update doc

This commit is contained in:
zachary62 2024-12-31 05:57:56 +00:00
parent 8c5e3be044
commit 86297ea2a8
2 changed files with 20 additions and 2 deletions

View File

@ -86,4 +86,22 @@ You can nest a **BatchFlow** in another **BatchFlow**. For instance:
- **Outer** batch: returns a list of diretory param dicts (e.g., `{"directory": "/pathA"}`, `{"directory": "/pathB"}`, ...).
- **Inner** batch: returning a list of per-file param dicts.
At each level, **BatchFlow** merges its own param dict with the parents. By the time you reach the **innermost** node, the final `params` is the merged result of **all** parents in the chain. This way, a nested structure can keep track of the entire context (e.g., directory + file name) at once.
At each level, **BatchFlow** merges its own param dict with the parents. By the time you reach the **innermost** node, the final `params` is the merged result of **all** parents in the chain. This way, a nested structure can keep track of the entire context (e.g., directory + file name) at once.
```python
class FileBatchFlow(BatchFlow):
def prep(self, shared):
directory = self.params["directory"]
files = [f for f in os.listdir(directory) if f.endswith(".txt")]
return [{"filename": f} for f in files]
class DirectoryBatchFlow(BatchFlow):
def prep(self, shared):
directories = [ "/path/to/dirA", "/path/to/dirB"]
return [{"directory": d} for d in directories]
inner_flow = FileBatchFlow(start=MapSummaries())
outer_flow = DirectoryBatchFlow(start=inner_flow)
```

View File

@ -17,7 +17,7 @@ If you know memory management, **Shared Store** is like a **heap** shared across
### Why Not Use Other Communication Models like Message Passing?
**Message passing** works well for simple DAGs (e.g., for data pipelines), but with **nested graphs** (Flows containing Flows, repeated or cyclic calls), routing messages becomes hard to maintain. A shared store keeps the design simple and easy.
**Message passing** works well for simple DAGs, but with **nested graphs** (Flows containing Flows, repeated or cyclic calls), routing messages becomes hard to maintain. A shared store keeps the design simple and easy.
---