update doc
This commit is contained in:
parent
8c5e3be044
commit
86297ea2a8
|
|
@ -86,4 +86,22 @@ You can nest a **BatchFlow** in another **BatchFlow**. For instance:
|
|||
- **Outer** batch: returns a list of diretory param dicts (e.g., `{"directory": "/pathA"}`, `{"directory": "/pathB"}`, ...).
|
||||
- **Inner** batch: returning a list of per-file param dicts.
|
||||
|
||||
At each level, **BatchFlow** merges its own param dict with the parent’s. By the time you reach the **innermost** node, the final `params` is the merged result of **all** parents in the chain. This way, a nested structure can keep track of the entire context (e.g., directory + file name) at once.
|
||||
At each level, **BatchFlow** merges its own param dict with the parent’s. By the time you reach the **innermost** node, the final `params` is the merged result of **all** parents in the chain. This way, a nested structure can keep track of the entire context (e.g., directory + file name) at once.
|
||||
|
||||
```python
|
||||
|
||||
class FileBatchFlow(BatchFlow):
|
||||
def prep(self, shared):
|
||||
directory = self.params["directory"]
|
||||
files = [f for f in os.listdir(directory) if f.endswith(".txt")]
|
||||
return [{"filename": f} for f in files]
|
||||
|
||||
class DirectoryBatchFlow(BatchFlow):
|
||||
def prep(self, shared):
|
||||
directories = [ "/path/to/dirA", "/path/to/dirB"]
|
||||
return [{"directory": d} for d in directories]
|
||||
|
||||
|
||||
inner_flow = FileBatchFlow(start=MapSummaries())
|
||||
outer_flow = DirectoryBatchFlow(start=inner_flow)
|
||||
```
|
||||
|
|
@ -17,7 +17,7 @@ If you know memory management, **Shared Store** is like a **heap** shared across
|
|||
|
||||
### Why Not Use Other Communication Models like Message Passing?
|
||||
|
||||
**Message passing** works well for simple DAGs (e.g., for data pipelines), but with **nested graphs** (Flows containing Flows, repeated or cyclic calls), routing messages becomes hard to maintain. A shared store keeps the design simple and easy.
|
||||
**Message passing** works well for simple DAGs, but with **nested graphs** (Flows containing Flows, repeated or cyclic calls), routing messages becomes hard to maintain. A shared store keeps the design simple and easy.
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
Loading…
Reference in New Issue