diff --git a/docs/index.md b/docs/index.md index d31eb01..a6cfda5 100644 --- a/docs/index.md +++ b/docs/index.md @@ -15,6 +15,7 @@ We model the LLM workflow as a **Nested Flow**: - A Flow can be treated as a Node for **Nested Flows**. - Both Nodes and Flows can be **Batched** for data-intensive tasks. - Nodes and Flows can be **Async** for user inputs. +- **Async** Nodes and Flows can be executed in **Parallel**.
@@ -27,6 +28,7 @@ We model the LLM workflow as a **Nested Flow**:
- [Communication](./communication.md)
- [Batch](./batch.md)
- [Async](./async.md)
+- [Parallel](./parallel.md)
## Preparation
diff --git a/docs/parallel.md b/docs/parallel.md
new file mode 100644
index 0000000..b1b8d3a
--- /dev/null
+++ b/docs/parallel.md
@@ -0,0 +1,57 @@
+---
+layout: default
+title: "Parallel"
+parent: "Core Abstraction"
+nav_order: 6
+---
+
+# Parallel
+
+**Parallel** Nodes and Flows let you run multiple tasks **concurrently**—for example, summarizing multiple texts at once. Unlike a regular **BatchNode**, which processes items sequentially, **AsyncParallelBatchNode** and **AsyncParallelBatchFlow** can fire off tasks in parallel. This can improve performance by overlapping I/O and compute.
+
+## AsyncParallelBatchNode
+
+Like **AsyncBatchNode**, but uses `prep_async()`, `exec_async()`, and `post_async()` in **parallel**:
+
+```python
+class ParallelSummaries(AsyncParallelBatchNode):
+ async def prep_async(self, shared):
+ # e.g., multiple texts
+ return shared["texts"]
+
+ async def exec_async(self, text):
+ prompt = f"Summarize: {text}"
+ return await call_llm_async(prompt)
+
+ async def post_async(self, shared, prep_res, exec_res_list):
+ shared["summary"] = "\n\n".join(exec_res_list)
+ return "default"
+
+node = ParallelSummaries()
+flow = AsyncFlow(start=node)
+```
+
+## AsyncParallelBatchFlow
+
+Parallel version of **BatchFlow**. Each iteration of the sub-flow runs **concurrently** using different parameters:
+
+```python
+class SummarizeMultipleFiles(AsyncParallelBatchFlow):
+ async def prep_async(self, shared):
+ return [{"filename": f} for f in shared["files"]]
+
+sub_flow = AsyncFlow(start=LoadAndSummarizeFile())
+parallel_flow = SummarizeMultipleFiles(start=sub_flow)
+await parallel_flow.run_async(shared)
+```
+
+## Best Practices
+
+- **Ensure Tasks Are Independent**
+ If each item depends on the output of a previous item, **don’t** parallelize. Parallelizing dependent tasks can lead to inconsistencies or race conditions.
+
+- **Beware Rate Limits**
+ Parallel calls can **quickly** trigger rate limits on LLM services. You may need a **throttling** mechanism (e.g., semaphores or sleep intervals) to avoid hitting vendor limits.
+
+- **Consider Single-Node Batch APIs**
+ Some LLMs offer a **batch inference** API where you can send multiple prompts in a single call. This is more complex to implement but can be more efficient than launching many parallel requests. Conceptually, it can look similar to an **AsyncBatchNode** or **BatchNode**, but the underlying call bundles multiple items into **one** request.
\ No newline at end of file