105 lines
2.5 KiB
Markdown
105 lines
2.5 KiB
Markdown
# PocketFlow Parallel Batch Node Example
|
|
|
|
This example demonstrates parallel processing using AsyncParallelBatchNode to summarize multiple news articles concurrently. It shows how to:
|
|
1. Process multiple items in parallel
|
|
2. Handle I/O-bound tasks efficiently
|
|
3. Manage rate limits with throttling
|
|
|
|
## What this Example Does
|
|
|
|
When you run the example:
|
|
1. It loads multiple news articles from a data directory
|
|
2. Processes them in parallel using AsyncParallelBatchNode
|
|
3. For each article:
|
|
- Extracts key information
|
|
- Generates a summary using an LLM
|
|
- Saves the results
|
|
4. Combines all summaries into a final report
|
|
|
|
## How it Works
|
|
|
|
The example uses AsyncParallelBatchNode to process articles in parallel:
|
|
|
|
```python
|
|
class ParallelSummarizer(AsyncParallelBatchNode):
|
|
async def prep_async(self, shared):
|
|
# Return list of articles to process
|
|
return shared["articles"]
|
|
|
|
async def exec_async(self, article):
|
|
# Process single article (called in parallel)
|
|
summary = await call_llm_async(f"Summarize: {article}")
|
|
return summary
|
|
|
|
async def post_async(self, shared, prep_res, summaries):
|
|
# Combine all summaries
|
|
shared["summaries"] = summaries
|
|
return "default"
|
|
```
|
|
|
|
Key features demonstrated:
|
|
- Parallel execution of `exec_async`
|
|
- Rate limiting with semaphores
|
|
- Error handling for failed requests
|
|
- Progress tracking for parallel tasks
|
|
|
|
## Project Structure
|
|
```
|
|
pocketflow-parallel-batch-node/
|
|
├── README.md
|
|
├── requirements.txt
|
|
├── data/
|
|
│ ├── article1.txt
|
|
│ ├── article2.txt
|
|
│ └── article3.txt
|
|
├── main.py
|
|
├── flow.py
|
|
├── nodes.py
|
|
└── utils.py
|
|
```
|
|
|
|
## Running the Example
|
|
|
|
```bash
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# Run the example
|
|
python main.py
|
|
```
|
|
|
|
## Sample Output
|
|
```
|
|
Loading articles...
|
|
Found 3 articles to process
|
|
|
|
Processing in parallel...
|
|
[1/3] Processing article1.txt...
|
|
[2/3] Processing article2.txt...
|
|
[3/3] Processing article3.txt...
|
|
|
|
Summaries generated:
|
|
1. First article summary...
|
|
2. Second article summary...
|
|
3. Third article summary...
|
|
|
|
Final report saved to: summaries.txt
|
|
```
|
|
|
|
## Key Concepts
|
|
|
|
1. **Parallel Processing**
|
|
- Using AsyncParallelBatchNode for concurrent execution
|
|
- Managing parallel tasks efficiently
|
|
|
|
2. **Rate Limiting**
|
|
- Using semaphores to control concurrent requests
|
|
- Avoiding API rate limits
|
|
|
|
3. **Error Handling**
|
|
- Graceful handling of failed requests
|
|
- Retrying failed tasks
|
|
|
|
4. **Progress Tracking**
|
|
- Monitoring parallel task progress
|
|
- Providing user feedback |