compact doc

This commit is contained in:
zachary62 2025-03-03 22:32:27 -05:00
parent 355e2573c4
commit 467c9dbbd8
17 changed files with 11 additions and 41 deletions

View File

@ -31,5 +31,4 @@ callouts:
color: blue
best-practice:
title: Best Practice
color: green
color: green

View File

@ -10,12 +10,9 @@ nav_order: 5
**Async** Nodes implement `prep_async()`, `exec_async()`, `exec_fallback_async()`, and/or `post_async()`. This is useful for:
1. **prep_async()**: For *fetching/reading data (files, APIs, DB)* in an I/O-friendly way.
2. **exec_async()**: Typically used for async LLM calls.
3. **post_async()**: For *awaiting user feedback*, *coordinating across multi-agents* or any additional async steps after `exec_async()`.
**Note**: `AsyncNode` must be wrapped in `AsyncFlow`. `AsyncFlow` can also include regular (sync) nodes.
### Example

View File

@ -15,7 +15,6 @@ Nodes and Flows **communicate** in two ways:
- Great for data results, large content, or anything multiple nodes need.
- You shall design the data structure and populate it ahead.
2. **Params (only for [Batch](./batch.md))**
- Each node has a local, ephemeral `params` dict passed in by the **parent Flow**, used as an identifier for tasks. Parameter keys and values shall be **immutable**.
- Good for identifiers like filenames or numeric IDs, in Batch mode.
@ -85,7 +84,6 @@ Here:
- **Set** via `set_params()`.
- **Cleared** and updated each time a parent Flow calls it.
> Only set the uppermost Flow params because others will be overwritten by the parent Flow.
>
> If you need to set child node params, see [Batch](./batch.md).
@ -125,6 +123,4 @@ flow = Flow(start=node)
# 5) Set Flow params (overwrites node params)
flow.set_params({"filename": "doc2.txt"})
flow.run(shared) # The node summarizes doc2, not doc1
```
---
```

View File

@ -86,7 +86,6 @@ flowchart TD
- `node.run(shared)`: Just runs that node alone (calls `prep->exec->post()`), returns an Action.
- `flow.run(shared)`: Executes from the start node, follows Actions to the next node, and so on until the flow can't continue.
> `node.run(shared)` **does not** proceed to the successor.
> This is mainly for debugging or testing a single node.
>
@ -108,7 +107,6 @@ A **Flow** is also a **Node**, so it will run `prep()` and `post()`. However:
- It **won't** run `exec()`, as its main logic is to orchestrate its nodes.
- `post()` always receives `None` for `exec_res` and should instead get the flow execution results from the shared store.
### Basic Flow Nesting
Here's how to connect a flow to another node:
@ -177,5 +175,4 @@ flowchart LR
paymentFlow --> inventoryFlow
inventoryFlow --> shippingFlow
end
```
```

View File

@ -13,7 +13,6 @@ A **Node** is the smallest building block. Each Node has 3 steps `prep->exec->po
<img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/node.png?raw=true" width="400"/>
</div>
1. `prep(shared)`
- **Read and preprocess data** from `shared` store.
- Examples: *query DB, read files, or serialize data into a string*.
@ -31,14 +30,11 @@ A **Node** is the smallest building block. Each Node has 3 steps `prep->exec->po
- Examples: *update DB, change states, log results*.
- **Decide the next action** by returning a *string* (`action = "default"` if *None*).
> **Why 3 steps?** To enforce the principle of *separation of concerns*. The data storage and data processing are operated separately.
>
> All steps are *optional*. E.g., you can only implement `prep` and `post` if you just need to process data.
{: .note }
### Fault Tolerance & Retries
You can **retry** `exec()` if it raises an exception via two parameters when define the Node:
@ -106,5 +102,4 @@ action_result = summarize_node.run(shared)
print("Action returned:", action_result) # "default"
print("Summary stored:", shared["summary"])
```
```

View File

@ -19,7 +19,6 @@ nav_order: 6
> - **Consider Single-Node Batch APIs**: Some LLMs offer a **batch inference** API where you can send multiple prompts in a single call. This is more complex to implement but can be more efficient than launching many parallel requests and mitigates rate limits.
{: .best-practice }
## AsyncParallelBatchNode
Like **AsyncBatchNode**, but run `exec_async()` in **parallel**:
@ -54,4 +53,4 @@ class SummarizeMultipleFiles(AsyncParallelBatchFlow):
sub_flow = AsyncFlow(start=LoadAndSummarizeFile())
parallel_flow = SummarizeMultipleFiles(start=sub_flow)
await parallel_flow.run_async(shared)
```
```

View File

@ -98,5 +98,4 @@ search - "decide" >> decide # Loop back
flow = Flow(start=decide)
flow.run({"query": "Who won the Nobel Prize in Physics 2024?"})
```
```

View File

@ -14,7 +14,6 @@ MapReduce is a design pattern suitable when you have either:
and there is a logical way to break the task into smaller, ideally independent parts.
You first break down the task using [BatchNode](../core_abstraction/batch.md) in the map phase, followed by aggregation in the reduce phase.
### Example: Document Summarization
```python
@ -36,4 +35,4 @@ map_node >> reduce_node
# Create flow
summarize_flow = Flow(start=map_node)
summarize_flow.run(shared)
```
```

View File

@ -122,4 +122,4 @@ reply - "continue" >> retrieve
flow = Flow(start=retrieve)
shared = {}
flow.run(shared)
```
```

View File

@ -70,7 +70,6 @@ Agent received: Network connectivity: stable | timestamp_2
Agent received: Processing load: optimal | timestamp_3
```
### Interactive Multi-Agent Example: Taboo Game
Here's a more complex example where two agents play the word-guessing game Taboo.

View File

@ -9,7 +9,6 @@ nav_order: 2
Many real-world tasks are too complex for one LLM call. The solution is to decompose them into a [chain](../core_abstraction/flow.md) of multiple Nodes.
> - You don't want to make each task **too coarse**, because it may be *too complex for one LLM call*.
> - You don't want to make each task **too granular**, because then *the LLM call doesn't have enough context* and results are *not consistent across nodes*.
>
@ -47,4 +46,4 @@ shared = {"topic": "AI Safety"}
writing_flow.run(shared)
```
For *dynamic cases*, consider using [Agents](./agent.md).
For *dynamic cases*, consider using [Agents](./agent.md).

View File

@ -11,7 +11,6 @@ nav_order: 1
A [100-line](https://github.com/the-pocket/PocketFlow/blob/main/pocketflow/__init__.py) minimalist LLM framework for *Agents, Task Decomposition, RAG, etc*.
We model the LLM workflow as a **Graph + Shared Store**:
- **Nodes** handle simple (LLM) tasks.
- Nodes connect through **Actions** (labeled edges) for *Agents*.
@ -21,17 +20,13 @@ We model the LLM workflow as a **Graph + Shared Store**:
- **Batch** Nodes/Flows for data-intensive tasks.
- **Async** Nodes/Flows allow waits for asynchronous tasks.
<div align="center">
<img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/meme.jpg?raw=true" width="400"/>
</div>
> Have questions? Chat with [AI Assistant](https://chatgpt.com/g/g-677464af36588191b9eba4901946557b-mini-llm-flow-assistant)
{: .note }
## Core Abstraction
- [Node](./core_abstraction/node.md)

View File

@ -77,7 +77,6 @@ class SummarizeNode(Node):
return call_llm(f"Summarize: {text}", self.cur_retry==0)
```
- Enable logging:
```python
@ -93,4 +92,4 @@ def call_llm(prompt):
I believe it is a **bad practice** to provide LLM-specific implementations in a general framework:
- **LLM APIs change frequently**. Hardcoding them makes maintenance a nightmare.
- You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs.
- You may need **optimizations** like prompt caching, request batching, or response streaming.
- You may need **optimizations** like prompt caching, request batching, or response streaming.

View File

@ -68,7 +68,6 @@ def execute_sql(query):
return result
```
> ⚠️ Beware of SQL injection risk
{: .warning }

View File

@ -137,6 +137,4 @@ data_science_flow = DataScienceFlow(start=data_prep_node)
data_science_flow.run({})
```
The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`
The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`