update .cursorrules
This commit is contained in:
parent
4a9dd5bca0
commit
a2447f0bb6
|
|
@ -1,155 +1,24 @@
|
||||||
|
|
||||||
|
|
||||||
================================================
|
|
||||||
File: docs/guide.md
|
|
||||||
================================================
|
|
||||||
---
|
|
||||||
layout: default
|
|
||||||
title: "Design Guidance"
|
|
||||||
parent: "Apps"
|
|
||||||
nav_order: 1
|
|
||||||
---
|
|
||||||
|
|
||||||
# LLM System Design Guidance
|
|
||||||
|
|
||||||
|
|
||||||
## Example LLM Project File Structure
|
|
||||||
|
|
||||||
```
|
|
||||||
my_project/
|
|
||||||
├── main.py
|
|
||||||
├── flow.py
|
|
||||||
├── utils/
|
|
||||||
│ ├── __init__.py
|
|
||||||
│ ├── call_llm.py
|
|
||||||
│ └── search_web.py
|
|
||||||
├── tests/
|
|
||||||
│ ├── __init__.py
|
|
||||||
│ ├── test_flow.py
|
|
||||||
│ └── test_nodes.py
|
|
||||||
├── requirements.txt
|
|
||||||
└── docs/
|
|
||||||
└── design.md
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
### `docs/`
|
|
||||||
|
|
||||||
Store the documentation of the project.
|
|
||||||
|
|
||||||
It should include a `design.md` file, which describes
|
|
||||||
- Project requirements
|
|
||||||
- Required utility functions
|
|
||||||
- High-level flow with a mermaid diagram
|
|
||||||
- Shared memory data structure
|
|
||||||
- For each node, discuss
|
|
||||||
- Node purpose and design (e.g., should it be a batch or async node?)
|
|
||||||
- How the data shall be read (for `prep`) and written (for `post`)
|
|
||||||
- How the data shall be processed (for `exec`)
|
|
||||||
|
|
||||||
### `utils/`
|
|
||||||
|
|
||||||
Houses functions for external API calls (e.g., LLMs, web searches, etc.).
|
|
||||||
|
|
||||||
It’s recommended to dedicate one Python file per API call, with names like `call_llm.py` or `search_web.py`. Each file should include:
|
|
||||||
|
|
||||||
- The function to call the API
|
|
||||||
- A main function to run that API call
|
|
||||||
|
|
||||||
For instance, here’s a simplified `call_llm.py` example:
|
|
||||||
|
|
||||||
```python
|
|
||||||
from openai import OpenAI
|
|
||||||
|
|
||||||
def call_llm(prompt):
|
|
||||||
client = OpenAI(api_key="YOUR_API_KEY_HERE")
|
|
||||||
response = client.chat.completions.create(
|
|
||||||
model="gpt-4o",
|
|
||||||
messages=[{"role": "user", "content": prompt}]
|
|
||||||
)
|
|
||||||
return response.choices[0].message.content
|
|
||||||
|
|
||||||
def main():
|
|
||||||
prompt = "Hello, how are you?"
|
|
||||||
print(call_llm(prompt))
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
```
|
|
||||||
|
|
||||||
### `main.py`
|
|
||||||
|
|
||||||
Serves as the project’s entry point.
|
|
||||||
|
|
||||||
### `flow.py`
|
|
||||||
|
|
||||||
Implements the application’s flow, starting with node followed by the flow structure.
|
|
||||||
|
|
||||||
|
|
||||||
### `tests/`
|
|
||||||
|
|
||||||
Optionally contains all tests. Use `pytest` for testing flows, nodes, and utility functions.
|
|
||||||
For example, `test_call_llm.py` might look like:
|
|
||||||
|
|
||||||
```python
|
|
||||||
from utils.call_llm import call_llm
|
|
||||||
|
|
||||||
def test_call_llm():
|
|
||||||
prompt = "Hello, how are you?"
|
|
||||||
assert call_llm(prompt) is not None
|
|
||||||
```
|
|
||||||
|
|
||||||
## System Design Steps
|
|
||||||
|
|
||||||
1. **Project Requirements**
|
|
||||||
- Identify the project's core entities.
|
|
||||||
- Define each functional requirement and map out how these entities interact step by step.
|
|
||||||
|
|
||||||
2. **Utility Functions**
|
|
||||||
- Determine the low-level utility functions you’ll need (e.g., for LLM calls, web searches, file handling).
|
|
||||||
- Implement these functions and write basic tests to confirm they work correctly.
|
|
||||||
|
|
||||||
3. **Flow Design**
|
|
||||||
- Develop a high-level process flow that meets the project’s requirements.
|
|
||||||
- Specify which utility functions are used at each step.
|
|
||||||
- Identify possible decision points for *Node Actions* and data-intensive operations for *Batch* tasks.
|
|
||||||
- Illustrate the flow with a Mermaid diagram.
|
|
||||||
|
|
||||||
4. **Data Structure**
|
|
||||||
- Decide how to store and update state, whether in memory (for smaller applications) or a database (for larger or persistent needs).
|
|
||||||
- Define data schemas or models that detail how information is stored, accessed, and updated.
|
|
||||||
|
|
||||||
5. **Implementation**
|
|
||||||
- Start coding with a simple, direct approach (avoid over-engineering at first).
|
|
||||||
- For each node in your flow:
|
|
||||||
- **prep**: Determine how data is accessed or retrieved.
|
|
||||||
- **exec**: Outline the actual processing or logic needed.
|
|
||||||
- **post**: Handle any final updates or data persistence tasks.
|
|
||||||
|
|
||||||
6. **Optimization**
|
|
||||||
- **Prompt Engineering**: Use clear and specific instructions with illustrative examples to reduce ambiguity.
|
|
||||||
- **Task Decomposition**: Break large, complex tasks into manageable, logical steps.
|
|
||||||
|
|
||||||
7. **Reliability**
|
|
||||||
- **Structured Output**: Verify outputs conform to the required format. Consider increasing `max_retries` if needed.
|
|
||||||
- **Test Cases**: Develop clear, reproducible tests for each part of the flow.
|
|
||||||
- **Self-Evaluation**: Introduce an additional Node (powered by LLMs) to review outputs when the results are uncertain.
|
|
||||||
|
|
||||||
================================================
|
================================================
|
||||||
File: docs/agent.md
|
File: docs/agent.md
|
||||||
================================================
|
================================================
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Agent"
|
title: "Agent"
|
||||||
parent: "Paradigm"
|
parent: "Design"
|
||||||
nav_order: 6
|
nav_order: 6
|
||||||
---
|
---
|
||||||
|
|
||||||
# Agent
|
# Agent
|
||||||
|
|
||||||
For many tasks, we need agents that take dynamic and recursive actions based on the inputs they receive.
|
Agent is a powerful design pattern, where node can take dynamic actions based on the context it receives.
|
||||||
You can create these agents as **Nodes** connected by *Actions* in a directed graph using [Flow](./flow.md).
|
To express an agent, create a Node (the agent) with [branching](./flow.md) to other nodes (Actions).
|
||||||
|
|
||||||
|
> The core of build **performant** and **reliable** agents boils down to:
|
||||||
|
>
|
||||||
|
> 1. **Context Management:** Provide *clear, relevant context* so agents can understand the problem.E.g., Rather than dumping an entire chat history or entire files, use a [Workflow](./decomp.md) that filters out and includes only the most relevant information.
|
||||||
|
>
|
||||||
|
> 2. **Action Space:** Define *a well-structured, unambiguous, and easy-to-use* set of actions. For instance, avoid creating overlapping actions like `read_databases` and `read_csvs`. Instead, unify data sources (e.g., move CSVs into a database) and design a single action. The action can be parameterized (e.g., string for search) or programmable (e.g., SQL queries).
|
||||||
|
{: .best-practice }
|
||||||
|
|
||||||
### Example: Search Agent
|
### Example: Search Agent
|
||||||
|
|
||||||
|
|
@ -234,8 +103,6 @@ flow = Flow(start=decide)
|
||||||
flow.run({"query": "Who won the Nobel Prize in Physics 2024?"})
|
flow.run({"query": "Who won the Nobel Prize in Physics 2024?"})
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
================================================
|
================================================
|
||||||
File: docs/async.md
|
File: docs/async.md
|
||||||
================================================
|
================================================
|
||||||
|
|
@ -436,10 +303,8 @@ Nodes and Flows **communicate** in two ways:
|
||||||
|
|
||||||
If you know memory management, think of the **Shared Store** like a **heap** (shared by all function calls), and **Params** like a **stack** (assigned by the caller).
|
If you know memory management, think of the **Shared Store** like a **heap** (shared by all function calls), and **Params** like a **stack** (assigned by the caller).
|
||||||
|
|
||||||
> **Best Practice:** Use `Shared Store` for almost all cases. It's flexible and easy to manage. It separates data storage from data processing, making the code more readable and easier to maintain.
|
> Use `Shared Store` for almost all cases. It's flexible and easy to manage. It separates *Data Schema* from *Compute Logic*, making the code easier to maintain. `Params` is more a syntax sugar for [Batch](./batch.md).
|
||||||
>
|
{: .best-practice }
|
||||||
> `Params` is more a syntax sugar for [Batch](./batch.md).
|
|
||||||
{: .note }
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -551,14 +416,21 @@ File: docs/decomp.md
|
||||||
================================================
|
================================================
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Task Decomposition"
|
title: "Workflow"
|
||||||
parent: "Paradigm"
|
parent: "Design"
|
||||||
nav_order: 2
|
nav_order: 2
|
||||||
---
|
---
|
||||||
|
|
||||||
# Task Decomposition
|
# Workflow
|
||||||
|
|
||||||
Many real-world tasks are too complex for one LLM call. The solution is to decompose them into multiple calls as a [Flow](./flow.md) of Nodes.
|
Many real-world tasks are too complex for one LLM call. The solution is to decompose them into a [chain](./flow.md) of multiple Nodes.
|
||||||
|
|
||||||
|
|
||||||
|
> - You don't want to make each task **too coarse**, because it may be *too complex for one LLM call*.
|
||||||
|
> - You don't want to make each task **too granular**, because then *the LLM call doesn't have enough context* and results are *not consistent across nodes*.
|
||||||
|
>
|
||||||
|
> You usually need multiple *iterations* to find the *sweet spot*. If the task has too many *edge cases*, consider using [Agents](./agent.md).
|
||||||
|
{: .best-practice }
|
||||||
|
|
||||||
### Example: Article Writing
|
### Example: Article Writing
|
||||||
|
|
||||||
|
|
@ -932,6 +804,123 @@ flowchart LR
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
================================================
|
||||||
|
File: docs/guide.md
|
||||||
|
================================================
|
||||||
|
---
|
||||||
|
layout: default
|
||||||
|
title: "Design Guidance"
|
||||||
|
parent: "Apps"
|
||||||
|
nav_order: 1
|
||||||
|
---
|
||||||
|
|
||||||
|
# LLM System Design Guidance
|
||||||
|
|
||||||
|
|
||||||
|
## System Design Steps
|
||||||
|
|
||||||
|
1. **Project Requirements**
|
||||||
|
- Identify the project's core entities, and provide a step-by-step user story.
|
||||||
|
- Define a list of both functional and non-functional requirements.
|
||||||
|
|
||||||
|
2. **Utility Functions**
|
||||||
|
- Determine the utility functions on which this project depends (e.g., for LLM calls, web searches, file handling).
|
||||||
|
- Implement these functions and write basic tests to confirm they work correctly.
|
||||||
|
|
||||||
|
> After this step, don't jump straight into building an LLM system.
|
||||||
|
>
|
||||||
|
> First, make sure you clearly understand the problem by manually solving it using some example inputs.
|
||||||
|
>
|
||||||
|
> It's always easier to first build a solid intuition about the problem and its solution, then focus on automating the process.
|
||||||
|
{: .warning }
|
||||||
|
|
||||||
|
3. **Flow Design**
|
||||||
|
- Build a high-level design of the flow of nodes (for example, using a Mermaid diagram) to automate the solution.
|
||||||
|
- For each node in your flow, specify:
|
||||||
|
- **prep**: How data is accessed or retrieved.
|
||||||
|
- **exec**: The specific utility function to use (ideally one function per node).
|
||||||
|
- **post**: How data is updated or persisted.
|
||||||
|
- Identify potential design patterns, such as Batch, Agent, or RAG.
|
||||||
|
|
||||||
|
4. **Data Structure**
|
||||||
|
- Decide how you will store and update state (in memory for smaller applications or in a database for larger, persistent needs).
|
||||||
|
- If it isn’t straightforward, define data schemas or models detailing how information is stored, accessed, and updated.
|
||||||
|
- As you finalize your data structure, you may need to refine your flow design.
|
||||||
|
|
||||||
|
5. **Implementation**
|
||||||
|
- For each node, implement the **prep**, **exec**, and **post** functions based on the flow design.
|
||||||
|
- Start coding with a simple, direct approach (avoid over-engineering at first).
|
||||||
|
- Add logging throughout the code to facilitate debugging.
|
||||||
|
|
||||||
|
6. **Optimization**
|
||||||
|
- **Prompt Engineering**: Use clear, specific instructions with illustrative examples to reduce ambiguity.
|
||||||
|
- **Task Decomposition**: Break large or complex tasks into manageable, logical steps.
|
||||||
|
|
||||||
|
7. **Reliability**
|
||||||
|
- **Structured Output**: Ensure outputs conform to the required format. Consider increasing `max_retries` if needed.
|
||||||
|
- **Test Cases**: Develop clear, reproducible tests for each part of the flow.
|
||||||
|
- **Self-Evaluation**: Introduce an additional node (powered by LLMs) to review outputs when results are uncertain.
|
||||||
|
|
||||||
|
## Example LLM Project File Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
my_project/
|
||||||
|
├── main.py
|
||||||
|
├── flow.py
|
||||||
|
├── utils/
|
||||||
|
│ ├── __init__.py
|
||||||
|
│ ├── call_llm.py
|
||||||
|
│ └── search_web.py
|
||||||
|
├── requirements.txt
|
||||||
|
└── docs/
|
||||||
|
└── design.md
|
||||||
|
```
|
||||||
|
|
||||||
|
### `docs/`
|
||||||
|
|
||||||
|
Holds all project documentation. Include a `design.md` file covering:
|
||||||
|
- Project requirements
|
||||||
|
- Utility functions
|
||||||
|
- High-level flow (with a Mermaid diagram)
|
||||||
|
- Shared memory data structure
|
||||||
|
- Node designs:
|
||||||
|
- Purpose and design (e.g., batch or async)
|
||||||
|
- Data read (prep) and write (post)
|
||||||
|
- Data processing (exec)
|
||||||
|
|
||||||
|
### `utils/`
|
||||||
|
|
||||||
|
Houses functions for external API calls (e.g., LLMs, web searches, etc.). It’s recommended to dedicate one Python file per API call, with names like `call_llm.py` or `search_web.py`. Each file should include:
|
||||||
|
|
||||||
|
- The function to call the API
|
||||||
|
- A main function to run that API call for testing
|
||||||
|
|
||||||
|
For instance, here’s a simplified `call_llm.py` example:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from openai import OpenAI
|
||||||
|
|
||||||
|
def call_llm(prompt):
|
||||||
|
client = OpenAI(api_key="YOUR_API_KEY_HERE")
|
||||||
|
response = client.chat.completions.create(
|
||||||
|
model="gpt-4o",
|
||||||
|
messages=[{"role": "user", "content": prompt}]
|
||||||
|
)
|
||||||
|
return response.choices[0].message.content
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
prompt = "Hello, how are you?"
|
||||||
|
print(call_llm(prompt))
|
||||||
|
```
|
||||||
|
|
||||||
|
### `main.py`
|
||||||
|
|
||||||
|
Serves as the project’s entry point.
|
||||||
|
|
||||||
|
### `flow.py`
|
||||||
|
|
||||||
|
Implements the application’s flow, starting with node followed by the flow structure.
|
||||||
|
|
||||||
================================================
|
================================================
|
||||||
File: docs/index.md
|
File: docs/index.md
|
||||||
================================================
|
================================================
|
||||||
|
|
@ -956,7 +945,7 @@ We model the LLM workflow as a **Nested Directed Graph**:
|
||||||
|
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/minillmflow.jpg?raw=true" width="400"/>
|
<img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/meme.jpg?raw=true" width="400"/>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -974,23 +963,21 @@ We model the LLM workflow as a **Nested Directed Graph**:
|
||||||
- [(Advanced) Async](./async.md)
|
- [(Advanced) Async](./async.md)
|
||||||
- [(Advanced) Parallel](./parallel.md)
|
- [(Advanced) Parallel](./parallel.md)
|
||||||
|
|
||||||
## Low-Level Details
|
## Utility Functions
|
||||||
|
|
||||||
- [LLM Wrapper](./llm.md)
|
- [LLM Wrapper](./llm.md)
|
||||||
- [Tool](./tool.md)
|
- [Tool](./tool.md)
|
||||||
- [Viz and Debug](./viz.md)
|
- [Viz and Debug](./viz.md)
|
||||||
- Chunking
|
- Chunking
|
||||||
|
|
||||||
> We do not provide built-in implementations.
|
> We do not provide built-in utility functions. Example implementations are provided as reference.
|
||||||
>
|
|
||||||
> Example implementations are provided as reference.
|
|
||||||
{: .warning }
|
{: .warning }
|
||||||
|
|
||||||
|
|
||||||
## High-Level Paradigm
|
## Design Patterns
|
||||||
|
|
||||||
- [Structured Output](./structure.md)
|
- [Structured Output](./structure.md)
|
||||||
- [Task Decomposition](./decomp.md)
|
- [Workflow](./decomp.md)
|
||||||
- [Map Reduce](./mapreduce.md)
|
- [Map Reduce](./mapreduce.md)
|
||||||
- [RAG](./rag.md)
|
- [RAG](./rag.md)
|
||||||
- [Chat Memory](./memory.md)
|
- [Chat Memory](./memory.md)
|
||||||
|
|
@ -1012,7 +999,7 @@ File: docs/llm.md
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "LLM Wrapper"
|
title: "LLM Wrapper"
|
||||||
parent: "Details"
|
parent: "Utility"
|
||||||
nav_order: 1
|
nav_order: 1
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1113,13 +1100,19 @@ File: docs/mapreduce.md
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Map Reduce"
|
title: "Map Reduce"
|
||||||
parent: "Paradigm"
|
parent: "Design"
|
||||||
nav_order: 3
|
nav_order: 3
|
||||||
---
|
---
|
||||||
|
|
||||||
# Map Reduce
|
# Map Reduce
|
||||||
|
|
||||||
Process large inputs by splitting them into chunks using [BatchNode](./batch.md), then combining results.
|
MapReduce is a design pattern suitable when you have either:
|
||||||
|
- Large input data (e.g., multiple files to process), or
|
||||||
|
- Large output data (e.g., multiple forms to fill)
|
||||||
|
|
||||||
|
and there is a logical way to break the task into smaller, ideally independent parts.
|
||||||
|
You first break down the task using [BatchNode](./batch.md) in the map phase, followed by aggregation in the reduce phase.
|
||||||
|
|
||||||
|
|
||||||
### Example: Document Summarization
|
### Example: Document Summarization
|
||||||
|
|
||||||
|
|
@ -1151,7 +1144,7 @@ File: docs/memory.md
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Chat Memory"
|
title: "Chat Memory"
|
||||||
parent: "Paradigm"
|
parent: "Design"
|
||||||
nav_order: 5
|
nav_order: 5
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1197,59 +1190,81 @@ We can:
|
||||||
2. Use [vector search](./tool.md) to retrieve relevant exchanges beyond the last 4.
|
2. Use [vector search](./tool.md) to retrieve relevant exchanges beyond the last 4.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
class ChatWithMemory(Node):
|
################################
|
||||||
|
# Node A: Retrieve user input & relevant messages
|
||||||
|
################################
|
||||||
|
class ChatRetrieve(Node):
|
||||||
def prep(self, s):
|
def prep(self, s):
|
||||||
# Initialize shared dict
|
|
||||||
s.setdefault("history", [])
|
s.setdefault("history", [])
|
||||||
s.setdefault("memory_index", None)
|
s.setdefault("memory_index", None)
|
||||||
|
|
||||||
user_input = input("You: ")
|
user_input = input("You: ")
|
||||||
|
return user_input
|
||||||
|
|
||||||
# Retrieve relevant past if we have enough history and an index
|
def exec(self, user_input):
|
||||||
|
emb = get_embedding(user_input)
|
||||||
relevant = []
|
relevant = []
|
||||||
if len(s["history"]) > 8 and s["memory_index"]:
|
if len(shared["history"]) > 8 and shared["memory_index"]:
|
||||||
idx, _ = search_index(s["memory_index"], get_embedding(user_input), top_k=2)
|
idx, _ = search_index(shared["memory_index"], emb, top_k=2)
|
||||||
relevant = [s["history"][i[0]] for i in idx]
|
relevant = [shared["history"][i[0]] for i in idx]
|
||||||
|
return (user_input, relevant)
|
||||||
|
|
||||||
return {"user_input": user_input, "recent": s["history"][-8:], "relevant": relevant}
|
def post(self, s, p, r):
|
||||||
|
user_input, relevant = r
|
||||||
|
s["user_input"] = user_input
|
||||||
|
s["relevant"] = relevant
|
||||||
|
return "continue"
|
||||||
|
|
||||||
def exec(self, c):
|
################################
|
||||||
messages = [{"role": "system", "content": "You are a helpful assistant."}]
|
# Node B: Call LLM, update history + index
|
||||||
# Include relevant history if any
|
################################
|
||||||
if c["relevant"]:
|
class ChatReply(Node):
|
||||||
messages.append({"role": "system", "content": f"Relevant: {c['relevant']}"})
|
def prep(self, s):
|
||||||
# Add recent history and the current user input
|
user_input = s["user_input"]
|
||||||
messages += c["recent"] + [{"role": "user", "content": c["user_input"]}]
|
recent = s["history"][-8:]
|
||||||
return call_llm(messages)
|
relevant = s.get("relevant", [])
|
||||||
|
return user_input, recent, relevant
|
||||||
|
|
||||||
|
def exec(self, inputs):
|
||||||
|
user_input, recent, relevant = inputs
|
||||||
|
msgs = [{"role":"system","content":"You are a helpful assistant."}]
|
||||||
|
if relevant:
|
||||||
|
msgs.append({"role":"system","content":f"Relevant: {relevant}"})
|
||||||
|
msgs.extend(recent)
|
||||||
|
msgs.append({"role":"user","content":user_input})
|
||||||
|
ans = call_llm(msgs)
|
||||||
|
return ans
|
||||||
|
|
||||||
def post(self, s, pre, ans):
|
def post(self, s, pre, ans):
|
||||||
# Update chat history
|
user_input, _, _ = pre
|
||||||
s["history"] += [
|
s["history"].append({"role":"user","content":user_input})
|
||||||
{"role": "user", "content": pre["user_input"]},
|
s["history"].append({"role":"assistant","content":ans})
|
||||||
{"role": "assistant", "content": ans}
|
|
||||||
]
|
|
||||||
|
|
||||||
# When first reaching 8 messages, create index
|
# Manage memory index
|
||||||
if len(s["history"]) == 8:
|
if len(s["history"]) == 8:
|
||||||
embeddings = []
|
embs = []
|
||||||
for i in range(0, 8, 2):
|
for i in range(0, 8, 2):
|
||||||
e = s["history"][i]["content"] + " " + s["history"][i+1]["content"]
|
text = s["history"][i]["content"] + " " + s["history"][i+1]["content"]
|
||||||
embeddings.append(get_embedding(e))
|
embs.append(get_embedding(text))
|
||||||
s["memory_index"] = create_index(embeddings)
|
s["memory_index"] = create_index(embs)
|
||||||
|
|
||||||
# Embed older exchanges once we exceed 8 messages
|
|
||||||
elif len(s["history"]) > 8:
|
elif len(s["history"]) > 8:
|
||||||
pair = s["history"][-10:-8]
|
text = s["history"][-2]["content"] + " " + s["history"][-1]["content"]
|
||||||
embedding = get_embedding(pair[0]["content"] + " " + pair[1]["content"])
|
new_emb = np.array([get_embedding(text)]).astype('float32')
|
||||||
s["memory_index"].add(np.array([embedding]).astype('float32'))
|
s["memory_index"].add(new_emb)
|
||||||
|
|
||||||
print(f"Assistant: {ans}")
|
print(f"Assistant: {ans}")
|
||||||
return "continue"
|
return "continue"
|
||||||
|
|
||||||
chat = ChatWithMemory()
|
################################
|
||||||
chat - "continue" >> chat
|
# Flow wiring
|
||||||
flow = Flow(start=chat)
|
################################
|
||||||
flow.run({})
|
retrieve = ChatRetrieve()
|
||||||
|
reply = ChatReply()
|
||||||
|
retrieve - "continue" >> reply
|
||||||
|
reply - "continue" >> retrieve
|
||||||
|
|
||||||
|
flow = Flow(start=retrieve)
|
||||||
|
shared = {}
|
||||||
|
flow.run(shared)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -1259,7 +1274,7 @@ File: docs/multi_agent.md
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "(Advanced) Multi-Agents"
|
title: "(Advanced) Multi-Agents"
|
||||||
parent: "Paradigm"
|
parent: "Design"
|
||||||
nav_order: 7
|
nav_order: 7
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1268,6 +1283,8 @@ nav_order: 7
|
||||||
Multiple [Agents](./flow.md) can work together by handling subtasks and communicating the progress.
|
Multiple [Agents](./flow.md) can work together by handling subtasks and communicating the progress.
|
||||||
Communication between agents is typically implemented using message queues in shared storage.
|
Communication between agents is typically implemented using message queues in shared storage.
|
||||||
|
|
||||||
|
> Most of time, you don't need Multi-Agents. Start with a simple solution first.
|
||||||
|
{: .best-practice }
|
||||||
|
|
||||||
### Example Agent Communication: Message Queue
|
### Example Agent Communication: Message Queue
|
||||||
|
|
||||||
|
|
@ -1548,18 +1565,6 @@ print("Action returned:", action_result) # "default"
|
||||||
print("Summary stored:", shared["summary"])
|
print("Summary stored:", shared["summary"])
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
================================================
|
|
||||||
File: docs/paradigm.md
|
|
||||||
================================================
|
|
||||||
---
|
|
||||||
layout: default
|
|
||||||
title: "Paradigm"
|
|
||||||
nav_order: 4
|
|
||||||
has_children: true
|
|
||||||
---
|
|
||||||
|
|
||||||
================================================
|
================================================
|
||||||
File: docs/parallel.md
|
File: docs/parallel.md
|
||||||
================================================
|
================================================
|
||||||
|
|
@ -1577,6 +1582,14 @@ nav_order: 6
|
||||||
> Because of Python’s GIL, parallel nodes and flows can’t truly parallelize CPU-bound tasks (e.g., heavy numerical computations). However, they excel at overlapping I/O-bound work—like LLM calls, database queries, API requests, or file I/O.
|
> Because of Python’s GIL, parallel nodes and flows can’t truly parallelize CPU-bound tasks (e.g., heavy numerical computations). However, they excel at overlapping I/O-bound work—like LLM calls, database queries, API requests, or file I/O.
|
||||||
{: .warning }
|
{: .warning }
|
||||||
|
|
||||||
|
> - **Ensure Tasks Are Independent**: If each item depends on the output of a previous item, **do not** parallelize.
|
||||||
|
>
|
||||||
|
> - **Beware of Rate Limits**: Parallel calls can **quickly** trigger rate limits on LLM services. You may need a **throttling** mechanism (e.g., semaphores or sleep intervals).
|
||||||
|
>
|
||||||
|
> - **Consider Single-Node Batch APIs**: Some LLMs offer a **batch inference** API where you can send multiple prompts in a single call. This is more complex to implement but can be more efficient than launching many parallel requests and mitigates rate limits.
|
||||||
|
{: .best-practice }
|
||||||
|
|
||||||
|
|
||||||
## AsyncParallelBatchNode
|
## AsyncParallelBatchNode
|
||||||
|
|
||||||
Like **AsyncBatchNode**, but run `exec_async()` in **parallel**:
|
Like **AsyncBatchNode**, but run `exec_async()` in **parallel**:
|
||||||
|
|
@ -1613,33 +1626,13 @@ parallel_flow = SummarizeMultipleFiles(start=sub_flow)
|
||||||
await parallel_flow.run_async(shared)
|
await parallel_flow.run_async(shared)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
## Best Practices
|
|
||||||
|
|
||||||
- **Ensure Tasks Are Independent**: If each item depends on the output of a previous item, **do not** parallelize.
|
|
||||||
|
|
||||||
- **Beware of Rate Limits**: Parallel calls can **quickly** trigger rate limits on LLM services. You may need a **throttling** mechanism (e.g., semaphores or sleep intervals).
|
|
||||||
|
|
||||||
- **Consider Single-Node Batch APIs**: Some LLMs offer a **batch inference** API where you can send multiple prompts in a single call. This is more complex to implement but can be more efficient than launching many parallel requests and mitigates rate limits.
|
|
||||||
|
|
||||||
|
|
||||||
================================================
|
|
||||||
File: docs/preparation.md
|
|
||||||
================================================
|
|
||||||
---
|
|
||||||
layout: default
|
|
||||||
title: "Details"
|
|
||||||
nav_order: 3
|
|
||||||
has_children: true
|
|
||||||
---
|
|
||||||
|
|
||||||
================================================
|
================================================
|
||||||
File: docs/rag.md
|
File: docs/rag.md
|
||||||
================================================
|
================================================
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "RAG"
|
title: "RAG"
|
||||||
parent: "Paradigm"
|
parent: "Design"
|
||||||
nav_order: 4
|
nav_order: 4
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1653,34 +1646,44 @@ Use [vector search](./tool.md) to find relevant context for LLM responses.
|
||||||
```python
|
```python
|
||||||
class PrepareEmbeddings(Node):
|
class PrepareEmbeddings(Node):
|
||||||
def prep(self, shared):
|
def prep(self, shared):
|
||||||
texts = shared["texts"]
|
return shared["texts"]
|
||||||
embeddings = [get_embedding(text) for text in texts]
|
|
||||||
shared["search_index"] = create_index(embeddings)
|
def exec(self, texts):
|
||||||
|
# Embed each text chunk
|
||||||
|
embs = [get_embedding(t) for t in texts]
|
||||||
|
return embs
|
||||||
|
|
||||||
|
def post(self, shared, prep_res, exec_res):
|
||||||
|
shared["search_index"] = create_index(exec_res)
|
||||||
|
# no action string means "default"
|
||||||
|
|
||||||
class AnswerQuestion(Node):
|
class AnswerQuestion(Node):
|
||||||
def prep(self, shared):
|
def prep(self, shared):
|
||||||
question = input("Enter question: ")
|
question = input("Enter question: ")
|
||||||
query_embedding = get_embedding(question)
|
return question
|
||||||
indices, _ = search_index(shared["search_index"], query_embedding, top_k=1)
|
|
||||||
relevant_text = shared["texts"][indices[0][0]]
|
|
||||||
return question, relevant_text
|
|
||||||
|
|
||||||
def exec(self, inputs):
|
def exec(self, question):
|
||||||
question, context = inputs
|
q_emb = get_embedding(question)
|
||||||
prompt = f"Question: {question}\nContext: {context}\nAnswer: "
|
idx, _ = search_index(shared["search_index"], q_emb, top_k=1)
|
||||||
|
best_id = idx[0][0]
|
||||||
|
relevant_text = shared["texts"][best_id]
|
||||||
|
prompt = f"Question: {question}\nContext: {relevant_text}\nAnswer:"
|
||||||
return call_llm(prompt)
|
return call_llm(prompt)
|
||||||
|
|
||||||
def post(self, shared, prep_res, exec_res):
|
def post(self, shared, p, answer):
|
||||||
print(f"Answer: {exec_res}")
|
print("Answer:", answer)
|
||||||
|
|
||||||
# Connect nodes
|
############################################
|
||||||
|
# Wire up the flow
|
||||||
prep = PrepareEmbeddings()
|
prep = PrepareEmbeddings()
|
||||||
qa = AnswerQuestion()
|
qa = AnswerQuestion()
|
||||||
prep >> qa
|
prep >> qa
|
||||||
|
|
||||||
# Create flow
|
flow = Flow(start=prep)
|
||||||
qa_flow = Flow(start=prep)
|
|
||||||
qa_flow.run(shared)
|
# Example usage
|
||||||
|
shared = {"texts": ["I love apples", "Cats are great", "The sky is blue"]}
|
||||||
|
flow.run(shared)
|
||||||
```
|
```
|
||||||
|
|
||||||
================================================
|
================================================
|
||||||
|
|
@ -1689,7 +1692,7 @@ File: docs/structure.md
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Structured Output"
|
title: "Structured Output"
|
||||||
parent: "Paradigm"
|
parent: "Design"
|
||||||
nav_order: 1
|
nav_order: 1
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1771,6 +1774,9 @@ summary:
|
||||||
return structured_result
|
return structured_result
|
||||||
```
|
```
|
||||||
|
|
||||||
|
> Besides using `assert` statements, another popular way to validate schemas is [Pydantic](https://github.com/pydantic/pydantic)
|
||||||
|
{: .note }
|
||||||
|
|
||||||
### Why YAML instead of JSON?
|
### Why YAML instead of JSON?
|
||||||
|
|
||||||
Current LLMs struggle with escaping. YAML is easier with strings since they don't always need quotes.
|
Current LLMs struggle with escaping. YAML is easier with strings since they don't always need quotes.
|
||||||
|
|
@ -1804,7 +1810,7 @@ File: docs/tool.md
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Tool"
|
title: "Tool"
|
||||||
parent: "Details"
|
parent: "Utility"
|
||||||
nav_order: 2
|
nav_order: 2
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -1814,7 +1820,6 @@ Similar to LLM wrappers, we **don't** provide built-in tools. Here, we recommend
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
||||||
## 1. Embedding Calls
|
## 1. Embedding Calls
|
||||||
|
|
||||||
```python
|
```python
|
||||||
|
|
@ -2025,7 +2030,7 @@ File: docs/viz.md
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Viz and Debug"
|
title: "Viz and Debug"
|
||||||
parent: "Details"
|
parent: "Utility"
|
||||||
nav_order: 3
|
nav_order: 3
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -2162,4 +2167,3 @@ data_science_flow.run({})
|
||||||
```
|
```
|
||||||
|
|
||||||
The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`
|
The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`
|
||||||
|
|
||||||
|
|
@ -36,7 +36,7 @@ For a new development paradigmn: **Build LLM Apps by Chatting with LLM agents, N
|
||||||
|
|
||||||
- **For quick questions**: Use the [GPT assistant](https://chatgpt.com/g/g-677464af36588191b9eba4901946557b-pocket-flow-assistant) (note: it uses older models not ideal for coding).
|
- **For quick questions**: Use the [GPT assistant](https://chatgpt.com/g/g-677464af36588191b9eba4901946557b-pocket-flow-assistant) (note: it uses older models not ideal for coding).
|
||||||
- **For one-time LLM task**: Create a [ChatGPT](https://help.openai.com/en/articles/10169521-using-projects-in-chatgpt) or [Claude](https://www.anthropic.com/news/projects) project; upload the [docs](docs) to project knowledge.
|
- **For one-time LLM task**: Create a [ChatGPT](https://help.openai.com/en/articles/10169521-using-projects-in-chatgpt) or [Claude](https://www.anthropic.com/news/projects) project; upload the [docs](docs) to project knowledge.
|
||||||
- **For LLM App development**: Use [Cursor AI](https://www.cursor.com/). Copy [.cursorrules](assets/.cursorrules) to your project root as **[Cursor Rules](https://docs.cursor.com/context/rules-for-ai)**.
|
- **For LLM App development**: Use [Cursor AI](https://www.cursor.com/). Copy [.cursorrules](.cursorrules) to your project root as **[Cursor Rules](https://docs.cursor.com/context/rules-for-ai)**.
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue