update cursor rule

This commit is contained in:
zachary62 2025-03-11 19:27:43 -04:00
parent 71d650bb64
commit c531387685
1 changed files with 174 additions and 63 deletions

View File

@ -6,9 +6,9 @@ layout: default
title: "Build your LLM App" title: "Build your LLM App"
--- ---
# LLM Application Development Playbook # LLM System Design Playbook
> If you are an AI assistant involved in building LLM Apps, read this guide **VERY, VERY** carefully! This is the most important chapter in the entire document. Throughout development, you should always (1) start with a small and simple solution, (2) design at a high level (`docs/design.md`) before implementation, and (3) frequently ask humans for feedback and clarification. > If you are an AI assistant involved in building LLM Systems, read this guide **VERY, VERY** carefully! This is the most important chapter in the entire document. Throughout development, you should always (1) start with a small and simple solution, (2) design at a high level (`docs/design.md`) before implementation, and (3) frequently ask humans for feedback and clarification.
{: .warning } {: .warning }
## System Design Steps ## System Design Steps
@ -17,48 +17,54 @@ These system designs should be a collaboration between humans and AI assistants:
| Stage | Human | AI | Comment | | Stage | Human | AI | Comment |
|:-----------------------|:----------:|:---------:|:------------------------------------------------------------------------| |:-----------------------|:----------:|:---------:|:------------------------------------------------------------------------|
| 1. Project Requirements | ★★★ High | ★☆☆ Low | Humans understand the requirements and context best. | | 1. Requirements | ★★★ High | ★☆☆ Low | Humans understand the requirements and context. |
| 2. Utility Functions | ★★☆ Medium | ★★☆ Medium | The human is familiar with external APIs and integrations, and the AI assists with implementation. | | 2. Flow | ★★☆ Medium | ★★☆ Medium | Humans specify the high-level design, and the AI fills in the details. |
| 3. Flow Design | ★★☆ Medium | ★★☆ Medium | The human identifies complex and ambiguous parts, and the AI helps with redesign. | | 3. Utilities | ★★☆ Medium | ★★☆ Medium | Humans provide available external APIs and integrations, and the AI helps with implementation. |
| 4. Data Schema | ★☆☆ Low | ★★★ High | The AI assists in designing the data schema based on the flow. | | 4. Node | ★☆☆ Low | ★★★ High | The AI helps design the node types and data handling based on the flow. |
| 5. Implementation | ★☆☆ Low | ★★★ High | The human identifies complex and ambiguous parts, and the AI helps with redesign. | | 5. Implementation | ★☆☆ Low | ★★★ High | The AI implements the flow based on the design. |
| 6. Optimization | ★★☆ Medium | ★★☆ Medium | The human reviews the code and evaluates the results, while the AI helps optimize. | | 6. Optimization | ★★☆ Medium | ★★☆ Medium | Humans evaluate the results, and the AI helps optimize. |
| 7. Reliability | ★☆☆ Low | ★★★ High | The AI helps write test cases and address corner cases. | | 7. Reliability | ★☆☆ Low | ★★★ High | The AI writes test cases and addresses corner cases. |
1. **Project Requirements**: Clarify the requirements for your project, and evaluate whether an AI system is a good fit. An AI systems are: 1. **Requirements**: Clarify the requirements for your project, and evaluate whether an AI system is a good fit. AI systems are:
- suitable for routine tasks that require common sense (e.g., filling out forms, replying to emails). - suitable for routine tasks that require common sense (e.g., filling out forms, replying to emails).
- suitable for creative tasks where all inputs are provided (e.g., building slides, writing SQL). - suitable for creative tasks where all inputs are provided (e.g., building slides, writing SQL).
- **NOT** suitable for tasks that are highly ambiguous and require complex information (e.g., building a startup). - **NOT** suitable for tasks that are highly ambiguous and require complex info (e.g., building a startup).
- > **If a human cant solve it, an LLM cant automate it!** Before building an LLM system, thoroughly understand the problem by manually solving example inputs to develop intuition. - > **If a human cant solve it, an LLM cant automate it!** Before building an LLM system, thoroughly understand the problem by manually solving example inputs to develop intuition.
{: .best-practice } {: .best-practice }
2. **Utility Functions**: AI system is the decision-maker and relies on *external utility functions* to:
<div align="center"><img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/utility.png?raw=true" width="400"/></div> 2. **Flow Design**: Outline at a high level, describe how your AI system orchestrates nodes.
- Identify applicable design patterns (e.g., [Map Reduce](./design_pattern/mapreduce.md), [Agent](./design_pattern/agent.md), [RAG](./design_pattern/rag.md)).
- For each node, provide a high-level purpose description.
- Draw the Flow in mermaid diagram.
- Read inputs (e.g., retrieving Slack messages, reading emails) 3. **Utilities**: Based on the Flow Design, identify and implement necessary utility functions.
- Write outputs (e.g., generating reports, sending emails) - Think of your AI system as the brain. It needs a body—these *external utility functions*—to interact with the real world:
- Use external tools (e.g., calling LLMs, searching the web) <div align="center"><img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/utility.png?raw=true" width="400"/></div>
- In contrast, *LLM-based tasks* (e.g., summarizing text, analyzing sentiment) are **NOT** utility functions. Instead, they are *internal core functions* within the AI system—designed in step 3—and are built on top of the utility functions.
- > **Start small!** Only include the most important ones to begin with!
{: .best-practice }
3. **Flow Design (Compute)**: Create a high-level outline for your applications flow. - Reading inputs (e.g., retrieving Slack messages, reading emails)
- Identify potential design patterns (e.g., Batch, Agent, RAG). - Writing outputs (e.g., generating reports, sending emails)
- For each node, specify: - Using external tools (e.g., calling LLMs, searching the web)
- **Purpose**: The high-level compute logic
- **Type**: Regular node, Batch node, async node, or another type
- `exec`: The specific utility function to call (ideally, one function per node)
4. **Data Schema (Data)**: Plan how data will be stored and updated. - NOTE: *LLM-based tasks* (e.g., summarizing text, analyzing sentiment) are **NOT** utility functions; rather, they are *core functions* internal in the AI system.
- For simple apps, use an in-memory dictionary. - > **Start small!** Only include the most important ones to begin with!
- For more complex apps or when persistence is required, use a database. {: .best-practice }
- For each node, specify:
4. **Node Design**: Plan how each node will read and write data, and use utility functions.
- Start with the shared data design
- For simple systems, use an in-memory dictionary.
- For more complex systems or when persistence is required, use a database.
- **Remove Data Redundancy**: Dont store the same data. Use in-memory references or foreign keys.
- For each node, design its type and data handling:
- `type`: Decide between Regular, Batch, or Async
- `prep`: How the node reads data - `prep`: How the node reads data
- `exec`: Which utility function this node uses
- `post`: How the node writes data - `post`: How the node writes data
5. **Implementation**: Implement nodes and flows based on the design. 5. **Implementation**: Implement the initial nodes and flows based on the design.
- Start with a simple, direct approach (avoid over-engineering and full-scale type checking or testing). Let it fail fast to identify weaknesses. - **“Keep it simple, stupid!”** Avoid complex features and full-scale type checking.
- **FAIL FAST**! Avoid `try` logic so you can quickly identify any weak points in the system.
- Add logging throughout the code to facilitate debugging. - Add logging throughout the code to facilitate debugging.
6. **Optimization**: 6. **Optimization**:
@ -97,7 +103,7 @@ my_project/
- **`utils/`**: Contains all utility functions. - **`utils/`**: Contains all utility functions.
- Its recommended to dedicate one Python file to each API call, for example `call_llm.py` or `search_web.py`. - Its recommended to dedicate one Python file to each API call, for example `call_llm.py` or `search_web.py`.
- Each file should also include a `main()` function to try that API call - Each file should also include a `main()` function to try that API call
- **`flow.py`**: Implements the applications flow, starting with node definitions followed by the overall structure. - **`flow.py`**: Implements the system's flow, starting with node definitions followed by the overall structure.
- **`main.py`**: Serves as the projects entry point. - **`main.py`**: Serves as the projects entry point.
================================================ ================================================
@ -1291,52 +1297,157 @@ nav_order: 4
# RAG (Retrieval Augmented Generation) # RAG (Retrieval Augmented Generation)
For certain LLM tasks like answering questions, providing context is essential. For certain LLM tasks like answering questions, providing relevant context is essential. One common architecture is a **two-stage** RAG pipeline:
Use [vector search](../utility_function/tool.md) to find relevant context for LLM responses.
### Example: Question Answering <div align="center">
<img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/rag.png?raw=true" width="400"/>
</div>
1. **Offline stage**: Preprocess and index documents ("building the index").
2. **Online stage**: Given a question, generate answers by retrieving the most relevant context.
---
## Stage 1: Offline Indexing
We create three Nodes:
1. `ChunkDocs` [chunks](../utility_function/chunking.md) raw text.
2. `EmbedDocs` [embeds](../utility_function/embedding.md) each chunk.
3. `StoreIndex` stores embeddings into a [vector database](../utility_function/vector.md).
```python ```python
class PrepareEmbeddings(Node): class ChunkDocs(BatchNode):
def prep(self, shared): def prep(self, shared):
return shared["texts"] # A list of file paths in shared["files"]. We process each file.
return shared["files"]
def exec(self, texts): def exec(self, filepath):
# Embed each text chunk # read file content. In real usage, do error handling.
embs = [get_embedding(t) for t in texts] with open(filepath, "r", encoding="utf-8") as f:
return embs text = f.read()
# chunk by 100 chars each
chunks = []
size = 100
for i in range(0, len(text), size):
chunks.append(text[i : i + size])
return chunks
def post(self, shared, prep_res, exec_res): def post(self, shared, prep_res, exec_res_list):
shared["search_index"] = create_index(exec_res) # exec_res_list is a list of chunk-lists, one per file.
# no action string means "default" # flatten them all into a single list of chunks.
all_chunks = []
for chunk_list in exec_res_list:
all_chunks.extend(chunk_list)
shared["all_chunks"] = all_chunks
class AnswerQuestion(Node): class EmbedDocs(BatchNode):
def prep(self, shared): def prep(self, shared):
question = input("Enter question: ") return shared["all_chunks"]
return question
def exec(self, chunk):
return get_embedding(chunk)
def post(self, shared, prep_res, exec_res_list):
# Store the list of embeddings.
shared["all_embeds"] = exec_res_list
print(f"Total embeddings: {len(exec_res_list)}")
class StoreIndex(Node):
def prep(self, shared):
# We'll read all embeds from shared.
return shared["all_embeds"]
def exec(self, all_embeds):
# Create a vector index (faiss or other DB in real usage).
index = create_index(all_embeds)
return index
def post(self, shared, prep_res, index):
shared["index"] = index
# Wire them in sequence
chunk_node = ChunkDocs()
embed_node = EmbedDocs()
store_node = StoreIndex()
chunk_node >> embed_node >> store_node
OfflineFlow = Flow(start=chunk_node)
```
Usage example:
```python
shared = {
"files": ["doc1.txt", "doc2.txt"], # any text files
}
OfflineFlow.run(shared)
```
---
## Stage 2: Online Query & Answer
We have 3 nodes:
1. `EmbedQuery` embeds the users question.
2. `RetrieveDocs` retrieves top chunk from the index.
3. `GenerateAnswer` calls the LLM with the question + chunk to produce the final answer.
```python
class EmbedQuery(Node):
def prep(self, shared):
return shared["question"]
def exec(self, question): def exec(self, question):
q_emb = get_embedding(question) return get_embedding(question)
idx, _ = search_index(shared["search_index"], q_emb, top_k=1)
best_id = idx[0][0] def post(self, shared, prep_res, q_emb):
relevant_text = shared["texts"][best_id] shared["q_emb"] = q_emb
prompt = f"Question: {question}\nContext: {relevant_text}\nAnswer:"
class RetrieveDocs(Node):
def prep(self, shared):
# We'll need the query embedding, plus the offline index/chunks
return shared["q_emb"], shared["index"], shared["all_chunks"]
def exec(self, inputs):
q_emb, index, chunks = inputs
I, D = search_index(index, q_emb, top_k=1)
best_id = I[0][0]
relevant_chunk = chunks[best_id]
return relevant_chunk
def post(self, shared, prep_res, relevant_chunk):
shared["retrieved_chunk"] = relevant_chunk
print("Retrieved chunk:", relevant_chunk[:60], "...")
class GenerateAnswer(Node):
def prep(self, shared):
return shared["question"], shared["retrieved_chunk"]
def exec(self, inputs):
question, chunk = inputs
prompt = f"Question: {question}\nContext: {chunk}\nAnswer:"
return call_llm(prompt) return call_llm(prompt)
def post(self, shared, p, answer): def post(self, shared, prep_res, answer):
shared["answer"] = answer
print("Answer:", answer) print("Answer:", answer)
############################################ embed_qnode = EmbedQuery()
# Wire up the flow retrieve_node = RetrieveDocs()
prep = PrepareEmbeddings() generate_node = GenerateAnswer()
qa = AnswerQuestion()
prep >> qa
flow = Flow(start=prep) embed_qnode >> retrieve_node >> generate_node
OnlineFlow = Flow(start=embed_qnode)
```
# Example usage Usage example:
shared = {"texts": ["I love apples", "Cats are great", "The sky is blue"]}
flow.run(shared) ```python
# Suppose we already ran OfflineFlow and have:
# shared["all_chunks"], shared["index"], etc.
shared["question"] = "Why do people like cats?"
OnlineFlow.run(shared)
# final answer in shared["answer"]
``` ```
================================================ ================================================