From afe290a015e7433b6f045194e4c59763d99ac3ef Mon Sep 17 00:00:00 2001
From: zachary62 <zhuang333@wisc.edu>
Date: Sat, 1 Mar 2025 10:32:08 -0500
Subject: [PATCH] update cursorrules

---
 .cursorrules | 164 ++++++++++++++++++++-------------------------------
 1 file changed, 64 insertions(+), 100 deletions(-)

diff --git a/.cursorrules b/.cursorrules
index 97b2a8e..0aba9a3 100644
--- a/.cursorrules
+++ b/.cursorrules
@@ -4,7 +4,7 @@ File: docs/agent.md
 ---
 layout: default
 title: "Agent"
-parent: "Design"
+parent: "Design Pattern"
 nav_order: 6
 ---
 
@@ -410,14 +410,13 @@ flow.run(shared)  # The node summarizes doc2, not doc1
 
 ---
 
-
 ================================================
 File: docs/decomp.md
 ================================================
 ---
 layout: default
 title: "Workflow"
-parent: "Design"
+parent: "Design Pattern"
 nav_order: 2
 ---
 
@@ -809,57 +808,59 @@ File: docs/guide.md
 ================================================
 ---
 layout: default
-title: "Design Guidance"
+title: "Development Playbook"
 parent: "Apps"
 nav_order: 1
 ---
 
-# LLM System Design Guidance
-
+# LLM Application Development Playbook
 
 ## System Design Steps
 
-1. **Project Requirements**  
-   - Identify the project's core entities, and provide a step-by-step user story.  
-   - Define a list of both functional and non-functional requirements.
+1. **Project Requirements**: Clearify the requirements for your project.
 
-2. **Utility Functions**  
-   - Determine the utility functions on which this project depends (e.g., for LLM calls, web searches, file handling).  
-   - Implement these functions and write basic tests to confirm they work correctly.
+2. **Utility Functions**: Although the system acts as the main decision-maker, it depends on utility functions for routine tasks and real-world interactions:
+   - `call_llm` (of course)
+   - Routine tasks (e.g., chunking text, formatting strings)  
+   - External inputs (e.g., searching the web, reading emails)  
+   - Output generation (e.g., producing reports, sending emails)
 
-> After this step, don't jump straight into building an LLM system.  
->
-> First, make sure you clearly understand the problem by manually solving it using some example inputs.  
->
-> It's always easier to first build a solid intuition about the problem and its solution, then focus on automating the process.  
-{: .warning }
+   - > **If a human can’t solve it, an LLM can’t automate it!** Before building an LLM system, thoroughly understand the problem by manually solving example inputs to develop intuition.
+     {: .best-practice }
 
-3. **Flow Design**  
-   - Build a high-level design of the flow of nodes (for example, using a Mermaid diagram) to automate the solution.  
-   - For each node in your flow, specify:  
-     - **prep**: How data is accessed or retrieved.  
-     - **exec**: The specific utility function to use (ideally one function per node).  
-     - **post**: How data is updated or persisted.  
+3. **Flow Design (Compute)**: Create a high-level design for the application’s flow.
    - Identify potential design patterns, such as Batch, Agent, or RAG.
+   - For each node, specify:
+     - **Purpose**: The high-level compute logic
+     - `exec`: The specific utility function to call (ideally, one function per node)
 
-4. **Data Structure**  
-   - Decide how you will store and update state (in memory for smaller applications or in a database for larger, persistent needs).  
-   - If it isn’t straightforward, define data schemas or models detailing how information is stored, accessed, and updated.  
-   - As you finalize your data structure, you may need to refine your flow design.
+4. **Data Schema (Data)**: Plan how data will be stored and updated.
+   - For simple apps, use an in-memory dictionary.
+   - For more complex apps or when persistence is required, use a database.
+   - For each node, specify:
+     - `prep`: How the node reads data
+     - `post`: How the node writes data
 
-5. **Implementation**  
-   - For each node, implement the **prep**, **exec**, and **post** functions based on the flow design.  
-   - Start coding with a simple, direct approach (avoid over-engineering at first).  
+5. **Implementation**: Implement nodes and flows based on the design.
+   - Start with a simple, direct approach (avoid over-engineering and full-scale type checking or testing). Let it fail fast to identify weaknesses.
    - Add logging throughout the code to facilitate debugging.
 
-6. **Optimization**  
-   - **Prompt Engineering**: Use clear, specific instructions with illustrative examples to reduce ambiguity.  
-   - **Task Decomposition**: Break large or complex tasks into manageable, logical steps.
+6. **Optimization**:
+   - **Use Intuition**: For a quick initial evaluation, human intuition is often a good start.
+   - **Redesign Flow (Back to Step 3)**: Consider breaking down tasks further, introducing agentic decisions, or better managing input contexts.
+   - If your flow design is already solid, move on to micro-optimizations:
+     - **Prompt Engineering**: Use clear, specific instructions with examples to reduce ambiguity.
+     - **In-Context Learning**: Provide robust examples for tasks that are difficult to specify with instructions alone.
+
+   - > **You’ll likely iterate a lot!** Expect to repeat Steps 3–6 hundreds of times.
+     >
+     > <div align="center"><img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/success.png?raw=true" width="400"/></div>
+     {: .best-practice }
 
 7. **Reliability**  
-   - **Structured Output**: Ensure outputs conform to the required format. Consider increasing `max_retries` if needed.  
-   - **Test Cases**: Develop clear, reproducible tests for each part of the flow.  
-   - **Self-Evaluation**: Introduce an additional node (powered by LLMs) to review outputs when results are uncertain.
+   - **Node Retries**: Add checks in the node `exec` to ensure outputs meet requirements, and consider increasing `max_retries` and `wait` times.
+   - **Logging and Visualization**: Maintain logs of all attempts and visualize node results for easier debugging.
+   - **Self-Evaluation**: Add a separate node (powered by an LLM) to review outputs when results are uncertain.
 
 ## Example LLM Project File Structure
 
@@ -876,50 +877,12 @@ my_project/
     └── design.md
 ```
 
-### `docs/`
-
-Holds all project documentation. Include a `design.md` file covering:
-- Project requirements
-- Utility functions
-- High-level flow (with a Mermaid diagram)
-- Shared memory data structure
-- Node designs:
-  - Purpose and design (e.g., batch or async)
-  - Data read (prep) and write (post)
-  - Data processing (exec)
-
-### `utils/`
-
-Houses functions for external API calls (e.g., LLMs, web searches, etc.). It’s recommended to dedicate one Python file per API call, with names like `call_llm.py` or `search_web.py`. Each file should include:
-
-- The function to call the API
-- A main function to run that API call for testing
-
-For instance, here’s a simplified `call_llm.py` example:
-
-```python
-from openai import OpenAI
-
-def call_llm(prompt):
-    client = OpenAI(api_key="YOUR_API_KEY_HERE")
-    response = client.chat.completions.create(
-        model="gpt-4o",
-        messages=[{"role": "user", "content": prompt}]
-    )
-    return response.choices[0].message.content
-
-if __name__ == "__main__":
-    prompt = "Hello, how are you?"
-    print(call_llm(prompt))
-```
-
-### `main.py`
-
-Serves as the project’s entry point.
-
-### `flow.py`
-
-Implements the application’s flow, starting with node followed by the flow structure.
+- **`docs/design.md`**: Contains project documentation and the details of each step above.
+- **`utils/`**: Contains all utility functions.
+  - It’s recommended to dedicate one Python file to each API call, for example `call_llm.py` or `search_web.py`.
+  - Each file should also include a `main()` function to try that API call
+- **`flow.py`**: Implements the application’s flow, starting with node definitions followed by the overall structure.
+- **`main.py`**: Serves as the project’s entry point.
 
 ================================================
 File: docs/index.md
@@ -963,7 +926,7 @@ We model the LLM workflow as a **Nested Directed Graph**:
 - [(Advanced) Async](./async.md)
 - [(Advanced) Parallel](./parallel.md)
 
-## Utility Functions
+## Utility Function
 
 - [LLM Wrapper](./llm.md)
 - [Tool](./tool.md)
@@ -974,7 +937,7 @@ We model the LLM workflow as a **Nested Directed Graph**:
 {: .warning }
 
 
-## Design Patterns
+## Design Pattern
 
 - [Structured Output](./structure.md)
 - [Workflow](./decomp.md)
@@ -985,13 +948,7 @@ We model the LLM workflow as a **Nested Directed Graph**:
 - [(Advanced) Multi-Agents](./multi_agent.md)
 - Evaluation
 
-## Example LLM Apps
-
-[LLM System Design Guidance](./guide.md)
-
-- [Summarization + QA agent for Paul Graham Essay](./essay.md)
-- More coming soon...
-
+## [LLM Application Development Playbook](./guide.md)
 
 ================================================
 File: docs/llm.md
@@ -999,7 +956,7 @@ File: docs/llm.md
 ---
 layout: default
 title: "LLM Wrapper"
-parent: "Utility"
+parent: "Utility Function"
 nav_order: 1
 ---
 
@@ -1089,7 +1046,7 @@ def call_llm(prompt):
 
 ## Why Not Provide Built-in LLM Wrappers?
 I believe it is a **bad practice** to provide LLM-specific implementations in a general framework:
-- **LLM APIs change frequently**. Hardcoding them makes maintenance a nighmare.
+- **LLM APIs change frequently**. Hardcoding them makes maintenance a nightmare.
 - You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs.
 - You may need **optimizations** like prompt caching, request batching, or response streaming.
 
@@ -1100,7 +1057,7 @@ File: docs/mapreduce.md
 ---
 layout: default
 title: "Map Reduce"
-parent: "Design"
+parent: "Design Pattern"
 nav_order: 3
 ---
 
@@ -1144,7 +1101,7 @@ File: docs/memory.md
 ---
 layout: default
 title: "Chat Memory"
-parent: "Design"
+parent: "Design Pattern"
 nav_order: 5
 ---
 
@@ -1274,7 +1231,7 @@ File: docs/multi_agent.md
 ---
 layout: default
 title: "(Advanced) Multi-Agents"
-parent: "Design"
+parent: "Design Pattern"
 nav_order: 7
 ---
 
@@ -1473,6 +1430,11 @@ nav_order: 1
 
 A **Node** is the smallest building block. Each Node has 3 steps `prep->exec->post`:
 
+<div align="center">
+  <img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/node.png?raw=true" width="400"/>
+</div>
+
+
 1. `prep(shared)`
    - **Read and preprocess data** from `shared` store. 
    - Examples: *query DB, read files, or serialize data into a string*.
@@ -1490,6 +1452,8 @@ A **Node** is the smallest building block. Each Node has 3 steps `prep->exec->po
    - Examples: *update DB, change states, log results*.
    - **Decide the next action** by returning a *string* (`action = "default"` if *None*).
 
+
+
 > **Why 3 steps?** To enforce the principle of *separation of concerns*. The data storage and data processing are operated separately.
 >
 > All steps are *optional*. E.g., you can only implement `prep` and `post` if you just need to process data.
@@ -1632,7 +1596,7 @@ File: docs/rag.md
 ---
 layout: default
 title: "RAG"
-parent: "Design" 
+parent: "Design Pattern"
 nav_order: 4
 ---
 
@@ -1692,7 +1656,7 @@ File: docs/structure.md
 ---
 layout: default
 title: "Structured Output"
-parent: "Design"
+parent: "Design Pattern"
 nav_order: 1
 ---
 
@@ -1810,7 +1774,7 @@ File: docs/tool.md
 ---
 layout: default
 title: "Tool"
-parent: "Utility"
+parent: "Utility Function"
 nav_order: 2
 ---
 
@@ -2030,7 +1994,7 @@ File: docs/viz.md
 ---
 layout: default
 title: "Viz and Debug"
-parent: "Utility"
+parent: "Utility Function"
 nav_order: 3
 ---
 
@@ -2166,4 +2130,4 @@ data_science_flow = DataScienceFlow(start=data_prep_node)
 data_science_flow.run({})
 ```
 
-The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`
+The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`
\ No newline at end of file