From a2447f0bb6d2920fb1dfe17e3616d59acf0412a1 Mon Sep 17 00:00:00 2001
From: zachary62 <zhuang333@wisc.edu>
Date: Fri, 28 Feb 2025 18:24:16 -0500
Subject: [PATCH] update .cursorrules

---
 assets/.cursorrules => .cursorrules | 510 ++++++++++++++--------------
 README.md                           |   2 +-
 2 files changed, 258 insertions(+), 254 deletions(-)
 rename assets/.cursorrules => .cursorrules (86%)

diff --git a/assets/.cursorrules b/.cursorrules
similarity index 86%
rename from assets/.cursorrules
rename to .cursorrules
index 3e4c5cc..97b2a8e 100644
--- a/assets/.cursorrules
+++ b/.cursorrules
@@ -1,155 +1,24 @@
-
-
-================================================
-File: docs/guide.md
-================================================
----
-layout: default
-title: "Design Guidance"
-parent: "Apps"
-nav_order: 1
----
-
-# LLM System Design Guidance
-
-
-## Example LLM Project File Structure
-
-```
-my_project/
-├── main.py
-├── flow.py
-├── utils/
-│   ├── __init__.py
-│   ├── call_llm.py
-│   └── search_web.py
-├── tests/
-│   ├── __init__.py
-│   ├── test_flow.py
-│   └── test_nodes.py
-├── requirements.txt
-└── docs/
-    └── design.md
-```
-
-
-### `docs/`
-
-Store the documentation of the project.
-
-It should include a `design.md` file, which describes 
-- Project requirements
-- Required utility functions
-- High-level flow with a mermaid diagram
-- Shared memory data structure
-- For each node, discuss
-  - Node purpose and design (e.g., should it be a batch or async node?)
-  - How the data shall be read (for `prep`) and written (for `post`)
-  - How the data shall be processed (for `exec`)
-
-### `utils/`
-
-Houses functions for external API calls (e.g., LLMs, web searches, etc.). 
-
-It’s recommended to dedicate one Python file per API call, with names like `call_llm.py` or `search_web.py`. Each file should include:
-
-- The function to call the API
-- A main function to run that API call
-
-For instance, here’s a simplified `call_llm.py` example:
-
-```python
-from openai import OpenAI
-
-def call_llm(prompt):
-    client = OpenAI(api_key="YOUR_API_KEY_HERE")
-    response = client.chat.completions.create(
-        model="gpt-4o",
-        messages=[{"role": "user", "content": prompt}]
-    )
-    return response.choices[0].message.content
-
-def main():
-    prompt = "Hello, how are you?"
-    print(call_llm(prompt))
-
-if __name__ == "__main__":
-    main()
-```
-
-### `main.py`
-
-Serves as the project’s entry point.
-
-### `flow.py`
-
-Implements the application’s flow, starting with node followed by the flow structure.
-
-
-### `tests/`
-
-Optionally contains all tests. Use `pytest` for testing flows, nodes, and utility functions.
-For example, `test_call_llm.py` might look like:
-
-```python
-from utils.call_llm import call_llm
-
-def test_call_llm():
-    prompt = "Hello, how are you?"
-    assert call_llm(prompt) is not None
-```
-
-## System Design Steps
-
-1. **Project Requirements**  
-   - Identify the project's core entities.  
-   - Define each functional requirement and map out how these entities interact step by step.
-
-2. **Utility Functions**  
-   - Determine the low-level utility functions you’ll need (e.g., for LLM calls, web searches, file handling).  
-   - Implement these functions and write basic tests to confirm they work correctly.
-
-3. **Flow Design**  
-   - Develop a high-level process flow that meets the project’s requirements.  
-   - Specify which utility functions are used at each step.  
-   - Identify possible decision points for *Node Actions* and data-intensive operations for *Batch* tasks.  
-   - Illustrate the flow with a Mermaid diagram.
-
-4. **Data Structure**  
-   - Decide how to store and update state, whether in memory (for smaller applications) or a database (for larger or persistent needs).  
-   - Define data schemas or models that detail how information is stored, accessed, and updated.
-
-5. **Implementation**  
-   - Start coding with a simple, direct approach (avoid over-engineering at first).  
-   - For each node in your flow:
-     - **prep**: Determine how data is accessed or retrieved.  
-     - **exec**: Outline the actual processing or logic needed.  
-     - **post**: Handle any final updates or data persistence tasks.
-
-6. **Optimization**  
-   - **Prompt Engineering**: Use clear and specific instructions with illustrative examples to reduce ambiguity.  
-   - **Task Decomposition**: Break large, complex tasks into manageable, logical steps.
-
-7. **Reliability**  
-   - **Structured Output**: Verify outputs conform to the required format. Consider increasing `max_retries` if needed.  
-   - **Test Cases**: Develop clear, reproducible tests for each part of the flow.  
-   - **Self-Evaluation**: Introduce an additional Node (powered by LLMs) to review outputs when the results are uncertain.
-
 ================================================
 File: docs/agent.md
 ================================================
 ---
 layout: default
 title: "Agent"
-parent: "Paradigm"
+parent: "Design"
 nav_order: 6
 ---
 
 # Agent
 
-For many tasks, we need agents that take dynamic and recursive actions based on the inputs they receive.
-You can create these agents as **Nodes** connected by *Actions* in a directed graph using [Flow](./flow.md).
+Agent is a powerful design pattern, where node can take dynamic actions based on the context it receives.
+To express an agent, create a Node (the agent) with [branching](./flow.md) to other nodes (Actions).
 
+> The core of build **performant** and **reliable** agents boils down to:
+> 
+> 1. **Context Management:** Provide *clear, relevant context* so agents can understand the problem.E.g., Rather than dumping an entire chat history or entire files, use a [Workflow](./decomp.md) that filters out and includes only the most relevant information.
+>
+> 2. **Action Space:** Define *a well-structured, unambiguous, and easy-to-use* set of actions. For instance, avoid creating overlapping actions like `read_databases` and `read_csvs`. Instead, unify data sources (e.g., move CSVs into a database) and design a single action. The action can be parameterized (e.g., string for search) or  programmable (e.g., SQL queries).
+{: .best-practice }
 
 ### Example: Search Agent
 
@@ -234,8 +103,6 @@ flow = Flow(start=decide)
 flow.run({"query": "Who won the Nobel Prize in Physics 2024?"})
 ```
 
-
-
 ================================================
 File: docs/async.md
 ================================================
@@ -436,10 +303,8 @@ Nodes and Flows **communicate** in two ways:
 
 If you know memory management, think of the **Shared Store** like a **heap** (shared by all function calls), and **Params** like a **stack** (assigned by the caller).
 
-> **Best Practice:** Use `Shared Store` for almost all cases. It's flexible and easy to manage. It separates data storage from data processing, making the code more readable and easier to maintain. 
->
-> `Params` is more a syntax sugar for [Batch](./batch.md).
-{: .note }
+> Use `Shared Store` for almost all cases. It's flexible and easy to manage. It separates *Data Schema* from *Compute Logic*, making the code easier to maintain. `Params` is more a syntax sugar for [Batch](./batch.md).
+{: .best-practice }
 
 ---
 
@@ -551,14 +416,21 @@ File: docs/decomp.md
 ================================================
 ---
 layout: default
-title: "Task Decomposition"
-parent: "Paradigm"
+title: "Workflow"
+parent: "Design"
 nav_order: 2
 ---
 
-# Task Decomposition
+# Workflow
 
-Many real-world tasks are too complex for one LLM call. The solution is to decompose them into multiple calls as a [Flow](./flow.md) of Nodes.
+Many real-world tasks are too complex for one LLM call. The solution is to decompose them into a [chain](./flow.md) of multiple Nodes.
+
+
+> - You don't want to make each task **too coarse**, because it may be *too complex for one LLM call*.
+> - You don't want to make each task **too granular**, because then *the LLM call doesn't have enough context* and results are *not consistent across nodes*.
+> 
+> You usually need multiple *iterations* to find the *sweet spot*. If the task has too many *edge cases*, consider using [Agents](./agent.md).
+{: .best-practice }
 
 ### Example: Article Writing
 
@@ -932,6 +804,123 @@ flowchart LR
 
 
 
+================================================
+File: docs/guide.md
+================================================
+---
+layout: default
+title: "Design Guidance"
+parent: "Apps"
+nav_order: 1
+---
+
+# LLM System Design Guidance
+
+
+## System Design Steps
+
+1. **Project Requirements**  
+   - Identify the project's core entities, and provide a step-by-step user story.  
+   - Define a list of both functional and non-functional requirements.
+
+2. **Utility Functions**  
+   - Determine the utility functions on which this project depends (e.g., for LLM calls, web searches, file handling).  
+   - Implement these functions and write basic tests to confirm they work correctly.
+
+> After this step, don't jump straight into building an LLM system.  
+>
+> First, make sure you clearly understand the problem by manually solving it using some example inputs.  
+>
+> It's always easier to first build a solid intuition about the problem and its solution, then focus on automating the process.  
+{: .warning }
+
+3. **Flow Design**  
+   - Build a high-level design of the flow of nodes (for example, using a Mermaid diagram) to automate the solution.  
+   - For each node in your flow, specify:  
+     - **prep**: How data is accessed or retrieved.  
+     - **exec**: The specific utility function to use (ideally one function per node).  
+     - **post**: How data is updated or persisted.  
+   - Identify potential design patterns, such as Batch, Agent, or RAG.
+
+4. **Data Structure**  
+   - Decide how you will store and update state (in memory for smaller applications or in a database for larger, persistent needs).  
+   - If it isn’t straightforward, define data schemas or models detailing how information is stored, accessed, and updated.  
+   - As you finalize your data structure, you may need to refine your flow design.
+
+5. **Implementation**  
+   - For each node, implement the **prep**, **exec**, and **post** functions based on the flow design.  
+   - Start coding with a simple, direct approach (avoid over-engineering at first).  
+   - Add logging throughout the code to facilitate debugging.
+
+6. **Optimization**  
+   - **Prompt Engineering**: Use clear, specific instructions with illustrative examples to reduce ambiguity.  
+   - **Task Decomposition**: Break large or complex tasks into manageable, logical steps.
+
+7. **Reliability**  
+   - **Structured Output**: Ensure outputs conform to the required format. Consider increasing `max_retries` if needed.  
+   - **Test Cases**: Develop clear, reproducible tests for each part of the flow.  
+   - **Self-Evaluation**: Introduce an additional node (powered by LLMs) to review outputs when results are uncertain.
+
+## Example LLM Project File Structure
+
+```
+my_project/
+├── main.py
+├── flow.py
+├── utils/
+│   ├── __init__.py
+│   ├── call_llm.py
+│   └── search_web.py
+├── requirements.txt
+└── docs/
+    └── design.md
+```
+
+### `docs/`
+
+Holds all project documentation. Include a `design.md` file covering:
+- Project requirements
+- Utility functions
+- High-level flow (with a Mermaid diagram)
+- Shared memory data structure
+- Node designs:
+  - Purpose and design (e.g., batch or async)
+  - Data read (prep) and write (post)
+  - Data processing (exec)
+
+### `utils/`
+
+Houses functions for external API calls (e.g., LLMs, web searches, etc.). It’s recommended to dedicate one Python file per API call, with names like `call_llm.py` or `search_web.py`. Each file should include:
+
+- The function to call the API
+- A main function to run that API call for testing
+
+For instance, here’s a simplified `call_llm.py` example:
+
+```python
+from openai import OpenAI
+
+def call_llm(prompt):
+    client = OpenAI(api_key="YOUR_API_KEY_HERE")
+    response = client.chat.completions.create(
+        model="gpt-4o",
+        messages=[{"role": "user", "content": prompt}]
+    )
+    return response.choices[0].message.content
+
+if __name__ == "__main__":
+    prompt = "Hello, how are you?"
+    print(call_llm(prompt))
+```
+
+### `main.py`
+
+Serves as the project’s entry point.
+
+### `flow.py`
+
+Implements the application’s flow, starting with node followed by the flow structure.
+
 ================================================
 File: docs/index.md
 ================================================
@@ -956,7 +945,7 @@ We model the LLM workflow as a **Nested Directed Graph**:
 
 
 <div align="center">
-  <img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/minillmflow.jpg?raw=true" width="400"/>
+  <img src="https://github.com/the-pocket/PocketFlow/raw/main/assets/meme.jpg?raw=true" width="400"/>
 </div>
 
 
@@ -974,23 +963,21 @@ We model the LLM workflow as a **Nested Directed Graph**:
 - [(Advanced) Async](./async.md)
 - [(Advanced) Parallel](./parallel.md)
 
-## Low-Level Details
+## Utility Functions
 
 - [LLM Wrapper](./llm.md)
 - [Tool](./tool.md)
 - [Viz and Debug](./viz.md)
 - Chunking
 
-> We do not provide built-in implementations. 
->
-> Example implementations are provided as reference.
+> We do not provide built-in utility functions. Example implementations are provided as reference.
 {: .warning }
 
 
-## High-Level Paradigm
+## Design Patterns
 
 - [Structured Output](./structure.md)
-- [Task Decomposition](./decomp.md)
+- [Workflow](./decomp.md)
 - [Map Reduce](./mapreduce.md)
 - [RAG](./rag.md)
 - [Chat Memory](./memory.md)
@@ -1012,7 +999,7 @@ File: docs/llm.md
 ---
 layout: default
 title: "LLM Wrapper"
-parent: "Details"
+parent: "Utility"
 nav_order: 1
 ---
 
@@ -1113,13 +1100,19 @@ File: docs/mapreduce.md
 ---
 layout: default
 title: "Map Reduce"
-parent: "Paradigm"
+parent: "Design"
 nav_order: 3
 ---
 
 # Map Reduce
 
-Process large inputs by splitting them into chunks using [BatchNode](./batch.md), then combining results.
+MapReduce is a design pattern suitable when you have either:
+- Large input data (e.g., multiple files to process), or
+- Large output data (e.g., multiple forms to fill)
+
+and there is a logical way to break the task into smaller, ideally independent parts. 
+You first break down the task using [BatchNode](./batch.md) in the map phase, followed by aggregation in the reduce phase.
+
 
 ### Example: Document Summarization
 
@@ -1151,7 +1144,7 @@ File: docs/memory.md
 ---
 layout: default
 title: "Chat Memory"
-parent: "Paradigm"
+parent: "Design"
 nav_order: 5
 ---
 
@@ -1197,59 +1190,81 @@ We can:
 2. Use [vector search](./tool.md) to retrieve relevant exchanges beyond the last 4.
 
 ```python
-class ChatWithMemory(Node):
+################################
+# Node A: Retrieve user input & relevant messages
+################################
+class ChatRetrieve(Node):
     def prep(self, s):
-        # Initialize shared dict
         s.setdefault("history", [])
         s.setdefault("memory_index", None)
-        
         user_input = input("You: ")
-        
-        # Retrieve relevant past if we have enough history and an index
+        return user_input
+
+    def exec(self, user_input):
+        emb = get_embedding(user_input)
         relevant = []
-        if len(s["history"]) > 8 and s["memory_index"]:
-            idx, _ = search_index(s["memory_index"], get_embedding(user_input), top_k=2)
-            relevant = [s["history"][i[0]] for i in idx]
+        if len(shared["history"]) > 8 and shared["memory_index"]:
+            idx, _ = search_index(shared["memory_index"], emb, top_k=2)
+            relevant = [shared["history"][i[0]] for i in idx]
+        return (user_input, relevant)
 
-        return {"user_input": user_input, "recent": s["history"][-8:], "relevant": relevant}
+    def post(self, s, p, r):
+        user_input, relevant = r
+        s["user_input"] = user_input
+        s["relevant"] = relevant
+        return "continue"
 
-    def exec(self, c):
-        messages = [{"role": "system", "content": "You are a helpful assistant."}]
-        # Include relevant history if any
-        if c["relevant"]:
-            messages.append({"role": "system", "content": f"Relevant: {c['relevant']}"})
-        # Add recent history and the current user input
-        messages += c["recent"] + [{"role": "user", "content": c["user_input"]}]
-        return call_llm(messages)
+################################
+# Node B: Call LLM, update history + index
+################################
+class ChatReply(Node):
+    def prep(self, s):
+        user_input = s["user_input"]
+        recent = s["history"][-8:]
+        relevant = s.get("relevant", [])
+        return user_input, recent, relevant
+
+    def exec(self, inputs):
+        user_input, recent, relevant = inputs
+        msgs = [{"role":"system","content":"You are a helpful assistant."}]
+        if relevant:
+            msgs.append({"role":"system","content":f"Relevant: {relevant}"})
+        msgs.extend(recent)
+        msgs.append({"role":"user","content":user_input})
+        ans = call_llm(msgs)
+        return ans
 
     def post(self, s, pre, ans):
-        # Update chat history
-        s["history"] += [
-            {"role": "user", "content": pre["user_input"]},
-            {"role": "assistant", "content": ans}
-        ]
+        user_input, _, _ = pre
+        s["history"].append({"role":"user","content":user_input})
+        s["history"].append({"role":"assistant","content":ans})
         
-        # When first reaching 8 messages, create index
+        # Manage memory index
         if len(s["history"]) == 8:
-            embeddings = []
+            embs = []
             for i in range(0, 8, 2):
-                e = s["history"][i]["content"] + " " + s["history"][i+1]["content"]
-                embeddings.append(get_embedding(e))
-            s["memory_index"] = create_index(embeddings)
-            
-        # Embed older exchanges once we exceed 8 messages
+                text = s["history"][i]["content"] + " " + s["history"][i+1]["content"]
+                embs.append(get_embedding(text))
+            s["memory_index"] = create_index(embs)
         elif len(s["history"]) > 8:
-            pair = s["history"][-10:-8]
-            embedding = get_embedding(pair[0]["content"] + " " + pair[1]["content"])
-            s["memory_index"].add(np.array([embedding]).astype('float32'))
-        
+            text = s["history"][-2]["content"] + " " + s["history"][-1]["content"]
+            new_emb = np.array([get_embedding(text)]).astype('float32')
+            s["memory_index"].add(new_emb)
+
         print(f"Assistant: {ans}")
         return "continue"
 
-chat = ChatWithMemory()
-chat - "continue" >> chat
-flow = Flow(start=chat)
-flow.run({})
+################################
+# Flow wiring
+################################
+retrieve = ChatRetrieve()
+reply = ChatReply()
+retrieve - "continue" >> reply
+reply - "continue" >> retrieve
+
+flow = Flow(start=retrieve)
+shared = {}
+flow.run(shared)
 ```
 
 
@@ -1259,7 +1274,7 @@ File: docs/multi_agent.md
 ---
 layout: default
 title: "(Advanced) Multi-Agents"
-parent: "Paradigm"
+parent: "Design"
 nav_order: 7
 ---
 
@@ -1268,6 +1283,8 @@ nav_order: 7
 Multiple [Agents](./flow.md) can work together by handling subtasks and communicating the progress. 
 Communication between agents is typically implemented using message queues in shared storage.
 
+> Most of time, you don't need Multi-Agents. Start with a simple solution first.
+{: .best-practice }
 
 ### Example Agent Communication: Message Queue
 
@@ -1548,18 +1565,6 @@ print("Action returned:", action_result)  # "default"
 print("Summary stored:", shared["summary"])
 ```  
 
-
-
-================================================
-File: docs/paradigm.md
-================================================
----
-layout: default
-title: "Paradigm"
-nav_order: 4
-has_children: true
----
-
 ================================================
 File: docs/parallel.md
 ================================================
@@ -1577,6 +1582,14 @@ nav_order: 6
 > Because of Python’s GIL, parallel nodes and flows can’t truly parallelize CPU-bound tasks (e.g., heavy numerical computations). However, they excel at overlapping I/O-bound work—like LLM calls, database queries, API requests, or file I/O.
 {: .warning }
 
+> - **Ensure Tasks Are Independent**: If each item depends on the output of a previous item, **do not** parallelize.
+> 
+> - **Beware of Rate Limits**: Parallel calls can **quickly** trigger rate limits on LLM services. You may need a **throttling** mechanism (e.g., semaphores or sleep intervals).
+> 
+> - **Consider Single-Node Batch APIs**: Some LLMs offer a **batch inference** API where you can send multiple prompts in a single call. This is more complex to implement but can be more efficient than launching many parallel requests and mitigates rate limits.
+{: .best-practice }
+
+
 ## AsyncParallelBatchNode
 
 Like **AsyncBatchNode**, but run `exec_async()` in **parallel**:
@@ -1613,33 +1626,13 @@ parallel_flow = SummarizeMultipleFiles(start=sub_flow)
 await parallel_flow.run_async(shared)
 ```
 
-
-## Best Practices
-
-- **Ensure Tasks Are Independent**: If each item depends on the output of a previous item, **do not** parallelize.
-
-- **Beware of Rate Limits**: Parallel calls can **quickly** trigger rate limits on LLM services. You may need a **throttling** mechanism (e.g., semaphores or sleep intervals).
-
-- **Consider Single-Node Batch APIs**: Some LLMs offer a **batch inference** API where you can send multiple prompts in a single call. This is more complex to implement but can be more efficient than launching many parallel requests and mitigates rate limits.
-
-
-================================================
-File: docs/preparation.md
-================================================
----
-layout: default
-title: "Details"
-nav_order: 3
-has_children: true
----
-
 ================================================
 File: docs/rag.md
 ================================================
 ---
 layout: default
 title: "RAG"
-parent: "Paradigm" 
+parent: "Design" 
 nav_order: 4
 ---
 
@@ -1653,34 +1646,44 @@ Use [vector search](./tool.md) to find relevant context for LLM responses.
 ```python
 class PrepareEmbeddings(Node):
     def prep(self, shared):
-        texts = shared["texts"]
-        embeddings = [get_embedding(text) for text in texts]
-        shared["search_index"] = create_index(embeddings)
+        return shared["texts"]
+
+    def exec(self, texts):
+        # Embed each text chunk
+        embs = [get_embedding(t) for t in texts]
+        return embs
+
+    def post(self, shared, prep_res, exec_res):
+        shared["search_index"] = create_index(exec_res)
+        # no action string means "default"
 
 class AnswerQuestion(Node):
     def prep(self, shared):
         question = input("Enter question: ")
-        query_embedding = get_embedding(question)
-        indices, _ = search_index(shared["search_index"], query_embedding, top_k=1)
-        relevant_text = shared["texts"][indices[0][0]]
-        return question, relevant_text
+        return question
 
-    def exec(self, inputs):
-        question, context = inputs
-        prompt = f"Question: {question}\nContext: {context}\nAnswer: "
+    def exec(self, question):
+        q_emb = get_embedding(question)
+        idx, _ = search_index(shared["search_index"], q_emb, top_k=1)
+        best_id = idx[0][0]
+        relevant_text = shared["texts"][best_id]
+        prompt = f"Question: {question}\nContext: {relevant_text}\nAnswer:"
         return call_llm(prompt)
 
-    def post(self, shared, prep_res, exec_res):
-        print(f"Answer: {exec_res}")
+    def post(self, shared, p, answer):
+        print("Answer:", answer)
 
-# Connect nodes
+############################################
+# Wire up the flow
 prep = PrepareEmbeddings()
 qa = AnswerQuestion()
 prep >> qa
 
-# Create flow
-qa_flow = Flow(start=prep)
-qa_flow.run(shared)
+flow = Flow(start=prep)
+
+# Example usage
+shared = {"texts": ["I love apples", "Cats are great", "The sky is blue"]}
+flow.run(shared)
 ```
 
 ================================================
@@ -1689,7 +1692,7 @@ File: docs/structure.md
 ---
 layout: default
 title: "Structured Output"
-parent: "Paradigm"
+parent: "Design"
 nav_order: 1
 ---
 
@@ -1771,6 +1774,9 @@ summary:
         return structured_result
 ```
 
+> Besides using `assert` statements, another popular way to validate schemas is [Pydantic](https://github.com/pydantic/pydantic)
+{: .note }
+
 ### Why YAML instead of JSON?
 
 Current LLMs struggle with escaping. YAML is easier with strings since they don't always need quotes.
@@ -1804,7 +1810,7 @@ File: docs/tool.md
 ---
 layout: default
 title: "Tool"
-parent: "Details"
+parent: "Utility"
 nav_order: 2
 ---
 
@@ -1814,7 +1820,6 @@ Similar to LLM wrappers, we **don't** provide built-in tools. Here, we recommend
 
 ---
 
-
 ## 1. Embedding Calls
 
 ```python
@@ -2025,7 +2030,7 @@ File: docs/viz.md
 ---
 layout: default
 title: "Viz and Debug"
-parent: "Details"
+parent: "Utility"
 nav_order: 3
 ---
 
@@ -2162,4 +2167,3 @@ data_science_flow.run({})
 ```
 
 The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`
-
diff --git a/README.md b/README.md
index a09aab6..71ff130 100644
--- a/README.md
+++ b/README.md
@@ -36,7 +36,7 @@ For a new development paradigmn: **Build LLM Apps by Chatting with LLM agents, N
       
   - **For quick questions**: Use  the [GPT assistant](https://chatgpt.com/g/g-677464af36588191b9eba4901946557b-pocket-flow-assistant) (note: it uses older models not ideal for coding).
   - **For one-time LLM task**:  Create a [ChatGPT](https://help.openai.com/en/articles/10169521-using-projects-in-chatgpt) or [Claude](https://www.anthropic.com/news/projects) project; upload the [docs](docs) to project knowledge.
-  - **For LLM App development**: Use [Cursor AI](https://www.cursor.com/). Copy [.cursorrules](assets/.cursorrules) to your project root as **[Cursor Rules](https://docs.cursor.com/context/rules-for-ai)**.
+  - **For LLM App development**: Use [Cursor AI](https://www.cursor.com/). Copy [.cursorrules](.cursorrules) to your project root as **[Cursor Rules](https://docs.cursor.com/context/rules-for-ai)**.
 
   </details>