From c3c7dcaa2b73724c2e65a1da41a90555e2c066c4 Mon Sep 17 00:00:00 2001
From: zachary62 <zhuang333@wisc.edu>
Date: Fri, 21 Feb 2025 19:30:14 -0500
Subject: [PATCH] update the guide

---
 docs/guide.md | 81 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 45 insertions(+), 36 deletions(-)

diff --git a/docs/guide.md b/docs/guide.md
index 227a36c..64134fb 100644
--- a/docs/guide.md
+++ b/docs/guide.md
@@ -8,6 +8,51 @@ nav_order: 1
 # LLM System Design Guidance
 
 
+## System Design Steps
+
+1. **Project Requirements**  
+   - Identify the project's core entities, and provide a step-by-step user story.  
+   - Define a list of both functional and non-functional requirements.
+
+2. **Utility Functions**  
+   - Determine the utility functions on which this project depends (e.g., for LLM calls, web searches, file handling).  
+   - Implement these functions and write basic tests to confirm they work correctly.
+
+> After this step, don't jump straight into building an LLM system.  
+>
+> First, make sure you clearly understand the problem by manually solving it using some example inputs.  
+>
+> It's always easier to first build a solid intuition about the problem and its solution, then focus on automating the process.  
+{: .warning }
+
+3. **Flow Design**  
+   - Build a high-level design of the flow of nodes (for example, using a Mermaid diagram) to automate the solution.  
+   - For each node in your flow, specify:  
+     - **prep**: How data is accessed or retrieved.  
+     - **exec**: The specific utility function to use (ideally one function per node).  
+     - **post**: How data is updated or persisted.  
+   - Identify potential design patterns, such as Batch, Agent, or RAG.
+
+4. **Data Structure**  
+   - Decide how you will store and update state (in memory for smaller applications or in a database for larger, persistent needs).  
+   - If it isn’t straightforward, define data schemas or models detailing how information is stored, accessed, and updated.  
+   - As you finalize your data structure, you may need to refine your flow design.
+
+5. **Implementation**  
+   - For each node, implement the **prep**, **exec**, and **post** functions based on the flow design.  
+   - Start coding with a simple, direct approach (avoid over-engineering at first).  
+   - Add logging throughout the code to facilitate debugging.
+
+6. **Optimization**  
+   - **Prompt Engineering**: Use clear, specific instructions with illustrative examples to reduce ambiguity.  
+   - **Task Decomposition**: Break large or complex tasks into manageable, logical steps.
+
+7. **Reliability**  
+   - **Structured Output**: Ensure outputs conform to the required format. Consider increasing `max_retries` if needed.  
+   - **Test Cases**: Develop clear, reproducible tests for each part of the flow.  
+   - **Self-Evaluation**: Introduce an additional node (powered by LLMs) to review outputs when results are uncertain.
+
+
 ## Example LLM Project File Structure
 
 ```
@@ -93,39 +138,3 @@ def test_call_llm():
     prompt = "Hello, how are you?"
     assert call_llm(prompt) is not None
 ```
-
-## System Design Steps
-
-1. **Project Requirements**  
-   - Identify the project's core entities.  
-   - Define each functional requirement and map out how these entities interact step by step.
-
-2. **Utility Functions**  
-   - Determine the low-level utility functions you’ll need (e.g., for LLM calls, web searches, file handling).  
-   - Implement these functions and write basic tests to confirm they work correctly.
-
-3. **Flow Design**  
-   - Develop a high-level process flow of nodes that meets the project’s requirements.  
-   - For each node in your flow:
-     - **prep**: Determine how data is accessed or retrieved.  
-     - **exec**: Outline the utility function to be used. Ideally, one utility function per node. Highlight the `utility function` name.
-     - **post**: Handle any final updates or data persistence tasks.
-   - Identify possible decision points for *Node Actions* and data-intensive operations for *Batch* tasks.  
-   - Illustrate the flow with a Mermaid diagram.
-
-4. **Data Structure**  
-   - Decide how to store and update state, whether in memory (for smaller applications) or a database (for larger or persistent needs).  
-   - Define data schemas or models that detail how information is stored, accessed, and updated.
-
-5. **Implementation**  
-   - For each node, implement the `prep`, `exec`, and `post` functions based on the flow design.
-   - Start coding with a simple, direct approach (avoid over-engineering at first).  
-
-6. **Optimization**  
-   - **Prompt Engineering**: Use clear and specific instructions with illustrative examples to reduce ambiguity.  
-   - **Task Decomposition**: Break large, complex tasks into manageable, logical steps.
-
-7. **Reliability**  
-   - **Structured Output**: Verify outputs conform to the required format. Consider increasing `max_retries` if needed.  
-   - **Test Cases**: Develop clear, reproducible tests for each part of the flow.  
-   - **Self-Evaluation**: Introduce an additional Node (powered by LLMs) to review outputs when the results are uncertain.