From c3c7dcaa2b73724c2e65a1da41a90555e2c066c4 Mon Sep 17 00:00:00 2001 From: zachary62 Date: Fri, 21 Feb 2025 19:30:14 -0500 Subject: [PATCH] update the guide --- docs/guide.md | 81 ++++++++++++++++++++++++++++----------------------- 1 file changed, 45 insertions(+), 36 deletions(-) diff --git a/docs/guide.md b/docs/guide.md index 227a36c..64134fb 100644 --- a/docs/guide.md +++ b/docs/guide.md @@ -8,6 +8,51 @@ nav_order: 1 # LLM System Design Guidance +## System Design Steps + +1. **Project Requirements** + - Identify the project's core entities, and provide a step-by-step user story. + - Define a list of both functional and non-functional requirements. + +2. **Utility Functions** + - Determine the utility functions on which this project depends (e.g., for LLM calls, web searches, file handling). + - Implement these functions and write basic tests to confirm they work correctly. + +> After this step, don't jump straight into building an LLM system. +> +> First, make sure you clearly understand the problem by manually solving it using some example inputs. +> +> It's always easier to first build a solid intuition about the problem and its solution, then focus on automating the process. +{: .warning } + +3. **Flow Design** + - Build a high-level design of the flow of nodes (for example, using a Mermaid diagram) to automate the solution. + - For each node in your flow, specify: + - **prep**: How data is accessed or retrieved. + - **exec**: The specific utility function to use (ideally one function per node). + - **post**: How data is updated or persisted. + - Identify potential design patterns, such as Batch, Agent, or RAG. + +4. **Data Structure** + - Decide how you will store and update state (in memory for smaller applications or in a database for larger, persistent needs). + - If it isn’t straightforward, define data schemas or models detailing how information is stored, accessed, and updated. + - As you finalize your data structure, you may need to refine your flow design. + +5. **Implementation** + - For each node, implement the **prep**, **exec**, and **post** functions based on the flow design. + - Start coding with a simple, direct approach (avoid over-engineering at first). + - Add logging throughout the code to facilitate debugging. + +6. **Optimization** + - **Prompt Engineering**: Use clear, specific instructions with illustrative examples to reduce ambiguity. + - **Task Decomposition**: Break large or complex tasks into manageable, logical steps. + +7. **Reliability** + - **Structured Output**: Ensure outputs conform to the required format. Consider increasing `max_retries` if needed. + - **Test Cases**: Develop clear, reproducible tests for each part of the flow. + - **Self-Evaluation**: Introduce an additional node (powered by LLMs) to review outputs when results are uncertain. + + ## Example LLM Project File Structure ``` @@ -93,39 +138,3 @@ def test_call_llm(): prompt = "Hello, how are you?" assert call_llm(prompt) is not None ``` - -## System Design Steps - -1. **Project Requirements** - - Identify the project's core entities. - - Define each functional requirement and map out how these entities interact step by step. - -2. **Utility Functions** - - Determine the low-level utility functions you’ll need (e.g., for LLM calls, web searches, file handling). - - Implement these functions and write basic tests to confirm they work correctly. - -3. **Flow Design** - - Develop a high-level process flow of nodes that meets the project’s requirements. - - For each node in your flow: - - **prep**: Determine how data is accessed or retrieved. - - **exec**: Outline the utility function to be used. Ideally, one utility function per node. Highlight the `utility function` name. - - **post**: Handle any final updates or data persistence tasks. - - Identify possible decision points for *Node Actions* and data-intensive operations for *Batch* tasks. - - Illustrate the flow with a Mermaid diagram. - -4. **Data Structure** - - Decide how to store and update state, whether in memory (for smaller applications) or a database (for larger or persistent needs). - - Define data schemas or models that detail how information is stored, accessed, and updated. - -5. **Implementation** - - For each node, implement the `prep`, `exec`, and `post` functions based on the flow design. - - Start coding with a simple, direct approach (avoid over-engineering at first). - -6. **Optimization** - - **Prompt Engineering**: Use clear and specific instructions with illustrative examples to reduce ambiguity. - - **Task Decomposition**: Break large, complex tasks into manageable, logical steps. - -7. **Reliability** - - **Structured Output**: Verify outputs conform to the required format. Consider increasing `max_retries` if needed. - - **Test Cases**: Develop clear, reproducible tests for each part of the flow. - - **Self-Evaluation**: Introduce an additional Node (powered by LLMs) to review outputs when the results are uncertain.