add supervisor

2025-03-20 17:59:34 -04:00 · 2025-03-20 17:59:34 -04:00 · 136387bbaa
parent 3544dd5440
commit 136387bbaa
5 changed files with 137 additions and 28 deletions
--- a/cookbook/pocketflow-agent/README.md
+++ b/cookbook/pocketflow-agent/README.md
@ -1,6 +1,12 @@
-# PocketFlow Research Agent - Tutorial for Dummy
+# Research Agent

-This project demonstrates a simple yet powerful LLM-powered research agent built with PocketFlow, a minimalist LLM framework in just 100 lines of code! This implementation is based directly on the tutorial post [LLM Agents are simply Graph — Tutorial For Dummies](https://zacharyhuang.substack.com/p/llm-agent-internal-as-a-graph-tutorial).
+This project demonstrates a simple yet powerful LLM-powered research agent. This implementation is based directly on the tutorial: [LLM Agents are simply Graph — Tutorial For Dummies](https://zacharyhuang.substack.com/p/llm-agent-internal-as-a-graph-tutorial).
+
+## Features
+
+- Performs web searches to gather information
+- Makes decisions about when to search vs. when to answer
+- Generates comprehensive answers based on research findings

 ## Getting Started

--- a/cookbook/pocketflow-supervisor/README.md
+++ b/cookbook/pocketflow-supervisor/README.md
@ -1,6 +1,12 @@
-# PocketFlow Research Agent - Tutorial for Dummy
+# Supervisor Agent

-This project demonstrates a simple yet powerful LLM-powered research agent built with PocketFlow, a minimalist LLM framework in just 100 lines of code! This implementation is based directly on the tutorial post [LLM Agents are simply Graph — Tutorial For Dummies](https://zacharyhuang.substack.com/p/llm-agent-internal-as-a-graph-tutorial).
+This project demonstrates a supervisor that oversees an unreliable [research agent](../pocketflow-agent) to ensure high-quality answers.
+
+## Features
+
+- Evaluates responses for quality and relevance
+- Rejects nonsensical or unreliable answers
+- Requests new answers until a quality response is produced

 ## Getting Started

@ -37,22 +43,54 @@ python main.py --"What is quantum computing?"

 ## How It Works?

-The magic happens through a simple but powerful graph structure with three main parts:
+The magic happens through a simple but powerful graph structure with these main components:

 ```mermaid
 graph TD
-    A[DecideAction] -->|"search"| B[SearchWeb]
-    A -->|"answer"| C[AnswerQuestion]
-    B -->|"decide"| A
+    subgraph InnerAgent[Inner Research Agent]
+        DecideAction -->|"search"| SearchWeb
+        DecideAction -->|"answer"| UnreliableAnswerNode
+        SearchWeb -->|"decide"| DecideAction
+    end
+    
+    InnerAgent --> SupervisorNode
+    SupervisorNode -->|"retry"| InnerAgent
 ```

 Here's what each part does:
-1. **DecideAction**: The brain that figures out whether to search or answer
-2. **SearchWeb**: The researcher that goes out and finds information
-3. **AnswerQuestion**: The writer that crafts the final answer
+1. **DecideAction**: The brain that figures out whether to search or answer based on current context
+2. **SearchWeb**: The researcher that goes out and finds information using web search
+3. **UnreliableAnswerNode**: Generates answers (with a 50% chance of being unreliable)
+4. **SupervisorNode**: Quality control that validates answers and rejects nonsensical ones
+
+## Example Output
+
+```
+🤔 Processing question: Who won the Nobel Prize in Physics 2024?
+🤔 Agent deciding what to do next...
+🔍 Agent decided to search for: Nobel Prize in Physics 2024 winner
+🌐 Searching the web for: Nobel Prize in Physics 2024 winner
+📚 Found information, analyzing results...
+🤔 Agent deciding what to do next...
+💡 Agent decided to answer the question
+🤪 Generating unreliable dummy answer...
+✅ Answer generated successfully
+    🔍 Supervisor checking answer quality...
+    ❌ Supervisor rejected answer: Answer appears to be nonsensical or unhelpful
+🤔 Agent deciding what to do next...
+💡 Agent decided to answer the question
+✍️ Crafting final answer...
+✅ Answer generated successfully
+    🔍 Supervisor checking answer quality...
+    ✅ Supervisor approved answer: Answer appears to be legitimate
+
+🎯 Final Answer:
+The Nobel Prize in Physics for 2024 was awarded jointly to John J. Hopfield and Geoffrey Hinton. They were recognized "for foundational discoveries and inventions that enable machine learning with artificial neural networks." Their work has been pivotal in the field of artificial intelligence, specifically in developing the theories and technologies that support machine learning using artificial neural networks. John Hopfield is associated with Princeton University, while Geoffrey Hinton is connected to the University of Toronto. Their achievements have laid essential groundwork for advancements in AI and its widespread application across various domains.
+```
+
+## Files

-Here's what's in each file:
 - [`main.py`](./main.py): The starting point - runs the whole show!
- [`flow.py`](./flow.py): Connects everything together into a smart agent
- [`nodes.py`](./nodes.py): The building blocks that make decisions and take actions
+- [`flow.py`](./flow.py): Connects everything together into a smart agent with supervision
+- [`nodes.py`](./nodes.py): The building blocks that make decisions, take actions, and validate answers
 - [`utils.py`](./utils.py): Helper functions for talking to the LLM and searching the web
--- a/cookbook/pocketflow-supervisor/flow.py
+++ b/cookbook/pocketflow-supervisor/flow.py
@ -1,18 +1,17 @@
 from pocketflow import Flow
-from nodes import DecideAction, SearchWeb, UnreliableAnswerNode
+from nodes import DecideAction, SearchWeb, UnreliableAnswerNode, SupervisorNode

-def create_agent_flow():
+def create_agent_inner_flow():
    """
-    Create and connect the nodes to form a complete agent flow.
+    Create the inner research agent flow without supervision.
    
-    The flow works like this:
+    This flow handles the research cycle:
    1. DecideAction node decides whether to search or answer
-    2. If search, go to SearchWeb node
-    3. If answer, go to UnreliableAnswerNode (which has a 50% chance of giving nonsense answers)
-    4. After SearchWeb completes, go back to DecideAction
+    2. If search, go to SearchWeb node and return to decide
+    3. If answer, go to UnreliableAnswerNode
    
    Returns:
-        Flow: A complete research agent flow with unreliable answering capability
+        Flow: A research agent flow
    """
    # Create instances of each node
    decide = DecideAction()
@ -29,5 +28,35 @@ def create_agent_flow():
    # After SearchWeb completes and returns "decide", go back to DecideAction
    search - "decide" >> decide
    
-    # Create and return the flow, starting with the DecideAction node
-    return Flow(start=decide) 
+    # Create and return the inner flow, starting with the DecideAction node
+    return Flow(start=decide)
+
+def create_agent_flow():
+    """
+    Create a supervised agent flow by treating the entire agent flow as a node
+    and placing the supervisor outside of it.
+    
+    The flow works like this:
+    1. Inner agent flow does research and generates an answer
+    2. SupervisorNode checks if the answer is valid
+    3. If answer is valid, flow completes
+    4. If answer is invalid, restart the inner agent flow
+    
+    Returns:
+        Flow: A complete research agent flow with supervision
+    """
+    # Create the inner flow
+    agent_flow = create_agent_inner_flow()
+    
+    # Create the supervisor node
+    supervisor = SupervisorNode()
+    
+    # Connect the components
+    # After agent_flow completes, go to supervisor
+    agent_flow >> supervisor
+    
+    # If supervisor rejects the answer, go back to agent_flow
+    supervisor - "retry" >> agent_flow
+    
+    # Create and return the outer flow, starting with the agent_flow
+    return Flow(start=agent_flow) 
--- a/cookbook/pocketflow-supervisor/main.py
+++ b/cookbook/pocketflow-supervisor/main.py
@ -2,7 +2,7 @@ import sys
 from flow import create_agent_flow

 def main():
-    """Simple function to process a question."""
+    """Simple function to process a question with supervised answers."""
    # Default question
    default_question = "Who won the Nobel Prize in Physics 2024?"
    
@ -13,7 +13,7 @@ def main():
            question = arg[2:]
            break
    
-    # Create the agent flow
+    # Create the agent flow with supervision
    agent_flow = create_agent_flow()
    
    # Process the question
--- a/cookbook/pocketflow-supervisor/nodes.py
+++ b/cookbook/pocketflow-supervisor/nodes.py
@ -129,6 +129,42 @@ Provide a comprehensive answer using the research results.
        shared["answer"] = exec_res
        
        print(f"✅ Answer generated successfully")
+
+class SupervisorNode(Node):
+    def prep(self, shared):
+        """Get the current answer for evaluation."""
+        return shared["answer"]
+    
+    def exec(self, answer):
+        """Check if the answer is valid or nonsensical."""
+        print(f"    🔍 Supervisor checking answer quality...")
        
-        # We're done - no need to continue the flow
-        return "done" 
+        # Check for obvious markers of the nonsense answers
+        nonsense_markers = [
+            "coffee break", 
+            "purple unicorns", 
+            "made up", 
+            "42", 
+            "Who knows?"
+        ]
+        
+        # Check if the answer contains any nonsense markers
+        is_nonsense = any(marker in answer for marker in nonsense_markers)
+        
+        if is_nonsense:
+            return {"valid": False, "reason": "Answer appears to be nonsensical or unhelpful"}
+        else:
+            return {"valid": True, "reason": "Answer appears to be legitimate"}
+    
+    def post(self, shared, prep_res, exec_res):
+        """Decide whether to accept the answer or restart the process."""
+        if exec_res["valid"]:
+            print(f"    ✅ Supervisor approved answer: {exec_res['reason']}")
+        else:
+            print(f"    ❌ Supervisor rejected answer: {exec_res['reason']}")
+            # Clean up the bad answer
+            shared["answer"] = None
+            # Add a note about the rejected answer
+            context = shared.get("context", "")
+            shared["context"] = context + "\n\nNOTE: Previous answer attempt was rejected by supervisor."
+            return "retry"