add chat with memory tutorial

2025-03-21 15:28:55 -04:00 · 2025-03-21 15:28:55 -04:00 · 154e37528b
parent a0a1a4cadd
commit 154e37528b
13 changed files with 555 additions and 309 deletions
--- a/cookbook/pocketflow-chat-memory/README.md
+++ b/cookbook/pocketflow-chat-memory/README.md
@ -1,91 +1,105 @@
-# Retrieval Augmented Generation (RAG)
+# PocketFlow Chat with Memory

-This project demonstrates a simplified RAG system that retrieves relevant documents based on user queries.
+A chat application with memory retrieval using PocketFlow. This example maintains a sliding window of recent conversations while retrieving relevant past conversations based on context.

 ## Features

- Simple vector-based document retrieval
- Two-stage pipeline (offline indexing, online querying)
- FAISS-powered similarity search
+- Maintains a window of 3 most recent conversation pairs
+- Archives older conversations with embeddings
+- Uses vector similarity to retrieve the most relevant past conversation
+- Combines recent context (3 pairs) with retrieved context (1 pair) for better responses

-## Getting Started
+## Run It

-1. Install the required dependencies:
+1. Make sure your OpenAI API key is set:
+    ```bash
+    export OPENAI_API_KEY="your-api-key-here"
+    ```

-```bash
-pip install -r requirements.txt
-```
-
-2. Run the application with a sample query:
-
-```bash
-python main.py --"Large Language Model"
-```
-
-3. Or run without arguments to use the default query:
-
-```bash
-python main.py
-```
-
-## API Key
-
-By default, demo uses dummy embedding based on character frequencies. To use real OpenAI embedding:
-
-1. Edit nodes.py to replace the dummy `get_embedding` with `get_openai_embedding`:
-```python
-# Change this line:
-query_embedding = get_embedding(query)
-# To this:
-query_embedding = get_openai_embedding(query)
-
-# And also change this line:
-return get_embedding(text)
-# To this:
-return get_openai_embedding(text)
-```
-
-2. Make sure your OpenAI API key is set:
-```bash
-export OPENAI_API_KEY="your-api-key-here"
-```
+2. Install requirements and run the application:
+    ```bash
+    pip install -r requirements.txt
+    python main.py
+    ```
    
 ## How It Works

-The magic happens through a two-stage pipeline implemented with PocketFlow:
-
 ```mermaid
-graph TD
-    subgraph OfflineFlow[Offline Document Indexing]
-        EmbedDocs[EmbedDocumentsNode] --> CreateIndex[CreateIndexNode]
-    end
-    
-    subgraph OnlineFlow[Online Query Processing]
-        EmbedQuery[EmbedQueryNode] --> RetrieveDoc[RetrieveDocumentNode]
-    end
+flowchart LR
+    Question[GetUserQuestionNode] -->|retrieve| Retrieve[RetrieveNode]
+    Retrieve -->|answer| Answer[AnswerNode]
+    Answer -->|question| Question
+    Answer -->|embed| Embed[EmbedNode]
+    Embed -->|question| Question
 ```

-Here's what each part does:
-1. **EmbedDocumentsNode**: Converts documents into vector representations
-2. **CreateIndexNode**: Creates a searchable FAISS index from embeddings
-3. **EmbedQueryNode**: Converts user query into the same vector space
-4. **RetrieveDocumentNode**: Finds the most similar document using vector search
+The chat application uses:
+- Four specialized nodes:
+  - `GetUserQuestionNode`: Handles interactive user input
+  - `RetrieveNode`: Finds relevant past conversations using vector similarity
+  - `AnswerNode`: Generates responses using both recent and retrieved context
+  - `EmbedNode`: Archives older conversations with embeddings
+- A sliding window approach that maintains only the 3 most recent conversation pairs in active context
+
+## Files
+
+- [`nodes.py`](./nodes.py): Four node implementations with clear separation of concerns
+- [`flow.py`](./flow.py): Chat flow structure definition
+- [`main.py`](./main.py): Entry point for running the demo
+- [`utils/`](./utils/): Utility functions for embeddings, LLM calls, and vector operations
+

 ## Example Output

 ```
-✅ Created 5 document embeddings
-🔍 Creating search index...
-✅ Index created with 5 vectors
-🔍 Embedding query: Large Language Model
-🔎 Searching for relevant documents...
-📄 Retrieved document (index: 3, distance: 0.3296)
-📄 Most relevant text: "PocketFlow is a 100-line Large Language Model Framework."
+==================================================
+PocketFlow Chat with Memory
+==================================================
+This chat keeps your 3 most recent conversations
+and brings back relevant past conversations when helpful
+Type 'exit' to end the conversation
+==================================================
+Welcome to the interactive chat! Type 'exit' to end the conversation.
+
+You: remember, my cat name is Whiskers and she only drinks from running water taps
+
+Assistant: Got it! Whiskers only drinks from running water taps. If you have any questions or need advice related to her, feel free to ask!
+
+You: remember, I have a peanut allergy but I can eat almonds and cashews without problems
+
+Assistant: Understood! You have a peanut allergy but can safely eat almonds and cashews. If you need any assistance or information related to your dietary preferences or restrictions, just let me know!
+
+You: remember, my anniversary is June 17th and I always forget to buy a gift until the last minute
+
+Assistant: Got it! Your anniversary is on June 17th, and you often remember to buy a gift at the last minute. If you'd like, I can help you brainstorm gift ideas or set reminders as the date approaches!
+
+You: remember, I lived in Portugal for 3 years when I was a child and still speak some Portuguese
+
+Assistant: That's great to know! You lived in Portugal for three years during your childhood and still speak some Portuguese. If you ever want to practice or need help with anything related to Portuguese language or culture, feel free to ask!
+✅ Added conversation to index at position 0
+✅ Index now contains 1 conversations
+
+You: remember, my favorite dessert is tiramisu but only if it's made with real mascarpone cheese
+🔍 Finding relevant conversation for: remember, my favorite dessert ...
+📄 Retrieved conversation (distance: 0.5008)
+
+Assistant: Got it! Your favorite dessert is tiramisu, but only when it's made with real mascarpone cheese. If you ever need recommendations or recipes, just let me know!
+✅ Added conversation to index at position 1
+✅ Index now contains 2 conversations
+
+You: remember, I collect vintage mechanical watches and my most valuable one is a 1965 Omega Seamaster
+🔍 Finding relevant conversation for: remember, I collect vintage me...
+📄 Retrieved conversation (distance: 0.5374)
+
+Assistant: Got it! You collect vintage mechanical watches, and your most valuable piece is a 1965 Omega Seamaster. If you have questions about watches or need assistance with your collection, feel free to reach out!
+✅ Added conversation to index at position 2
+✅ Index now contains 3 conversations
+
+You: what's my cat name?
+🔍 Finding relevant conversation for: what's my cat name?...
+📄 Retrieved conversation (distance: 0.3643)
+
+Assistant: Your cat's name is Whiskers.
+✅ Added conversation to index at position 3
+✅ Index now contains 4 conversations
 ```
-
-## Files
-
- [`main.py`](./main.py): Main entry point for running the RAG demonstration
- [`flow.py`](./flow.py): Configures the flows that connect the nodes
- [`nodes.py`](./nodes.py): Defines the nodes for document processing and retrieval
- [`utils.py`](./utils.py): Utility functions including the embedding function
--- a/cookbook/pocketflow-chat-memory/flow.py
+++ b/cookbook/pocketflow-chat-memory/flow.py
@ -1,22 +1,33 @@
 from pocketflow import Flow
-from nodes import EmbedDocumentsNode, CreateIndexNode, EmbedQueryNode, RetrieveDocumentNode
+from nodes import GetUserQuestionNode, RetrieveNode, AnswerNode, EmbedNode

-def get_offline_flow():
-    # Create offline flow for document indexing
-    embed_docs_node = EmbedDocumentsNode()
-    create_index_node = CreateIndexNode()
-    embed_docs_node >> create_index_node
-    offline_flow = Flow(start=embed_docs_node)
-    return offline_flow
+def create_chat_flow():
+    # Create the nodes
+    question_node = GetUserQuestionNode()
+    retrieve_node = RetrieveNode()
+    answer_node = AnswerNode()
+    embed_node = EmbedNode()
    
-def get_online_flow():
-    # Create online flow for document retrieval
-    embed_query_node = EmbedQueryNode()
-    retrieve_doc_node = RetrieveDocumentNode()
-    embed_query_node >> retrieve_doc_node
-    online_flow = Flow(start=embed_query_node)
-    return online_flow
+    # Connect the flow:
+    # 1. Start with getting a question
+    # 2. Retrieve relevant conversations
+    # 3. Generate an answer
+    # 4. Optionally embed old conversations
+    # 5. Loop back to get the next question

-# Initialize flows
-offline_flow = get_offline_flow()
-online_flow = get_online_flow() 
+    # Main flow path
+    question_node - "retrieve" >> retrieve_node
+    retrieve_node - "answer" >> answer_node
+    
+    # When we need to embed old conversations
+    answer_node - "embed" >> embed_node
+    
+    # Loop back for next question
+    answer_node - "question" >> question_node
+    embed_node - "question" >> question_node
+    
+    # Create the flow starting with question node
+    return Flow(start=question_node)
+
+# Initialize the flow
+chat_flow = create_chat_flow() 
--- a/cookbook/pocketflow-chat-memory/main.py
+++ b/cookbook/pocketflow-chat-memory/main.py
@ -1,55 +1,26 @@
-import sys
-from flow import offline_flow, online_flow
+from flow import chat_flow

-def run_rag_demo():
+def run_chat_memory_demo():
    """
-    Run a demonstration of the RAG system.
+    Run an interactive chat interface with memory retrieval.
    
-    This function:
-    1. Indexes a set of sample documents (offline flow)
-    2. Takes a query from the command line
-    3. Retrieves the most relevant document (online flow)
+    Features:
+    1. Maintains a window of the 3 most recent conversation pairs
+    2. Archives older conversations with embeddings
+    3. Retrieves 1 relevant past conversation when needed
+    4. Total context to LLM: 3 recent pairs + 1 retrieved pair
    """
    
-    # Sample texts - corpus of documents to search
-    texts = [
-        "The quick brown fox jumps over the lazy dog.",
-        "Machine learning is a subset of artificial intelligence.",
-        "Python is a popular programming language for data science.",
-        "PocketFlow is a 100-line Large Language Model Framework.",
-        "The weather is sunny and warm today.",
-    ]
-    
    print("=" * 50)
-    print("PocketFlow RAG Document Retrieval")
+    print("PocketFlow Chat with Memory")
+    print("=" * 50)
+    print("This chat keeps your 3 most recent conversations")
+    print("and brings back relevant past conversations when helpful")
+    print("Type 'exit' to end the conversation")
    print("=" * 50)
    
-    # Default query
-    default_query = "Large Language Model"
-    
-    # Get query from command line if provided with --
-    query = default_query
-    for arg in sys.argv[1:]:
-        if arg.startswith("--"):
-            query = arg[2:]
-            break
-    
-    # Single shared store for both flows
-    shared = {
-        "texts": texts,
-        "embeddings": None,
-        "index": None,
-        "query": query,
-        "query_embedding": None,
-        "retrieved_document": None
-    }
-    
-    # Initialize and run the offline flow (document indexing)
-    offline_flow.run(shared)
-    
-    # Run the online flow to retrieve the most relevant document
-    online_flow.run(shared)
-
+    # Run the chat flow
+    chat_flow.run({})

 if __name__ == "__main__":
-    run_rag_demo()
+    run_chat_memory_demo()
--- a/cookbook/pocketflow-chat-memory/nodes.py
+++ b/cookbook/pocketflow-chat-memory/nodes.py
@ -1,95 +1,203 @@
-from pocketflow import Node, Flow, BatchNode
-import numpy as np
-import faiss
-from utils import get_embedding, get_openai_embedding
+from pocketflow import Node
+from utils.vector_index import create_index, add_vector, search_vectors
+from utils.call_llm import call_llm
+from utils.get_embedding import get_embedding

-# Nodes for the offline flow
-class EmbedDocumentsNode(BatchNode):
+class GetUserQuestionNode(Node):
    def prep(self, shared):
-        """Read texts from shared store and return as an iterable"""
-        return shared["texts"]
+        """Initialize messages if first run"""
+        if "messages" not in shared:
+            shared["messages"] = []
+            print("Welcome to the interactive chat! Type 'exit' to end the conversation.")
        
-    def exec(self, text):
-        """Embed a single text"""
-        return get_embedding(text)
+        return None
    
-    def post(self, shared, prep_res, exec_res_list):
-        """Store embeddings in the shared store"""
-        embeddings = np.array(exec_res_list, dtype=np.float32)
-        shared["embeddings"] = embeddings
-        print(f"✅ Created {len(embeddings)} document embeddings")
-        return "default"
+    def exec(self, _):
+        """Get user input interactively"""
+        # Get interactive input from user
+        user_input = input("\nYou: ")
            
-class CreateIndexNode(Node):
-    def prep(self, shared):
-        """Get embeddings from shared store"""
-        return shared["embeddings"]
+        # Check if user wants to exit
+        if user_input.lower() == 'exit':
+            return None
            
-    def exec(self, embeddings):
-        """Create FAISS index and add embeddings"""
-        print("🔍 Creating search index...")
-        dimension = embeddings.shape[1]
-        
-        # Create a flat L2 index
-        index = faiss.IndexFlatL2(dimension)
-        
-        # Add the embeddings to the index
-        index.add(embeddings)
-        
-        return index
+        return user_input
    
    def post(self, shared, prep_res, exec_res):
-        """Store the index in shared store"""
-        shared["index"] = exec_res
-        print(f"✅ Index created with {exec_res.ntotal} vectors")
-        return "default"
+        # If exec_res is None, the user wants to exit
+        if exec_res is None:
+            print("\nGoodbye!")
+            return None  # End the conversation
            
-# Nodes for the online flow
-class EmbedQueryNode(Node):
+        # Add user message to current messages
+        shared["messages"].append({"role": "user", "content": exec_res})
+        
+        return "retrieve"
+
+class AnswerNode(Node):
    def prep(self, shared):
-        """Get query from shared store"""
-        return shared["query"]
+        """Prepare context for the LLM"""
+        if not shared.get("messages"):
+            return None
            
-    def exec(self, query):
-        """Embed the query"""
-        print(f"🔍 Embedding query: {query}")
-        query_embedding = get_embedding(query)
-        return np.array([query_embedding], dtype=np.float32)
+        # 1. Get the last 3 conversation pairs (or fewer if not available)
+        recent_messages = shared["messages"][-6:] if len(shared["messages"]) > 6 else shared["messages"]
+        
+        # 2. Add the retrieved relevant conversation if available
+        context = []
+        if shared.get("retrieved_conversation"):
+            # Add a system message to indicate this is a relevant past conversation
+            context.append({
+                "role": "system", 
+                "content": "The following is a relevant past conversation that may help with the current query:"
+            })
+            context.extend(shared["retrieved_conversation"])
+            context.append({
+                "role": "system", 
+                "content": "Now continue the current conversation:"
+            })
+        
+        # 3. Add the recent messages
+        context.extend(recent_messages)
+        
+        return context
+    
+    def exec(self, messages):
+        """Generate a response using the LLM"""
+        if messages is None:
+            return None
+        
+        # Call LLM with the context
+        response = call_llm(messages)
+        return response
    
    def post(self, shared, prep_res, exec_res):
-        """Store query embedding in shared store"""
-        shared["query_embedding"] = exec_res
-        return "default"
+        """Process the LLM response"""
+        if prep_res is None or exec_res is None:
+            return None  # End the conversation
        
-class RetrieveDocumentNode(Node):
+        # Print the assistant's response
+        print(f"\nAssistant: {exec_res}")
+        
+        # Add assistant message to history
+        shared["messages"].append({"role": "assistant", "content": exec_res})
+        
+        # If we have more than 6 messages (3 conversation pairs), archive the oldest pair
+        if len(shared["messages"]) > 6:
+            return "embed"
+        
+        # We only end if the user explicitly typed 'exit'
+        # Even if last_question is set, we continue in interactive mode
+        return "question"
+
+class EmbedNode(Node):
    def prep(self, shared):
-        """Get query embedding, index, and texts from shared store"""
-        return shared["query_embedding"], shared["index"], shared["texts"]
+        """Extract the oldest conversation pair for embedding"""
+        if len(shared["messages"]) <= 6:
+            return None
            
-    def exec(self, inputs):
-        """Search the index for similar documents"""
-        print("🔎 Searching for relevant documents...")
-        query_embedding, index, texts = inputs
+        # Extract the oldest user-assistant pair
+        oldest_pair = shared["messages"][:2]
+        # Remove them from current messages
+        shared["messages"] = shared["messages"][2:]
        
-        # Search for the most similar document
-        distances, indices = index.search(query_embedding, k=1)
+        return oldest_pair
    
-        # Get the index of the most similar document
-        best_idx = indices[0][0]
-        distance = distances[0][0]
+    def exec(self, conversation):
+        """Embed a conversation"""
+        if not conversation:
+            return None
            
-        # Get the corresponding text
-        most_relevant_text = texts[best_idx]
+        # Combine user and assistant messages into a single text for embedding
+        user_msg = next((msg for msg in conversation if msg["role"] == "user"), {"content": ""})
+        assistant_msg = next((msg for msg in conversation if msg["role"] == "assistant"), {"content": ""})
+        combined = f"User: {user_msg['content']} Assistant: {assistant_msg['content']}"
+        
+        # Generate embedding
+        embedding = get_embedding(combined)
        
        return {
-            "text": most_relevant_text,
-            "index": best_idx,
-            "distance": distance
+            "conversation": conversation,
+            "embedding": embedding
        }
    
    def post(self, shared, prep_res, exec_res):
-        """Store retrieved document in shared store"""
-        shared["retrieved_document"] = exec_res
-        print(f"📄 Retrieved document (index: {exec_res['index']}, distance: {exec_res['distance']:.4f})")
-        print(f"📄 Most relevant text: \"{exec_res['text']}\"")
-        return "default"
+        """Store the embedding and add to index"""
+        if not exec_res:
+            # If there's nothing to embed, just continue with the next question
+            return "question"
+            
+        # Initialize vector index if not exist
+        if "vector_index" not in shared:
+            shared["vector_index"] = create_index()
+            shared["vector_items"] = []  # Track items separately
+            
+        # Add the embedding to the index and store the conversation
+        position = add_vector(shared["vector_index"], exec_res["embedding"])
+        shared["vector_items"].append(exec_res["conversation"])
+        
+        print(f"✅ Added conversation to index at position {position}")
+        print(f"✅ Index now contains {len(shared['vector_items'])} conversations")
+        
+        # Continue with the next question
+        return "question"
+
+class RetrieveNode(Node):
+    def prep(self, shared):
+        """Get the current query for retrieval"""
+        if not shared.get("messages"):
+            return None
+            
+        # Get the latest user message for searching
+        latest_user_msg = next((msg for msg in reversed(shared["messages"]) 
+                                if msg["role"] == "user"), {"content": ""})
+        
+        # Check if we have a vector index with items
+        if ("vector_index" not in shared or 
+            "vector_items" not in shared or 
+            len(shared["vector_items"]) == 0):
+            return None
+            
+        return {
+            "query": latest_user_msg["content"],
+            "vector_index": shared["vector_index"],
+            "vector_items": shared["vector_items"]
+        }
+    
+    def exec(self, inputs):
+        """Find the most relevant past conversation"""
+        if not inputs:
+            return None
+            
+        query = inputs["query"]
+        vector_index = inputs["vector_index"]
+        vector_items = inputs["vector_items"]
+        
+        print(f"🔍 Finding relevant conversation for: {query[:30]}...")
+        
+        # Create embedding for the query
+        query_embedding = get_embedding(query)
+        
+        # Search for the most similar conversation
+        indices, distances = search_vectors(vector_index, query_embedding, k=1)
+        
+        if not indices:
+            return None
+            
+        # Get the corresponding conversation
+        conversation = vector_items[indices[0]]
+        
+        return {
+            "conversation": conversation,
+            "distance": distances[0]
+        }
+    
+    def post(self, shared, prep_res, exec_res):
+        """Store the retrieved conversation"""
+        if exec_res is not None:
+            shared["retrieved_conversation"] = exec_res["conversation"]
+            print(f"📄 Retrieved conversation (distance: {exec_res['distance']:.4f})")
+        else:
+            shared["retrieved_conversation"] = None
+        
+        return "answer"
--- a/cookbook/pocketflow-chat-memory/utils.py
+++ b/cookbook/pocketflow-chat-memory/utils.py
@ -1,79 +0,0 @@
-import os
-import numpy as np
-from openai import OpenAI
-
-def get_embedding(text):
-    """
-    A simple embedding function that converts text to vector.
-    
-    In a real application, you would use a proper embedding model like OpenAI,
-    Hugging Face, or other embedding services. For this example, we'll use a 
-    simple approach based on character frequencies for demonstration purposes.
-    """
-    # Create a simple embedding (128-dimensional) based on character frequencies
-    # This is just for demonstration - not a real embedding algorithm!
-    embedding = np.zeros(128, dtype=np.float32)
-    
-    # Generate a deterministic but distributed embedding based on character frequency
-    for i, char in enumerate(text):
-        # Use modulo to distribute values across the embedding dimensions
-        pos = ord(char) % 128
-        embedding[pos] += 1.0
-    
-    # Normalize the embedding
-    norm = np.linalg.norm(embedding)
-    if norm > 0:
-        embedding = embedding / norm
-    
-    return embedding
-
-def get_openai_embedding(text):
-    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "YOUR_API_KEY"))
-    
-    response = client.embeddings.create(
-        model="text-embedding-ada-002",
-        input=text
-    )
-    
-    # Extract the embedding vector from the response
-    embedding = response.data[0].embedding
-    
-    # Convert to numpy array for consistency with other embedding functions
-    return np.array(embedding, dtype=np.float32)
-
-
-if __name__ == "__main__":
-    # Test the embedding function
-    text1 = "The quick brown fox jumps over the lazy dog."
-    text2 = "Python is a popular programming language for data science."
-    
-    emb1 = get_embedding(text1)
-    emb2 = get_embedding(text2)
-    
-    print(f"Embedding 1 shape: {emb1.shape}")
-    print(f"Embedding 2 shape: {emb2.shape}")
-    
-    # Calculate similarity (dot product)
-    similarity = np.dot(emb1, emb2)
-    print(f"Similarity between texts: {similarity:.4f}")
-    
-    # Compare with a different text
-    text3 = "Machine learning is a subset of artificial intelligence."
-    emb3 = get_embedding(text3)
-    similarity13 = np.dot(emb1, emb3)
-    similarity23 = np.dot(emb2, emb3)
-    
-    print(f"Similarity between text1 and text3: {similarity13:.4f}")
-    print(f"Similarity between text2 and text3: {similarity23:.4f}")
-    
-    # These simple comparisons should show higher similarity 
-    # between related concepts (text2 and text3) than between
-    # unrelated texts (text1 and text3)
-    
-    # Uncomment to test OpenAI embeddings (requires API key)
-    print("\nTesting OpenAI embeddings (requires API key):")
-    oai_emb1 = get_openai_embedding(text1)
-    oai_emb2 = get_openai_embedding(text2)
-    print(f"OpenAI Embedding 1 shape: {oai_emb1.shape}")
-    oai_similarity = np.dot(oai_emb1, oai_emb2)
-    print(f"OpenAI similarity between texts: {oai_similarity:.4f}")
--- a/cookbook/pocketflow-chat-memory/utils/init.py
+++ b/cookbook/pocketflow-chat-memory/utils/init.py
@ -0,0 +1 @@
+
--- a/cookbook/pocketflow-chat-memory/utils/call_llm.py
+++ b/cookbook/pocketflow-chat-memory/utils/call_llm.py
@ -0,0 +1,20 @@
+import os
+from openai import OpenAI
+
+def call_llm(messages):
+    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "your-api-key"))
+    
+    response = client.chat.completions.create(
+        model="gpt-4o",
+        messages=messages,
+        temperature=0.7
+    )
+    
+    return response.choices[0].message.content
+
+if __name__ == "__main__":
+    # Test the LLM call
+    messages = [{"role": "user", "content": "In a few words, what's the meaning of life?"}]
+    response = call_llm(messages)
+    print(f"Prompt: {messages[0]['content']}")
+    print(f"Response: {response}") 
--- a/cookbook/pocketflow-chat-memory/utils/get_embedding.py
+++ b/cookbook/pocketflow-chat-memory/utils/get_embedding.py
@ -0,0 +1,33 @@
+import os
+import numpy as np
+from openai import OpenAI
+
+def get_embedding(text):
+    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "YOUR_API_KEY"))
+    
+    response = client.embeddings.create(
+        model="text-embedding-ada-002",
+        input=text
+    )
+    
+    # Extract the embedding vector from the response
+    embedding = response.data[0].embedding
+    
+    # Convert to numpy array for consistency with other embedding functions
+    return np.array(embedding, dtype=np.float32)
+
+
+if __name__ == "__main__":
+    # Test the embedding function
+    text1 = "The quick brown fox jumps over the lazy dog."
+    text2 = "Python is a popular programming language for data science."
+    
+    emb1 = get_embedding(text1)
+    emb2 = get_embedding(text2)
+    
+    print(f"Embedding 1 shape: {emb1.shape}")
+    print(f"Embedding 2 shape: {emb2.shape}")
+    
+    # Calculate similarity (dot product)
+    similarity = np.dot(emb1, emb2)
+    print(f"Similarity between texts: {similarity:.4f}") 
--- a/cookbook/pocketflow-chat-memory/utils/vector_index.py
+++ b/cookbook/pocketflow-chat-memory/utils/vector_index.py
@ -0,0 +1,65 @@
+import numpy as np
+import faiss
+
+def create_index(dimension=1536):
+    return faiss.IndexFlatL2(dimension)
+
+def add_vector(index, vector):
+    # Make sure the vector is a numpy array with the right shape for FAISS
+    vector = np.array(vector).reshape(1, -1).astype(np.float32)
+    
+    # Add the vector to the index
+    index.add(vector)
+    
+    # Return the position (index.ntotal is the total number of vectors in the index)
+    return index.ntotal - 1
+
+def search_vectors(index, query_vector, k=1):
+    """Search for the k most similar vectors to the query vector
+    
+    Args:
+        index: The FAISS index
+        query_vector: The query vector (numpy array or list)
+        k: Number of results to return (default: 1)
+        
+    Returns:
+        tuple: (indices, distances) where:
+            - indices is a list of positions in the index
+            - distances is a list of the corresponding distances
+    """
+    # Make sure we don't try to retrieve more vectors than exist in the index
+    k = min(k, index.ntotal)
+    if k == 0:
+        return [], []
+        
+    # Make sure the query is a numpy array with the right shape for FAISS
+    query_vector = np.array(query_vector).reshape(1, -1).astype(np.float32)
+    
+    # Search the index
+    distances, indices = index.search(query_vector, k)
+    
+    return indices[0].tolist(), distances[0].tolist()
+
+# Example usage
+if __name__ == "__main__":
+    # Create a new index
+    index = create_index(dimension=3)
+    
+    # Add some random vectors and track them separately
+    items = []
+    for i in range(5):
+        vector = np.random.random(3)
+        position = add_vector(index, vector)
+        items.append(f"Item {i}")
+        print(f"Added vector at position {position}")
+        
+    print(f"Index contains {index.ntotal} vectors")
+    
+    # Search for a similar vector
+    query = np.random.random(3)
+    indices, distances = search_vectors(index, query, k=2)
+    
+    print("Query:", query)
+    print("Found indices:", indices)
+    print("Distances:", distances)
+    print("Retrieved items:", [items[idx] for idx in indices]) 
--- a/cookbook/pocketflow-chat-memory/vector_index.py
+++ b/cookbook/pocketflow-chat-memory/vector_index.py
@ -0,0 +1,92 @@
+import numpy as np
+import faiss
+
+def create_index(dimension=128):
+    """Create a new vector index for fast similarity search
+    
+    Args:
+        dimension: The dimensionality of the vectors to be indexed
+        
+    Returns:
+        tuple: (index, items_list) where:
+            - index is the FAISS index for searching
+            - items_list is an empty list for storing the items
+    """
+    # Create a flat (exact, brute-force) index for storing vectors
+    index = faiss.IndexFlatL2(dimension)
+    # Initialize an empty list to store the actual items
+    items_list = []
+    return index, items_list
+
+def add_to_index(index, items_list, embedding, item):
+    """Add an item and its vector representation to the index
+    
+    Args:
+        index: The FAISS index
+        items_list: The list of items corresponding to vectors in the index
+        embedding: The vector representation of the item (numpy array)
+        item: The actual item to store
+        
+    Returns:
+        int: The position where the item was added
+    """
+    # Make sure the embedding is a numpy array with the right shape for FAISS
+    vector = np.array(embedding).reshape(1, -1).astype(np.float32)
+    
+    # Add the vector to the index
+    index.add(vector)
+    
+    # Store the item and return its position
+    items_list.append(item)
+    return len(items_list) - 1
+
+def search_index(index, items_list, query_embedding, k=1):
+    """Search for the k most similar items to the query vector
+    
+    Args:
+        index: The FAISS index
+        items_list: The list of items corresponding to vectors in the index
+        query_embedding: The query vector (numpy array)
+        k: Number of results to return (default: 1)
+        
+    Returns:
+        tuple: (found_items, distances) where:
+            - found_items is a list of the k most similar items
+            - distances is a list of the corresponding distances
+    """
+    # Make sure we don't try to retrieve more items than exist in the index
+    k = min(k, len(items_list))
+    if k == 0:
+        return [], []
+        
+    # Make sure the query is a numpy array with the right shape for FAISS
+    query_vector = np.array(query_embedding).reshape(1, -1).astype(np.float32)
+    
+    # Search the index
+    D, I = index.search(query_vector, k)
+    
+    # Get the items corresponding to the found indices
+    found_items = [items_list[i] for i in I[0]]
+    distances = D[0].tolist()
+    
+    return found_items, distances
+
+# Example usage
+if __name__ == "__main__":
+    # Create a new index
+    index, items = create_index(dimension=3)
+    
+    # Add some random vectors and items
+    for i in range(5):
+        vector = np.random.random(3)
+        add_to_index(index, items, vector, f"Item {i}")
+        
+    print(f"Added {len(items)} items to the index")
+    
+    # Search for a similar vector
+    query = np.random.random(3)
+    found_items, distances = search_index(index, items, query, k=2)
+    
+    print("Query:", query)
+    print("Found items:", found_items)
+    print("Distances:", distances) 
--- a/cookbook/pocketflow-chat/README.md
+++ b/cookbook/pocketflow-chat/README.md
@ -39,6 +39,6 @@ The chat application uses:

 ## Files

- `main.py`: Implementation of the ChatNode and chat flow
- `utils.py`: Simple wrapper for calling the OpenAI API
+- [`main.py`](./main.py): Implementation of the ChatNode and chat flow
+- [`utils.py`](./utils.py): Simple wrapper for calling the OpenAI API
 
--- a/cookbook/pocketflow-thinking/README.md
+++ b/cookbook/pocketflow-thinking/README.md
@ -69,7 +69,7 @@ For comparison:
 - [GPT-o1 pro with thinking](https://chatgpt.com/share/67dcb1bf-ceb0-8000-823a-8ce894032e37): Correct answer after 1.5 min


-Below is an example of how Claude 3.7 Sonnet uses thinking mode to solve this complex problem, and get the correct result:
+Below is an example of how Claude 3.7 Sonnet (without native thinking) to solve this complex problem, and get the correct result:

 ```
 🤔 Processing question: Break a stick, then break the longer piece again. What's the probability of forming a triangle?
--- a/docs/guide.md
+++ b/docs/guide.md
@ -140,6 +140,7 @@ Agentic Coding should be a collaboration between Human System Design and Agent I
 ```
 my_project/
 ├── main.py
+├── nodes.py
 ├── flow.py
 ├── utils/
 │   ├── __init__.py
@ -154,13 +155,12 @@ my_project/
 - **`utils/`**: Contains all utility functions.
  - It's recommended to dedicate one Python file to each API call, for example `call_llm.py` or `search_web.py`.
  - Each file should also include a `main()` function to try that API call
- **`flow.py`**: Implements the system's flow, starting with node definitions followed by the overall structure.
+- **`nodes.py`**: Contains all the node definitions.
  ```python
-  # flow.py
-  from pocketflow import Node, Flow
+  # nodes.py
+  from pocketflow import Node
  from utils.call_llm import call_llm

-  # Example with two nodes in a flow
  class GetQuestionNode(Node):
      def exec(self, _):
          # Get question directly from user input
@ -184,7 +184,15 @@ my_project/
      def post(self, shared, prep_res, exec_res):
          # Store the answer in shared
          shared["answer"] = exec_res
+  ```
+- **`flow.py`**: Implements functions that create flows by importing node definitions and connecting them.
+  ```python
+  # flow.py
+  from pocketflow import Flow
+  from nodes import GetQuestionNode, AnswerNode

+  def create_qa_flow():
+      """Create and return a question-answering flow."""
      # Create nodes
      get_question_node = GetQuestionNode()
      answer_node = AnswerNode()
@ -193,12 +201,12 @@ my_project/
      get_question_node >> answer_node
      
      # Create flow starting with input node
-  qa_flow = Flow(start=get_question_node)
+      return Flow(start=get_question_node)
  ```
 - **`main.py`**: Serves as the project's entry point.
  ```python
  # main.py
-  from flow import qa_flow
+  from flow import create_qa_flow

  # Example main function
  # Please replace this with your own main function
@ -208,6 +216,8 @@ my_project/
          "answer": None     # Will be populated by AnswerNode
      }

+      # Create the flow and run it
+      qa_flow = create_qa_flow()
      qa_flow.run(shared)
      print(f"Question: {shared['question']}")
      print(f"Answer: {shared['answer']}")