2.4 KiB
2.4 KiB
Retrieval Augmented Generation (RAG)
This project demonstrates a simplified RAG system that retrieves relevant documents based on user queries.
Features
- Simple vector-based document retrieval
- Two-stage pipeline (offline indexing, online querying)
- FAISS-powered similarity search
Getting Started
- Install the required dependencies:
pip install -r requirements.txt
- Run the application with a sample query:
python main.py --"Large Language Model"
- Or run without arguments to use the default query:
python main.py
API Key
By default, demo uses dummy embedding based on character frequencies. To use real OpenAI embedding:
- Edit nodes.py to replace the dummy
get_embeddingwithget_openai_embedding:
# Change this line:
query_embedding = get_embedding(query)
# To this:
query_embedding = get_openai_embedding(query)
# And also change this line:
return get_embedding(text)
# To this:
return get_openai_embedding(text)
- Make sure your OpenAI API key is set:
export OPENAI_API_KEY="your-api-key-here"
How It Works
The magic happens through a two-stage pipeline implemented with PocketFlow:
graph TD
subgraph OfflineFlow[Offline Document Indexing]
EmbedDocs[EmbedDocumentsNode] --> CreateIndex[CreateIndexNode]
end
subgraph OnlineFlow[Online Query Processing]
EmbedQuery[EmbedQueryNode] --> RetrieveDoc[RetrieveDocumentNode]
end
Here's what each part does:
- EmbedDocumentsNode: Converts documents into vector representations
- CreateIndexNode: Creates a searchable FAISS index from embeddings
- EmbedQueryNode: Converts user query into the same vector space
- RetrieveDocumentNode: Finds the most similar document using vector search
Example Output
✅ Created 5 document embeddings
🔍 Creating search index...
✅ Index created with 5 vectors
🔍 Embedding query: Large Language Model
🔎 Searching for relevant documents...
📄 Retrieved document (index: 3, distance: 0.3296)
📄 Most relevant text: "PocketFlow is a 100-line Large Language Model Framework."