diff --git a/docs/guide.md b/docs/guide.md index 9cc1a06..1c7a045 100644 --- a/docs/guide.md +++ b/docs/guide.md @@ -54,7 +54,6 @@ Agentic Coding should be a collaboration between Human System Design and Agent I - *Output*: a vector of 3072 floats - *Necessity:* Used by the second node to embed text - 4. **Node Design**: Plan how each node will read and write data, and use utility functions. - Start with the shared data design - For simple systems, use an in-memory dictionary. @@ -66,7 +65,6 @@ Agentic Coding should be a collaboration between Human System Design and Agent I - `exec`: Call the embedding utility function - `post`: Write "embedding" to the shared store - 5. **Implementation**: Implement the initial nodes and flows based on the design. - 🎉 If you’ve reached this step, humans have finished the design. Now *Agentic Coding* begins! - **“Keep it simple, stupid!”** Avoid complex features and full-scale type checking. diff --git a/docs/index.md b/docs/index.md index 986a478..0c1dcd5 100644 --- a/docs/index.md +++ b/docs/index.md @@ -25,7 +25,7 @@ We model the LLM workflow as a **Graph + Shared Store**: - [Flow](./core_abstraction/flow.md) connects nodes through **Actions** (labeled edges). - [Shared Store](./core_abstraction/communication.md) enables communication between nodes within flows. - [Batch](./core_abstraction/batch.md) nodes/flows allow for data-intensive tasks. -- [(Advanced) Async](./core_abstraction/async.md) nodes/flows allow waiting for asynchronous tasks. +- [Async](./core_abstraction/async.md) nodes/flows allow waiting for asynchronous tasks. - [(Advanced) Parallel](./core_abstraction/parallel.md) nodes/flows handle I/O-bound tasks.
@@ -49,15 +49,20 @@ From there, it’s easy to implement popular design patterns: ## Utility Function -We provide utility functions not in *codes*, but in *docs*: +We **DON'T** provide built-in utility functions. Instead, please implement your own. I believe it is a **bad practice** to provide vendor-specific implementations in a general framework: + +- **Vendor APIs change frequently**. Hardcoding them makes maintenance a nightmare. +- You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs. +- You may need **optimizations** like prompt caching, request batching, or response streaming. + +However, we provide example implementations for commonly used utility functions. - [LLM Wrapper](./utility_function/llm.md) -- [Tool](./utility_function/tool.md) -- [(Optional) Viz and Debug](./utility_function/viz.md) -- [(Optional) Web Search](./utility_function/websearch.md) -- [(Optional) Chunking](./utility_function/chunking.md) -- [(Optional) Embedding](./utility_function/embedding.md) -- [(Optional) Vector Databases](./utility_function/vector.md) -- [(Optional) Text-to-Speech](./utility_function/text_to_speech.md) +- [Viz and Debug](./utility_function/viz.md) +- [Web Search](./utility_function/websearch.md) +- [Chunking](./utility_function/chunking.md) +- [Embedding](./utility_function/embedding.md) +- [Vector Databases](./utility_function/vector.md) +- [Text-to-Speech](./utility_function/text_to_speech.md) ## Ready to build your Apps? [Learn Agentic Coding!](./guide.md) diff --git a/docs/utility_function/chunking.md b/docs/utility_function/chunking.md index e009c33..875be3f 100644 --- a/docs/utility_function/chunking.md +++ b/docs/utility_function/chunking.md @@ -2,7 +2,7 @@ layout: default title: "Text Chunking" parent: "Utility Function" -nav_order: 5 +nav_order: 4 --- # Text Chunking diff --git a/docs/utility_function/embedding.md b/docs/utility_function/embedding.md index 6261554..65e3539 100644 --- a/docs/utility_function/embedding.md +++ b/docs/utility_function/embedding.md @@ -2,7 +2,7 @@ layout: default title: "Embedding" parent: "Utility Function" -nav_order: 6 +nav_order: 5 --- # Embedding diff --git a/docs/utility_function/llm.md b/docs/utility_function/llm.md index 0d83fce..cbb586b 100644 --- a/docs/utility_function/llm.md +++ b/docs/utility_function/llm.md @@ -7,7 +7,7 @@ nav_order: 1 # LLM Wrappers -We **don't** provide built-in LLM wrappers. Instead, please implement your own or check out libraries like [litellm](https://github.com/BerriAI/litellm). +Check out libraries like [litellm](https://github.com/BerriAI/litellm). Here, we provide some minimal example implementations: 1. OpenAI @@ -141,8 +141,3 @@ def call_llm(prompt): return response ``` -## Why Not Provide Built-in LLM Wrappers? -I believe it is a **bad practice** to provide LLM-specific implementations in a general framework: -- **LLM APIs change frequently**. Hardcoding them makes maintenance a nightmare. -- You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs. -- You may need **optimizations** like prompt caching, request batching, or response streaming. \ No newline at end of file diff --git a/docs/utility_function/text_to_speech.md b/docs/utility_function/text_to_speech.md index 59d4811..5fe1774 100644 --- a/docs/utility_function/text_to_speech.md +++ b/docs/utility_function/text_to_speech.md @@ -2,7 +2,7 @@ layout: default title: "Text-to-Speech" parent: "Utility Function" -nav_order: 8 +nav_order: 7 --- # Text-to-Speech diff --git a/docs/utility_function/tool.md b/docs/utility_function/tool.md deleted file mode 100644 index bf50466..0000000 --- a/docs/utility_function/tool.md +++ /dev/null @@ -1,199 +0,0 @@ ---- -layout: default -title: "Tool" -parent: "Utility Function" -nav_order: 2 ---- - -# Tool - -Similar to LLM wrappers, we **don't** provide built-in tools. Here, we recommend some *minimal* (and incomplete) implementations of commonly used tools. These examples can serve as a starting point for your own tooling. - ---- - -## 1. Embedding Calls - -```python -def get_embedding(text): - from openai import OpenAI - client = OpenAI(api_key="YOUR_API_KEY_HERE") - r = client.embeddings.create( - model="text-embedding-ada-002", - input=text - ) - return r.data[0].embedding - -get_embedding("What's the meaning of life?") -``` - ---- - -## 2. Vector Database (Faiss) - -```python -import faiss -import numpy as np - -def create_index(embeddings): - dim = len(embeddings[0]) - index = faiss.IndexFlatL2(dim) - index.add(np.array(embeddings).astype('float32')) - return index - -def search_index(index, query_embedding, top_k=5): - D, I = index.search( - np.array([query_embedding]).astype('float32'), - top_k - ) - return I, D - -index = create_index(embeddings) -search_index(index, query_embedding) -``` - ---- - -## 3. Local Database - -```python -import sqlite3 - -def execute_sql(query): - conn = sqlite3.connect("mydb.db") - cursor = conn.cursor() - cursor.execute(query) - result = cursor.fetchall() - conn.commit() - conn.close() - return result -``` - -> ⚠️ Beware of SQL injection risk -{: .warning } - ---- - -## 4. Python Function Execution - -```python -def run_code(code_str): - env = {} - exec(code_str, env) - return env - -run_code("print('Hello, world!')") -``` - -> ⚠️ exec() is dangerous with untrusted input -{: .warning } - - ---- - -## 5. PDF Extraction - -If your PDFs are text-based, use PyMuPDF: - -```python -import fitz # PyMuPDF - -def extract_text(pdf_path): - doc = fitz.open(pdf_path) - text = "" - for page in doc: - text += page.get_text() - doc.close() - return text - -extract_text("document.pdf") -``` - -For image-based PDFs (e.g., scanned), OCR is needed. A easy and fast option is using an LLM with vision capabilities: - -```python -from openai import OpenAI -import base64 - -def call_llm_vision(prompt, image_data): - client = OpenAI(api_key="YOUR_API_KEY_HERE") - img_base64 = base64.b64encode(image_data).decode('utf-8') - - response = client.chat.completions.create( - model="gpt-4o", - messages=[{ - "role": "user", - "content": [ - {"type": "text", "text": prompt}, - {"type": "image_url", - "image_url": {"url": f"data:image/png;base64,{img_base64}"}} - ] - }] - ) - - return response.choices[0].message.content - -pdf_document = fitz.open("document.pdf") -page_num = 0 -page = pdf_document[page_num] -pix = page.get_pixmap() -img_data = pix.tobytes("png") - -call_llm_vision("Extract text from this image", img_data) -``` - ---- - -## 6. Web Crawling - -```python -def crawl_web(url): - import requests - from bs4 import BeautifulSoup - html = requests.get(url).text - soup = BeautifulSoup(html, "html.parser") - return soup.title.string, soup.get_text() -``` - ---- - - -## 7. Audio Transcription (OpenAI Whisper) - -```python -def transcribe_audio(file_path): - import openai - audio_file = open(file_path, "rb") - transcript = openai.Audio.transcribe("whisper-1", audio_file) - return transcript["text"] -``` - ---- - -## 8. Text-to-Speech (TTS) - -```python -def text_to_speech(text): - import pyttsx3 - engine = pyttsx3.init() - engine.say(text) - engine.runAndWait() -``` - ---- - -## 9. Sending Email - -```python -def send_email(to_address, subject, body, from_address, password): - import smtplib - from email.mime.text import MIMEText - - msg = MIMEText(body) - msg["Subject"] = subject - msg["From"] = from_address - msg["To"] = to_address - - with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server: - server.login(from_address, password) - server.sendmail(from_address, [to_address], msg.as_string()) -``` \ No newline at end of file diff --git a/docs/utility_function/vector.md b/docs/utility_function/vector.md index f309831..1c1d33d 100644 --- a/docs/utility_function/vector.md +++ b/docs/utility_function/vector.md @@ -2,7 +2,7 @@ layout: default title: "Vector Databases" parent: "Utility Function" -nav_order: 7 +nav_order: 6 --- # Vector Databases diff --git a/docs/utility_function/viz.md b/docs/utility_function/viz.md index c46e1b7..324a516 100644 --- a/docs/utility_function/viz.md +++ b/docs/utility_function/viz.md @@ -2,7 +2,7 @@ layout: default title: "Viz and Debug" parent: "Utility Function" -nav_order: 3 +nav_order: 2 --- # Visualization and Debugging diff --git a/docs/utility_function/websearch.md b/docs/utility_function/websearch.md index eabf2dd..0613f34 100644 --- a/docs/utility_function/websearch.md +++ b/docs/utility_function/websearch.md @@ -2,7 +2,7 @@ layout: default title: "Web Search" parent: "Utility Function" -nav_order: 4 +nav_order: 3 --- # Web Search