update utility

2025-03-17 01:55:45 -04:00 · 2025-03-17 01:55:45 -04:00 · 21744b7a3f
parent 8e13414687
commit 21744b7a3f
10 changed files with 21 additions and 222 deletions
--- a/docs/guide.md
+++ b/docs/guide.md
@ -54,7 +54,6 @@ Agentic Coding should be a collaboration between Human System Design and Agent I
      - *Output*: a vector of 3072 floats
      - *Necessity:* Used by the second node to embed text
 4. **Node Design**: Plan how each node will read and write data, and use utility functions.
   - Start with the shared data design
      - For simple systems, use an in-memory dictionary.
@ -66,7 +65,6 @@ Agentic Coding should be a collaboration between Human System Design and Agent I
     - `exec`: Call the embedding utility function
     - `post`: Write "embedding" to the shared store
 5. **Implementation**: Implement the initial nodes and flows based on the design.
   - 🎉 If you’ve reached this step, humans have finished the design. Now *Agentic Coding* begins!
   - **“Keep it simple, stupid!”** Avoid complex features and full-scale type checking.
--- a/docs/index.md
+++ b/docs/index.md
@ -25,7 +25,7 @@ We model the LLM workflow as a **Graph + Shared Store**:
 - [Flow](./core_abstraction/flow.md) connects nodes through **Actions** (labeled edges).
 - [Shared Store](./core_abstraction/communication.md) enables communication between nodes within flows.
 - [Batch](./core_abstraction/batch.md) nodes/flows allow for data-intensive tasks.
- [(Advanced) Async](./core_abstraction/async.md) nodes/flows allow waiting for asynchronous tasks.
+- [Async](./core_abstraction/async.md) nodes/flows allow waiting for asynchronous tasks.
 - [(Advanced) Parallel](./core_abstraction/parallel.md) nodes/flows handle I/O-bound tasks.
 <div align="center">
@ -49,15 +49,20 @@ From there, it’s easy to implement popular design patterns:
 ## Utility Function
-We provide utility functions not in *codes*, but in *docs*:
+We **DON'T** provide built-in utility functions. Instead, please implement your own. I believe it is a **bad practice** to provide vendor-specific implementations in a general framework:
 - **Vendor APIs change frequently**. Hardcoding them makes maintenance a nightmare.
 - You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs.
 - You may need **optimizations** like prompt caching, request batching, or response streaming.
 However, we provide example implementations for commonly used utility functions.
 - [LLM Wrapper](./utility_function/llm.md)
- [Tool](./utility_function/tool.md)
+- [Viz and Debug](./utility_function/viz.md)
- [(Optional) Viz and Debug](./utility_function/viz.md)
+- [Web Search](./utility_function/websearch.md)
- [(Optional) Web Search](./utility_function/websearch.md)
+- [Chunking](./utility_function/chunking.md)
- [(Optional) Chunking](./utility_function/chunking.md)
+- [Embedding](./utility_function/embedding.md)
- [(Optional) Embedding](./utility_function/embedding.md)
+- [Vector Databases](./utility_function/vector.md)
- [(Optional) Vector Databases](./utility_function/vector.md)
+- [Text-to-Speech](./utility_function/text_to_speech.md)
 - [(Optional) Text-to-Speech](./utility_function/text_to_speech.md)
 ## Ready to build your Apps? [Learn Agentic Coding!](./guide.md)
--- a/docs/utility_function/chunking.md
+++ b/docs/utility_function/chunking.md
@ -2,7 +2,7 @@
 layout: default
 title: "Text Chunking"
 parent: "Utility Function"
-nav_order: 5
+nav_order: 4
 ---
 # Text Chunking
--- a/docs/utility_function/embedding.md
+++ b/docs/utility_function/embedding.md
@ -2,7 +2,7 @@
 layout: default
 title: "Embedding"
 parent: "Utility Function"
-nav_order: 6
+nav_order: 5
 ---
 # Embedding
--- a/docs/utility_function/llm.md
+++ b/docs/utility_function/llm.md
@ -7,7 +7,7 @@ nav_order: 1
 # LLM Wrappers
-We **don't** provide built-in LLM wrappers. Instead, please implement your own or check out libraries like [litellm](https://github.com/BerriAI/litellm). 
+Check out libraries like [litellm](https://github.com/BerriAI/litellm). 
 Here, we provide some minimal example implementations:
 1. OpenAI
@ -141,8 +141,3 @@ def call_llm(prompt):
    return response
 ```
 ## Why Not Provide Built-in LLM Wrappers?
 I believe it is a **bad practice** to provide LLM-specific implementations in a general framework:
 - **LLM APIs change frequently**. Hardcoding them makes maintenance a nightmare.
 - You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs.
 - You may need **optimizations** like prompt caching, request batching, or response streaming.
--- a/docs/utility_function/text_to_speech.md
+++ b/docs/utility_function/text_to_speech.md
@ -2,7 +2,7 @@
 layout: default
 title: "Text-to-Speech"
 parent: "Utility Function"
-nav_order: 8
+nav_order: 7
 ---
 # Text-to-Speech
--- a/docs/utility_function/tool.md
+++ b/docs/utility_function/tool.md
@ -1,199 +0,0 @@
 ---
 layout: default
 title: "Tool"
 parent: "Utility Function"
 nav_order: 2
 ---
 # Tool
 Similar to LLM wrappers, we **don't** provide built-in tools. Here, we recommend some *minimal* (and incomplete) implementations of commonly used tools. These examples can serve as a starting point for your own tooling.
 ---
 ## 1. Embedding Calls
 ```python
 def get_embedding(text):
    from openai import OpenAI
    client = OpenAI(api_key="YOUR_API_KEY_HERE")
    r = client.embeddings.create(
        model="text-embedding-ada-002",
        input=text
    )
    return r.data[0].embedding
 get_embedding("What's the meaning of life?")
 ```
 ---
 ## 2. Vector Database (Faiss)
 ```python
 import faiss
 import numpy as np
 def create_index(embeddings):
    dim = len(embeddings[0])
    index = faiss.IndexFlatL2(dim)
    index.add(np.array(embeddings).astype('float32'))
    return index
 def search_index(index, query_embedding, top_k=5):
    D, I = index.search(
        np.array([query_embedding]).astype('float32'), 
        top_k
    )
    return I, D
 index = create_index(embeddings)
 search_index(index, query_embedding)
 ```
 ---
 ## 3. Local Database
 ```python
 import sqlite3
 def execute_sql(query):
    conn = sqlite3.connect("mydb.db")
    cursor = conn.cursor()
    cursor.execute(query)
    result = cursor.fetchall()
    conn.commit()
    conn.close()
    return result
 ```
 > ⚠️ Beware of SQL injection risk
 {: .warning }
 ---
 ## 4. Python Function Execution
 ```python
 def run_code(code_str):
    env = {}
    exec(code_str, env)
    return env
 run_code("print('Hello, world!')")
 ```
 > ⚠️ exec() is dangerous with untrusted input
 {: .warning }
 ---
 ## 5. PDF Extraction
 If your PDFs are text-based, use PyMuPDF:
 ```python
 import fitz  # PyMuPDF
 def extract_text(pdf_path):
    doc = fitz.open(pdf_path)
    text = ""
    for page in doc:
        text += page.get_text()
    doc.close()
    return text
 extract_text("document.pdf")
 ```
 For image-based PDFs (e.g., scanned), OCR is needed. A easy and fast option is using an LLM with vision capabilities:
 ```python
 from openai import OpenAI
 import base64
 def call_llm_vision(prompt, image_data):
    client = OpenAI(api_key="YOUR_API_KEY_HERE")
    img_base64 = base64.b64encode(image_data).decode('utf-8')
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": prompt},
                {"type": "image_url", 
                 "image_url": {"url": f"data:image/png;base64,{img_base64}"}}
            ]
        }]
    )
    return response.choices[0].message.content
 pdf_document = fitz.open("document.pdf")
 page_num = 0
 page = pdf_document[page_num]
 pix = page.get_pixmap()
 img_data = pix.tobytes("png")
 call_llm_vision("Extract text from this image", img_data)
 ```
 ---
 ## 6. Web Crawling
 ```python
 def crawl_web(url):
    import requests
    from bs4 import BeautifulSoup
    html = requests.get(url).text
    soup = BeautifulSoup(html, "html.parser")
    return soup.title.string, soup.get_text()
 ```
 ---
 ## 7. Audio Transcription (OpenAI Whisper)
 ```python
 def transcribe_audio(file_path):
    import openai
    audio_file = open(file_path, "rb")
    transcript = openai.Audio.transcribe("whisper-1", audio_file)
    return transcript["text"]
 ```
 ---
 ## 8. Text-to-Speech (TTS)
 ```python
 def text_to_speech(text):
    import pyttsx3
    engine = pyttsx3.init()
    engine.say(text)
    engine.runAndWait()
 ```
 ---
 ## 9. Sending Email
 ```python
 def send_email(to_address, subject, body, from_address, password):
    import smtplib
    from email.mime.text import MIMEText
    msg = MIMEText(body)
    msg["Subject"] = subject
    msg["From"] = from_address
    msg["To"] = to_address
    with smtplib.SMTP_SSL("smtp.gmail.com", 465) as server:
        server.login(from_address, password)
        server.sendmail(from_address, [to_address], msg.as_string())
 ```
--- a/docs/utility_function/vector.md
+++ b/docs/utility_function/vector.md
@ -2,7 +2,7 @@
 layout: default
 title: "Vector Databases"
 parent: "Utility Function"
-nav_order: 7
+nav_order: 6
 ---
 # Vector Databases
--- a/docs/utility_function/viz.md
+++ b/docs/utility_function/viz.md
@ -2,7 +2,7 @@
 layout: default
 title: "Viz and Debug"
 parent: "Utility Function"
-nav_order: 3
+nav_order: 2
 ---
 # Visualization and Debugging
--- a/docs/utility_function/websearch.md
+++ b/docs/utility_function/websearch.md
@ -2,7 +2,7 @@
 layout: default
 title: "Web Search"
 parent: "Utility Function"
-nav_order: 4
+nav_order: 3
 ---
 # Web Search