add more docs

2025-03-05 22:11:43 -05:00 · 2025-03-05 22:11:43 -05:00 · af48dbf402
parent cc4f7f632a
commit af48dbf402
7 changed files with 338 additions and 8 deletions
--- a/docs/design_pattern/rag.md
+++ b/docs/design_pattern/rag.md
@ -8,7 +8,11 @@ nav_order: 4
 # RAG (Retrieval Augmented Generation)
 For certain LLM tasks like answering questions, providing context is essential.
-Use [vector search](../utility_function/tool.md) to find relevant context for LLM responses.
+Most common way to retrive text-based context is through embedding:
 1. Given texts, you first [chunk](../utility_function/chunking.md) them.
 2. Next, you [embed](../utility_function/embedding.md) each chunk.
 3. Then you store the chunks in [vector databases](../utility_function/vector.md).
 4. Finally, given a query, you embed the query and find the closest chunk in the vector databases.
 ### Example: Question Answering
--- a/docs/index.md
+++ b/docs/index.md
@ -41,7 +41,8 @@ We model the LLM workflow as a **Graph + Shared Store**:
 - [(Optional) Web Search](./utility_function/websearch.md)
 - [(Optional) Chunking](./utility_function/chunking.md)
 - [(Optional) Embedding](./utility_function/embedding.md)
- [(Optional) Vector](./utility_function/vector.md)
+- [(Optional) Vector Databases](./utility_function/vector.md)
 - [(Optional) Text-to-Speech](./utility_function/text_to_speech.md)
 > We do not provide built-in utility functions. Example implementations are provided as reference.
 {: .warning }
--- a/docs/utility_function/chunking.md
+++ b/docs/utility_function/chunking.md
@ -32,7 +32,7 @@ def fixed_size_chunk(text, chunk_size=100):
 However, sentences are often cut awkwardly, losing coherence.
-### Sentence-Based Chunking
+### 2. Sentence-Based Chunking
 ```python
 import nltk
@ -47,7 +47,7 @@ def sentence_based_chunk(text, max_sentences=2):
 However, might not handle very long sentences or paragraphs well.
-### Other Chunking
+### 3. Other Chunking
 - **Paragraph-Based**: Split text by paragraphs (e.g., newlines). Large paragraphs can create big chunks.
 - **Semantic**: Use embeddings or topic modeling to chunk by semantic boundaries.
--- a/docs/utility_function/embedding.md
+++ b/docs/utility_function/embedding.md
@ -1,7 +1,7 @@
 ---
 layout: default
-title: "Web Search"
+title: "Embedding"
-parent: "Embedding"
+parent: "Utility Function"
 nav_order: 6
 ---
@ -15,7 +15,7 @@ Below you will find an overview table of various text embedding APIs, along with
 {: .best-practice }
-| **API** | **Free Tier** | **Pricing** | **Docs** |
+| **API** | **Free Tier** | **Pricing Model** | **Docs** |
 | --- | --- | --- | --- |
 | **OpenAI** | ~$5 credit | ~$0.0001/1K tokens | [OpenAI Embeddings](https://platform.openai.com/docs/api-reference/embeddings) |
 | **Azure OpenAI** | $200 credit | Same as OpenAI (~$0.0001/1K tokens) | [Azure OpenAI Embeddings](https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?tabs=portal) |
--- a/docs/utility_function/text_to_speech.md
+++ b/docs/utility_function/text_to_speech.md
@ -0,0 +1,107 @@
 ---
 layout: default
 title: "Text-to-Speech"
 parent: "Utility Function"
 nav_order: 8
 ---
 ## Text-to-Speech
 | **Service**          | **Free Tier**         | **Pricing Model**                                            | **Docs**                                                            |
 |----------------------|-----------------------|--------------------------------------------------------------|---------------------------------------------------------------------|
 | **Amazon Polly**     | 5M std + 1M neural   | ~$4 /M (std), ~$16 /M (neural) after free tier               | [Polly Docs](https://aws.amazon.com/polly/)                         |
 | **Google Cloud TTS** | 4M std + 1M WaveNet  | ~$4 /M (std), ~$16 /M (WaveNet) pay-as-you-go                | [Cloud TTS Docs](https://cloud.google.com/text-to-speech)           |
 | **Azure TTS**        | 500K neural ongoing  | ~$15 /M (neural), discount at higher volumes                 | [Azure TTS Docs](https://azure.microsoft.com/products/cognitive-services/text-to-speech/) |
 | **IBM Watson TTS**   | 10K chars Lite plan  | ~$0.02 /1K (i.e. ~$20 /M). Enterprise options available       | [IBM Watson Docs](https://www.ibm.com/cloud/watson-text-to-speech)   |
 | **ElevenLabs**       | 10K chars monthly    | From ~$5/mo (30K chars) up to $330/mo (2M chars). Enterprise  | [ElevenLabs Docs](https://elevenlabs.io)                            |
 ## Example Python Code
 ### Amazon Polly
 ```python
 import boto3
 polly = boto3.client("polly", region_name="us-east-1",
                     aws_access_key_id="YOUR_AWS_ACCESS_KEY_ID",
                     aws_secret_access_key="YOUR_AWS_SECRET_ACCESS_KEY")
 resp = polly.synthesize_speech(
    Text="Hello from Polly!",
    OutputFormat="mp3",
    VoiceId="Joanna"
 )
 with open("polly.mp3", "wb") as f:
    f.write(resp["AudioStream"].read())
 ```
 ### Google Cloud TTS
 ```python
 from google.cloud import texttospeech
 client = texttospeech.TextToSpeechClient()
 input_text = texttospeech.SynthesisInput(text="Hello from Google Cloud TTS!")
 voice = texttospeech.VoiceSelectionParams(language_code="en-US")
 audio_cfg = texttospeech.AudioConfig(audio_encoding=texttospeech.AudioEncoding.MP3)
 resp = client.synthesize_speech(input=input_text, voice=voice, audio_config=audio_cfg)
 with open("gcloud_tts.mp3", "wb") as f:
    f.write(resp.audio_content)
 ```
 ### Azure TTS
 ```python
 import azure.cognitiveservices.speech as speechsdk
 speech_config = speechsdk.SpeechConfig(
    subscription="AZURE_KEY", region="AZURE_REGION")
 audio_cfg = speechsdk.audio.AudioConfig(filename="azure_tts.wav")
 synthesizer = speechsdk.SpeechSynthesizer(
    speech_config=speech_config,
    audio_config=audio_cfg
 )
 synthesizer.speak_text_async("Hello from Azure TTS!").get()
 ```
 ### IBM Watson TTS
 ```python
 from ibm_watson import TextToSpeechV1
 from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
 auth = IAMAuthenticator("IBM_API_KEY")
 service = TextToSpeechV1(authenticator=auth)
 service.set_service_url("IBM_SERVICE_URL")
 resp = service.synthesize(
    "Hello from IBM Watson!",
    voice="en-US_AllisonV3Voice",
    accept="audio/mp3"
 ).get_result()
 with open("ibm_tts.mp3", "wb") as f:
    f.write(resp.content)
 ```
 ### ElevenLabs
 ```python
 import requests
 api_key = "ELEVENLABS_KEY"
 voice_id = "ELEVENLABS_VOICE"
 url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}"
 headers = {"xi-api-key": api_key, "Content-Type": "application/json"}
 json_data = {
    "text": "Hello from ElevenLabs!",
    "voice_settings": {"stability": 0.75, "similarity_boost": 0.75}
 }
 resp = requests.post(url, headers=headers, json=json_data)
 with open("elevenlabs.mp3", "wb") as f:
    f.write(resp.content)
 ```
--- a/docs/utility_function/vector.md
+++ b/docs/utility_function/vector.md
@ -0,0 +1,218 @@
 ---
 layout: default
 title: "Vector Databases"
 parent: "Utility Function"
 nav_order: 7
 ---
 # Vector Databases
 Below is a  table of the popular vector search solutions:
 | **Tool** | **Free Tier** | **Pricing Model** | **Docs** |
 | --- | --- | --- | --- |
 | **FAISS** | N/A, self-host | Open-source | [Faiss.ai](https://faiss.ai) |
 | **Pinecone** | 2GB free | From $25/mo | [pinecone.io](https://pinecone.io) |
 | **Qdrant** | 1GB free cloud | Pay-as-you-go | [qdrant.tech](https://qdrant.tech) |
 | **Weaviate** | 14-day sandbox | From $25/mo | [weaviate.io](https://weaviate.io) |
 | **Milvus** | 5GB free cloud | PAYG or $99/mo dedicated | [milvus.io](https://milvus.io) |
 | **Chroma** | N/A, self-host | Free (Apache 2.0) | [trychroma.com](https://trychroma.com) |
 | **Redis** | 30MB free | From $5/mo | [redis.io](https://redis.io) |
 ---
 ## Example Python Code
 Below are basic usage snippets for each tool.
 ### FAISS
 ```python
 import faiss
 import numpy as np
 # Dimensionality of embeddings
 d = 128
 # Create a flat L2 index
 index = faiss.IndexFlatL2(d)
 # Random vectors
 data = np.random.random((1000, d)).astype('float32')
 index.add(data)
 # Query
 query = np.random.random((1, d)).astype('float32')
 D, I = index.search(query, k=5)
 print("Distances:", D)
 print("Neighbors:", I)
 ```
 ### Pinecone
 ```python
 import pinecone
 pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_ENV")
 index_name = "my-index"
 # Create the index if it doesn't exist
 if index_name not in pinecone.list_indexes():
    pinecone.create_index(name=index_name, dimension=128)
 # Connect
 index = pinecone.Index(index_name)
 # Upsert
 vectors = [
    ("id1", [0.1]*128),
    ("id2", [0.2]*128)
 ]
 index.upsert(vectors)
 # Query
 response = index.query([[0.15]*128], top_k=3)
 print(response)
 ```
 ### Qdrant
 ```python
 import qdrant_client
 from qdrant_client.models import Distance, VectorParams, PointStruct
 client = qdrant_client.QdrantClient(
    url="https://YOUR-QDRANT-CLOUD-ENDPOINT",
    api_key="YOUR_API_KEY"
 )
 collection = "my_collection"
 client.recreate_collection(
    collection_name=collection,
    vectors_config=VectorParams(size=128, distance=Distance.COSINE)
 )
 points = [
    PointStruct(id=1, vector=[0.1]*128, payload={"type": "doc1"}),
    PointStruct(id=2, vector=[0.2]*128, payload={"type": "doc2"}),
 ]
 client.upsert(collection_name=collection, points=points)
 results = client.search(
    collection_name=collection,
    query_vector=[0.15]*128,
    limit=2
 )
 print(results)
 ```
 ### Weaviate
 ```python
 import weaviate
 client = weaviate.Client("https://YOUR-WEAVIATE-CLOUD-ENDPOINT")
 schema = {
    "classes": [
        {
            "class": "Article",
            "vectorizer": "none"
        }
    ]
 }
 client.schema.create(schema)
 obj = {
    "title": "Hello World",
    "content": "Weaviate vector search"
 }
 client.data_object.create(obj, "Article", vector=[0.1]*128)
 resp = (
    client.query
    .get("Article", ["title", "content"])
    .with_near_vector({"vector": [0.15]*128})
    .with_limit(3)
    .do()
 )
 print(resp)
 ```
 ### Milvus
 ```python
 from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection
 import numpy as np
 connections.connect(alias="default", host="localhost", port="19530")
 fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128)
 ]
 schema = CollectionSchema(fields)
 collection = Collection("MyCollection", schema)
 emb = np.random.rand(10, 128).astype('float32')
 ids = list(range(10))
 collection.insert([ids, emb])
 index_params = {
    "index_type": "IVF_FLAT",
    "params": {"nlist": 128},
    "metric_type": "L2"
 }
 collection.create_index("embedding", index_params)
 collection.load()
 query_emb = np.random.rand(1, 128).astype('float32')
 results = collection.search(query_emb, "embedding", param={"nprobe": 10}, limit=3)
 print(results)
 ```
 ### Chroma
 ```python
 import chromadb
 from chromadb.config import Settings
 client = chromadb.Client(Settings(
    chroma_db_impl="duckdb+parquet",
    persist_directory="./chroma_data"
 ))
 coll = client.create_collection("my_collection")
 vectors = [[0.1, 0.2, 0.3], [0.2, 0.2, 0.2]]
 metas = [{"doc": "text1"}, {"doc": "text2"}]
 ids = ["id1", "id2"]
 coll.add(embeddings=vectors, metadatas=metas, ids=ids)
 res = coll.query(query_embeddings=[[0.15, 0.25, 0.3]], n_results=2)
 print(res)
 ```
 ### Redis
 ```python
 import redis
 import struct
 r = redis.Redis(host="localhost", port=6379)
 # Create index
 r.execute_command(
    "FT.CREATE", "my_idx", "ON", "HASH",
    "SCHEMA", "embedding", "VECTOR", "FLAT", "6",
    "TYPE", "FLOAT32", "DIM", "128",
    "DISTANCE_METRIC", "L2"
 )
 # Insert
 vec = struct.pack('128f', *[0.1]*128)
 r.hset("doc1", mapping={"embedding": vec})
 # Search
 qvec = struct.pack('128f', *[0.15]*128)
 q = "*=>[KNN 3 @embedding $BLOB AS dist]"
 res = r.ft("my_idx").search(q, query_params={"BLOB": qvec})
 print(res.docs)
 ```
--- a/docs/utility_function/websearch.md
+++ b/docs/utility_function/websearch.md
@ -8,7 +8,7 @@ nav_order: 4
 We recommend some implementations of commonly used web search tools.
-| **API**                         | **Free Tier**                                | **Pricing Model**                                              | **Official API Page**                                                  |
+| **API**                         | **Free Tier**                                | **Pricing Model**                                              | **Docs**                                                  |
 |---------------------------------|-----------------------------------------------|-----------------------------------------------------------------|------------------------------------------------------------------------|
 | **Google Custom Search JSON API** | 100 queries/day free       | $5 per 1000 queries.           | [Link](https://developers.google.com/custom-search/v1/overview)        |
 | **Bing Web Search API**         | 1,000 queries/month               | $15–$25 per 1,000 queries. | [Link](https://azure.microsoft.com/en-us/services/cognitive-services/bing-web-search-api/) |