add more docs
This commit is contained in:
parent
cc4f7f632a
commit
af48dbf402
|
|
@ -8,7 +8,11 @@ nav_order: 4
|
||||||
# RAG (Retrieval Augmented Generation)
|
# RAG (Retrieval Augmented Generation)
|
||||||
|
|
||||||
For certain LLM tasks like answering questions, providing context is essential.
|
For certain LLM tasks like answering questions, providing context is essential.
|
||||||
Use [vector search](../utility_function/tool.md) to find relevant context for LLM responses.
|
Most common way to retrive text-based context is through embedding:
|
||||||
|
1. Given texts, you first [chunk](../utility_function/chunking.md) them.
|
||||||
|
2. Next, you [embed](../utility_function/embedding.md) each chunk.
|
||||||
|
3. Then you store the chunks in [vector databases](../utility_function/vector.md).
|
||||||
|
4. Finally, given a query, you embed the query and find the closest chunk in the vector databases.
|
||||||
|
|
||||||
### Example: Question Answering
|
### Example: Question Answering
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,7 +41,8 @@ We model the LLM workflow as a **Graph + Shared Store**:
|
||||||
- [(Optional) Web Search](./utility_function/websearch.md)
|
- [(Optional) Web Search](./utility_function/websearch.md)
|
||||||
- [(Optional) Chunking](./utility_function/chunking.md)
|
- [(Optional) Chunking](./utility_function/chunking.md)
|
||||||
- [(Optional) Embedding](./utility_function/embedding.md)
|
- [(Optional) Embedding](./utility_function/embedding.md)
|
||||||
- [(Optional) Vector](./utility_function/vector.md)
|
- [(Optional) Vector Databases](./utility_function/vector.md)
|
||||||
|
- [(Optional) Text-to-Speech](./utility_function/text_to_speech.md)
|
||||||
|
|
||||||
> We do not provide built-in utility functions. Example implementations are provided as reference.
|
> We do not provide built-in utility functions. Example implementations are provided as reference.
|
||||||
{: .warning }
|
{: .warning }
|
||||||
|
|
|
||||||
|
|
@ -32,7 +32,7 @@ def fixed_size_chunk(text, chunk_size=100):
|
||||||
|
|
||||||
However, sentences are often cut awkwardly, losing coherence.
|
However, sentences are often cut awkwardly, losing coherence.
|
||||||
|
|
||||||
### Sentence-Based Chunking
|
### 2. Sentence-Based Chunking
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import nltk
|
import nltk
|
||||||
|
|
@ -47,7 +47,7 @@ def sentence_based_chunk(text, max_sentences=2):
|
||||||
|
|
||||||
However, might not handle very long sentences or paragraphs well.
|
However, might not handle very long sentences or paragraphs well.
|
||||||
|
|
||||||
### Other Chunking
|
### 3. Other Chunking
|
||||||
|
|
||||||
- **Paragraph-Based**: Split text by paragraphs (e.g., newlines). Large paragraphs can create big chunks.
|
- **Paragraph-Based**: Split text by paragraphs (e.g., newlines). Large paragraphs can create big chunks.
|
||||||
- **Semantic**: Use embeddings or topic modeling to chunk by semantic boundaries.
|
- **Semantic**: Use embeddings or topic modeling to chunk by semantic boundaries.
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,7 @@
|
||||||
---
|
---
|
||||||
layout: default
|
layout: default
|
||||||
title: "Web Search"
|
title: "Embedding"
|
||||||
parent: "Embedding"
|
parent: "Utility Function"
|
||||||
nav_order: 6
|
nav_order: 6
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -15,7 +15,7 @@ Below you will find an overview table of various text embedding APIs, along with
|
||||||
{: .best-practice }
|
{: .best-practice }
|
||||||
|
|
||||||
|
|
||||||
| **API** | **Free Tier** | **Pricing** | **Docs** |
|
| **API** | **Free Tier** | **Pricing Model** | **Docs** |
|
||||||
| --- | --- | --- | --- |
|
| --- | --- | --- | --- |
|
||||||
| **OpenAI** | ~$5 credit | ~$0.0001/1K tokens | [OpenAI Embeddings](https://platform.openai.com/docs/api-reference/embeddings) |
|
| **OpenAI** | ~$5 credit | ~$0.0001/1K tokens | [OpenAI Embeddings](https://platform.openai.com/docs/api-reference/embeddings) |
|
||||||
| **Azure OpenAI** | $200 credit | Same as OpenAI (~$0.0001/1K tokens) | [Azure OpenAI Embeddings](https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?tabs=portal) |
|
| **Azure OpenAI** | $200 credit | Same as OpenAI (~$0.0001/1K tokens) | [Azure OpenAI Embeddings](https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?tabs=portal) |
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,107 @@
|
||||||
|
---
|
||||||
|
layout: default
|
||||||
|
title: "Text-to-Speech"
|
||||||
|
parent: "Utility Function"
|
||||||
|
nav_order: 8
|
||||||
|
---
|
||||||
|
|
||||||
|
## Text-to-Speech
|
||||||
|
|
||||||
|
| **Service** | **Free Tier** | **Pricing Model** | **Docs** |
|
||||||
|
|----------------------|-----------------------|--------------------------------------------------------------|---------------------------------------------------------------------|
|
||||||
|
| **Amazon Polly** | 5M std + 1M neural | ~$4 /M (std), ~$16 /M (neural) after free tier | [Polly Docs](https://aws.amazon.com/polly/) |
|
||||||
|
| **Google Cloud TTS** | 4M std + 1M WaveNet | ~$4 /M (std), ~$16 /M (WaveNet) pay-as-you-go | [Cloud TTS Docs](https://cloud.google.com/text-to-speech) |
|
||||||
|
| **Azure TTS** | 500K neural ongoing | ~$15 /M (neural), discount at higher volumes | [Azure TTS Docs](https://azure.microsoft.com/products/cognitive-services/text-to-speech/) |
|
||||||
|
| **IBM Watson TTS** | 10K chars Lite plan | ~$0.02 /1K (i.e. ~$20 /M). Enterprise options available | [IBM Watson Docs](https://www.ibm.com/cloud/watson-text-to-speech) |
|
||||||
|
| **ElevenLabs** | 10K chars monthly | From ~$5/mo (30K chars) up to $330/mo (2M chars). Enterprise | [ElevenLabs Docs](https://elevenlabs.io) |
|
||||||
|
|
||||||
|
## Example Python Code
|
||||||
|
|
||||||
|
### Amazon Polly
|
||||||
|
```python
|
||||||
|
import boto3
|
||||||
|
|
||||||
|
polly = boto3.client("polly", region_name="us-east-1",
|
||||||
|
aws_access_key_id="YOUR_AWS_ACCESS_KEY_ID",
|
||||||
|
aws_secret_access_key="YOUR_AWS_SECRET_ACCESS_KEY")
|
||||||
|
|
||||||
|
resp = polly.synthesize_speech(
|
||||||
|
Text="Hello from Polly!",
|
||||||
|
OutputFormat="mp3",
|
||||||
|
VoiceId="Joanna"
|
||||||
|
)
|
||||||
|
|
||||||
|
with open("polly.mp3", "wb") as f:
|
||||||
|
f.write(resp["AudioStream"].read())
|
||||||
|
```
|
||||||
|
|
||||||
|
### Google Cloud TTS
|
||||||
|
```python
|
||||||
|
from google.cloud import texttospeech
|
||||||
|
|
||||||
|
client = texttospeech.TextToSpeechClient()
|
||||||
|
input_text = texttospeech.SynthesisInput(text="Hello from Google Cloud TTS!")
|
||||||
|
voice = texttospeech.VoiceSelectionParams(language_code="en-US")
|
||||||
|
audio_cfg = texttospeech.AudioConfig(audio_encoding=texttospeech.AudioEncoding.MP3)
|
||||||
|
|
||||||
|
resp = client.synthesize_speech(input=input_text, voice=voice, audio_config=audio_cfg)
|
||||||
|
|
||||||
|
with open("gcloud_tts.mp3", "wb") as f:
|
||||||
|
f.write(resp.audio_content)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Azure TTS
|
||||||
|
```python
|
||||||
|
import azure.cognitiveservices.speech as speechsdk
|
||||||
|
|
||||||
|
speech_config = speechsdk.SpeechConfig(
|
||||||
|
subscription="AZURE_KEY", region="AZURE_REGION")
|
||||||
|
audio_cfg = speechsdk.audio.AudioConfig(filename="azure_tts.wav")
|
||||||
|
|
||||||
|
synthesizer = speechsdk.SpeechSynthesizer(
|
||||||
|
speech_config=speech_config,
|
||||||
|
audio_config=audio_cfg
|
||||||
|
)
|
||||||
|
|
||||||
|
synthesizer.speak_text_async("Hello from Azure TTS!").get()
|
||||||
|
```
|
||||||
|
|
||||||
|
### IBM Watson TTS
|
||||||
|
```python
|
||||||
|
from ibm_watson import TextToSpeechV1
|
||||||
|
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
|
||||||
|
|
||||||
|
auth = IAMAuthenticator("IBM_API_KEY")
|
||||||
|
service = TextToSpeechV1(authenticator=auth)
|
||||||
|
service.set_service_url("IBM_SERVICE_URL")
|
||||||
|
|
||||||
|
resp = service.synthesize(
|
||||||
|
"Hello from IBM Watson!",
|
||||||
|
voice="en-US_AllisonV3Voice",
|
||||||
|
accept="audio/mp3"
|
||||||
|
).get_result()
|
||||||
|
|
||||||
|
with open("ibm_tts.mp3", "wb") as f:
|
||||||
|
f.write(resp.content)
|
||||||
|
```
|
||||||
|
|
||||||
|
### ElevenLabs
|
||||||
|
```python
|
||||||
|
import requests
|
||||||
|
|
||||||
|
api_key = "ELEVENLABS_KEY"
|
||||||
|
voice_id = "ELEVENLABS_VOICE"
|
||||||
|
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}"
|
||||||
|
headers = {"xi-api-key": api_key, "Content-Type": "application/json"}
|
||||||
|
|
||||||
|
json_data = {
|
||||||
|
"text": "Hello from ElevenLabs!",
|
||||||
|
"voice_settings": {"stability": 0.75, "similarity_boost": 0.75}
|
||||||
|
}
|
||||||
|
|
||||||
|
resp = requests.post(url, headers=headers, json=json_data)
|
||||||
|
|
||||||
|
with open("elevenlabs.mp3", "wb") as f:
|
||||||
|
f.write(resp.content)
|
||||||
|
```
|
||||||
|
|
||||||
|
|
@ -0,0 +1,218 @@
|
||||||
|
---
|
||||||
|
layout: default
|
||||||
|
title: "Vector Databases"
|
||||||
|
parent: "Utility Function"
|
||||||
|
nav_order: 7
|
||||||
|
---
|
||||||
|
|
||||||
|
# Vector Databases
|
||||||
|
|
||||||
|
|
||||||
|
Below is a table of the popular vector search solutions:
|
||||||
|
|
||||||
|
| **Tool** | **Free Tier** | **Pricing Model** | **Docs** |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| **FAISS** | N/A, self-host | Open-source | [Faiss.ai](https://faiss.ai) |
|
||||||
|
| **Pinecone** | 2GB free | From $25/mo | [pinecone.io](https://pinecone.io) |
|
||||||
|
| **Qdrant** | 1GB free cloud | Pay-as-you-go | [qdrant.tech](https://qdrant.tech) |
|
||||||
|
| **Weaviate** | 14-day sandbox | From $25/mo | [weaviate.io](https://weaviate.io) |
|
||||||
|
| **Milvus** | 5GB free cloud | PAYG or $99/mo dedicated | [milvus.io](https://milvus.io) |
|
||||||
|
| **Chroma** | N/A, self-host | Free (Apache 2.0) | [trychroma.com](https://trychroma.com) |
|
||||||
|
| **Redis** | 30MB free | From $5/mo | [redis.io](https://redis.io) |
|
||||||
|
|
||||||
|
---
|
||||||
|
## Example Python Code
|
||||||
|
|
||||||
|
Below are basic usage snippets for each tool.
|
||||||
|
|
||||||
|
### FAISS
|
||||||
|
```python
|
||||||
|
import faiss
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
# Dimensionality of embeddings
|
||||||
|
d = 128
|
||||||
|
|
||||||
|
# Create a flat L2 index
|
||||||
|
index = faiss.IndexFlatL2(d)
|
||||||
|
|
||||||
|
# Random vectors
|
||||||
|
data = np.random.random((1000, d)).astype('float32')
|
||||||
|
index.add(data)
|
||||||
|
|
||||||
|
# Query
|
||||||
|
query = np.random.random((1, d)).astype('float32')
|
||||||
|
D, I = index.search(query, k=5)
|
||||||
|
|
||||||
|
print("Distances:", D)
|
||||||
|
print("Neighbors:", I)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pinecone
|
||||||
|
```python
|
||||||
|
import pinecone
|
||||||
|
|
||||||
|
pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_ENV")
|
||||||
|
|
||||||
|
index_name = "my-index"
|
||||||
|
|
||||||
|
# Create the index if it doesn't exist
|
||||||
|
if index_name not in pinecone.list_indexes():
|
||||||
|
pinecone.create_index(name=index_name, dimension=128)
|
||||||
|
|
||||||
|
# Connect
|
||||||
|
index = pinecone.Index(index_name)
|
||||||
|
|
||||||
|
# Upsert
|
||||||
|
vectors = [
|
||||||
|
("id1", [0.1]*128),
|
||||||
|
("id2", [0.2]*128)
|
||||||
|
]
|
||||||
|
index.upsert(vectors)
|
||||||
|
|
||||||
|
# Query
|
||||||
|
response = index.query([[0.15]*128], top_k=3)
|
||||||
|
print(response)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Qdrant
|
||||||
|
```python
|
||||||
|
import qdrant_client
|
||||||
|
from qdrant_client.models import Distance, VectorParams, PointStruct
|
||||||
|
|
||||||
|
client = qdrant_client.QdrantClient(
|
||||||
|
url="https://YOUR-QDRANT-CLOUD-ENDPOINT",
|
||||||
|
api_key="YOUR_API_KEY"
|
||||||
|
)
|
||||||
|
|
||||||
|
collection = "my_collection"
|
||||||
|
client.recreate_collection(
|
||||||
|
collection_name=collection,
|
||||||
|
vectors_config=VectorParams(size=128, distance=Distance.COSINE)
|
||||||
|
)
|
||||||
|
|
||||||
|
points = [
|
||||||
|
PointStruct(id=1, vector=[0.1]*128, payload={"type": "doc1"}),
|
||||||
|
PointStruct(id=2, vector=[0.2]*128, payload={"type": "doc2"}),
|
||||||
|
]
|
||||||
|
|
||||||
|
client.upsert(collection_name=collection, points=points)
|
||||||
|
|
||||||
|
results = client.search(
|
||||||
|
collection_name=collection,
|
||||||
|
query_vector=[0.15]*128,
|
||||||
|
limit=2
|
||||||
|
)
|
||||||
|
print(results)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Weaviate
|
||||||
|
```python
|
||||||
|
import weaviate
|
||||||
|
|
||||||
|
client = weaviate.Client("https://YOUR-WEAVIATE-CLOUD-ENDPOINT")
|
||||||
|
|
||||||
|
schema = {
|
||||||
|
"classes": [
|
||||||
|
{
|
||||||
|
"class": "Article",
|
||||||
|
"vectorizer": "none"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
client.schema.create(schema)
|
||||||
|
|
||||||
|
obj = {
|
||||||
|
"title": "Hello World",
|
||||||
|
"content": "Weaviate vector search"
|
||||||
|
}
|
||||||
|
client.data_object.create(obj, "Article", vector=[0.1]*128)
|
||||||
|
|
||||||
|
resp = (
|
||||||
|
client.query
|
||||||
|
.get("Article", ["title", "content"])
|
||||||
|
.with_near_vector({"vector": [0.15]*128})
|
||||||
|
.with_limit(3)
|
||||||
|
.do()
|
||||||
|
)
|
||||||
|
print(resp)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Milvus
|
||||||
|
```python
|
||||||
|
from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
connections.connect(alias="default", host="localhost", port="19530")
|
||||||
|
|
||||||
|
fields = [
|
||||||
|
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
|
||||||
|
FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128)
|
||||||
|
]
|
||||||
|
schema = CollectionSchema(fields)
|
||||||
|
collection = Collection("MyCollection", schema)
|
||||||
|
|
||||||
|
emb = np.random.rand(10, 128).astype('float32')
|
||||||
|
ids = list(range(10))
|
||||||
|
collection.insert([ids, emb])
|
||||||
|
|
||||||
|
index_params = {
|
||||||
|
"index_type": "IVF_FLAT",
|
||||||
|
"params": {"nlist": 128},
|
||||||
|
"metric_type": "L2"
|
||||||
|
}
|
||||||
|
collection.create_index("embedding", index_params)
|
||||||
|
collection.load()
|
||||||
|
|
||||||
|
query_emb = np.random.rand(1, 128).astype('float32')
|
||||||
|
results = collection.search(query_emb, "embedding", param={"nprobe": 10}, limit=3)
|
||||||
|
print(results)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Chroma
|
||||||
|
```python
|
||||||
|
import chromadb
|
||||||
|
from chromadb.config import Settings
|
||||||
|
|
||||||
|
client = chromadb.Client(Settings(
|
||||||
|
chroma_db_impl="duckdb+parquet",
|
||||||
|
persist_directory="./chroma_data"
|
||||||
|
))
|
||||||
|
|
||||||
|
coll = client.create_collection("my_collection")
|
||||||
|
|
||||||
|
vectors = [[0.1, 0.2, 0.3], [0.2, 0.2, 0.2]]
|
||||||
|
metas = [{"doc": "text1"}, {"doc": "text2"}]
|
||||||
|
ids = ["id1", "id2"]
|
||||||
|
coll.add(embeddings=vectors, metadatas=metas, ids=ids)
|
||||||
|
|
||||||
|
res = coll.query(query_embeddings=[[0.15, 0.25, 0.3]], n_results=2)
|
||||||
|
print(res)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Redis
|
||||||
|
```python
|
||||||
|
import redis
|
||||||
|
import struct
|
||||||
|
|
||||||
|
r = redis.Redis(host="localhost", port=6379)
|
||||||
|
|
||||||
|
# Create index
|
||||||
|
r.execute_command(
|
||||||
|
"FT.CREATE", "my_idx", "ON", "HASH",
|
||||||
|
"SCHEMA", "embedding", "VECTOR", "FLAT", "6",
|
||||||
|
"TYPE", "FLOAT32", "DIM", "128",
|
||||||
|
"DISTANCE_METRIC", "L2"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Insert
|
||||||
|
vec = struct.pack('128f', *[0.1]*128)
|
||||||
|
r.hset("doc1", mapping={"embedding": vec})
|
||||||
|
|
||||||
|
# Search
|
||||||
|
qvec = struct.pack('128f', *[0.15]*128)
|
||||||
|
q = "*=>[KNN 3 @embedding $BLOB AS dist]"
|
||||||
|
res = r.ft("my_idx").search(q, query_params={"BLOB": qvec})
|
||||||
|
print(res.docs)
|
||||||
|
```
|
||||||
|
|
||||||
|
|
@ -8,7 +8,7 @@ nav_order: 4
|
||||||
|
|
||||||
We recommend some implementations of commonly used web search tools.
|
We recommend some implementations of commonly used web search tools.
|
||||||
|
|
||||||
| **API** | **Free Tier** | **Pricing Model** | **Official API Page** |
|
| **API** | **Free Tier** | **Pricing Model** | **Docs** |
|
||||||
|---------------------------------|-----------------------------------------------|-----------------------------------------------------------------|------------------------------------------------------------------------|
|
|---------------------------------|-----------------------------------------------|-----------------------------------------------------------------|------------------------------------------------------------------------|
|
||||||
| **Google Custom Search JSON API** | 100 queries/day free | $5 per 1000 queries. | [Link](https://developers.google.com/custom-search/v1/overview) |
|
| **Google Custom Search JSON API** | 100 queries/day free | $5 per 1000 queries. | [Link](https://developers.google.com/custom-search/v1/overview) |
|
||||||
| **Bing Web Search API** | 1,000 queries/month | $15–$25 per 1,000 queries. | [Link](https://azure.microsoft.com/en-us/services/cognitive-services/bing-web-search-api/) |
|
| **Bing Web Search API** | 1,000 queries/month | $15–$25 per 1,000 queries. | [Link](https://azure.microsoft.com/en-us/services/cognitive-services/bing-web-search-api/) |
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue