3.9 KiB

Raw Blame History

layout	title	parent	nav_order
default	LLM Wrapper	Utility Function	1

LLM Wrappers

We don't provide built-in LLM wrappers. Instead, please implement your own or check out libraries like litellm. Here, we provide some minimal example implementations:

OpenAI

def call_llm(prompt):
    from openai import OpenAI
    client = OpenAI(api_key="YOUR_API_KEY_HERE")
    r = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    return r.choices[0].message.content

# Example usage
call_llm("How are you?")

Store the API key in an environment variable like OPENAI_API_KEY for security. {: .best-practice }

Claude (Anthropic)

def call_llm(prompt):
    from anthropic import Anthropic
    client = Anthropic(api_key="YOUR_API_KEY_HERE")
    response = client.messages.create(
        model="claude-2",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=100
    )
    return response.content

Google (Generative AI Studio / PaLM API)

def call_llm(prompt):
    import google.generativeai as genai
    genai.configure(api_key="YOUR_API_KEY_HERE")
    response = genai.generate_text(
        model="models/text-bison-001",
        prompt=prompt
    )
    return response.result

Azure (Azure OpenAI)

def call_llm(prompt):
    from openai import AzureOpenAI
    client = AzureOpenAI(
        azure_endpoint="https://<YOUR_RESOURCE_NAME>.openai.azure.com/",
        api_key="YOUR_API_KEY_HERE",
        api_version="2023-05-15"
    )
    r = client.chat.completions.create(
        model="<YOUR_DEPLOYMENT_NAME>",
        messages=[{"role": "user", "content": prompt}]
    )
    return r.choices[0].message.content

Ollama (Local LLM)

def call_llm(prompt):
    from ollama import chat
    response = chat(
        model="llama2",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.message.content

Improvements

Feel free to enhance your call_llm function as needed. Here are examples:

Handle chat history:

def call_llm(messages):
    from openai import OpenAI
    client = OpenAI(api_key="YOUR_API_KEY_HERE")
    r = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    return r.choices[0].message.content

Add in-memory caching

from functools import lru_cache

@lru_cache(maxsize=1000)
def call_llm(prompt):
    # Your implementation here
    pass

⚠️ Caching conflicts with Node retries, as retries yield the same result.

To address this, you could use cached results only if not retried. {: .warning }

from functools import lru_cache

@lru_cache(maxsize=1000)
def cached_call(prompt):
    pass

def call_llm(prompt, use_cache):
    if use_cache:
        return cached_call(prompt)
    # Call the underlying function directly
    return cached_call.__wrapped__(prompt)

class SummarizeNode(Node):
    def exec(self, text):
        return call_llm(f"Summarize: {text}", self.cur_retry==0)

Enable logging:

def call_llm(prompt):
    import logging
    logging.info(f"Prompt: {prompt}")
    response = ... # Your implementation here
    logging.info(f"Response: {response}")
    return response

Why Not Provide Built-in LLM Wrappers?

I believe it is a bad practice to provide LLM-specific implementations in a general framework:

LLM APIs change frequently. Hardcoding them makes maintenance a nightmare.
You may need flexibility to switch vendors, use fine-tuned models, or deploy local LLMs.
You may need optimizations like prompt caching, request batching, or response streaming.

3.9 KiB Raw Blame History

LLM Wrappers

Improvements

Why Not Provide Built-in LLM Wrappers?

3.9 KiB

Raw Blame History