diff --git a/docs/index.md b/docs/index.md index 3258b51..cd3c002 100644 --- a/docs/index.md +++ b/docs/index.md @@ -8,6 +8,12 @@ nav_order: 1 A [100-line](https://github.com/zachary62/miniLLMFlow/blob/main/minillmflow/__init__.py) minimalist LLM framework for *Agents, Task Decomposition, RAG, etc*. +
+ +
+ +## Core Abstraction + We model the LLM workflow as a **Nested Directed Graph**: - **Nodes** handle simple (LLM) tasks. - Nodes connect through **Actions** (labeled edges) for *Agents*. @@ -16,12 +22,7 @@ We model the LLM workflow as a **Nested Directed Graph**: - **Batch** Nodes/Flows for data-intensive tasks. - **Async** Nodes/Flows allow waits or **Parallel** execution -
- -
- -## Core Abstraction - +To learn more: - [Node](./node.md) - [Flow](./flow.md) - [Communication](./communication.md) @@ -29,20 +30,32 @@ We model the LLM workflow as a **Nested Directed Graph**: - [(Advanced) Async](./async.md) - [(Advanced) Parallel](./parallel.md) -## Preparation +## LLM Wrapper & Tools +**We DO NOT provide built-in LLM wrappers and tools!** + +I believe it is a *bad practice* to provide low-level implementations in a general framework: +- **APIs change frequently.** Hardcoding them makes maintenance a nightmare. +- You may need **flexibility.** E.g., using fine-tunined LLMs or deploying local ones. +- You may need **optimizations.** E.g., prompt caching, request batching, response streaming... + +We provide some simple example implementations: - [LLM Wrapper](./llm.md) - [Tool](./tool.md) -## Paradigm Implementation +## Paradigm +Based on the core abstraction, we implement common high-level paradigms: + +- [Structured Output](./structure.md) - Task Decomposition -- Agent -- Map Reduce - RAG -- Structured Output +- Chat Memory +- Map Reduce +- Agent +- Multi-Agent - Evaluation ## Example Projects -- TODO +- Coming soon ... diff --git a/docs/llm.md b/docs/llm.md index 9229940..55e44ab 100644 --- a/docs/llm.md +++ b/docs/llm.md @@ -62,9 +62,3 @@ def call_llm(prompt): return response ``` -## Why Not Provide a Built-in LLM Wrapper? -I believe it is a **bad practice** to provide LLM-specific implementations in a general framework: -- **LLM APIs change frequently**. Hardcoding them makes maintenance a nighmare. -- You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs. -- You may need **optimizations** like prompt caching, request batching, or response streaming. - diff --git a/docs/paradigm.md b/docs/paradigm.md new file mode 100644 index 0000000..ec5072f --- /dev/null +++ b/docs/paradigm.md @@ -0,0 +1,6 @@ +--- +layout: default +title: "Paradigm" +nav_order: 4 +has_children: true +--- \ No newline at end of file diff --git a/docs/structure.md b/docs/structure.md new file mode 100644 index 0000000..1138f5c --- /dev/null +++ b/docs/structure.md @@ -0,0 +1,111 @@ +--- +layout: default +title: "Structured Output" +parent: "Paradigm" +nav_order: 1 +--- + +# Structured Output + +In many use cases, you may want the LLM to output a specific structure, such as a list or a dictionary with predefined keys. + +There are several approaches to achieve a structured output: +- **Prompting** the LLM to strictly return a defined structure. +- Using LLMs that natively support **schema enforcement**. +- **Post-processing** the LLM’s response to extract structured content. + +In practice, **Prompting** is simple and reliable for modern LLMs. + +## Example Use Cases + +1. **Extracting Key Information** + +```yaml +product: + name: Widget Pro + price: 199.99 + description: | + A high-quality widget designed for professionals. + Recommended for advanced users. +``` + +2. **Summarizing Documents into Bullet Points** + +```yaml +summary: + - This product is easy to use. + - It is cost-effective. + - Suitable for all skill levels. +``` + +3. **Generating Configuration Files** + +```yaml +server: + host: 127.0.0.1 + port: 8080 + ssl: true +``` + +## Prompt Engineering + +When prompting the LLM to produce **structured** output: +1. **Wrap** the structure in code fences (e.g., ```yaml). +2. **Validate** that all required fields exist (and retry if necessary). + +### Example Text Summarization + +```python +class SummarizeNode(Node): + def exec(self, prep_res): + # Suppose `prep_res` is the text to summarize. + prompt = f""" +Please summarize the following text as YAML, with exactly 3 bullet points + +{prep_res} + +Now, output: +```yaml +summary: + - bullet 1 + - bullet 2 + - bullet 3 +```""" + response = call_llm(prompt) + yaml_str = response.split("```yaml")[1].split("```")[0].strip() + + import yaml + structured_result = yaml.safe_load(yaml_str) + + assert "summary" in structured_result + assert isinstance(structured_result["summary"], list) + + return structured_result +``` + +### Why YAML instead of JSON? + +Current LLMs struggle with escaping. YAML is easier with strings since they don’t always need quotes. + +**In JSON** + +```json +{ + "dialogue": "Alice said: \"Hello Bob.\\nHow are you?\\nI am good.\"" +} +``` + +- Every double quote inside the string must be escaped with `\"`. +- Each newline in the dialogue must be represented as `\n`. + +**In YAML** + +```yaml +dialogue: | + Alice said: "Hello Bob. + How are you? + I am good." +``` + +- No need to escape interior quotes—just place the entire text under a block literal (`|`). +- Newlines are naturally preserved without needing `\n`. \ No newline at end of file