structured output
This commit is contained in:
parent
118293a641
commit
1d8377f6cd
|
|
@ -8,6 +8,12 @@ nav_order: 1
|
||||||
|
|
||||||
A [100-line](https://github.com/zachary62/miniLLMFlow/blob/main/minillmflow/__init__.py) minimalist LLM framework for *Agents, Task Decomposition, RAG, etc*.
|
A [100-line](https://github.com/zachary62/miniLLMFlow/blob/main/minillmflow/__init__.py) minimalist LLM framework for *Agents, Task Decomposition, RAG, etc*.
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<img src="https://github.com/zachary62/miniLLMFlow/blob/main/assets/minillmflow.jpg?raw=true" width="400"/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
## Core Abstraction
|
||||||
|
|
||||||
We model the LLM workflow as a **Nested Directed Graph**:
|
We model the LLM workflow as a **Nested Directed Graph**:
|
||||||
- **Nodes** handle simple (LLM) tasks.
|
- **Nodes** handle simple (LLM) tasks.
|
||||||
- Nodes connect through **Actions** (labeled edges) for *Agents*.
|
- Nodes connect through **Actions** (labeled edges) for *Agents*.
|
||||||
|
|
@ -16,12 +22,7 @@ We model the LLM workflow as a **Nested Directed Graph**:
|
||||||
- **Batch** Nodes/Flows for data-intensive tasks.
|
- **Batch** Nodes/Flows for data-intensive tasks.
|
||||||
- **Async** Nodes/Flows allow waits or **Parallel** execution
|
- **Async** Nodes/Flows allow waits or **Parallel** execution
|
||||||
|
|
||||||
<div align="center">
|
To learn more:
|
||||||
<img src="https://github.com/zachary62/miniLLMFlow/blob/main/assets/minillmflow.jpg?raw=true" width="400"/>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
## Core Abstraction
|
|
||||||
|
|
||||||
- [Node](./node.md)
|
- [Node](./node.md)
|
||||||
- [Flow](./flow.md)
|
- [Flow](./flow.md)
|
||||||
- [Communication](./communication.md)
|
- [Communication](./communication.md)
|
||||||
|
|
@ -29,20 +30,32 @@ We model the LLM workflow as a **Nested Directed Graph**:
|
||||||
- [(Advanced) Async](./async.md)
|
- [(Advanced) Async](./async.md)
|
||||||
- [(Advanced) Parallel](./parallel.md)
|
- [(Advanced) Parallel](./parallel.md)
|
||||||
|
|
||||||
## Preparation
|
## LLM Wrapper & Tools
|
||||||
|
|
||||||
|
**We DO NOT provide built-in LLM wrappers and tools!**
|
||||||
|
|
||||||
|
I believe it is a *bad practice* to provide low-level implementations in a general framework:
|
||||||
|
- **APIs change frequently.** Hardcoding them makes maintenance a nightmare.
|
||||||
|
- You may need **flexibility.** E.g., using fine-tunined LLMs or deploying local ones.
|
||||||
|
- You may need **optimizations.** E.g., prompt caching, request batching, response streaming...
|
||||||
|
|
||||||
|
We provide some simple example implementations:
|
||||||
- [LLM Wrapper](./llm.md)
|
- [LLM Wrapper](./llm.md)
|
||||||
- [Tool](./tool.md)
|
- [Tool](./tool.md)
|
||||||
|
|
||||||
## Paradigm Implementation
|
## Paradigm
|
||||||
|
|
||||||
|
Based on the core abstraction, we implement common high-level paradigms:
|
||||||
|
|
||||||
|
- [Structured Output](./structure.md)
|
||||||
- Task Decomposition
|
- Task Decomposition
|
||||||
- Agent
|
|
||||||
- Map Reduce
|
|
||||||
- RAG
|
- RAG
|
||||||
- Structured Output
|
- Chat Memory
|
||||||
|
- Map Reduce
|
||||||
|
- Agent
|
||||||
|
- Multi-Agent
|
||||||
- Evaluation
|
- Evaluation
|
||||||
|
|
||||||
## Example Projects
|
## Example Projects
|
||||||
|
|
||||||
- TODO
|
- Coming soon ...
|
||||||
|
|
|
||||||
|
|
@ -62,9 +62,3 @@ def call_llm(prompt):
|
||||||
return response
|
return response
|
||||||
```
|
```
|
||||||
|
|
||||||
## Why Not Provide a Built-in LLM Wrapper?
|
|
||||||
I believe it is a **bad practice** to provide LLM-specific implementations in a general framework:
|
|
||||||
- **LLM APIs change frequently**. Hardcoding them makes maintenance a nighmare.
|
|
||||||
- You may need **flexibility** to switch vendors, use fine-tuned models, or deploy local LLMs.
|
|
||||||
- You may need **optimizations** like prompt caching, request batching, or response streaming.
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,6 @@
|
||||||
|
---
|
||||||
|
layout: default
|
||||||
|
title: "Paradigm"
|
||||||
|
nav_order: 4
|
||||||
|
has_children: true
|
||||||
|
---
|
||||||
|
|
@ -0,0 +1,111 @@
|
||||||
|
---
|
||||||
|
layout: default
|
||||||
|
title: "Structured Output"
|
||||||
|
parent: "Paradigm"
|
||||||
|
nav_order: 1
|
||||||
|
---
|
||||||
|
|
||||||
|
# Structured Output
|
||||||
|
|
||||||
|
In many use cases, you may want the LLM to output a specific structure, such as a list or a dictionary with predefined keys.
|
||||||
|
|
||||||
|
There are several approaches to achieve a structured output:
|
||||||
|
- **Prompting** the LLM to strictly return a defined structure.
|
||||||
|
- Using LLMs that natively support **schema enforcement**.
|
||||||
|
- **Post-processing** the LLM’s response to extract structured content.
|
||||||
|
|
||||||
|
In practice, **Prompting** is simple and reliable for modern LLMs.
|
||||||
|
|
||||||
|
## Example Use Cases
|
||||||
|
|
||||||
|
1. **Extracting Key Information**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
product:
|
||||||
|
name: Widget Pro
|
||||||
|
price: 199.99
|
||||||
|
description: |
|
||||||
|
A high-quality widget designed for professionals.
|
||||||
|
Recommended for advanced users.
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Summarizing Documents into Bullet Points**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
summary:
|
||||||
|
- This product is easy to use.
|
||||||
|
- It is cost-effective.
|
||||||
|
- Suitable for all skill levels.
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Generating Configuration Files**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
server:
|
||||||
|
host: 127.0.0.1
|
||||||
|
port: 8080
|
||||||
|
ssl: true
|
||||||
|
```
|
||||||
|
|
||||||
|
## Prompt Engineering
|
||||||
|
|
||||||
|
When prompting the LLM to produce **structured** output:
|
||||||
|
1. **Wrap** the structure in code fences (e.g., ```yaml).
|
||||||
|
2. **Validate** that all required fields exist (and retry if necessary).
|
||||||
|
|
||||||
|
### Example Text Summarization
|
||||||
|
|
||||||
|
```python
|
||||||
|
class SummarizeNode(Node):
|
||||||
|
def exec(self, prep_res):
|
||||||
|
# Suppose `prep_res` is the text to summarize.
|
||||||
|
prompt = f"""
|
||||||
|
Please summarize the following text as YAML, with exactly 3 bullet points
|
||||||
|
|
||||||
|
{prep_res}
|
||||||
|
|
||||||
|
Now, output:
|
||||||
|
```yaml
|
||||||
|
summary:
|
||||||
|
- bullet 1
|
||||||
|
- bullet 2
|
||||||
|
- bullet 3
|
||||||
|
```"""
|
||||||
|
response = call_llm(prompt)
|
||||||
|
yaml_str = response.split("```yaml")[1].split("```")[0].strip()
|
||||||
|
|
||||||
|
import yaml
|
||||||
|
structured_result = yaml.safe_load(yaml_str)
|
||||||
|
|
||||||
|
assert "summary" in structured_result
|
||||||
|
assert isinstance(structured_result["summary"], list)
|
||||||
|
|
||||||
|
return structured_result
|
||||||
|
```
|
||||||
|
|
||||||
|
### Why YAML instead of JSON?
|
||||||
|
|
||||||
|
Current LLMs struggle with escaping. YAML is easier with strings since they don’t always need quotes.
|
||||||
|
|
||||||
|
**In JSON**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"dialogue": "Alice said: \"Hello Bob.\\nHow are you?\\nI am good.\""
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Every double quote inside the string must be escaped with `\"`.
|
||||||
|
- Each newline in the dialogue must be represented as `\n`.
|
||||||
|
|
||||||
|
**In YAML**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
dialogue: |
|
||||||
|
Alice said: "Hello Bob.
|
||||||
|
How are you?
|
||||||
|
I am good."
|
||||||
|
```
|
||||||
|
|
||||||
|
- No need to escape interior quotes—just place the entire text under a block literal (`|`).
|
||||||
|
- Newlines are naturally preserved without needing `\n`.
|
||||||
Loading…
Reference in New Issue