111 lines
2.5 KiB
Markdown
111 lines
2.5 KiB
Markdown
---
|
|
layout: default
|
|
title: "Structured Output"
|
|
parent: "Design"
|
|
nav_order: 1
|
|
---
|
|
|
|
# Structured Output
|
|
|
|
In many use cases, you may want the LLM to output a specific structure, such as a list or a dictionary with predefined keys.
|
|
|
|
There are several approaches to achieve a structured output:
|
|
- **Prompting** the LLM to strictly return a defined structure.
|
|
- Using LLMs that natively support **schema enforcement**.
|
|
- **Post-processing** the LLM's response to extract structured content.
|
|
|
|
In practice, **Prompting** is simple and reliable for modern LLMs.
|
|
|
|
### Example Use Cases
|
|
|
|
- Extracting Key Information
|
|
|
|
```yaml
|
|
product:
|
|
name: Widget Pro
|
|
price: 199.99
|
|
description: |
|
|
A high-quality widget designed for professionals.
|
|
Recommended for advanced users.
|
|
```
|
|
|
|
- Summarizing Documents into Bullet Points
|
|
|
|
```yaml
|
|
summary:
|
|
- This product is easy to use.
|
|
- It is cost-effective.
|
|
- Suitable for all skill levels.
|
|
```
|
|
|
|
- Generating Configuration Files
|
|
|
|
```yaml
|
|
server:
|
|
host: 127.0.0.1
|
|
port: 8080
|
|
ssl: true
|
|
```
|
|
|
|
## Prompt Engineering
|
|
|
|
When prompting the LLM to produce **structured** output:
|
|
1. **Wrap** the structure in code fences (e.g., `yaml`).
|
|
2. **Validate** that all required fields exist (and let `Node` handles retry).
|
|
|
|
### Example Text Summarization
|
|
|
|
```python
|
|
class SummarizeNode(Node):
|
|
def exec(self, prep_res):
|
|
# Suppose `prep_res` is the text to summarize.
|
|
prompt = f"""
|
|
Please summarize the following text as YAML, with exactly 3 bullet points
|
|
|
|
{prep_res}
|
|
|
|
Now, output:
|
|
```yaml
|
|
summary:
|
|
- bullet 1
|
|
- bullet 2
|
|
- bullet 3
|
|
```"""
|
|
response = call_llm(prompt)
|
|
yaml_str = response.split("```yaml")[1].split("```")[0].strip()
|
|
|
|
import yaml
|
|
structured_result = yaml.safe_load(yaml_str)
|
|
|
|
assert "summary" in structured_result
|
|
assert isinstance(structured_result["summary"], list)
|
|
|
|
return structured_result
|
|
```
|
|
|
|
### Why YAML instead of JSON?
|
|
|
|
Current LLMs struggle with escaping. YAML is easier with strings since they don't always need quotes.
|
|
|
|
**In JSON**
|
|
|
|
```json
|
|
{
|
|
"dialogue": "Alice said: \"Hello Bob.\\nHow are you?\\nI am good.\""
|
|
}
|
|
```
|
|
|
|
- Every double quote inside the string must be escaped with `\"`.
|
|
- Each newline in the dialogue must be represented as `\n`.
|
|
|
|
**In YAML**
|
|
|
|
```yaml
|
|
dialogue: |
|
|
Alice said: "Hello Bob.
|
|
How are you?
|
|
I am good."
|
|
```
|
|
|
|
- No need to escape interior quotes—just place the entire text under a block literal (`|`).
|
|
- Newlines are naturally preserved without needing `\n`. |