111 lines
2.5 KiB
Markdown
111 lines
2.5 KiB
Markdown
---
|
||
layout: default
|
||
title: "Structured Output"
|
||
parent: "Paradigm"
|
||
nav_order: 1
|
||
---
|
||
|
||
# Structured Output
|
||
|
||
In many use cases, you may want the LLM to output a specific structure, such as a list or a dictionary with predefined keys.
|
||
|
||
There are several approaches to achieve a structured output:
|
||
- **Prompting** the LLM to strictly return a defined structure.
|
||
- Using LLMs that natively support **schema enforcement**.
|
||
- **Post-processing** the LLM’s response to extract structured content.
|
||
|
||
In practice, **Prompting** is simple and reliable for modern LLMs.
|
||
|
||
## Example Use Cases
|
||
|
||
1. **Extracting Key Information**
|
||
|
||
```yaml
|
||
product:
|
||
name: Widget Pro
|
||
price: 199.99
|
||
description: |
|
||
A high-quality widget designed for professionals.
|
||
Recommended for advanced users.
|
||
```
|
||
|
||
2. **Summarizing Documents into Bullet Points**
|
||
|
||
```yaml
|
||
summary:
|
||
- This product is easy to use.
|
||
- It is cost-effective.
|
||
- Suitable for all skill levels.
|
||
```
|
||
|
||
3. **Generating Configuration Files**
|
||
|
||
```yaml
|
||
server:
|
||
host: 127.0.0.1
|
||
port: 8080
|
||
ssl: true
|
||
```
|
||
|
||
## Prompt Engineering
|
||
|
||
When prompting the LLM to produce **structured** output:
|
||
1. **Wrap** the structure in code fences (e.g., ```yaml).
|
||
2. **Validate** that all required fields exist (and retry if necessary).
|
||
|
||
### Example Text Summarization
|
||
|
||
```python
|
||
class SummarizeNode(Node):
|
||
def exec(self, prep_res):
|
||
# Suppose `prep_res` is the text to summarize.
|
||
prompt = f"""
|
||
Please summarize the following text as YAML, with exactly 3 bullet points
|
||
|
||
{prep_res}
|
||
|
||
Now, output:
|
||
```yaml
|
||
summary:
|
||
- bullet 1
|
||
- bullet 2
|
||
- bullet 3
|
||
```"""
|
||
response = call_llm(prompt)
|
||
yaml_str = response.split("```yaml")[1].split("```")[0].strip()
|
||
|
||
import yaml
|
||
structured_result = yaml.safe_load(yaml_str)
|
||
|
||
assert "summary" in structured_result
|
||
assert isinstance(structured_result["summary"], list)
|
||
|
||
return structured_result
|
||
```
|
||
|
||
### Why YAML instead of JSON?
|
||
|
||
Current LLMs struggle with escaping. YAML is easier with strings since they don’t always need quotes.
|
||
|
||
**In JSON**
|
||
|
||
```json
|
||
{
|
||
"dialogue": "Alice said: \"Hello Bob.\\nHow are you?\\nI am good.\""
|
||
}
|
||
```
|
||
|
||
- Every double quote inside the string must be escaped with `\"`.
|
||
- Each newline in the dialogue must be represented as `\n`.
|
||
|
||
**In YAML**
|
||
|
||
```yaml
|
||
dialogue: |
|
||
Alice said: "Hello Bob.
|
||
How are you?
|
||
I am good."
|
||
```
|
||
|
||
- No need to escape interior quotes—just place the entire text under a block literal (`|`).
|
||
- Newlines are naturally preserved without needing `\n`. |