Output Control with JSON and Schemas

— Free-form AI output is fragile in production. This article explains how to use JSON and schema validation to make LLM outputs safer, more predictable, and easier to integrate with deterministic systems.

level: intermediate topics: prompting tags: prompting, schemas, validation, production, reliability

The Problem with Free-Form Output

Unstructured AI output is a reliability hazard.

Consider this prompt:

Extract the customer's name, email, and order number from this message.

The model might respond:

All are correct, but none are programmatically parseable without fragile regex.


JSON as Output Format

Basic JSON Constraint

prompt = f"""
Extract customer information.

Input: {message}

Output format: JSON
{{
    "name": "string",
    "email": "string",
    "order_number": "string"
}}

Extract:
"""

Benefits:

  • Parseable output
  • Type expectations clear
  • Consistent structure

Schema-Driven Prompting

Define Schema First

from pydantic import BaseModel, EmailStr

class CustomerInfo(BaseModel):
    name: str
    email: EmailStr
    order_number: str

# Generate prompt from schema
prompt = f"""
Extract customer information.

Input: {message}

Output format (JSON):
{CustomerInfo.model_json_schema()}

Extract:
"""

Validate Before Using

response = llm.generate(prompt)

try:
    # Parse and validate
    data = CustomerInfo.model_validate_json(response)

    # Now safe to use
    send_email(data.email)
    lookup_order(data.order_number)

except ValidationError as e:
    # Handle invalid output
    log_error("Schema validation failed", e)
    retry_with_clarification()

Benefits:

  • Type safety
  • Validation errors caught early
  • Self-documenting code

Nested and Complex Schemas

Multi-Level Data

class Product(BaseModel):
    name: str
    price: float
    quantity: int

class Order(BaseModel):
    customer_name: str
    customer_email: EmailStr
    products: list[Product]
    total: float
    notes: str | None = None

prompt = f"""
Extract order information.

Input: {order_text}

Output format (JSON):
{{
    "customer_name": "string",
    "customer_email": "email",
    "products": [
        {{"name": "str", "price": "float", "quantity": "int"}}
    ],
    "total": "float",
    "notes": "string or null"
}}

Extract:
"""

Enum Constraints

Problem: Unreliable Classification

# Free-form response
prompt = "Classify this email as urgent, normal, or low priority"
# Might return: "Urgent", "URGENT", "urgent!", "very urgent", "high priority"

Solution: Enum Schema

from enum import Enum

class Priority(str, Enum):
    URGENT = "urgent"
    NORMAL = "normal"
    LOW = "low"

class EmailClassification(BaseModel):
    priority: Priority
    category: Literal["support", "sales", "billing"]
    requires_response: bool

prompt = f"""
Classify this email.

Input: {email_text}

Output format (JSON):
{{
    "priority": "urgent" | "normal" | "low",
    "category": "support" | "sales" | "billing",
    "requires_response": true | false
}}

Classify:
"""

Benefits:

  • Only valid values accepted
  • No parsing ambiguity
  • Downstream code doesn’t break

Optional vs Required Fields

class ProductReview(BaseModel):
    rating: int  # Required
    review_text: str  # Required
    reviewer_name: str | None = None  # Optional
    would_recommend: bool = True  # Default value

prompt = f"""
Extract product review.

Input: {review}

Output format (JSON):
{{
    "rating": int (1-5, required),
    "review_text": "string (required)",
    "reviewer_name": "string or null (optional)",
    "would_recommend": bool (default: true)
}}

Extract:
"""

Validation Rules Beyond Types

Field Constraints

from pydantic import Field, field_validator

class UserProfile(BaseModel):
    username: str = Field(min_length=3, max_length=20, pattern="^[a-zA-Z0-9_]+$")
    age: int = Field(ge=13, le=120)
    bio: str = Field(max_length=500)

    @field_validator('username')
    def username_must_not_be_profane(cls, v):
        if is_profane(v):
            raise ValueError('Username contains inappropriate content')
        return v

Handling Extraction Failures

Graceful Degradation

class ExtractionResult(BaseModel):
    success: bool
    data: dict | None = None
    error: str | None = None
    confidence: Literal["high", "medium", "low"]

prompt = f"""
Extract structured data from text.

Input: {text}

Output format (JSON):
{{
    "success": true | false,
    "data": {{...}} | null,
    "error": "string if success=false" | null,
    "confidence": "high" | "medium" | "low"
}}

Rules:
- If extraction succeeds, set success=true and populate data
- If text is ambiguous or incomplete, set success=false, error="reason"
- Always provide confidence level

Extract:
"""

result = ExtractionResult.model_validate_json(response)

if result.success:
    process_data(result.data)
elif result.confidence == "low":
    request_human_review(text)
else:
    log_error(result.error)

Array Constraints

class Article(BaseModel):
    title: str
    tags: list[str] = Field(min_length=1, max_length=5)
    authors: list[str] = Field(min_length=1)
    related_articles: list[str] = Field(default_factory=list)

prompt = f"""
Extract article metadata.

Output format (JSON):
{{
    "title": "string",
    "tags": ["string"] (1-5 tags required),
    "authors": ["string"] (at least 1 required),
    "related_articles": ["string"] (optional, can be empty)
}}
"""

Unions and Discriminated Types

class ErrorResponse(BaseModel):
    type: Literal["error"]
    message: str
    code: str

class SuccessResponse(BaseModel):
    type: Literal["success"]
    data: dict

Response = ErrorResponse | SuccessResponse

def parse_response(response_json: str) -> Response:
    data = json.loads(response_json)
    if data["type"] == "error":
        return ErrorResponse.model_validate(data)
    else:
        return SuccessResponse.model_validate(data)

Schema Evolution

Version Schemas

class OrderV1(BaseModel):
    customer_name: str
    items: list[str]
    total: float

class OrderV2(BaseModel):
    customer_name: str
    customer_email: EmailStr  # New required field
    items: list[dict]  # Now structured
    total: float
    tax: float = 0.0  # New optional field

# Use appropriate schema based on context
if api_version == "v1":
    schema = OrderV1
else:
    schema = OrderV2

Real-World Example: Form Extraction

class Address(BaseModel):
    street: str
    city: str
    state: str = Field(pattern="^[A-Z]{2}$")
    zip_code: str = Field(pattern="^\\d{5}(-\\d{4})?$")

class ContactForm(BaseModel):
    first_name: str = Field(min_length=1)
    last_name: str = Field(min_length=1)
    email: EmailStr
    phone: str = Field(pattern="^\\+?1?\\d{10,}$")
    address: Address
    inquiry_type: Literal["sales", "support", "general"]
    message: str = Field(min_length=10, max_length=1000)

prompt = f"""
Extract contact form information.

Input: {form_text}

Output format (JSON):
{ContactForm.model_json_schema()}

Validation rules:
- first_name, last_name: Non-empty strings
- email: Valid email format
- phone: US format, 10+ digits
- state: 2-letter abbreviation (e.g., "CA")
- zip_code: 5 digits or 5+4 format
- inquiry_type: One of: sales, support, general
- message: 10-1000 characters

Extract:
"""

try:
    form = ContactForm.model_validate_json(llm.generate(prompt))
    # All validation passed, safe to process
    process_form(form)
except ValidationError as e:
    # Send back to LLM with error details for retry
    retry_prompt = f"""
Previous extraction failed validation:
{e}

Please re-extract following all rules exactly.
Input: {form_text}
"""

Performance Optimization

Caching Schema Strings

# Don't regenerate schema in every prompt
SCHEMAS = {
    "contact_form": ContactForm.model_json_schema(),
    "order": Order.model_json_schema(),
    "review": Review.model_json_schema()
}

# Reuse cached schemas
prompt = f"""
Input: {data}
Output format: {SCHEMAS['contact_form']}
"""

Streaming + Validation

async def extract_with_streaming(prompt: str, schema: type[BaseModel]):
    buffer = ""
    async for chunk in llm.stream(prompt):
        buffer += chunk

        # Try parsing incrementally
        try:
            obj = schema.model_validate_json(buffer)
            return obj  # Valid JSON received
        except:
            continue  # Keep accumulating

    raise ValueError("Stream ended without valid JSON")

Common Mistakes

❌ No validation, just parse JSON

# Dangerous: Assumes JSON is correct
data = json.loads(response)
send_email(data["email"])  # Might not be valid email

❌ Overly complex schemas

# Too nested, LLMs struggle with deep nesting
class Level5Nested(BaseModel):
    a: dict[str, list[dict[str, list[dict]]]]

❌ Not handling validation failures

# Missing try/except means unhandled exceptions
data = Schema.model_validate_json(response)

Best Practices

  1. Define schemas before prompting
  2. Always validate before using
  3. Keep schemas simple (max 3 levels deep)
  4. Provide clear examples in prompts
  5. Log validation failures for debugging
  6. Retry with error feedback when validation fails
  7. Version your schemas

Conclusion

JSON + Schema validation transforms AI from unreliable text generator to structured data source.

Key benefits:

  • Predictable outputs: Same structure every time
  • Type safety: Downstream code doesn’t break
  • Validation: Catch errors before they propagate
  • Maintainability: Schemas document expected outputs

Free-form text is fine for humans. Production systems need structure.

Continue learning

Next in this path

Debugging Bad Prompts Systematically

When AI outputs fail, random prompt tweaking is not debugging. This article presents a systematic methodology for identifying, reproducing, and fixing prompt-related failures in production systems.

Intentional links