Stop Fighting JSON Parsing Errors in Your LLM Apps
How DSPy.rb's new reliability features make JSON extraction from LLMs actually reliable
Vicente Reig
Fractional Engineering Lead •
If you’ve built anything with LLMs, you know the pain. You carefully craft a prompt asking for JSON output, the model responds with something that looks like JSON, and then… JSON::ParserError
.
Maybe it wrapped the JSON in markdown code blocks. Maybe it added a helpful explanation before the actual data. Maybe it just forgot a comma. Whatever the reason, you’re now debugging string manipulation instead of building features.
DSPy.rb just shipped reliability features that make this problem (mostly) go away.
The Problem We’re Solving
Here’s what typically happens when you need structured data from an LLM:
response = lm.chat(messages: [{
role: "user",
content: "Extract product details as JSON: #{product_description}"
}])
# This works... sometimes
data = JSON.parse(response.content) # 💥 JSON::ParserError
You end up writing defensive code like this:
# Please no more of this
json_match = response.content.match(/```json\n(.*?)\n```/m) ||
response.content.match(/\{.*\}/m)
data = JSON.parse(json_match[1]) rescue nil
The Solution: Provider-Optimized Strategies
DSPy.rb now automatically selects the best JSON extraction strategy based on your LLM provider and model. No configuration needed - it just works.
For OpenAI and Gemini Users
If you’re using OpenAI’s GPT-4/GPT-4o or Google’s Gemini models, DSPy.rb automatically uses native structured outputs:
# OpenAI structured outputs
lm = DSPy::LM.new("openai/gpt-4o-mini",
api_key: ENV["OPENAI_API_KEY"],
structured_outputs: true)
# Gemini structured outputs (gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash-exp)
lm = DSPy::LM.new("gemini/gemini-1.5-flash",
api_key: ENV["GEMINI_API_KEY"],
structured_outputs: true)
class ProductExtractor < DSPy::Signature
input do
const :description, String
end
output do
const :name, String
const :price, Float
const :in_stock, T::Boolean
end
end
# This now returns guaranteed valid JSON
predict = DSPy::Predict.new(ProductExtractor)
result = predict.forward(description: "iPhone 15 Pro - $999, available now")
# => { name: "iPhone 15 Pro", price: 999.0, in_stock: true }
No more parsing errors. Both OpenAI and Gemini guarantee valid JSON when using their native structured output features.
For Anthropic Users
Claude users get the battle-tested 4-pattern extraction that handles Claude’s various response formats:
lm = DSPy::LM.new("anthropic/claude-3-haiku-20240307",
api_key: ENV["ANTHROPIC_API_KEY"])
# Same code, optimized extraction for Claude
predict = DSPy::Predict.new(ProductExtractor)
result = predict.forward(description: "MacBook Air M3 - $1199")
For Everything Else
Models without special support get enhanced prompting that explicitly asks for clean JSON and tries multiple extraction patterns:
# Works with any model
lm = DSPy::LM.new("ollama/llama2", base_url: "http://localhost:11434")
Reliability Features That Actually Matter
Automatic Retries with Fallback
Sometimes things fail. Networks hiccup. Models have bad days. DSPy.rb now retries intelligently:
- First attempt with the optimal strategy
- Retry with exponential backoff if parsing fails
- Fallback to the next best strategy if retries exhausted
- Progressive degradation through all available strategies
This happens automatically. You don’t need to configure anything.
Smart Caching
Schema conversion and capability detection are now cached:
- Schema caching: OpenAI schemas cached for 1 hour
- Capability caching: Model capabilities cached for 24 hours
- Thread-safe: Works correctly in multi-threaded apps
This means the second request is always faster than the first.
Better Error Messages
When things do go wrong, you get useful errors:
Failed to parse LLM response as JSON: unexpected token.
Original content length: 156 chars
Not just “invalid JSON” - you get context to actually debug the issue.
Configuration Per Provider
Enable native structured outputs for maximum reliability:
DSPy.configure do |config|
# OpenAI: Native structured outputs (highly recommended)
config.lm = DSPy::LM.new(
"openai/gpt-4o-mini",
api_key: ENV["OPENAI_API_KEY"],
structured_outputs: true
)
# Gemini: Native structured outputs (highly recommended)
# config.lm = DSPy::LM.new(
# "gemini/gemini-2.5-flash",
# api_key: ENV["GEMINI_API_KEY"],
# structured_outputs: true
# )
# Anthropic: Tool-based extraction (default, recommended)
# config.lm = DSPy::LM.new(
# "anthropic/claude-sonnet-4-5-20250929",
# api_key: ENV["ANTHROPIC_API_KEY"],
# structured_outputs: true # Default
# )
# Anthropic: Enhanced prompting (alternative)
# config.lm = DSPy::LM.new(
# "anthropic/claude-sonnet-4-5-20250929",
# api_key: ENV["ANTHROPIC_API_KEY"],
# structured_outputs: false
# )
end
Real Performance Impact
In our testing with production workloads:
- OpenAI + structured outputs: Near-zero JSON parsing errors
- Gemini + structured outputs: Near-zero JSON parsing errors
- Anthropic tool extraction: <0.1% errors
- Enhanced prompting fallback: ~0.5% errors
The robust extraction strategies handle edge cases gracefully, providing reliable parsing across all providers.
Migration is Seamless
If you’re already using DSPy.rb, you get these improvements automatically. Your existing code continues to work, just more reliably:
# Existing code - no changes needed
class SentimentAnalysis < DSPy::Signature
input do
const :text, String
end
output do
const :sentiment, String
const :confidence, Float
end
end
# This is now more reliable
analyzer = DSPy::Predict.new(SentimentAnalysis)
result = analyzer.forward(text: "This library is amazing!")
What’s Next
This is part of our broader push to make DSPy.rb the most reliable way to build LLM applications in Ruby. We’re focusing on:
- Streaming support for real-time applications
- Batch processing optimizations
- Provider-specific optimizations for Gemini, Cohere, and others
Try It Now
gem install dspy
Or in your Gemfile:
gem 'dspy', '~> 0.9.0'
Check out the documentation for more examples, or dive into the reliability features guide for advanced usage.
Building something cool with DSPy.rb? I’d love to hear about it - @vicentereig