Observability
DSPy.rb provides a simple, lightweight observability system based on structured logging and span tracking. The system is designed to be OTEL/Langfuse compatible while maintaining minimal overhead.
Overview
The observability system offers:
- Span Tracking: Trace operations with parent-child relationships
- Structured Logging: JSON or key=value format based on environment
- Context Propagation: Automatic trace correlation across operations
- GenAI Conventions: Following OpenTelemetry semantic conventions for LLMs
- Zero Dependencies: No external instrumentation libraries required
- Thread-Safe: Isolated context per thread
Architecture
DSPy.rb uses a simple Context system for observability:
# lib/dspy/context.rb - ~50 lines total
module DSPy
class Context
# Thread-local storage for trace context
# Manages span stack and trace IDs
end
end
Basic Configuration
Enable Logging
DSPy.configure do |config|
# Configure logger (uses Dry::Logger)
config.logger = Dry.Logger(:dspy)
# Or with custom configuration
config.logger = Dry.Logger(:dspy, formatter: :json) do |logger|
logger.add_backend(stream: "log/production.log")
end
end
Environment-Aware Formatting
The logger automatically detects the environment:
- Production (
RAILS_ENV=production
orRACK_ENV=production
): JSON format - Development/Test: Key=value format for readability
Using the Context System
Basic Span Tracking
# Wrap any operation in a span
DSPy::Context.with_span(
operation: 'my_operation',
'my.attribute' => 'value'
) do
# Your code here
result = perform_work()
# Log events within the span
DSPy.log('my.event', result: result, status: 'success')
result
end
Nested Spans
Spans automatically track parent-child relationships:
DSPy::Context.with_span(operation: 'parent_operation') do
# Parent span
DSPy::Context.with_span(operation: 'child_operation') do
# Child span - automatically linked to parent
end
end
Accessing Current Context
# Get current trace context
context = DSPy::Context.current
# => { trace_id: "uuid", span_stack: ["parent_id", "current_id"] }
# Check if in a span
if DSPy::Context.current[:span_stack].any?
# Currently within a span
end
Semantic Conventions
DSPy.rb follows GenAI semantic conventions for LLM operations:
LLM Operations
# Automatically added by DSPy::LM
DSPy::Context.with_span(
operation: 'llm.generate',
'gen_ai.system' => 'openai',
'gen_ai.request.model' => 'gpt-4',
'gen_ai.usage.prompt_tokens' => 150,
'gen_ai.usage.completion_tokens' => 50
) do
# LLM call
end
Module Operations
# Automatically added by DSPy modules
DSPy::Context.with_span(
operation: 'dspy.predict',
'dspy.module' => 'ChainOfThought',
'dspy.signature' => 'QuestionAnswering'
) do
# Module execution
end
Reading Logs
JSON Format (Production)
{
"timestamp": "2024-01-15T10:30:45.123Z",
"level": "INFO",
"event": "llm.generate",
"trace_id": "123e4567-e89b-12d3-a456-426614174000",
"span_id": "987fcdeb-51a2-43f1-9012-345678901234",
"parent_span_id": "abcdef12-3456-7890-abcd-ef1234567890",
"gen_ai.system": "openai",
"gen_ai.request.model": "gpt-4",
"duration_ms": 1250
}
Key=Value Format (Development)
timestamp=2024-01-15T10:30:45.123Z level=INFO event=llm.generate trace_id=123e4567 span_id=987fcdeb parent_span_id=abcdef12 gen_ai.system=openai gen_ai.request.model=gpt-4 duration_ms=1250
Integration with External Systems
Exporting to OpenTelemetry
Since DSPy.rb logs follow OTEL conventions, you can use log forwarding:
# Use a log forwarder to send JSON logs to OTEL collector
# Configure your infrastructure to forward logs from:
DSPy.configure do |config|
config.logger = Dry.Logger(:dspy, formatter: :json) do |logger|
logger.add_backend(stream: "/var/log/dspy/traces.json")
end
end
Langfuse Integration
Langfuse can ingest the JSON logs directly:
# Forward logs to Langfuse using their log ingestion API
tail -f log/production.log | \
jq -c 'select(.event | startswith("llm"))' | \
curl -X POST https://api.langfuse.com/logs \
-H "Authorization: Bearer $LANGFUSE_API_KEY" \
-H "Content-Type: application/json" \
-d @-
Custom Processing
Process logs with your preferred tools:
# Parse and analyze logs
File.foreach("log/production.log") do |line|
event = JSON.parse(line)
if event["event"] == "llm.generate"
# Track LLM usage
tokens = event["gen_ai.usage.total_tokens"]
model = event["gen_ai.request.model"]
# ... your analytics
end
end
Common Patterns
Adding Custom Attributes
class MyModule < DSPy::Module
def forward(**inputs)
DSPy::Context.with_span(
operation: 'my_module.process',
'module.version' => '1.0',
'module.customer_id' => customer_id
) do
# Your logic
result = process(inputs)
# Log custom metrics
DSPy.log('my_module.complete',
items_processed: result.count,
processing_time: elapsed_time
)
result
end
end
end
Error Tracking
DSPy::Context.with_span(operation: 'risky_operation') do
begin
perform_operation()
rescue => e
DSPy.log('error.occurred',
error_class: e.class.name,
error_message: e.message,
error_backtrace: e.backtrace.first(5)
)
raise
end
end
Performance Monitoring
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
DSPy::Context.with_span(operation: 'batch_process') do
results = process_batch(items)
duration = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time
DSPy.log('batch.complete',
duration_ms: (duration * 1000).round(2),
items_count: items.size,
success_rate: results.count(&:success?) / items.size.to_f
)
results
end
Migration from Old Instrumentation
If you were using the old instrumentation system:
Before (Old System)
# Complex configuration
DSPy.configure do |config|
config.instrumentation.enabled = true
config.instrumentation.subscribers = [:logger, :otel]
config.instrumentation.sampling_rate = 0.1
end
# Event emission
DSPy::Instrumentation.instrument('my.event', payload) do
work()
end
After (New System)
# Simple configuration
DSPy.configure do |config|
config.logger = Dry.Logger(:dspy)
end
# Span tracking
DSPy::Context.with_span(operation: 'my.event', **payload) do
work()
end
Best Practices
-
Use Semantic Names: Follow dot notation for operations (e.g.,
user.signup
,llm.generate
) -
Keep Attributes Flat: Avoid deeply nested attribute structures
-
Limit Attribute Size: Don’t log large payloads as span attributes
-
Use Consistent Keys: Maintain consistent attribute naming across your application
-
Sample in Production: For high-volume applications, implement sampling:
# Simple sampling if rand < 0.1 # 10% sampling DSPy::Context.with_span(operation: 'sampled_op') do # ... end else # Execute without span tracking end
Troubleshooting
No Logs Appearing
Check that logger is configured:
DSPy.config.logger # Should not be nil
Missing Trace Correlation
Ensure operations are wrapped in spans:
# Correct - with span
DSPy::Context.with_span(operation: 'my_op') do
DSPy.log('my.event', data: 'value')
end
# Incorrect - no span context
DSPy.log('my.event', data: 'value') # Will log but no trace_id
Thread Safety Issues
Context is thread-local, so each thread has isolated context:
Thread.new do
# This thread has its own context
DSPy::Context.with_span(operation: 'thread_op') do
# ...
end
end
Langfuse Integration (Zero Configuration)
DSPy.rb includes zero-config Langfuse integration via OpenTelemetry. Simply set your Langfuse environment variables and DSPy will automatically export spans to Langfuse alongside the normal logging.
Setup
# Required environment variables
export LANGFUSE_PUBLIC_KEY=pk-lf-your-public-key
export LANGFUSE_SECRET_KEY=sk-lf-your-secret-key
# Optional: specify host (defaults to cloud.langfuse.com)
export LANGFUSE_HOST=https://cloud.langfuse.com # or https://us.cloud.langfuse.com
How It Works
When Langfuse environment variables are detected, DSPy automatically:
- Configures OpenTelemetry SDK with OTLP exporter
- Creates dual output: Both structured logs AND OpenTelemetry spans
- Exports to Langfuse using proper authentication and endpoints
- Falls back gracefully if OpenTelemetry gems are missing or configuration fails
Example Output
With Langfuse configured, your DSPy applications will send traces like this:
In your logs (as usual):
{
"severity": "INFO",
"time": "2025-08-08T22:02:57Z",
"trace_id": "abc-123-def",
"span_id": "span-456",
"parent_span_id": "span-789",
"operation": "ChainOfThought.forward",
"dspy.module": "ChainOfThought",
"event": "span.start"
}
In Langfuse (automatically):
Trace: abc-123-def
├─ ChainOfThought.forward [2000ms]
│ ├─ Module: ChainOfThought
│ └─ llm.generate [1000ms]
│ ├─ Model: gpt-4-0613
│ ├─ Temperature: 0.7
│ ├─ Tokens: 100 in / 50 out / 150 total
│ └─ Cost: $0.0021 (calculated by Langfuse)
GenAI Semantic Conventions
DSPy automatically includes OpenTelemetry GenAI semantic conventions:
# LLM operations automatically include:
{
"gen_ai.system": "openai",
"gen_ai.request.model": "gpt-4",
"gen_ai.response.model": "gpt-4-0613",
"gen_ai.usage.prompt_tokens": 100,
"gen_ai.usage.completion_tokens": 50,
"gen_ai.usage.total_tokens": 150
}
Manual Configuration (Advanced)
For custom OpenTelemetry setups, you can disable auto-configuration and set up manually:
# Disable auto-config by not setting Langfuse env vars
# Then configure OpenTelemetry yourself:
require 'opentelemetry/sdk'
require 'opentelemetry/exporter/otlp'
OpenTelemetry::SDK.configure do |config|
config.service_name = 'my-dspy-app'
# Your custom configuration
end
Dependencies
The Langfuse integration requires these gems (automatically included):
opentelemetry-sdk
(~> 1.8)opentelemetry-exporter-otlp
(~> 0.30)
If these gems are not available, DSPy gracefully falls back to logging-only mode.
Troubleshooting Langfuse Integration
Spans not appearing in Langfuse:
- Verify environment variables are set correctly
- Check Langfuse host/region (EU vs US)
- Ensure network connectivity to Langfuse endpoints
OpenTelemetry errors:
- Check that required gems are installed:
bundle install
- Look for observability error logs:
grep "observability.error" log/production.log
Authentication issues:
- Verify your public and secret keys are correct
- Check that keys have proper permissions in Langfuse dashboard
Summary
The observability system in DSPy.rb provides three modes of operation:
- Logging Only (default): Structured logs with span tracking
- Langfuse Integration (zero-config): Logs + automatic OpenTelemetry export
- Custom OTEL (advanced): Full control over OpenTelemetry configuration
Key benefits:
- Zero-config Langfuse: Just set env vars and it works
- Non-invasive: Logging still works without Langfuse
- Standards-compliant: Uses OpenTelemetry GenAI semantic conventions
- Graceful degradation: Falls back to logging if anything fails
- Minimal overhead: ~100 lines of observability code total
For most applications, the zero-config Langfuse integration provides production-ready observability with minimal setup.