Signatures
Signatures define the interface between your application and language models. They specify inputs, outputs, and task descriptions using Sorbet types for basic type safety.
Basic Signature Structure
class TaskSignature < DSPy::Signature
description "Clear description of what this signature accomplishes"
input do
const :field_name, String
end
output do
const :result_field, String
end
end
Input Definition
Supported Types
class BasicClassifier < DSPy::Signature
description "Classify text into categories"
input do
const :text, String # Required string
const :context, T.nilable(String) # Optional string
const :max_length, Integer # Required integer
const :include_score, T::Boolean # Boolean
const :created_date, Date # Date (ISO 8601 format)
const :updated_at, DateTime # DateTime with timezone
const :processed_time, Time # Time (converted to UTC)
const :tags, T::Array[String] # Array of strings
const :metadata, T::Hash[String, String] # Hash with string keys/values
end
end
Date and Time Types
DSPy.rb provides comprehensive support for date and time types with automatic serialization/deserialization:
class EventScheduler < DSPy::Signature
description "Schedule events based on requirements"
input do
const :start_date, Date # Required date
const :end_date, T.nilable(Date) # Optional date
const :preferred_time, DateTime # DateTime with timezone
const :deadline, Time # Time (stored as UTC)
end
output do
const :scheduled_date, Date # LLM returns ISO 8601 date string, auto-converted
const :event_datetime, DateTime # LLM returns ISO datetime, preserves timezone
const :created_at, Time # LLM returns time string, converted to UTC
end
end
Date/Time Format Handling
- Date: Serialized as ISO 8601 format (
YYYY-MM-DD
) - DateTime: Serialized as ISO 8601 with timezone (
YYYY-MM-DDTHH:MM:SS+00:00
) - Time: Serialized as ISO 8601, automatically converted to UTC for consistency
predictor = DSPy::Predict.new(EventScheduler)
result = predictor.call(
start_date: "2024-01-15", # String input, converted to Date
preferred_time: "2024-01-15T10:30:45Z" # String input, converted to DateTime
)
puts result.scheduled_date.class # => Date
puts result.event_datetime.class # => DateTime
Timezone Considerations
Following ActiveRecord conventions:
- Time objects are automatically converted to UTC for consistent storage
- DateTime objects preserve timezone information
- Date objects are timezone-agnostic
Output Definition
Using Enums for Controlled Outputs
class SentimentAnalysis < DSPy::Signature
description "Analyze sentiment of text"
class Sentiment < T::Enum
enums do
Positive = new('positive')
Negative = new('negative')
Neutral = new('neutral')
end
end
input do
const :text, String
end
output do
const :sentiment, Sentiment
const :score, Float
const :reasoning, T.nilable(String)
end
end
Using Structs for Structured Outputs
class EntityExtraction < DSPy::Signature
description "Extract entities from text"
class EntityType < T::Enum
enums do
Person = new('person')
Organization = new('organization')
Location = new('location')
end
end
class Entity < T::Struct
const :name, String
const :type, EntityType
const :confidence, Float
end
input do
const :text, String
end
output do
const :entities, T::Array[Entity]
const :total_found, Integer
end
end
Union Types
You can use T.any()
to specify fields that can accept multiple types:
class FlexibleExtraction < DSPy::Signature
description "Extract data that could be in different formats"
input do
const :text, String
end
output do
# Value can be numeric or categorical
const :result, T.any(Float, String)
const :confidence, Float
end
end
For more complex union types with structs and automatic type conversion, see the Union Types section in Rich Types.
Optional Fields
class ContentGeneration < DSPy::Signature
description "Generate content with configurable parameters"
input do
const :topic, String
const :style, T.nilable(String) # Optional field
const :max_words, Integer
end
output do
const :content, String
const :word_count, Integer
const :estimated_time, T.nilable(Float) # May not always be provided
end
end
Default Values (New in v0.7.0)
Default values make your signatures more flexible and handle missing LLM responses gracefully:
class SmartSearch < DSPy::Signature
description "Search with intelligent defaults"
input do
const :query, String
const :max_results, Integer, default: 10
const :language, String, default: "English"
const :include_metadata, T::Boolean, default: false
end
output do
const :results, T::Array[String]
const :total_found, Integer
const :search_time_ms, Float, default: 0.0
const :cached, T::Boolean, default: false
end
end
# Usage - input defaults reduce boilerplate
search = DSPy::Predict.new(SmartSearch)
# Only need to provide required fields
result = search.call(query: "Ruby programming")
# max_results=10, language="English", include_metadata=false are used
# Output defaults handle missing LLM responses
# If LLM doesn't return search_time_ms or cached, defaults are applied
How Default Values Work
- Input Defaults: Applied when creating the input struct
- Reduce boilerplate in your code
- Make APIs more user-friendly
- Output Defaults: Applied when LLM response is missing fields
- Improve robustness when LLMs omit optional fields
- Prevent errors from incomplete responses
Practical Examples
Email Classification
class EmailClassifier < DSPy::Signature
description "Classify emails by category and priority"
class Priority < T::Enum
enums do
Low = new('low')
Medium = new('medium')
High = new('high')
Urgent = new('urgent')
end
end
input do
const :email_content, String
const :sender, String
end
output do
const :category, String
const :priority, Priority
const :confidence, Float
const :requires_action, T::Boolean
end
end
Product Review Analysis
class ProductReview < DSPy::Signature
description "Analyze product reviews and extract ratings"
input do
const :review_text, String
const :product_category, String
end
output do
const :rating, Integer
const :summary, String
const :key_points, T::Array[String]
end
end
Schema Formats
DSPy.rb supports two schema formats for communicating with language models: JSON Schema (default) and BAML Schema. The schema format controls how DSPy describes your signature’s structure to the LLM.
JSON Schema (Default)
Signatures automatically generate JSON schemas for language model integration:
class TextClassifier < DSPy::Signature
description "Classify text documents"
class Category < T::Enum
enums do
Technical = new('technical')
Business = new('business')
Personal = new('personal')
end
end
input do
const :text, String
const :length_limit, Integer
end
output do
const :category, Category
const :confidence, Float
const :keywords, T::Array[String]
end
end
# Access generated schemas
TextClassifier.input_json_schema # Returns JSON schema for inputs
TextClassifier.output_json_schema # Returns JSON schema for outputs
BAML Schema Format (New in v0.28.2)
BAML (Basically A Markup Language) is a compact schema format that reduces token usage by 84%+ compared to JSON Schema in Enhanced Prompting mode (structured_outputs: false
). It provides a more readable, concise representation of your data structures.
Configure BAML schema format:
# Option 1: Configure globally via LM
DSPy.configure do |c|
c.lm = DSPy::LM.new(
'openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
schema_format: :baml # Use BAML format for all signatures
)
end
# Option 2: Per-signature override via Prompt
prompt = DSPy::Prompt.from_signature(YourSignature, schema_format: :baml)
Schema Format Comparison:
For the TextClassifier
signature above:
JSON Schema (verbose):
{
"$schema": "http://json-schema.org/draft-06/schema#",
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": ["technical", "business", "personal"]
},
"confidence": {
"type": "number"
},
"keywords": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["category", "confidence", "keywords"]
}
BAML Schema (compact):
class TextClassifierOutput {
category string
confidence float
keywords string[]
}
Token Savings: 84.4% (verified by integration tests)
For rich signatures with nested types, BAML saves 80-85% of prompt tokens used for schema definitions in Enhanced Prompting mode.
Note: BAML format applies only to Enhanced Prompting mode (structured_outputs: false
). When using Structured Outputs mode (structured_outputs: true
), OpenAI’s native API receives the JSON Schema directly and BAML format has no effect.
When to Use BAML:
- Complex signatures with many fields
- Nested structs and arrays
- Cost-sensitive applications
- High-volume LLM API usage
When to Use JSON Schema:
- Simple signatures (1-3 fields)
- When LLM provider specifically requires JSON Schema
- Legacy compatibility requirements
Requirements:
BAML format requires the sorbet-baml
gem:
# Gemfile
gem 'sorbet-baml'
The gem is automatically included as a dependency of dspy-rb
.
Usage with Predictors
# Use signature with a predictor
classifier = DSPy::Predict.new(TextClassifier)
# Call with input matching the signature
result = classifier.call(
text: "This is a technical document about APIs",
length_limit: 1000
)
# Access typed outputs (automatically converted from JSON)
puts result.category # => TextClassifier::Category::Technical (not a string!)
puts result.category.serialize # => "technical"
puts result.confidence # => 0.85
puts result.keywords # => ["APIs", "technical", "document"]
Automatic Type Conversion (v0.9.0+)
DSPy automatically converts LLM JSON responses to the proper Ruby types:
- Enums: Strings are converted to T::Enum instances
- Structs: Nested hashes become T::Struct objects
- Arrays: Elements are converted recursively
- Defaults: Missing fields use their default values
See Rich Types for detailed information.
Testing Signatures
RSpec.describe TextClassifier do
let(:predictor) { DSPy::Predict.new(TextClassifier) }
it "classifies text correctly" do
result = predictor.call(
text: "This is a technical document",
length_limit: 500
)
expect(result.category).to be_a(TextClassifier::Category)
expect(result.confidence).to be_a(Float)
expect(result.keywords).to be_a(Array)
end
it "generates proper JSON schemas" do
input_schema = TextClassifier.input_json_schema
expect(input_schema[:properties]).to have_key(:text)
expect(input_schema[:properties]).to have_key(:length_limit)
output_schema = TextClassifier.output_json_schema
expect(output_schema[:properties]).to have_key(:category)
expect(output_schema[:properties]).to have_key(:confidence)
end
end
Special Considerations
Working with ChainOfThought
When using DSPy::ChainOfThought
, be aware that it automatically adds a :reasoning
field to your signature’s output:
# DO NOT define :reasoning in your output when using ChainOfThought
class AnalysisSignature < DSPy::Signature
description "Analyze text sentiment"
input do
const :text, String
end
output do
const :sentiment, String
# :reasoning field will be added automatically by ChainOfThought
end
end
# ChainOfThought usage
analyzer = DSPy::ChainOfThought.new(AnalysisSignature)
result = analyzer.call(text: "Great product!")
# Access both original fields and automatic reasoning
puts result.sentiment # => "positive"
puts result.reasoning # => "The text uses positive language..."
Important: If you define your own :reasoning
field in a signature that will be used with ChainOfThought, it may cause conflicts or unexpected behavior.
Best Practices
1. Clear and Specific Descriptions
# Good: Specific and actionable
description "Classify customer support tickets by urgency and category based on message content"
# Bad: Vague
description "Classify text"
2. Meaningful Enum Values
# Good: Clear business meaning
class TicketPriority < T::Enum
enums do
Low = new('low')
Medium = new('medium')
High = new('high')
Urgent = new('urgent')
end
end
# Bad: Unclear values
class Priority < T::Enum
enums do
P1 = new('p1')
P2 = new('p2')
P3 = new('p3')
end
end
3. Use Optional Fields Appropriately
class ConfigurableAnalysis < DSPy::Signature
description "Analyze text with optional configuration"
input do
const :text, String
const :include_metadata, T.nilable(T::Boolean) # Optional
const :max_words, T.nilable(Integer) # Optional
end
output do
const :analysis, String
const :confidence, Float
const :metadata, T.nilable(T::Hash[String, String]) # Only if requested
end
end
Signatures provide the basic contract between your application and language models with type safety through Sorbet integration.