DSPy Python vs DSPy.rb Feature Comparison
This document provides a comprehensive comparison between the original DSPy Python library and the DSPy.rb implementation, highlighting what’s available, what’s missing, and what should be prioritized.
Important Note: DSPy.rb is not aiming for 1:1 parity with the Python library. Instead, we’re building a Ruby-idiomatic LLM framework inspired by DSPy’s core concepts while embracing Ruby’s strengths and conventions. Some features may be implemented differently, some may be skipped in favor of Ruby-specific alternatives, and some new features may be added that don’t exist in the Python version.
Core Modules/Predictors
Available in Both Python and Ruby
- Predict - Basic prediction module (✓ Ruby implemented)
- ChainOfThought - Adds reasoning field for step-by-step thinking (✓ Ruby implemented)
- ReAct - Thought-Action-Observation loop for agent tasks (✓ Ruby implemented)
Available in Python but NOT in Ruby
- ProgramOfThought - Teaches the LM to output code, whose execution results dictate the response
- MultiChainComparison - Compares multiple outputs from ChainOfThought to produce a final prediction
- Majority - Takes majority vote from multiple predictions
- MultiHopRAG - Multi-hop retrieval-augmented generation
- SimplifiedBaleen - Simplified version of Baleen for multi-hop reasoning
- Retry - Adds retry logic with feedback
Ruby-Specific Modules (Not in Python)
- CodeAct - Think-Code-Observe pattern for Ruby code generation and execution (available via the
dspy-code_actgem)
Optimization Techniques
Available in Both
- MIPROv2 - Bayesian optimization for instructions and demonstrations (✓ Ruby implemented)
Available in Python but NOT in Ruby
- BootstrapRS (BootstrapFewShotWithRandomSearch) - Synthesizes good few-shot examples with random search
- BootstrapFewShot - Generates complete demonstrations without further optimization
- COPRO - Coordinate ascent for instruction optimization
- BootstrapFinetune - Distills prompt-based programs into weight updates
- SignatureOptimizer - Optimizes input/output field names and descriptions
- BayesianSignatureOptimizer - Bayesian optimization for signature fields
- KNNFewShot - K-nearest neighbors for few-shot example selection
Advanced Features
Retrievers and Vector Stores
Python DSPy
- ColBERTv2 - Fast and accurate neural retrieval
- ChromaDB - Vector database integration
- Pinecone - Vector database integration
- Weaviate - Vector database integration
- Qdrant - Vector database integration
- FAISS - Facebook AI Similarity Search
- Azure Cognitive Search - Cloud-based search
- Custom Retriever base class
Ruby DSPy
- Memory System - Custom memory storage with embeddings (different approach)
- InMemoryStore - Simple in-memory vector storage
- LocalEmbeddingEngine - Local embedding generation
- No direct retriever integrations yet
Type System and Validation
Python DSPy
- TypedPredictor - Enforces type constraints via Pydantic
- TypedChainOfThought - Type-safe chain of thought
- Assertions - Runtime assertions for outputs
- Pydantic-based type validation
Ruby DSPy
- Sorbet type system integration (stronger static typing)
- Built-in type validation through Sorbet runtime
- Schema validation via JSON schemas
Tool Support
Python DSPy
- Basic tool/function calling support
- Integration with external APIs
Ruby DSPy
- ReAct with full tool integration
- MemoryToolset - Tools for memory operations
- TextProcessingToolset - Text manipulation tools
- More structured tool system with Sorbet types
Production Features
Python DSPy
- Basic logging and monitoring
- Limited production instrumentation
Ruby DSPy (More Advanced)
- Comprehensive Instrumentation System
- OpenTelemetry integration
- New Relic integration
- Langfuse integration
- Custom event system
- Storage and Registry Systems
- Program storage and versioning
- Signature registry
- Deployment support
- Configuration Management
- Dry-configurable integration
- Environment-based configs
Missing Components That Should Be Prioritized
High Priority (Core Functionality Gaps)
- ProgramOfThought Module
- Critical for code generation tasks
- Complements the optional
dspy-code_actmodule - Should support multiple languages
- MultiChainComparison Module
- Essential for improving answer quality
- Can leverage existing ChainOfThought
- Retriever Integrations
- ColBERTv2 wrapper
- Vector database clients (Pinecone, ChromaDB, Weaviate)
- Custom retriever base class
- BootstrapFewShot Optimizer
- Core optimization technique
- Simpler than MIPROv2
- Good starting point for users
Medium Priority (Enhanced Capabilities)
- COPRO Optimizer
- Instruction optimization
- Complements MIPROv2
- Typed Predictors
- TypedPredictor equivalent using Sorbet
- Better type safety for outputs
- Assertions System
- Runtime validation of outputs
- Quality control mechanism
- MultiHopRAG Module
- Advanced retrieval patterns
- Builds on retriever integrations
Lower Priority (Nice to Have)
- SignatureOptimizer
- Advanced optimization
- Can improve prompt quality
- KNNFewShot
- Advanced example selection
- Requires embedding infrastructure
Ruby DSPy Advantages
- Stronger Type System - Sorbet provides better static analysis
- Production-Ready Instrumentation - Comprehensive observability
- Better Configuration Management - dry-configurable integration
- Advanced Memory System - Built-in memory management
- Structured Tool System - Better tool integration patterns
Recommendations
- Immediate Focus: Implement ProgramOfThought and MultiChainComparison modules
- Retriever Strategy: Create a base retriever class and implement ColBERTv2 wrapper
- Optimizer Expansion: Add BootstrapFewShot as a simpler alternative to MIPROv2
- Type Safety: Leverage Sorbet to create TypedPredictor equivalent
- Vector Store Integration: Start with ChromaDB or Pinecone for vector storage
The Ruby implementation has made significant innovations in production readiness and observability, but needs to catch up on core DSPy modules and retriever integrations to achieve feature parity with the Python version.