DSPy Signatures anchor your app in a world where everything changes—prompting techniques, model families, even serialization formats. They’re the declarative contract for your prompt, so you never handcraft schemas or payloads again. JSON Schema and JSON payloads, however, bloat requests—especially when you’re shipping time-series data or long lists of structs that repeat every key. Starting today you can flip two symbols and keep Enhanced Prompting lean. Here’s the latest signature we used for the benchmark (now with nested structs and enums):
class TaskDecomposition < DSPy::Signature
description "Autonomously analyze a research topic and define optimal subtasks with strategic prioritization"
input do
const :topic, String, description: "The main research topic to investigate"
const :context, String, description: "Any additional context or constraints"
const :complexity_level, ComplexityLevel,
description: "Desired complexity level for task decomposition"
end
output do
const :subtasks, T::Array[Task], description: "Autonomously defined research subtasks"
const :task_types, T::Array[TaskType], description: "Type classification for each task"
const :priority_order, T::Array[Integer], description: "Priority rankings (1-5 scale)"
const :estimated_effort, T::Array[EstimatedEffortWithReasoning], description: "Effort estimates in hours with rationale"
const :dependencies, T::Array[Task], description: "Task dependency relationships"
const :agent_requirements, T::Array[String], description: "Suggested agent skills"
end
end
The remaining cost has always been tokens: JSON Schema is verbose and JSON payloads repeat every key. Starting today, you can flip two symbols and trim Enhanced Prompting back down to size—even for signatures that emit nested structs, enums, and rationales.
TL;DR
- Schema guidance: switch
schema_format: :bamland drop 3,528 → 608 characters (≈ 83% smaller) even with nested structs/enums. - Data blocks: switch
data_format: :toon(Token-Oriented Object Notation, powered by the newsorbet-toongem) and keep your inputs/outputs, ReAct histories, and tool payloads structured without JSON overhead. TOON itself lives at github.com/toon-format/toon. - Net effect: the enhanced
TaskDecompositionsignature now ships ≈ 9,490 fewer schema tokens and ≈ 2,420 fewer data tokens per Enhanced Prompting call. That still cuts prompts roughly in half, even though the tasks now carry objectives, success metrics, and reasoning.
DSPy.configure do |c|
c.lm = DSPy::LM.new(
'openai/gpt-4o-mini',
api_key: ENV['OPENAI_API_KEY'],
schema_format: :baml,
data_format: :toon
)
end
That’s it. Predictors, ChainOfThought, ReAct, and every DSPy module keep the same API; prompts just get cheaper.
Why TOON + BAML matters
| Scenario | JSON Schema + JSON Data | BAML Schema + TOON Data |
|---|---|---|
| Signature guidance size | 3,528 chars | 608 chars |
| Sample input + output payload | 2,063 chars | 1,180 chars |
| Total prompt tokens (Enhanced Prompting) | ~13,500 | ~6,300 |
Source: examples/baml_vs_json_benchmark.rb, live run baml_benchmark_20251107_172759.json.
That reduction isn’t just abstract token math:
- Schema savings: ~9,490 tokens disappear every time you render the signature guidance. That’s ~75% of the system prompt cost.
- Payload savings: TOON trims another ~2,420 tokens per request by avoiding repeated JSON keys.
- Latency/cost: When a model follows TOON, per-call cost falls 10‑20% and latency drops 15‑25% (e.g.,
gpt-4oBAML+TOON runs averaged 4.7 s vs 20 s for JSON+JSON in the benchmark). The same pattern held for Anthropic and Gemini models.
What the model feels
- Clear guidance, compact tables: BAML renders the signature schema in a TypeScript-like form instead of a 200-line JSON Schema blob. Models latch onto the important parts faster.
- Structured payloads without braces: Sorbet::Toon turns your input struct into a Token-Oriented Object Notation (TOON) block. Arrays of structs become literal tables, so histories, toolsets, time-series data, and complex outputs stop repeating field names. JSON adds padding every time you send a list; TOON stays slim.
- Enhanced Prompting by default: You keep the exact same predictor APIs—no function calls or json schema extraction tricks. Swapping formats only changes how we render the prompt, not how you write or parse completions.
Where the savings show up
- Prediction prompts – any signature-backed
Predictnow emits TOON payloads, so even single-call apps get the 57% token cut. - ReAct loops – every turn now shares tools, histories, and observations as TOON. Long multi-tool dialogues stop reprinting JSON hashes.
- Tool ecosystems – TOON preserves typing (thanks to
Sorbet::Toon.decode), so tool outputs round-trip back into Sorbet structs without manual serialization glue.
FAQ
- Do I need to use BAML and TOON together?
- No. They’re independent toggles. Use
schema_format: :bamlwhen you want compact schema guidance,data_format: :toonwhen you want lean payloads. You can enable either one (or both) per LM. - Where’s the benchmarking code?
- In
examples/baml_vs_json_benchmark.rb. It ships with the repo and emits the same.json/.csv/.txtartifacts referenced here. - Does this rely on function calling or structured outputs?
- No. Everything stays in Enhanced Prompting—you still write plain
Predict,ChainOfThought, orReActcode and parse completions the same way. - Can I combine TOON with provider-native structured outputs?
- Not today. Provider structured outputs still expect JSON. TOON is purpose-built for Enhanced Prompting, so use it when you’re controlling the prompt yourself.
- Will TOON break my ReAct tools or custom modules?
- No. ReAct, toolsets, and other DSPy modules already understand
data_format: :toon; they simply serialize histories, tools, and responses using Sorbet::Toon instead of JSON.
What’s the migration diff?
DSPy.configure do |c|
- c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
+ c.lm = DSPy::LM.new(
+ 'openai/gpt-4o-mini',
+ api_key: ENV['OPENAI_API_KEY'],
+ schema_format: :baml,
+ data_format: :toon
+ )
end
Flip the formats, keep your prompts declarative, and run TOON wherever Enhanced Prompting makes sense.