Let the Model Write Your Tools
Building a research agent with CodeAct where the LLM generates Ruby code on the fly
Vicente Reig
Fractional Engineering Lead β’ β’ 4 min read
CodeAct is a DSPy.rb module that lets an LLM write and execute Ruby code to solve tasks. Instead of defining tools upfront, the agent generates code on the fly. This post walks through a minimal research agent that fetches data from web APIs.
The Signature
First, define what the agent does:
class ResearchQuery < DSPy::Signature
description "Research a topic by generating Ruby code to fetch and analyze web content."
input do
const :query, String, description: "The research question or topic to investigate"
const :context, String, description: "Available libraries and guidelines"
end
output do
const :answer, String, description: "The answer based on research findings"
end
end
Nothing special here - just a standard signature with a query in and an answer out.
The Context
The context input is just a string describing what the agent can use. Here we explain that Net::HTTP, URI, and JSON are available, and list some useful API endpoints:
RESEARCH_CONTEXT = <<~CONTEXT
You have access to these Ruby libraries:
- Net::HTTP - for HTTP requests
- URI - for parsing URLs
- JSON - for parsing JSON responses
Useful APIs:
- GitHub API: https://api.github.com/repos/{owner}/{repo}
- Wikipedia API: https://en.wikipedia.org/api/rest_v1/page/summary/{title}
CONTEXT
Running the Agent
Configure DSPy with your LLM, then create a CodeAct instance:
DSPy.configure do |config|
config.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
end
agent = DSPy::CodeAct.new(ResearchQuery, max_iterations: 8)
Call the agent with your query:
result = agent.call(
query: "How many stars does rails/rails have?",
context: RESEARCH_CONTEXT
)
puts result.answer
What Happens
The agent loops through three phases: Think (reason about the approach), Code (generate and execute Ruby), and Observe (decide if done or continue). Hereβs a real run:
π¬ Research Query: How many stars does rails/rails have?
π Final Answer:
The repository 'rails/rails' has 57915 stars.
π Execution History (1 iterations):
[Step 1]
π Thought: To fetch the number of stars for the 'rails/rails' repository
on GitHub, I will use the GitHub API endpoint for repository information.
π» Code:
uri = URI.parse('https://api.github.com/repos/rails/rails')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.get(uri.request_uri)
data = JSON.parse(response.body)
stars = data['stargazers_count']
puts "rails/rails has #{stars} stars."
β
Result: rails/rails has 57915 stars.
The agent:
- Reasoned about which API to use
- Generated valid Ruby code with proper HTTPS handling
- Executed the code and captured the output
- Decided the task was complete after one iteration
Specializing via Signatures
The signature defines what your agent does. Change the signature, get a different agent:
# A data analyst
class AnalyzeCSV < DSPy::Signature
description "Analyze CSV data and compute statistics"
input do
const :csv_data, String
const :question, String
end
output do
const :analysis, String
const :numbers, T::Array[Float]
end
end
# A code explainer
class ExplainCode < DSPy::Signature
description "Execute code snippets to understand their behavior"
input do
const :code, String
const :language, String
end
output do
const :explanation, String
const :output_example, String
end
end
Each signature produces a specialized agent. The description guides the LLMβs approach, input fields define what data it receives, and output fields shape what it returns. CodeAct handles the execution loop - you just define the interface.
When to Use This
CodeAct works well when you need flexible data fetching or transformation. The agent can combine APIs, parse responses, and compute derived values - all without you writing tool definitions.
Use ReAct instead when you have well-defined tools with clear interfaces, or need stricter control over what the agent can do.
The tradeoff with CodeAct is safety: it uses eval to run generated code. Fine for experimentation, but add sandboxing before using with untrusted input.
Sandboxing Options
Ruby has no built-in safe sandbox ($SAFE was removed in Ruby 3.0). For production use with untrusted input, consider these alternatives:
| Approach | Gem | How it works | Overhead |
|---|---|---|---|
| Docker containers | trusted-sandbox | Runs code in isolated containers with resource limits | ~100ms+ |
| V8 engine | ruby_box | Compiles Ruby to JS via Opal, executes in V8 | Fast, but limited Ruby compatibility |
| Process isolation | safe_ruby | Forks process with whitelisted methods | Moderate |
| WebAssembly | wasmer-ruby | Memory-safe WASM sandbox | Complex setup |
For controlled experimentation, a lightweight approach with timeouts and pattern blocking:
class SaferCodeAct < DSPy::CodeAct
FORBIDDEN = [/`.*`/, /system\s*\(/, /exec\s*\(/, /File\.(delete|write)/]
def execute_ruby_code_safely(ruby_code)
return [nil, "Forbidden pattern"] if FORBIDDEN.any? { |p| ruby_code.match?(p) }
Timeout.timeout(5) { super }
rescue Timeout::Error
[nil, "Timed out"]
end
end
This isnβt a real sandbox - itβs just guardrails. For untrusted input, use Docker or WASM.
Try It
The full example is at examples/codeact_research_agent.rb:
$ bundle exec ruby examples/codeact_research_agent.rb "How many stars does rails/rails have?"
π¬ Research Query: How many stars does rails/rails have?
============================================================
π Final Answer:
------------------------------------------------------------
The repository 'rails/rails' has 57915 stars.
π Execution History (1 iterations):
------------------------------------------------------------
[Step 1]
π Thought: To fetch the number of stars for the 'rails/rails' repository on GitHub,
I will use the GitHub API endpoint for repository information.
π» Code:
uri = URI.parse('https://api.github.com/repos/rails/rails')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.get(uri.request_uri)
data = JSON.parse(response.body)
stars = data['stargazers_count']
puts "rails/rails has #{stars} stars."
β
Result: rails/rails has 57915 stars.
Or run it interactively:
bundle exec ruby examples/codeact_research_agent.rb