Workflow-first multi-agent orchestration for Ruby.
Smith gives you a disciplined way to build agent systems that are explicit, inspectable, and operationally sane. Instead of hiding orchestration inside prompts, it lets you model the work as a workflow with named states, named transitions, budgets, guardrails, tool policy, persistence hooks, artifacts, tracing, and composable multi-agent patterns.
Warning
smith is not published yet and is still under active development.
Expect API changes, contract tightening, and sharp edges.
If you want to build on it early, pin a commit and verify behavior against the runtime specs and spec/SPEC_MATRIX.md.
Smith is named after Agent Smith from The Matrix.
The reference fits the kind of systems this library is built for across Smith's full role in the films: control, enforcement, replication, propagation, coordination, containment pressure, and a system that becomes more dangerous as it becomes more autonomous and harder to constrain. That is closer to real agent software than most friendly assistant demos: multiple actors, repeated delegation, expanding scope, and failure modes that matter once the system has real consequences.
Smith is built for that layer: not just "call a model," but manage agent behavior as a system.
Most agent demos look good until you need one of these:
- explicit control flow instead of "the prompt told the model what to do"
- repeatable failure behavior
- tool authorization and guardrails
- budget and deadline enforcement
- parallel fan-out with controlled accounting
- nested workflows and reusable subflows
- evaluator-optimizer and orchestrator-worker loops
- persistence and resume at workflow boundaries
- artifacts for large outputs
- tracing and best-known token/cost accounting
Smith is built for that layer.
It is especially useful when you want to:
- turn one-off prompting into a real application workflow
- keep orchestration in Ruby instead of burying everything in model text
- compose multiple agents without losing control of budgets, deadlines, and failure semantics
- let host apps own storage, queues, retries, and long-lived durability
- A Ruby library for workflow-first agent orchestration
- Built on top of
RubyLLM, not a replacement for it - In-process and host-controlled
- Good for application-level orchestration where you want explicit state and explicit control
- Not a hosted runtime
- Not a durable workflow engine by itself
- Not a job queue
- Not a billing-grade cost system
- Not a replacement for your app's persistence, retries, or deployment platform
Your application still owns:
- persistence
- job execution
- retries at the host/process level
- tenant isolation policy
- provider credentials and provider-level configuration
With the current surface you can build:
- a single guarded agent behind a workflow boundary
- sequential multi-step flows
- classifier routers
- bounded parallel fan-out
- reusable subflows through nested workflows
- generator/evaluator loops
- orchestrator/worker systems
- workflows that store big outputs as artifact refs
- resumable flows using
to_state/.from_state
Here is the practical shift Smith gives you.
Without Smith, a typical app ends up with:
- a prompt string
- an LLM call
- some ad hoc branching around the response
- unclear failure handling
- no workflow state
With Smith, the same job becomes an explicit workflow with real structure:
class ReplyContext < Smith::Context
persist :user_message
inject_state do |persisted|
"User message: #{persisted[:user_message]}"
end
end
class ReplyAgent < Smith::Agent
register_as :reply_agent
model "gpt-4.1-nano"
instructions do |_context|
"Write a concise, professional reply."
end
end
class ReplyWorkflow < Smith::Workflow
context_manager ReplyContext
initial_state :idle
state :done
state :failed
transition :reply, from: :idle, to: :done do
execute :reply_agent
on_failure :fail
end
end
result = ReplyWorkflow.new(
context: { user_message: "I was charged twice for the same invoice." }
).run!
result.state
# => :done
result.output
# => final assistant output
result.steps
# => [{ transition: :reply, from: :idle, to: :done, output: ... }]That buys you immediately:
- explicit workflow state
- explicit success and failure routing
- a step log you can inspect
- a clean place to add budgets, tools, guardrails, tracing, persistence, and artifacts later
smith is not on RubyGems yet.
Use a local path:
# Gemfile
gem "smith", path: "../smith"Or a git source from your own remote:
# Gemfile
gem "smith", git: "ssh://git@your-git-host/your-org/smith.git"Then install:
bundle installAfter adding Smith to your bundle, verify the integration.
smith doctor # offline verification
smith doctor --live # includes real provider call
smith doctor --durability # includes persistence round-trip
smith install # scaffold config/smith.rbbin/rails smith:doctor # offline verification
bin/rails smith:doctor:live # includes real provider call
bin/rails smith:doctor:durability # includes persistence round-trip
bin/rails smith:install # scaffold config/initializers/smith.rbOr use the Rails generator:
bin/rails generate smith:install- Baseline (always): Smith loads, Ruby version, RubyLLM loads, minimal workflow boots
- Configuration (always): logger, artifacts, tracing, pricing — warns if missing
- Serialization (with
--durability): to_state, JSON round-trip, from_state, resume - Durability (with
--durability): host persistence adapter round-trip and resumed execution - Persistence (with
--profile rails_persistence): ActiveRecord, DB connection, RubyLLM persistence surface, schema - Live (with
--live): real provider call against configured RubyLLM model
Doctor is offline by default. Live verification and persistence checks are opt-in.
For durability verification, Smith supports these first-class adapter modes:
:rails_cachefor standard Rails cache integration:solid_cacheas a Rails-cache alias when your cache backend is Solid Cache:cache_storefor any cache-like store that responds towrite,read, anddelete:redisfor a Redis client:active_recordfor a keyed ActiveRecord model such asWorkflowState
:rails_cache and :solid_cache are only as durable as the configured Rails cache backend.
If Rails is using ActiveSupport::Cache::MemoryStore, Smith can round-trip in-process but that storage will not survive restarts, and doctor will warn accordingly.
The same warning applies to :cache_store if you point it at a process-local memory backend.
Example Rails config:
Smith.configure do |config|
config.persistence_adapter = :rails_cache
config.persistence_options = { namespace: "smith" }
endIf the workflow should begin with a deterministic conversation turn, you can seed that session history directly on the workflow:
class SeededReplyWorkflow < Smith::Workflow
seed_messages do |ctx|
[{ role: :user, content: ctx[:user_message] }]
end
initial_state :idle
state :done
transition :reply, from: :idle, to: :done do
execute :reply_agent
end
endExample Redis config:
Smith.configure do |config|
config.persistence_adapter = :redis
config.persistence_options = {
redis: Redis.new(url: ENV.fetch("REDIS_URL")),
namespace: "smith"
}
endExample ActiveRecord config:
Smith.configure do |config|
config.persistence_adapter = :active_record
config.persistence_options = {
model: WorkflowState,
key_column: :key,
payload_column: :payload
}
endYou can still provide a custom adapter object if your host app already has its own persistence API. It just needs to implement:
store(key, payload)fetch(key)delete(key)
The setup model is:
- configure your provider through
RubyLLM - optionally configure Smith runtime services
- define an agent
- define a workflow
- run it
To make a real Smith workflow run, you need:
- working
RubyLLMprovider setup - at least one
Smith::Agentwith amodel - a
register_asname for any agent a workflow will execute - the agent class to be loaded before the workflow step that references it runs
- a
Smith::Workflowwith at least one transition
Everything else is optional at first:
Smith.configure- budgets
- guardrails
- context management
- tracing
- artifacts
- pricing
Smith depends on RubyLLM.
It does not replace provider setup.
Minimal OpenAI example:
require "ruby_llm"
RubyLLM.configure do |config|
config.openai_api_key = ENV.fetch("OPENAI_API_KEY")
config.default_model = "gpt-4.1-nano"
endThe exact keys change by provider, but the layering does not:
- RubyLLM owns provider credentials and default provider behavior
- Smith owns orchestration on top of that
register_as is a class-load side effect.
When Smith executes a transition like:
execute :reply_agentit resolves :reply_agent from Smith::Agent::Registry.
That means the agent class must already have been loaded so its register_as :reply_agent line has actually run.
This matters most in environments with autoloading and partial eager loading, such as Rails development mode. If an agent class lives in an autoloaded path but nothing has referenced that constant yet, the registry entry may not exist when the workflow runs.
Smith now fails fast in that situation. Unresolved agent symbols raise Smith::WorkflowError instead of silently advancing with nil. The same rule applies across:
executerouteoptimizeorchestrate
In host apps, the fix is to make agent loading explicit at boot or reload time. In Rails, that usually means a small initializer or to_prepare hook that references or registers the workflow-facing agent classes your app uses.
register_as is reload-safe. Under the hood it delegates to Smith::Agent::Registry.ensure_registered, which handles:
- First boot: registers the agent normally.
- Rails reload (stale same-name class): detects the old class object by matching
.name, replaces it atomically. - Same object re-registration: no-op.
- True collision (different class name): raises
Smith::AgentRegistryError.
Host apps should not call Smith::Agent::Registry.clear! during normal runtime. clear! exists for test isolation only.
Host apps should not access Smith::Agent::Registry._container directly. All registry operations go through the public API: find, fetch!, register, delete, ensure_registered, clear!.
Registry collisions with different class names fail loudly with Smith::AgentRegistryError. This catches real misconfiguration (two different agents registered under the same key).
Post-clear! re-registration in tests: if a test calls clear! while agent constants are already loaded, referencing the constant does not re-run register_as (class body does not re-execute). Tests must explicitly re-register via Smith::Agent::Registry.ensure_registered(klass.register_as, klass).
The clean fix order is:
- Ensure every workflow-facing agent declares the exact symbol it uses through
register_as. - Make agent loading explicit in the host app via
to_prepare. - Let Smith fail fast if that bootstrap is missing.
In Rails, the preferred fix is a narrow to_prepare hook that references agent classes directly:
# config/initializers/smith_agents.rb
Rails.application.config.to_prepare do
ReplyAgent
TriageAgent
ResearchOrchestrator
ResearchWorker
endThat keeps development reload behavior intact and makes the dependency explicit. No app-level registry module is needed — Smith's ensure_registered handles reload safety.
Smith standardizes prompt roles across providers at the workflow handoff:
system: stable control-plane framing for the whole agent invocationassistant: prior model output from an earlier rounduser: the current task input plus turn-local workflow metadata
Smith keeps exactly one control-plane system message per agent invocation. Agent instructions and injected context state are merged into that single message before the provider call.
This matters for both provider compatibility and prompt semantics:
- Anthropic accepts a single top-level system prompt
- OpenAI-style providers treat
system/developeras high-authority instructions - turn-local workflow markers should stay adjacent to the current round, not be hoisted into the global instruction layer
In practice that means:
- injected state like
[smith:injected-state]belongs in the singlesystemmessage - refinement feedback and orchestration worker results belong in
usercontent for the current round - prior candidates or prior orchestrator outputs stay in
assistantwhen Smith is continuing a multi-round exchange
Setting config.eager_load = true can hide the problem by loading more classes at boot, but it is not the preferred fix:
- it couples correctness to a global loading mode
- it is a worse default for development
- it does not make the workflow-facing agent set explicit
Production eager loading is still fine. The point is that Smith correctness should not depend on eager loading as the only registration mechanism.
If you skip Smith.configure, you still need:
require "smith"Smith.configure is global runtime config for orchestration concerns:
- artifacts
- tracing
- pricing
- logging
Minimal example:
require "logger"
require "smith"
Smith.configure do |config|
config.logger = Logger.new($stdout)
config.trace_adapter = Smith::Trace::Memory.new
config.artifact_store = Smith::Artifacts::Memory.new
config.pricing = {
"gpt-4.1-nano" => {
input_cost_per_token: 0.0000001,
output_cost_per_token: 0.0000004
}
}
endYou do not need all of that to get started.
For a first run, Smith.configure can be omitted entirely.
class SupportReplyAgent < Smith::Agent
register_as :support_reply_agent
model "gpt-4.1-nano"
instructions do |_context|
"Write a concise, calm support reply with a concrete next step."
end
endclass SupportReplyContext < Smith::Context
persist :ticket_id, :user_message
inject_state do |persisted|
<<~TEXT
Ticket: #{persisted[:ticket_id]}
User message: #{persisted[:user_message]}
TEXT
end
end
class SupportReplyWorkflow < Smith::Workflow
context_manager SupportReplyContext
initial_state :idle
state :done
state :failed
transition :reply, from: :idle, to: :done do
execute :support_reply_agent
on_failure :fail
end
endresult = SupportReplyWorkflow.new(
context: {
ticket_id: "T-1042",
user_message: "I was charged twice for the same invoice."
}
).run!
result.state
# => :done
result.output
# => final workflow outputThe immediate value is not just "call a model". It is that the call now happens inside:
- an explicit workflow state machine
- a step log
- a standard failure path
- a result object with cumulative best-known totals
The normal public way to pass input into a workflow is exactly what the quickstart does:
- pass data through
context: - declare which keys matter with
persist - turn those keys into agent-visible input with
inject_state
If you need conversation history rather than just structured workflow input, that history lives in session_messages in persisted workflow state and comes back through .from_state.
If a workflow should start from a deterministic first turn, use seed_messages on the workflow. Seeded messages are only added for newly initialized workflows and do not rerun on restore.
Use Smith::Agent when you want RubyLLM agents plus Smith-specific operational controls.
Smith adds:
budgetguardrailsoutput_schemadata_volumefallback_modelsregister_as
It still keeps the RubyLLM agent surface, so you can continue using things like:
modeltoolsinstructionstemperaturethinking
Example:
class ResearchSummarySchema
# Replace this with your real RubyLLM schema object/class.
# The intended shape here is something like:
# { summary: "...", sources: ["..."] }
end
class ResearchAgent < Smith::Agent
register_as :research_agent
model "gpt-4.1-nano"
temperature 0.2
budget token_limit: 20_000, cost: 0.75, wall_clock: 20, tool_calls: 5
fallback_models "gpt-4.1-mini"
output_schema ResearchSummarySchema
instructions do |_context|
"Research the topic and return a concise, factual answer."
end
endNotes:
output_schemais passed through to RubyLLM schema support for providers that support structured outputs.thinkingis inherited from RubyLLM and forwards reasoning/thinking controls to providers that support them.- use
thinkingwith reasoning-capable models, for example:
class DeepReasoningAgent < Smith::Agent
register_as :deep_reasoning_agent
model "o4-mini"
thinking effort: :medium, budget: 2_048
endUse Smith::Workflow to define the actual orchestration graph.
It gives you:
- states
- transitions
- workflow budgets
- max transition bounds
- workflow-level guardrails
- context management
- stepwise execution
- persistence and resume
Example:
class ResearchWorkflow < Smith::Workflow
initial_state :idle
state :researching
state :done
state :failed
budget total_tokens: 150_000, total_cost: 2.50, wall_clock: 300, tool_calls: 20
max_transitions 12
transition :start, from: :idle, to: :researching do
execute :research_agent
on_success :finish
on_failure :fail
end
transition :finish, from: :researching, to: :done
endworkflow.run! returns a result object with:
stateoutputstepstotal_costtotal_tokenscontextsession_messagestool_resultsoutcomeoutcome_kindoutcome_payloadusage_entries
Those totals are cumulative best-known workflow totals, including resumed execution and nested roll-up, not just the last run! segment.
usage_entries is the per-agent-call billing-facts list — one Smith::Workflow::UsageEntry per agent provider call, each carrying usage_id, agent_name, model, input_tokens, output_tokens, cost, attempt_kind (:completed_attempt or :failed_attempt), and recorded_at. Hosts persist these idempotently (the usage_id is the natural unique key) and use them to compute per-row charges. The collection is deep-copied on RunResult population — host mutation does not leak back into workflow state.
Convenience helpers:
terminal_outputlast_errorfailed_transitionfailure_detail
context, session_messages, tool_results, and outcome are returned as final-state snapshots for host handling code. Mutating them does not mutate workflow internals.
Typical successful step entry:
{
transition: :reply,
from: :idle,
to: :done,
output: { "status" => "ok" }
}Typical failed step entry:
{
transition: :reply,
from: :idle,
to: :done,
error: #<Smith::AgentError ...>
}Blank or nil agent completions are treated as agent-boundary failures, not successful step output.
Smith will fail the step with Smith::BlankAgentOutputError instead of accepting an empty assistant turn.
Use advance! when you want stepwise execution instead of running the workflow to completion in one call.
workflow = ResearchWorkflow.new
first_step = workflow.advance!
workflow.state
# => :researching
result = workflow.run!
workflow.state
# => :doneThis shows the mixed mode clearly:
advance!executes one steprun!then continues from the current workflow state- in this example,
run!performs the remaining work and finishes the workflow
This is useful when your host app wants to inspect or persist state between step boundaries.
Use this as the quick selection rule:
| If you need... | Use... |
|---|---|
| One guarded model call behind a workflow boundary | A single transition with execute |
| Fixed sequential stages | pipeline |
| Classification-based branching | route |
| Fan-out across N parallel calls | execute ..., parallel: true, count: N |
| Reusable subflows | workflow ChildWorkflow |
| Iterative improve-and-judge loops | optimize |
| An orchestrator delegating structured tasks to workers | orchestrate |
The rest of this README walks through those in increasing complexity.
Use this when you want one agent call with real workflow semantics around it.
class TicketReplyAgent < Smith::Agent
register_as :ticket_reply_agent
model "gpt-4.1-nano"
instructions do |_context|
"Draft a support reply that is concise, calm, and actionable."
end
end
class TicketReplyWorkflow < Smith::Workflow
initial_state :idle
state :done
state :failed
transition :reply, from: :idle, to: :done do
execute :ticket_reply_agent
on_failure :fail
end
end
result = TicketReplyWorkflow.new.run!Why this is useful even when it looks small:
- you get a named transition
- failures route consistently
- the step is visible in
result.steps - you can later add budgets, guardrails, persistence, context, or tracing without rewriting the shape
Use this when you want sequential work, but each stage still needs its own step boundary and failure semantics.
class IntakeAgent < Smith::Agent
register_as :intake_agent
model "gpt-4.1-nano"
end
class DraftAgent < Smith::Agent
register_as :draft_agent
model "gpt-4.1-nano"
end
class ReviewWorkflow < Smith::Workflow
initial_state :idle
state :triaged
state :drafted
state :done
state :failed
transition :intake, from: :idle, to: :triaged do
execute :intake_agent
on_success :draft
on_failure :fail
end
transition :draft, from: :triaged, to: :drafted do
execute :draft_agent
on_success :finish
on_failure :fail
end
transition :finish, from: :drafted, to: :done
endValue:
- no hidden control flow
- no prompt-level "now do step 2"
- if step 1 or step 2 fails, the failure is a real workflow event, not an accidental provider exception leaking through
Use pipeline when the flow is mechanically sequential and you do not want to hand-write each transition.
class ResearchAgent < Smith::Agent
register_as :research_agent
model "gpt-4.1-nano"
end
class OutlineAgent < Smith::Agent
register_as :outline_agent
model "gpt-4.1-nano"
end
class DraftAgent < Smith::Agent
register_as :draft_agent
model "gpt-4.1-nano"
end
class ArticleWorkflow < Smith::Workflow
initial_state :idle
state :drafted
state :failed
pipeline :draft_article, from: :idle, to: :drafted do
stage :research, execute: :research_agent
stage :outline, execute: :outline_agent
stage :draft, execute: :draft_agent
on_failure :fail
end
endWhy pipeline matters:
- you still get real step boundaries
- each stage is still visible in the step log
- the last stage output becomes the workflow result
- the generated transitions are explicit and stable, rather than hidden in a loop
Note: on_failure inside the pipeline block applies to the generated pipeline transitions as a whole.
It is not a separate per-stage custom failure policy surface.
Use route when a classifier decides which specialist transition should run next.
The classifier output must be a hash that includes:
:route:confidence
Example:
class RouteDecisionSchema
# Replace this with your real RubyLLM schema object/class.
# Intended shape:
# { route: :refund, confidence: 0.91 }
end
class TriageAgent < Smith::Agent
register_as :triage_agent
model "gpt-4.1-nano"
output_schema RouteDecisionSchema
instructions do |_context|
<<~TEXT
Return a Hash with:
- :route => one of the declared route keys
- :confidence => a float between 0.0 and 1.0
TEXT
end
end
class RefundAgent < Smith::Agent
register_as :refund_agent
model "gpt-4.1-nano"
end
class GeneralSupportAgent < Smith::Agent
register_as :general_support_agent
model "gpt-4.1-nano"
end
class SupportRouterWorkflow < Smith::Workflow
initial_state :idle
state :triaged
state :refund_handled
state :general_handled
state :failed
transition :classify, from: :idle, to: :triaged do
route :triage_agent,
routes: {
refund: :handle_refund,
support: :handle_general
},
confidence_threshold: 0.75,
fallback: :handle_general
on_failure :fail
end
transition :handle_refund, from: :triaged, to: :refund_handled do
execute :refund_agent
on_failure :fail
end
transition :handle_general, from: :triaged, to: :general_handled do
execute :general_support_agent
on_failure :fail
end
endWhy this is better than "classifier prompt + if/else outside":
- route resolution is part of the workflow contract
- confidence thresholds are explicit
- invalid router outputs fail as workflow errors
- the chosen next transition is persisted and restored across resume
In practice, router outputs should be treated as structured outputs, not free-form prose.
Use parallel execution when the same kind of work must be done across multiple branches.
class FindingAgent < Smith::Agent
register_as :finding_agent
model "gpt-4.1-nano"
budget token_limit: 8_000, cost: 0.20, wall_clock: 15
end
class ParallelResearchWorkflow < Smith::Workflow
initial_state :idle
state :done
state :failed
budget total_tokens: 60_000, total_cost: 1.50, wall_clock: 90
transition :fan_out, from: :idle, to: :done do
execute :finding_agent, parallel: true, count: 4
on_failure :fail
end
endWhy this is valuable:
- Smith treats each branch as a real invocation
- workflow budgets remain cumulative outer limits
- agent budgets still narrow each branch call
- branch failures discard step output and route through normal failure handling
- prepared input is reused consistently across branches
Use nested workflows when one part of the system deserves to be a reusable subflow with its own states and transitions.
class ChildResearchAgent < Smith::Agent
register_as :child_research_agent
model "gpt-4.1-nano"
end
class ResearchSubflow < Smith::Workflow
initial_state :idle
state :done
transition :research, from: :idle, to: :done do
execute :child_research_agent
end
end
class ParentWorkflow < Smith::Workflow
initial_state :idle
state :researched
state :done
state :failed
transition :run_research, from: :idle, to: :researched do
workflow ResearchSubflow
on_failure :fail
end
transition :finish, from: :researched, to: :done
endWhat you get:
- the child workflow's final output becomes the parent step output
- parent step count stays parent-scoped
- parent and child share the outer budget ledger
- nested best-known token/cost totals roll up into the parent result
- artifact scope is preserved across nesting
Use optimize when one agent generates candidates and another agent evaluates whether the result is acceptable.
The evaluator output is expected to carry a contract like:
accept: true/falsefeedback: ...when rejecting- optional
score - optional
converged
Example:
class TranslationEvaluationSchema
# Replace this with your real RubyLLM schema object/class.
# Intended shape:
# { accept: true/false, feedback: "...", score: 0.93 }
end
class TranslationGenerator < Smith::Agent
register_as :translation_generator
model "gpt-4.1-nano"
end
class TranslationEvaluator < Smith::Agent
register_as :translation_evaluator
model "gpt-4.1-nano"
output_schema TranslationEvaluationSchema
end
class TranslationWorkflow < Smith::Workflow
initial_state :idle
state :done
state :failed
transition :translate, from: :idle, to: :done do
optimize generator: :translation_generator,
evaluator: :translation_evaluator,
max_rounds: 3,
evaluator_schema: TranslationEvaluationSchema,
improvement_threshold: 0.05
on_failure :fail
end
endWhy this matters:
- the loop is explicit, bounded, and observable
- acceptance criteria are structured
- exhaustion, malformed evaluator output, and convergence without acceptance fail normally
- costs and token usage from the full loop roll into the workflow totals
Use orchestrate when you need an orchestrator that can emit structured tasks for workers and later decide when the system is done.
The orchestrator can emit one of:
tasks: [...]final: {...}stop: "...reason..."
Example schemas:
class ResearchTaskSchema
def self.required_keys = %i[task_id input]
end
class WorkerOutputSchema
def self.required_keys = %i[finding]
end
class FinalOutputSchema
def self.required_keys = %i[summary]
end
class OrchestratorDecisionSchema
# Replace this with your real RubyLLM schema object/class.
# Intended shape:
# { tasks: [...] } or { final: {...} } or { stop: "..." }
endExample workflow:
class ResearchOrchestrator < Smith::Agent
register_as :research_orchestrator
model "gpt-4.1-nano"
output_schema OrchestratorDecisionSchema
instructions do |_context|
<<~TEXT
Return exactly one of:
- { tasks: [{ task_id:, input: }] }
- { final: { summary: ... } }
- { stop: "reason" }
TEXT
end
end
class ResearchWorker < Smith::Agent
register_as :research_worker
model "gpt-4.1-nano"
end
class ResearchProgramWorkflow < Smith::Workflow
initial_state :idle
state :done
state :failed
transition :research, from: :idle, to: :done do
orchestrate orchestrator: :research_orchestrator,
worker: :research_worker,
max_workers: 4,
max_delegation_rounds: 3,
task_schema: ResearchTaskSchema,
worker_output_schema: WorkerOutputSchema,
final_output_schema: FinalOutputSchema
on_failure :fail
end
endWhy this is valuable:
- delegation is explicit and bounded
- tasks and outputs are structured
- worker fan-out is controlled
- exhaustion and malformed orchestrator output fail as first-class workflow failures
Notes:
- the workflow helper validates
task_schema,worker_output_schema, andfinal_output_schema - worker execution automatically applies
worker_output_schema - the orchestrator still benefits from
output_schemaso its decision shape is pushed down to the provider layer too
Not every workflow step needs an agent. Sometimes you need small, deterministic logic inside the graph: verification, routing, normalization, or failure classification. Smith provides two transition primitives for this: compute and run.
Both yield a constrained step object — not the full workflow — and execute synchronously with no agent call, no budget consumption, and no session message output.
Use compute for steps that check prior output and decide what happens next.
transition :verify_research, from: :gathered, to: :verified do
compute do |step|
if step.tool_results.any? { |t| t[:captured]&.dig(:retryable) }
step.fail!("research temporarily unavailable", retryable: true)
end
unless step.last_output
step.write_outcome(kind: :terminal_failure, payload: { message: "no usable research output" })
step.route_to(:finish_terminal_failure)
end
step.route_to(:structure)
end
on_failure :fail
endUse run for steps that transform or prepare workflow-local state.
transition :normalize, from: :gathered, to: :prepared do
run do |step|
step.write_context(:normalized, step.last_output&.upcase)
step.route_to(:structure)
end
endThe yielded step object exposes a narrow, read-heavy surface:
| Read | Write / Control |
|---|---|
step.context |
step.write_context(key, value) |
step.read_context(key) |
step.write_outcome(kind:, payload:) |
step.last_output / step.output |
step.route_to(:transition_name) |
step.fail!(msg, retryable:, kind:, details:) |
|
step.tool_results |
|
step.session_messages |
|
step.current_state |
|
step.transition_name |
- Routing:
step.route_tooverrideson_success. If neither is set, normal state-based resolution applies. Named transitions that do not exist fail loudly withWorkflowError. - Failure:
step.fail!raisesSmith::DeterministicStepFailure(extendsWorkflowError) withretryable,kind, anddetailsmetadata. Routes throughon_failurelike any other step failure. - Outcome:
step.write_outcome(kind:, payload:)stores a workflow-owned terminal payload without smuggling it through context. The payload is persisted with the workflow and surfaced onRunResult.outcome,RunResult.outcome_kind, andRunResult.outcome_payload. - Context reads:
step.contextreturns an isolated snapshot of the workflow context at step start. Mutating that snapshot does not mutate workflow state.step.read_context(key)returns a merged view — pendingwrite_contextvalues override the snapshot. Useread_contextwhen you need read-after-write coherence within the same step. - No output: Deterministic steps produce no session message output.
last_outputcontinues to mean the last agent output. - No budget: No tokens or cost consumed.
- Persistence: Context writes and written outcomes survive
to_state/from_state. The block itself (a Proc) lives on the class-level Transition and is never serialized. - Trace: Emits
:deterministic_steptraces for start, success/routed, and failure. When a step writes an outcome, the trace includesoutcome_kind. - Mutual exclusivity:
computeandruncannot be combined withexecute,route,workflow,optimize, ororchestrate. A transition declares exactly one primary execution body.
Fallback chains are declared on the agent and stay inside one logical invocation.
class CriticalAgent < Smith::Agent
register_as :critical_agent
model "gpt-4.1"
fallback_models "gpt-4.1-mini", "gpt-4.1-nano"
endCurrent behavior:
- the primary model is tried first
- fallback moves through the declared chain
- only transient upstream failures trigger fallback
- guardrail, policy, schema, budget, deadline, and workflow failures do not
- best-known token and cost accounting accumulates across attempts
- the successful attempt is priced against the model that actually handled it
Smith tools extend RubyLLM tools with:
- privilege enforcement
- custom authorization
- tool guardrails
- deadline enforcement
- tool-call budgeting
- tracing
- result capture (workflow-scoped tool output collection)
Example:
class RefundCustomer < Smith::Tool
category :action
capabilities do
privilege :elevated
end
authorize do |context|
context[:account_id] && context[:role] == :elevated
end
def perform(context:, charge_id:, reason:)
# call your billing system here
{ refunded: true, charge_id: charge_id, reason: reason }
end
endTools can declare a capture_result block to collect structured data during workflow execution. Smith stores captured data on the workflow and exposes it on RunResult#tool_results. Smith does not interpret the payload — the host app owns all projection.
class WebSearch < Smith::Tool
capture_result do |kwargs, result|
{ query: kwargs[:query], urls: extract_urls(result) }
end
def perform(query:)
# search implementation
end
endAfter workflow execution:
result = MyWorkflow.run_persisted!(key: "search:123", context: { topic: "AI" })
result.tool_results
# => [{ tool: "web_search", captured: { query: "AI trends", urls: ["https://..."] } }]Captured tool results survive persistence — they are included in to_state and restored via from_state.
tool_results is designed for compact structured evidence (URLs, metadata, refs). Hosts should avoid storing large raw payloads there. If large tool outputs are needed, use artifacts and capture refs or metadata instead.
You can still use RubyLLM agent tool wiring on your agents:
class RefundAgent < Smith::Agent
register_as :refund_agent
model "gpt-4.1-nano"
tools RefundCustomer
endGuardrails can be attached at either the workflow level or the agent level.
Workflow guardrails run before agent guardrails for inputs, and before agent guardrails for outputs as well.
Example:
class SupportGuardrails < Smith::Guardrails
def require_input(payload)
raise "missing input" if payload.nil?
end
def sanitize_output(payload)
raise "empty response" if payload.nil?
end
def require_ticket(kwargs)
raise "ticket_id required" unless kwargs.dig(:context, :ticket_id)
end
input :require_input
output :sanitize_output
tool :require_ticket, on: [:refund_customer]
endAttach them like this:
class GuardedAgent < Smith::Agent
register_as :guarded_agent
model "gpt-4.1-nano"
guardrails SupportGuardrails
end
class GuardedWorkflow < Smith::Workflow
guardrails SupportGuardrails
initial_state :idle
state :done
transition :finish, from: :idle, to: :done do
execute :guarded_agent
end
endUse Smith::Context when you want:
- persisted workflow context keys
- observation masking over session history
- injected state summaries
Example:
class ReviewContext < Smith::Context
persist :ticket_id, :current_findings, :source_urls
session_strategy :observation_masking, window: 6
inject_state do |persisted|
<<~TEXT
Ticket: #{persisted[:ticket_id]}
Findings: #{persisted[:current_findings]}
Sources: #{Array(persisted[:source_urls]).join(", ")}
TEXT
end
end
class ReviewWorkflow < Smith::Workflow
context_manager ReviewContext
initial_state :idle
state :done
transition :review, from: :idle, to: :done do
execute :review_agent
end
endWhat Smith does for you:
- prepares masked session input at step boundaries
- injects a state summary message into that prepared input
- persists declared workflow context keys
- persists accepted session history
- preserves chosen next transitions across persistence
- supports JSON host round-trips through
to_stateand.from_state
Example host-controlled persistence:
workflow = ReviewWorkflow.new(context: {
ticket_id: "T-1042",
current_findings: "needs escalation",
source_urls: ["https://example.test/refund-policy"]
})
payload = JSON.generate(workflow.to_state)
# Store payload wherever your app wants.
restored = ReviewWorkflow.from_state(JSON.parse(payload))
result = restored.run!Important: Smith is resumable, but it is still your app's job to store and retrieve that state.
For the common restore-or-initialize case, Smith also exposes a small configured-adapter one-liner:
result = ReviewWorkflow.run_persisted!(
key: "ticket:T-1042",
context: {
ticket_id: "T-1042",
current_findings: "needs escalation"
},
on_step: ->(step) { puts "checkpointed #{step[:transition]}" },
clear: :done
)clear: :done is the default. Pass clear: false to preserve terminal state for host-managed cleanup timing, or clear: :terminal to clear any terminal workflow state once the run completes.
on_step: is a best-effort host callback. It runs after an accepted step has been checkpointed. Callback failures are logged and ignored; they do not roll back or abort durable workflow progression.
If the persistence key is a deterministic function of workflow context, declare it once on the workflow:
class ReviewWorkflow < Smith::Workflow
persistence_key { |ctx| "ticket:#{ctx[:ticket_id]}" }
end
result = ReviewWorkflow.run_persisted!(
context: {
ticket_id: "T-1042",
current_findings: "needs escalation"
}
)When a workflow derives its key this way, Smith persists the resolved durability key in workflow state. That keeps instance-level helpers such as persist!, advance_persisted!, and clear_persisted! stable across restore even when the workflow's context manager persists only a filtered subset of context keys.
If you need more explicit control, the lower-level lifecycle is still available:
workflow = ReviewWorkflow.restore_or_initialize(
key: "ticket:T-1042",
context: {
ticket_id: "T-1042",
current_findings: "needs escalation"
}
)
step = workflow.advance_persisted!("ticket:T-1042")
# Host app can broadcast or project progress here.
emit_progress(step)
result = workflow.run_persisted!("ticket:T-1042")
workflow.clear_persisted!("ticket:T-1042")restore(key, ...) is intentionally stricter: it requires a non-blank explicit key, and the lookup key remains authoritative for the restored workflow even if stored state contains an embedded persistence_key.
These helpers do not make Smith a job system or durable runtime. They only remove repetitive restore/checkpoint boilerplate around the configured persistence adapter while leaving queueing, projection, and recovery policy with the host app.
Use artifacts when outputs are too large to keep inline.
Smith exposes:
Smith.artifacts.storeSmith.artifacts.fetchSmith.artifacts.expired
The common pattern is to hand off the heavy payload in after_completion.
class LargeReportAgent < Smith::Agent
register_as :large_report_agent
model "gpt-4.1-nano"
data_volume :unbounded
def after_completion(result, _context)
ref = Smith.artifacts.store(
result[:full_report],
content_type: "application/json"
)
{
report_ref: ref,
summary: result[:summary]
}
end
endConfigure a backend:
Smith.configure do |config|
config.artifact_store = Smith::Artifacts::Memory.new
config.artifact_retention = 3600
endWhy this matters:
- large payloads can move out of the inline workflow result
- refs are execution-scoped
- nested workflows inherit artifact scope correctly
Smith supports two budget layers.
Workflow budgets are cumulative outer limits:
class BudgetedWorkflow < Smith::Workflow
budget total_tokens: 100_000, total_cost: 3.00, wall_clock: 300, tool_calls: 20
endAgent budgets are per-invocation narrowing constraints:
class BudgetedAgent < Smith::Agent
budget token_limit: 12_000, cost: 0.40, wall_clock: 20, tool_calls: 4
endThe naming is intentionally asymmetric:
- workflow budget dimensions are cumulative totals:
total_tokenstotal_cost
- agent budget dimensions are per-invocation caps:
token_limitcost
Shortcut:
- workflow budget means "how much can the whole workflow consume?"
- agent budget means "how much can this one invocation consume?"
Current budget model:
- workflow budgets are cumulative workflow truth
- agent budgets narrow individual invocations
- parallel branches honor per-branch agent budgets
- tool calls participate in budget enforcement
- denied tool calls do not leak exact
tool_callsbudget
Cost tracking is deliberately best-known:
- Smith computes model-call cost when pricing is configured
- unknown pricing does not fabricate cost
- unknown usage does not fabricate cost or tokens
RunResult.total_costandtotal_tokensare cumulative best-known totals- totals include resumed execution, nested roll-up, and fallback attempts where usage is known
Example pricing configuration:
Smith.configure do |config|
config.pricing = {
"gpt-4.1-nano" => {
input_cost_per_token: 0.0000001,
output_cost_per_token: 0.0000004
}
}
endSmith can emit structural traces for:
- transitions
- tool calls
- token usage
- cost
Example:
Smith.configure do |config|
config.trace_adapter = Smith::Trace::Logger
config.trace_transitions = true
config.trace_tool_calls = true
config.trace_token_usage = true
config.trace_cost = true
config.trace_content = false
endBuilt-in adapters include:
Smith::Trace::MemorySmith::Trace::LoggerSmith::Trace::OpenTelemetry
The default posture is structural tracing with content omitted unless you opt in.
There are three different configuration scopes.
Use this for shared runtime services:
- artifact backend
- tracing
- pricing catalog
- logger
Use agent classes for invocation behavior:
modeltoolsinstructionstemperaturethinkingbudgetguardrailsoutput_schemadata_volumefallback_modelsregister_as
Use workflow classes for orchestration behavior:
initial_statestatetransitionpipelinebudgetmax_transitionsguardrailscontext_manager
- "Which model should this agent use?" -> agent class
- "How do I store artifacts or emit traces?" ->
Smith.configure - "What happens after this step succeeds or fails?" -> workflow class
- "How many tokens/cost/tool calls can this one invocation use?" -> agent budget
- "How much total budget can the whole workflow consume?" -> workflow budget
- "Which provider credentials should the app use?" -> RubyLLM, not Smith
Smith.configure do |config|
config.artifact_store = Smith::Artifacts::Memory.new
config.artifact_retention = 3600
config.artifact_encryption = :none
config.artifact_tenant_isolation = false
config.trace_adapter = Smith::Trace::Logger
config.trace_transitions = true
config.trace_tool_calls = true
config.trace_token_usage = true
config.trace_cost = true
config.trace_fields = {
transition: %i[transition from to],
tool_call: %i[tool duration]
}
config.trace_content = false
config.trace_retention = 86_400
config.trace_tenant_isolation = false
config.pricing = {
"gpt-4.1-nano" => {
input_cost_per_token: 0.0000001,
output_cost_per_token: 0.0000004
}
}
config.logger = Logger.new($stdout)
end| Setting | What it controls | Typical first use |
|---|---|---|
artifact_store |
Where large handoff payloads are stored | Start with Smith::Artifacts::Memory.new |
artifact_retention |
Default retention window for artifact expiry checks | Set once you have a cleanup policy |
artifact_encryption |
Metadata-level encryption policy flag | Leave at default until you wire a real backend |
artifact_tenant_isolation |
Require namespaced artifact writes | Enable in multi-tenant systems |
trace_adapter |
Where structural traces go | Use Smith::Trace::Memory or Smith::Trace::Logger first |
trace_transitions |
Emit transition traces | Usually leave on |
trace_tool_calls |
Emit tool call traces | Usually leave on |
trace_token_usage |
Emit usage traces | Useful for budget visibility |
trace_cost |
Emit cost traces | Useful once pricing is configured |
trace_fields |
Allowlist structural trace fields | Use when you want tighter trace output |
trace_content |
Whether content appears in traces | Leave false first |
trace_retention |
Trace retention policy hook | Useful when traces leave memory |
trace_tenant_isolation |
Trace multi-tenant isolation flag | Enable in multi-tenant systems |
pricing |
Best-known model-call cost catalog | Add once you care about total_cost |
logger |
Smith's runtime logger | Usually the first setting to add |
Add settings in this order:
config.loggerconfig.trace_adapterconfig.artifact_storeconfig.pricing
Do not start by configuring every advanced switch at once.
If you are evaluating Smith seriously before release:
- treat this README as a guide, not a frozen contract
- pin the exact commit you depend on
- check
spec/SPEC_MATRIX.mdfor what is directly covered - verify the specific runtime seam you care about in the specs
The project is already useful for exploring workflow-first agent design, but the public surface is still settling.
bundle install
bundle exec rspecSmith is for Ruby teams that want agent systems with:
- explicit orchestration
- composable multi-agent patterns
- real budgets and guardrails
- resumable workflow state
- artifacts and tracing
- enough structure to build serious applications without pretending prompts are control flow
If that is the layer you need, Smith is the interesting part of the stack.