Skip to content

lostbean/agent_obs

Repository files navigation

AgentObs

Hex.pm Documentation CI

An Elixir library for LLM agent observability.

AgentObs provides a simple, powerful, and idiomatic interface for instrumenting LLM agentic applications with telemetry events. It supports multiple observability backends through a pluggable handler architecture.

Features

  • 🎯 High-level instrumentation helpers - trace_agent/3, trace_tool/3, trace_llm/3, trace_prompt/3
  • 🤖 ReqLLM integration helpers (optional) - Automatic instrumentation for ReqLLM with token tracking and streaming support
  • 🔌 Pluggable backend architecture - Support for multiple observability platforms
  • 🌟 OpenInference support - Full semantic conventions for Arize Phoenix
  • 📊 Rich metadata tracking - Token usage, costs, tool calls, and more
  • 🚀 Built on OTP - Supervised handlers with fault tolerance
  • 🧪 Backend-agnostic - Standardized event schema independent of backends

Architecture

AgentObs uses a two-layer architecture:

Layer 1: Core Telemetry API (Backend-Agnostic)

  • Leverages Elixir's native :telemetry ecosystem
  • Provides high-level helpers for instrumenting agent operations
  • Defines standardized event schemas

Layer 2: Pluggable Backend Handlers

  • Phoenix handler with OpenInference semantic conventions
  • Generic OpenTelemetry handler
  • Extensible to other platforms (Langfuse, Datadog, etc.)

Installation

Add agent_obs to your list of dependencies in mix.exs:

def deps do
  [
    {:agent_obs, "~> 0.1.0"}
  ]
end

Quick Start

1. Configure AgentObs

# config/config.exs
config :agent_obs,
  enabled: true,
  handlers: [AgentObs.Handlers.Phoenix]

# config/runtime.exs (for Arize Phoenix)
config :opentelemetry,
  span_processor: :batch,
  resource: [service: [name: "my_llm_agent"]]

config :opentelemetry_exporter,
  otlp_protocol: :http_protobuf,
  otlp_endpoint: System.get_env("ARIZE_PHOENIX_OTLP_ENDPOINT", "http://localhost:6006"),
  otlp_headers: []
# Note: /v1/traces is automatically appended by the exporter

2. Instrument Your Agent

defmodule MyApp.WeatherAgent do
  def get_forecast(city) do
    AgentObs.trace_agent("weather_forecast", %{input: "What's the weather in #{city}?"}, fn ->
      # Call LLM to determine tool to use
      {:ok, tool_call, _metadata} = call_llm_for_planning(city)

      # Execute the tool
      {:ok, weather_data} = AgentObs.trace_tool("get_weather_api", %{
        arguments: %{city: city}
      }, fn ->
        {:ok, %{temp: 72, condition: "sunny"}}
      end)

      # Return final result
      {:ok, "The weather in #{city} is #{weather_data.condition}", %{
        tools_used: ["get_weather_api"],
        iterations: 1
      }}
    end)
  end

  defp call_llm_for_planning(city) do
    AgentObs.trace_llm("gpt-4o", %{
      input_messages: [%{role: "user", content: "Get weather for #{city}"}]
    }, fn ->
      # Simulate LLM API call
      response = call_openai(...)

      {:ok, response, %{
        output_messages: [%{role: "assistant", content: response}],
        tokens: %{prompt: 50, completion: 25, total: 75},
        cost: 0.00012
      }}
    end)
  end
end

3. View Traces in Arize Phoenix

Start a local Phoenix instance:

docker run -p 6006:6006 -p 4317:4317 arizephoenix/phoenix:latest

Navigate to http://localhost:6006 to view your traces with:

  • Rich chat message visualization
  • Token usage and cost tracking
  • Tool call inspection
  • Nested span relationships

Handlers

Phoenix Handler (OpenInference)

Translates events to OpenInference semantic conventions for Arize Phoenix:

config :agent_obs,
  handlers: [AgentObs.Handlers.Phoenix]

Generic Handler (Basic OpenTelemetry)

Creates basic OpenTelemetry spans without OpenInference:

config :agent_obs,
  handlers: [AgentObs.Handlers.Generic]

Multiple Handlers

Use multiple backends simultaneously:

config :agent_obs,
  handlers: [
    AgentObs.Handlers.Phoenix,  # For detailed LLM observability
    AgentObs.Handlers.Generic   # For APM integration
  ]

ReqLLM Integration (Optional)

For applications using ReqLLM, AgentObs provides high-level helpers that automatically instrument LLM calls with full observability:

# Add to your deps
{:req_llm, "~> 1.0.0-rc.7"}

# Non-streaming text generation
{:ok, response} =
  AgentObs.ReqLLM.trace_generate_text(
    "anthropic:claude-3-5-sonnet",
    [%{role: "user", content: "Hello!"}]
  )

text = ReqLLM.Response.text(response)

# Streaming text generation
{:ok, stream_response} =
  AgentObs.ReqLLM.trace_stream_text(
    "anthropic:claude-3-5-sonnet",
    [%{role: "user", content: "Tell me a story"}]
  )

stream_response.stream
|> Stream.filter(&(&1.type == :content))
|> Stream.each(&IO.write(&1.text))
|> Stream.run()

# Structured data generation
schema = [name: [type: :string, required: true], age: [type: :pos_integer]]

{:ok, response} =
  AgentObs.ReqLLM.trace_generate_object(
    "anthropic:claude-3-5-sonnet",
    [%{role: "user", content: "Generate a person"}],
    schema
  )

object = ReqLLM.Response.object(response)
#=> %{name: "Alice", age: 30}

Benefits:

  • Automatic token usage extraction
  • Automatic tool call parsing
  • Works across all ReqLLM providers (Anthropic, OpenAI, Google, etc.)
  • Supports both streaming and non-streaming
  • Structured data generation with schema validation
  • Bang variants (!) for convenience

See the demo agent and ReqLLM integration guide for complete examples.

API Reference

High-Level Instrumentation

  • trace_agent/3 - Instruments agent loops or invocations
  • trace_tool/3 - Instruments tool calls
  • trace_llm/3 - Instruments LLM API calls
  • trace_prompt/3 - Instruments prompt template rendering

ReqLLM Helpers (Optional)

Text Generation:

  • AgentObs.ReqLLM.trace_generate_text/3 - Non-streaming text generation
  • AgentObs.ReqLLM.trace_generate_text!/3 - Non-streaming (bang variant)
  • AgentObs.ReqLLM.trace_stream_text/3 - Streaming text generation

Structured Data Generation:

  • AgentObs.ReqLLM.trace_generate_object/4 - Non-streaming structured data
  • AgentObs.ReqLLM.trace_generate_object!/4 - Non-streaming (bang variant)
  • AgentObs.ReqLLM.trace_stream_object/4 - Streaming structured data

Tool Execution:

  • AgentObs.ReqLLM.trace_tool_execution/3 - Instrumented tool execution

Stream Helpers:

  • AgentObs.ReqLLM.collect_stream/1 - Collect text stream with metadata
  • AgentObs.ReqLLM.collect_stream_object/1 - Collect object stream with metadata

Low-Level API

  • emit/2 - Emits custom telemetry events
  • configure/1 - Runtime configuration updates

See the full documentation for detailed API reference and examples.

Testing

Running Tests

# Run all tests (unit tests only, 99 tests)
mix test

# Include integration tests (requires API keys)
mix test --include integration

# Run only integration tests
mix test --only integration

ReqLLM Integration Tests

The ReqLLM module includes comprehensive test coverage with 193 tests:

Unit Tests (185 tests) - Run by default, use mocked streams:

  • Stream text and object collection
  • Tool call extraction and argument parsing
  • Token usage extraction
  • Function signature validation
  • Error handling (malformed JSON, missing data)
  • Edge cases (nil values, partial data, multiple fragments)
  • All generate_text, generate_object, and stream_object variants

Integration Tests (8 tests) - Excluded by default, require real LLM API calls:

  • Real LLM streaming with telemetry verification
  • Real non-streaming text generation
  • Real structured data generation (objects)
  • Real streaming object generation
  • Real tool execution with instrumentation
  • Full agent loop with streaming and tools
  • Bang variants (!) with real API calls

To run integration tests, set one of these environment variables:

export ANTHROPIC_API_KEY=your_key  # Uses claude-3-5-haiku-latest
# OR
export OPENAI_API_KEY=your_key     # Uses gpt-4o-mini
# OR
export GOOGLE_API_KEY=your_key     # Uses gemini-2.0-flash-exp

mix test --include integration

If no API key is configured, integration tests gracefully skip without failing.

Development

Quick Commands

# Install dependencies
mix deps.get

# Run pre-commit checks (format, test, credo)
mix precommit

# Run CI checks (format check, test, credo)
mix ci

Individual Commands

# Run tests
mix test

# Format code
mix format

# Check if code is formatted
mix format --check-formatted

# Run Credo (code quality)
mix credo

# Run Credo in strict mode
mix credo --strict

# Generate documentation
mix docs

# Run Dialyzer (type checking)
mix dialyzer

Pre-commit Hook

For automatic code quality checks before commits, you can run:

mix precommit

This will:

  1. Format your code
  2. Run all tests
  3. Run Credo in strict mode

CI Pipeline

The mix ci command is designed for continuous integration and will:

  1. Check that code is properly formatted (fails if not)
  2. Run all tests
  3. Run Credo in strict mode

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Copyright (c) 2025 Edgar Gomes

References

About

An Elixir library for LLM agent observability

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages