29 May 19:38

dca833e

v1.0.0 Latest

Latest

odek v1.0.0 — First Stable Release

Minimal Go autonomous agent runtime — 385 commits, 191 releases, one binary.

What is odek?

odek is a runtime, not a framework. It's the smallest possible surface area between an LLM and your tools: a single ~12 MB static binary, zero frameworks (stdlib + 2 packages), instant startup.

At its core is a ReAct loop (Reasoning + Acting): observe → think → act → repeat. The LLM reasons about the current state, decides what to do, and odek executes those actions — in parallel when possible, with systematic recovery when things fail.

$ go install github.com/BackendStack21/odek/cmd/odek@v1.0.0
$ export ODEK_API_KEY=sk-...
$ odek run "Run the tests and fix any failures"

The Journey to 1.0

Milestone	What shipped
v0.1.0 – v0.40.0	Core loop, tool registry, CLI, REPL, browser, file tools, MCP server, Docker sandbox
v0.41.0 – v0.52.0	Systematic tool-failure recovery, persistent memory (facts + episodes), Telegram bot, parallel tool execution, batch approval gate, Web UI, session resolver, security hardening
v0.53.0 – v0.55.0	Context-limit protection (trimToSurvival), sub-agent delegation, skill auto-learning, bypass-resistant danger classifier
v0.56.0 – v0.58.0	Async post-processing (no more hang), semantic session search, artifact-aware file search, MCP client, episode + skill provenance gating, FD-based API key handoff
v1.0.0	Audit system with divergence heuristic, untrusted-content wrapper with per-call nonce, approver friction mode, sub-agent risk caps, UI refactor — stability and security complete

385 commits. 191 tagged releases. One binary. We shipped fast, fixed fast, and never let a regression survive longer than a release.

Architecture at a Glance

CLI / REPL / Web UI / Telegram bot
            │
     ┌──────▼──────┐
     │  ReAct Loop  │  observe → think → parallel-act → repeat
     │  (300 iter)  │
     └──────┬──────┘
            │
   ┌────────┼────────┐
   ▼        ▼        ▼
Tools    Memory   Sub-agents
(25+)   (3-tier)  (up to 8)

Core Engine

Parallel tool execution — independent tool calls run concurrently (default: 4, configurable)
Batch approval gate — multiple risky tools shown in a single prompt, reducing fatigue
Context-limit protection — trimToSurvival drops oldest messages when approaching the model's context window, keeping the agent functional under extended sessions
Tool-failure recovery — systematic recovery: retry transient errors, skip permanently failed tools, continue without crashing
Async post-processing — skill learning and episode extraction run in background goroutines (eliminated the 2-5 second hang after every run)
Interaction modes — engaging (narrated), enhance (persistent), verbose (raw), off

Security — 12-Layer Defense

odek is an LLM agent that executes shell commands, reads/writes files, fetches URLs, and spawns sub-agents. That capability is the point. It's also the security problem. v1.0.0 ships layered defenses against prompt injection and approval fatigue:

#	Layer	What it does
1	Sandboxed execution	Isolated Docker container per session — no network, no host mounts beyond cwd, zero capabilities, destroyed on exit. `odek serve` enables it by default.
2	Untrusted-content wrapper	Every tool output from outside the trust boundary (browser, shell, read_file, MCP tools, transcribe) is wrapped in `<untrusted_content_<nonce>>`. Per-call nonce defeats wrapper-escape attacks.
3	Audit log + divergence heuristic	Every ingest is recorded with source + content-hash + turn. After each turn, a heuristic flags `suspicious_divergence` when the agent references resources the user didn't mention. Inspect with `odek audit <session-id>`.
4	Tainted memory episodes	Episodes from sessions that ingested untrusted content are stored but never auto-replayed. `Search()` filters them out.
5	Skill provenance gate	Skills auto-learned from untrusted contexts are pinned to `Lazy` (never auto-load). `odek skill promote` clears the flag after user review.
6	Sub-agent risk caps	`delegate_tasks` carries `trust_level` + `max_risk`. Untrusted → all dangerous actions forced to Deny. `max_risk` → everything above cap Deny.
7	FD-based API key handoff	Parent writes key to a 0600 tempfile, immediately `unlink()`s, passes the FD via `cmd.ExtraFiles`. Key never in `/proc/<pid>/environ`.
8	Bypass-resistant classifier	`normalize()` expands `$IFS`, extracts `$()` and backtick substitutions, strips `command`/`exec`/`builtin` wrappers, collapses unquoted backslashes, basenames absolute paths.
9	Approver friction mode	After 3 approvals of the same class in 60 seconds: requires typing literal `approve`, enforces 1.5s pause. Disabled shortcut for `destructive` + `blocked` regardless.
10	WS Origin allowlist	Rejects non-localhost WebSocket upgrades. Closes CSRF-on-localhost.
11	Secret redaction	20+ patterns: OpenAI, Anthropic, GitHub PAT, AWS, PEM, JWT, Vault, Google OAuth, SendGrid, Discord, DB URLs.
12	Regression bar	Every documented mitigation has a corresponding test in `security_report_validation_test.go`.

Full threat model: docs/SECURITY.md

What's in the Binary

25+ Built-in Tools (zero subprocess forks)

read_file, write_file, search_files, patch, batch_read, batch_patch, glob, file_info, shell, parallel_shell, browser, http_batch, math_eval, diff, count_lines, multi_grep, json_query, tree, checksum, sort, head_tail, base64, tr, word_count, transcribe, delegate_tasks, session_search

Persistent Memory — 3 Tiers

Facts — agent-managed durable key-value entries
Session buffer — auto-appended turn summaries
Episodes — LLM-extracted knowledge from past sessions. Merge-on-write via go-vector RandomProjections (cosine >0.7 auto-merges, <0.3 auto-adds). Saves ~80% LLM calls.

Skill System (on by default)

Skill-matched SKILL.md files load on-demand. Auto-learns patterns from every session — detects multi-step procedures, error recoveries, repeated actions, and user corrections. LLM-enriched with names, descriptions, triggers, and structured bodies. Import from any URI with automatic LLM risk assessment.

Sub-Agent Delegation

Parallel OS-process sub-agents via delegate_tasks. True isolation — each sub-agent is a fresh odek subagent process with its own config, tools, and timeout. Up to 8 concurrent workers. Risk-based trust caps.

MCP — Model Context Protocol

Full server implementation (stdio + SSE transport) and client (connect to external MCP servers). Tools are discovered and usable within the agent loop.

Platform Support

CLI, REPL (with raw-mode terminal editor), Web UI (HTTP + WebSocket), Telegram bot — all from one binary.

Performance

Metric	Value
Binary size	~12 MB (static)
Startup time	Instant (< 50ms)
Dependencies	5 packages (3 stdlib + 2 focused)
Benchmark	AIEB v2.0 — 80.3% (highest published agent score)
Test coverage	200+ unit + E2E tests across all tools

Breaking Changes from v0.x

None. v1.0.0 is backwards-compatible with all v0.58.x configurations and workflows. The 1.0 designation marks stability, not a rewrite.

Upgrade

go install github.com/BackendStack21/odek/cmd/odek@v1.0.0
odek --version  # → odek v1.0.0

What's Next

1.0 means the core is stable. Upcoming:

Streaming tool output — real-time shell and browser output in the terminal
Multi-model routing — route different workloads to different LLMs automatically
Remote sandbox — execute in cloud VMs, not just local Docker
Plugin system — load external tools as shared libraries

385 commits. 191 releases. 1 binary. Let's build.

Full Changelog: v0.58.8...v1.0.0

Assets 7

26 May 07:09

molty3000

v0.58.8

6b5c61f

v0.58.8 — Archive sessions on /new, fix deepsearch test

Features

archive sessions on /new instead of deleting + fix deepsearch test

Documentation

reverse CHANGELOG order to newest-first
regenerate full CHANGELOG.md from git history via generate-changelog.sh
deprecate manual changelog edits — point to generate-changelog.sh

Infrastructure

add generate-changelog.sh — conventional-commit changelog generator

Full Changelog: v0.58.7...v0.58.8

Assets 7

26 May 06:26

molty3000

v0.58.7

cbdd5b3

v0.58.7 — Dynamic release badge on landing page

Changes

🌐 Landing Page

Replaced the hardcoded v0.48.0 version badge in the hero section with a dynamic Shields.io badge linked to GitHub Releases
Zero JavaScript — the badge auto-updates via CDN-cached metadata from the latest release tag
Clicking the badge now takes you straight to the releases page

Full Changelog: v0.58.6...v0.58.7

Assets 7

26 May 06:10

molty3000

v0.58.6

5d6ed2b

v0.58.6 — session recall edge-case tests

Tests

6 new edge-case tests for the session recall pipeline

TestSessionSearch_DeepSearchTwoTokens — Verifies the v0.58.4 fix: a session where only "changes" appears in message content does NOT match query "go-vector changes". A session with both "go-vector" AND "changes" DOES match. Prevents the false positive that plagued the events fetcher analysis session.

TestSessionSearch_GetReturnsSessionMessages — Verifies the v0.58.3 fix: get returns the full session_messages array with correct role and content for every user/assistant message. System messages are excluded.

TestSessionSearch_PreSavePersistence — Verifies the v0.58.5 fix: a session saved to the Store is immediately findable by session_search. This simulates the pre-agent-loop save that ensures the current turn's data is visible to search tools inside the ReAct loop.

TestSessionSearch_DeepSearchEdgeCases — Three sub-tests:

Empty messages don't cause panics
System-only messages are excluded from deepSearch matching
Unicode content with two matching tokens works correctly

Full Changelog: v0.58.5...v0.58.6

Assets 7

26 May 06:07

molty3000

v0.58.5

3d43ef8

v0.58.5 — save user message before agent loop

Fixes

Telegram bot: save user message before agent loop

session_search inside the agent loop could never find the current turn's data. The user message was appended to an in-memory slice (line 998 of telegram.go) but only persisted to disk AFTER RunWithMessages completed (line 1534).

The entire ReAct loop ran with the current turn's messages invisible to both:

Vector search (Phase 1) — the index didn't have the current content
Deep search (Phase 2) — Store.Load() read a stale file from disk

Now the user message is saved to the Store immediately after being appended, using a direct Store.Save() call that bypasses the TurnCount increment. The normal end-of-turn save at line 1534 still runs and overwrites with the final state (including tool results and bot responses).

This ensures that any session_search call inside the agent loop can find the current turn's conversation content on disk and in the vector index.

Full Changelog: v0.58.4...v0.58.5

Assets 7

26 May 05:51

molty3000

v0.58.4

43ba6de

v0.58.4 — deepSearch requires 2+ distinct token matches

Fixes

session_search no longer matches on a single common word

Query "go-vector changes" was matching the events fetcher analysis session because "changes" appeared once in a message like "Events changed: +8 -8 = 10 total". deepSearch accepted any single token match across 100+ messages.

Now deepSearch tracks distinct matched tokens and requires at least 2 (or all for single-token queries). A single common word like "changes", "release", or "update" can no longer qualify an unrelated session.

Tool description updated

Added guidance telling the LLM to use get after search to read the actual conversation content. In v0.58.3, get was updated to return full session_messages but the LLM didn't know to use it.

Full Changelog: v0.58.3...v0.58.4

Assets 7

26 May 05:37

molty3000

v0.58.3

12d4add

v0.58.3 — session message content + recursive glob

Fixes

session_search `get` now returns actual message content

Previously get only returned message count + buffer summaries. The LLM couldn't read what was actually said in past sessions — it only saw 2-line buffer snippets. Now session_messages includes every user and assistant message with role + content, so the bot can directly read and understand past conversations.

`glob` tool now supports `**` recursive patterns

Go's filepath.Match and filepath.Glob don't support ** (globstar) — they treat ** as literal * characters. When the bot called glob {"pattern":"**/*.json","path":"..."}, it got {"matches":null} every time. The pattern **/*.json was silently failing because filepath.Match("**/*.json", path) never matches anything.

Now ** patterns are detected and converted to equivalent regex (e.g. **/*.json -> ^.*/[^/]*\.json$), so recursive globs actually work.

Full Changelog: v0.58.2...v0.58.3

Assets 7

26 May 05:24

molty3000

v0.58.2

f4ffc8a

v0.58.2 — stale vector cleanup on prune

Fixes

/prune no longer leaves orphaned vectors

Store.Cleanup() primary path (with index) bypassed Store.Delete() and directly removed session files + index entries — but never called Vec.Remove(). Every /prune command left stale vectors in vectors.gob.

The fallback path (no index) was correct — it used Store.Delete() which includes Vec.Remove().

Now the index-based path also calls Vec.Remove(id) alongside file removal.

Impact: No data corruption (stale vectors are skipped during search since Load() returns nil for deleted files, and the threshold filter vr.Score < 0.40 drops them). But the store accumulated uncompacted garbage that would never be cleaned up.

Full Changelog: v0.58.1...v0.58.2

Assets 7

26 May 05:19

molty3000

v0.58.1

9e3ef5e

v0.58.1 — session_search false-positive fix

Fixes

session_search no longer returns garbage results

Problem: Two bugs caused handleSearch to return "say hello" sessions as false positives:

Vector score threshold too low (0.05): Random Projections (bag-of-words) matches generic tech queries against "say hello" sessions at 0.30-0.36. Querying "odek project molty agent skill" returned irrelevant hello sessions with scores above the old threshold.
Deep search pool too narrow (20 sessions): Keyword fallback only searched the 20 most recent sessions. With 115+ "say hello" heartbeat tests occupying the recent list, substantive older sessions were never reached.

Fix:

Raised vector score threshold to 0.40 — only strong matches pass Phase 1
Changed deep search to List(0) — scans ALL sessions, not just the 20 most recent

Full Changelog: v0.58.0...v0.58.1

Assets 7

26 May 04:53

github-actions

v0.58.0

086f212

v0.58.0

Full Changelog: v0.57.0...v0.58.0

Assets 7

Releases: BackendStack21/odek

v1.0.0

odek v1.0.0 — First Stable Release

What is odek?

The Journey to 1.0

Architecture at a Glance

Core Engine

Security — 12-Layer Defense

What's in the Binary

25+ Built-in Tools (zero subprocess forks)

Persistent Memory — 3 Tiers

Skill System (on by default)

Sub-Agent Delegation

MCP — Model Context Protocol

Platform Support

Performance

Breaking Changes from v0.x

Upgrade

What's Next

Uh oh!

v0.58.8 — Archive sessions on /new, fix deepsearch test

Features

Documentation

Infrastructure

Uh oh!

v0.58.7 — Dynamic release badge on landing page

Changes

🌐 Landing Page

Uh oh!

v0.58.6 — session recall edge-case tests

Tests

6 new edge-case tests for the session recall pipeline

Uh oh!

v0.58.5 — save user message before agent loop

Fixes

Telegram bot: save user message before agent loop

Uh oh!

v0.58.4 — deepSearch requires 2+ distinct token matches

Fixes

session_search no longer matches on a single common word

Tool description updated

Uh oh!

v0.58.3 — session message content + recursive glob

Fixes

session_search get now returns actual message content

glob tool now supports ** recursive patterns

Uh oh!

v0.58.2 — stale vector cleanup on prune

Fixes

/prune no longer leaves orphaned vectors

Uh oh!

v0.58.1 — session_search false-positive fix

Fixes

session_search no longer returns garbage results

Uh oh!

v0.58.0

Uh oh!

session_search `get` now returns actual message content

`glob` tool now supports `**` recursive patterns