feat: add SQLite database support to OpenCode analyzer#120
Conversation
OpenCode has migrated from individual JSON message files to a SQLite database (opencode.db). This adds seamless support for the new format alongside the existing JSON files — no new tab, all data merges under the existing 'OpenCode' tab. ## What changed - Parse messages from ~/.local/share/opencode/opencode.db using the message, session, project, and part tables - Batch-load tool call stats from the part table with a LIKE pre-filter to avoid deserializing large non-tool parts - Deduplicate messages across JSON files and SQLite DB during the migration period (same global_hash for identical messages) - Dynamic contribution strategy: MultiSession when DB exists (correct for multi-message source), SingleMessage for JSON-only installs - Watch both the legacy storage/message/ directory and the parent opencode/ directory for SQLite DB changes - Accept opencode.db as a valid data path for file watcher events ## Refactoring Extracted shared logic into reusable helpers: - compute_message_stats() — stats computation from message + tool stats - build_conversation_message() — ConversationMessage construction - json_to_conversation_message() — legacy JSON path wrapper - Made OpenCodeMessage.id and session_id #[serde(default)] so the same struct parses both full JSON files and DB data blobs (which omit those fields; they come from DB columns instead) ## Tests Added 20 new tests covering: - SQLite data blob parsing (assistant, user, minimal) - Stats computation (with cost, user messages, tool stats preservation) - build_conversation_message with various project hash fallbacks - Global hash consistency between JSON and SQLite paths - In-memory SQLite integration tests (projects, sessions, tool stats, end-to-end message conversion)
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds read-only SQLite Changes
Sequence DiagramsequenceDiagram
participant Discovery as Discovery (FS/DB)
participant FS as Filesystem (JSON)
participant DB as SQLite DB
participant Parser as Parser/Converter
participant Stats as Stats Aggregator
participant Dedup as Deduplicator
participant Output as Output
Discovery->>FS: detect legacy JSON message dirs
Discovery->>DB: detect `opencode.db` / `opencode-*.db`
par JSON path
FS->>Parser: load JSON files
Parser->>Stats: extract per-message tool stats from parts
Stats-->>Parser: per-message tool stats
Parser->>Parser: json_to_conversation_message / build_conversation_message
and DB path
DB->>Parser: open_db (read-only), load project/session/messages
Parser->>Stats: batch_load_tool_stats_from_db & batch_load_step_finish_from_db
Stats-->>Parser: aggregated tool & step-finish stats
Parser->>Parser: parse_sqlite_messages -> build_conversation_message
end
Parser->>Dedup: emit messages with global hash (DB precedence)
Dedup->>Output: unified, deduplicated messages
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Fixes clippy::field_reassign_with_default triggered by CI's `cargo clippy --tests -- -D warnings`.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
src/analyzers/opencode.rs (5)
459-477: Duplicated tool-stats accumulation logic.The tool-name matching and stat incrementing in
extract_tool_stats_from_parts(lines 459–477) andbatch_load_tool_stats_from_db(lines 633–652) are identical. Extracting a shared helper would reduce the chance of future divergence.♻️ Proposed shared helper
fn accumulate_tool_stat(stats: &mut Stats, tool_name: &str, value: &OwnedValue) { stats.tool_calls += 1; match tool_name { "read" => { stats.files_read += 1; } "glob" => { stats.file_searches += 1; if let Some(count) = value .get("state") .and_then(|s| s.get("metadata")) .and_then(|m| m.get("count")) .and_then(|c| c.as_u64()) { stats.files_read += count; } } _ => {} } }Then both call sites become:
- stats.tool_calls += 1; - - match tool_name { - "read" => { - stats.files_read += 1; - } - "glob" => { - stats.file_searches += 1; - ... - } - _ => {} - } + accumulate_tool_stat(&mut stats, tool_name, &value);Also applies to: 633-652
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 459 - 477, Extract the duplicated tool-stat logic into a helper function (e.g., fn accumulate_tool_stat(stats: &mut Stats, tool_name: &str, value: &OwnedValue)) and replace the matching blocks in extract_tool_stats_from_parts and batch_load_tool_stats_from_db with calls to this helper; the helper should increment stats.tool_calls and handle "read" and "glob" cases (including extracting the nested "state"->"metadata"->"count" as_u64 to add to stats.files_read and increment stats.file_searches for "glob"). Ensure the function is visible to both call sites (module-level) and use the same Stats and OwnedValue types as in the original code.
883-927: Duplicated source-partitioning and parsing logic across methods.
get_stats_with_sources(lines 883–927) repeats the partition → load-context → parallel-parse-JSON → sequential-parse-DB pattern that already exists inparse_sources_parallel_with_paths(lines 814–860). Consider reusingparse_sources_parallel:fn get_stats_with_sources(&self, sources: Vec<DataSource>) -> Result<AgenticCodingToolStats> { let messages = self.parse_sources_parallel(&sources); // ... aggregate stats from `messages` ... }This eliminates ~40 lines of duplicated logic and ensures future changes to the parsing pipeline are applied in one place.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 883 - 927, get_stats_with_sources currently reimplements the partition/load/parallel-JSON/sequential-DB parsing logic already implemented in parse_sources_parallel_with_paths (aka parse_sources_parallel); replace the duplicated block in get_stats_with_sources with a call to that parsing helper to obtain Vec<ConversationMessage> (or adapt the helper to return that type), e.g. let messages = self.parse_sources_parallel_with_paths(sources) and then aggregate stats from messages; ensure any error handling or storage_root-dependent behavior is centralized in parse_sources_parallel_with_paths and update get_stats_with_sources to use its return value for further aggregation.
597-603: LIKE pre-filter may miss tool parts with unexpected JSON formatting.The two LIKE patterns cover
"type":"tool"and"type": "tool", but won't match other valid JSON whitespace variants (e.g.,"type" : "tool"or multi-line formatting). Since the filter is an optimization and false negatives would silently drop tool stats, consider a single broader pattern or a note documenting the assumption about OpenCode's serialization format.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 597 - 603, The current conn.prepare call uses two specific LIKE patterns that miss valid JSON spacing/formatting variants and can silently drop tool parts; replace the fragile LIKE filter with a robust check such as using SQLite JSON functions (e.g., json_extract(data, '$.type') = 'tool') or broaden the pattern to a single catch‑all before parsing, so all parts with type=="tool" are reliably detected; update the SQL string passed to conn.prepare (the query in the SELECT message_id, data FROM part ...) accordingly and ensure subsequent code that deserializes data still handles non-tool rows if you keep a looser pre-filter.
789-805:parse_sourcereloads all projects & sessions for every individual JSON file.When called in a loop (e.g., from a watcher processing one file at a time),
load_projectsandload_sessionsare invoked per file. This is fine for one-off parses but is worth noting; the batch path (parse_sources_parallel_with_paths) correctly loads context once.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 789 - 805, parse_source currently calls load_projects and load_sessions on every invocation (reloading context per JSON file); change parse_source to avoid per-file reloads by accepting preloaded context or reusing a cached value: update parse_source signature to take projects and sessions (e.g., add parameters like projects: &ProjectsType, sessions: &SessionsType or a single Context struct), remove the internal calls to load_projects/load_sessions and use the supplied preloaded data, and update call sites (including parse_sources_parallel_with_paths and the watcher loop) to load projects/sessions once and pass them through; alternatively implement a small memoized/cache lookup keyed by storage_root inside parse_source if changing the signature is impractical.
853-858: Consider structured logging instead ofeprintln!.Using
eprintln!for error reporting (also at line 921) mixes analyzer output with stderr. If the project uses a logging framework (e.g.,tracing), switching totracing::warn!would allow log-level filtering and structured metadata.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 853 - 858, Replace the eprintln! calls in the Err(e) arms (e.g., the block that prints "Failed to parse OpenCode SQLite DB {:?}: {}" and the similar call near line 921) with structured tracing logs: import tracing and use tracing::warn! (or trace/debug/info as appropriate) with named fields for the path and error (for example: tracing::warn!(path = %source.path.display(), error = %e, "Failed to parse OpenCode SQLite DB");). Ensure you remove the eprintln! usage, add the necessary use tracing::... import, and format the message as structured metadata so the logs can be filtered and queried.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/analyzers/opencode.rs`:
- Around line 1206-1208: Replace the two-step reassignment of a default Stats
with a single struct initialization: instead of creating tool_stats via
Stats::default() and then setting tool_calls and files_read, construct
tool_stats using the struct init pattern (base it on Default::default()) and set
tool_calls and files_read inline; target the variable tool_stats and the Stats
type, replacing the existing two assignments with the combined initialization.
- Around line 47-50: The doc for has_sqlite_db() claims it checks existence and
schema but the implementation only checks file existence; either update the
comment to state it only checks existence, or implement a lightweight schema
check: in has_sqlite_db() (or a helper called from it) open the SQLite at
Self::db_path(), run a simple query against sqlite_master to ensure the expected
table (e.g., "message") exists (for example: SELECT name FROM sqlite_master
WHERE type='table' AND name='message' LIMIT 1), and return true only if the file
exists, the DB opens, and the table is present; ensure errors opening/queries
are handled and result in false.
- Around line 705-707: The SQLite-path fallback uses session.project_id while
the JSON-path uses session.id, causing inconsistent project_hash; update the
SQLite-path fallback to use the session id instead. Locate the variables
session_title, worktree, fallback in opencode.rs and change the fallback
assignment from session.map(|s| s.project_id.as_str()) to use session.id (e.g.,
session.map(|s| s.id.as_str()) or session.map(|s| s.id.clone()) as appropriate)
so both JSON and SQLite paths use the same session_id fallback.
---
Nitpick comments:
In `@src/analyzers/opencode.rs`:
- Around line 459-477: Extract the duplicated tool-stat logic into a helper
function (e.g., fn accumulate_tool_stat(stats: &mut Stats, tool_name: &str,
value: &OwnedValue)) and replace the matching blocks in
extract_tool_stats_from_parts and batch_load_tool_stats_from_db with calls to
this helper; the helper should increment stats.tool_calls and handle "read" and
"glob" cases (including extracting the nested "state"->"metadata"->"count"
as_u64 to add to stats.files_read and increment stats.file_searches for "glob").
Ensure the function is visible to both call sites (module-level) and use the
same Stats and OwnedValue types as in the original code.
- Around line 883-927: get_stats_with_sources currently reimplements the
partition/load/parallel-JSON/sequential-DB parsing logic already implemented in
parse_sources_parallel_with_paths (aka parse_sources_parallel); replace the
duplicated block in get_stats_with_sources with a call to that parsing helper to
obtain Vec<ConversationMessage> (or adapt the helper to return that type), e.g.
let messages = self.parse_sources_parallel_with_paths(sources) and then
aggregate stats from messages; ensure any error handling or
storage_root-dependent behavior is centralized in
parse_sources_parallel_with_paths and update get_stats_with_sources to use its
return value for further aggregation.
- Around line 597-603: The current conn.prepare call uses two specific LIKE
patterns that miss valid JSON spacing/formatting variants and can silently drop
tool parts; replace the fragile LIKE filter with a robust check such as using
SQLite JSON functions (e.g., json_extract(data, '$.type') = 'tool') or broaden
the pattern to a single catch‑all before parsing, so all parts with type=="tool"
are reliably detected; update the SQL string passed to conn.prepare (the query
in the SELECT message_id, data FROM part ...) accordingly and ensure subsequent
code that deserializes data still handles non-tool rows if you keep a looser
pre-filter.
- Around line 789-805: parse_source currently calls load_projects and
load_sessions on every invocation (reloading context per JSON file); change
parse_source to avoid per-file reloads by accepting preloaded context or reusing
a cached value: update parse_source signature to take projects and sessions
(e.g., add parameters like projects: &ProjectsType, sessions: &SessionsType or a
single Context struct), remove the internal calls to load_projects/load_sessions
and use the supplied preloaded data, and update call sites (including
parse_sources_parallel_with_paths and the watcher loop) to load
projects/sessions once and pass them through; alternatively implement a small
memoized/cache lookup keyed by storage_root inside parse_source if changing the
signature is impractical.
- Around line 853-858: Replace the eprintln! calls in the Err(e) arms (e.g., the
block that prints "Failed to parse OpenCode SQLite DB {:?}: {}" and the similar
call near line 921) with structured tracing logs: import tracing and use
tracing::warn! (or trace/debug/info as appropriate) with named fields for the
path and error (for example: tracing::warn!(path = %source.path.display(), error
= %e, "Failed to parse OpenCode SQLite DB");). Ensure you remove the eprintln!
usage, add the necessary use tracing::... import, and format the message as
structured metadata so the logs can be filtered and queried.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/analyzers/opencode.rs (1)
878-946: 🛠️ Refactor suggestion | 🟠 Major
get_stats_with_sourcesduplicatesparse_sources_parallellogic.The JSON/SQLite partitioning, context loading, parallel parsing, and deduplication are duplicated here. Since
parse_sources_parallelalready handles all of this (including deduplication), this method could delegate to it:♻️ Proposed refactor
fn get_stats_with_sources( &self, sources: Vec<DataSource>, ) -> Result<crate::types::AgenticCodingToolStats> { - // Partition sources into JSON files and DB files. - let (db_sources, json_sources): (Vec<_>, Vec<_>) = sources - .iter() - .partition(|s| s.path.extension().is_some_and(|ext| ext == "db")); - - let mut all_messages: Vec<ConversationMessage> = Vec::new(); - - // --- Parse JSON sources in parallel --- - if !json_sources.is_empty() - && let Some(storage_root) = Self::storage_root() - { - // ... ~30 lines of duplicated parsing ... - } - - // --- Parse SQLite sources --- - for source in db_sources { - // ... duplicated DB parsing ... - } - - // Deduplicate - let messages = crate::utils::deduplicate_by_global_hash(all_messages); + let messages = self.parse_sources_parallel(&sources); // Aggregate stats. let mut daily_stats = crate::utils::aggregate_by_date(&messages);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 878 - 946, get_stats_with_sources duplicates the JSON/SQLite partitioning, context loading, parallel parsing, and deduplication already implemented in parse_sources_parallel; replace the body of get_stats_with_sources with a delegation to parse_sources_parallel and then adapt its returned messages/stats into the AgenticCodingToolStats struct. Specifically: call Self::parse_sources_parallel(sources) (ensuring storage_root/context are handled there), use the returned deduplicated messages to compute daily_stats and num_conversations (reuse crate::utils::aggregate_by_date) and construct the AgenticCodingToolStats with analyzer_name from self.display_name(); remove the duplicated json/db parsing and parse_sqlite_messages usage from get_stats_with_sources. Ensure any helper functions referenced (storage_root, parse_sources_parallel, deduplicate_by_global_hash) are used rather than reimplemented.
🧹 Nitpick comments (1)
src/analyzers/opencode.rs (1)
853-858: Consider structured logging instead ofeprintln!.Using
eprintln!for error reporting in a TUI application may interfere with the UI. If the project has a logging framework (e.g.,tracingorlog), preferwarn!orerror!macros for better observability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 853 - 858, Replace the direct stderr print in the Err arm that uses eprintln! with a structured logging macro (e.g., tracing::error! or log::error!) so the error doesn't disrupt the TUI; keep the same context by logging the source.path and the error (e), and add the appropriate use/import (tracing::error or log::error) at the top of the module; target the Err(e) => block that references source.path and e and swap eprintln! for the chosen logging macro with a clear message and structured fields if using tracing (e.g., error!(path = %source.path, error = %e, "Failed to parse OpenCode SQLite DB")).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/analyzers/opencode.rs`:
- Around line 596-603: The LIKE pre-filter in the conn.prepare call (query
string selecting from part WHERE data LIKE '%"type":"tool"%' OR data LIKE
'%"type": "tool"%') can miss tool parts with arbitrary whitespace/newlines;
update the query to either remove the LIKE filter entirely and rely on the
post-parse type check (the code path that inspects parsed JSON around line 625)
or replace the pattern with a broader match (e.g., use REGEXP/JSON functions if
supported) so that all candidate rows are returned; modify the SQL in the
conn.prepare invocation accordingly and keep the existing post-parse filtering
logic intact.
- Around line 527-530: DbSession.title is currently String but the DB title
column can be NULL, causing row.get(2)? to fail and rows to be lost; change the
DbSession struct's title field to Option<String> and update the query extraction
to use row.get::<_, Option<String>>(2)? (replace any plain row.get(2)? calls),
then adjust downstream code that expects a String (e.g., the mapping/usage
around the former line 705) to handle Option<String> safely (provide fallback or
propagate None) so sessions with NULL titles are preserved instead of being
dropped.
- Around line 270-276: The field s.cached_tokens is incorrectly set to only
tokens.cache.read; update the assignment in the block that checks msg.tokens so
s.cached_tokens = tokens.cache.write + tokens.cache.read (sum write and read)
instead of using tokens.cache.read alone; locate the code around the msg.tokens
handling where s.input_tokens, s.output_tokens, s.reasoning_tokens,
s.cache_creation_tokens and s.cache_read_tokens are set and change the
s.cached_tokens assignment accordingly.
---
Outside diff comments:
In `@src/analyzers/opencode.rs`:
- Around line 878-946: get_stats_with_sources duplicates the JSON/SQLite
partitioning, context loading, parallel parsing, and deduplication already
implemented in parse_sources_parallel; replace the body of
get_stats_with_sources with a delegation to parse_sources_parallel and then
adapt its returned messages/stats into the AgenticCodingToolStats struct.
Specifically: call Self::parse_sources_parallel(sources) (ensuring
storage_root/context are handled there), use the returned deduplicated messages
to compute daily_stats and num_conversations (reuse
crate::utils::aggregate_by_date) and construct the AgenticCodingToolStats with
analyzer_name from self.display_name(); remove the duplicated json/db parsing
and parse_sqlite_messages usage from get_stats_with_sources. Ensure any helper
functions referenced (storage_root, parse_sources_parallel,
deduplicate_by_global_hash) are used rather than reimplemented.
---
Duplicate comments:
In `@src/analyzers/opencode.rs`:
- Around line 705-707: The JSON and SQLite code paths generate inconsistent
project hashes because the JSON path uses session.id while the SQLite path uses
session.project_id; pick one and make both consistent — update the JSON path
(where fallback_project_hash is computed) to use session.project_id (matching
the SQLite path) so the same fallback value is used; locate uses of session.id
and session.project_id (and related vars like fallback_project_hash,
session_title, worktree, fallback) and replace the JSON-side session.id usage
with session.project_id.as_str() (or the equivalent) so grouping is consistent.
- Around line 47-50: The doc for has_sqlite_db() is incorrect: it claims the
function checks "exists and has the expected schema" but the implementation only
tests Self::db_path().is_some_and(|p| p.exists()); either update the doc to say
it only checks file existence or extend the function to validate schema (e.g.,
open the DB at Self::db_path(), run a PRAGMA user_version or query expected
tables/columns, and return true only if schema matches). Modify the comment or
implement the schema check in has_sqlite_db() and keep references to
Self::db_path() and has_sqlite_db() so callers remain unchanged.
---
Nitpick comments:
In `@src/analyzers/opencode.rs`:
- Around line 853-858: Replace the direct stderr print in the Err arm that uses
eprintln! with a structured logging macro (e.g., tracing::error! or log::error!)
so the error doesn't disrupt the TUI; keep the same context by logging the
source.path and the error (e), and add the appropriate use/import
(tracing::error or log::error) at the top of the module; target the Err(e) =>
block that references source.path and e and swap eprintln! for the chosen
logging macro with a clear message and structured fields if using tracing (e.g.,
error!(path = %source.path, error = %e, "Failed to parse OpenCode SQLite DB")).
- Fix doc comment on has_sqlite_db() to match implementation (only checks file existence, not schema) [comment 1] - Fix inconsistent fallback_project_hash between JSON and SQLite paths: JSON used session.id but SQLite used session.project_id, causing different project_hash values for the same message depending on which source won deduplication. Both now use session_id. [comment 2] - Extract shared accumulate_tool_stat() helper to deduplicate the tool-name matching logic between extract_tool_stats_from_parts (JSON filesystem) and batch_load_tool_stats_from_db (SQLite). [nitpick 1] - Collapse get_stats_with_sources() to reuse parse_sources_parallel() instead of reimplementing the partition/parse/dedup pipeline, removing ~40 lines of duplicated logic. [nitpick 2] - Document the LIKE pre-filter assumption: OpenCode uses JSON.stringify without pretty-printing, so the two patterns cover all expected formatting. False positives are harmless (filtered in Rust). [nitpick 3] Skipped two nitpicks that don't apply: - parse_source per-file reload: trait method signature is fixed, batch path already handles it, and this matches the pre-existing pattern. - eprintln! vs tracing: the entire codebase uses eprintln!, not tracing.
|
I'll do this one in the next release - after this one I'm about to do. |
|
Aah, dying for this one. |
OpenCode has continued evolving their SQLite database schema since the
initial v1.1.53 migration. This adds support for the current schema
alongside the original format.
## What changed
### Channel-specific database support
- Discover and parse opencode-{channel}.db files (e.g. opencode-canary.db)
in addition to the default opencode.db
- Updated discovery, glob patterns, watch directories, and validation
to recognize channel-specific databases
### Step-finish token aggregation
- When a message's data blob has zero tokens at message level, fall back
to aggregating token/cost data from step-finish parts
- This handles newer OpenCode versions where per-step accounting is the
primary data source
### Updated test infrastructure
- create_test_db() now matches the real initial schema (with slug,
version, sandboxes, icon_url, etc.)
- Added create_test_db_v2() matching the current schema (workspace_id,
workspace table, composite indexes, commands column)
## Tests
Added 9 new tests:
- v2 schema project/session loading
- Workspace-linked sessions
- 'global' project_id fallback
- step-finish token aggregation
- step-finish fallback for zero-token messages
- v2 end-to-end with real-world data shapes
- Newer message format parsing (tools field, total tokens)
- Channel-specific DB file validation
All 229 tests pass. Clippy, fmt, and doc checks clean.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/analyzers/opencode.rs (1)
294-301:⚠️ Potential issue | 🟡 Minor
cached_tokensis undercounted by excluding cache writes.Line 300 only copies
tokens.cache.read, so total cached usage is understated whenever cache writes are present.Proposed fix
- s.cached_tokens = tokens.cache.read; + s.cached_tokens = tokens.cache.read + tokens.cache.write;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/analyzers/opencode.rs` around lines 294 - 301, The code sets s.cached_tokens to tokens.cache.read, which undercounts cached usage because it omits cache writes; update the assignment in the block handling msg.tokens so s.cached_tokens = tokens.cache.read + tokens.cache.write (or use a checked/overflow-safe add if types require) referencing s.cached_tokens, tokens.cache.read and tokens.cache.write to ensure total cached tokens include both reads and writes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/analyzers/opencode.rs`:
- Around line 800-807: The fallback condition ignores aggregates that have
nonzero reasoning/cache metrics; update the if that checks
step_finish_map.get(&id) to consider any nonzero aggregate field (not just
agg.input || agg.output). Specifically, when retrieving agg from step_finish_map
in the block that uses msg_has_tokens, change the predicate to check (agg.input
> 0 || agg.output > 0 || agg.reasoning > 0 || agg.cache_read > 0 ||
agg.cache_write > 0) so cache-only or reasoning-only aggregates trigger the
step-finish fallback for msg tokens.
- Around line 1074-1079: The contribution_strategy() method flips between
ContributionStrategy::MultiSession and ContributionStrategy::SingleMessage based
on Self::has_sqlite_db(), which allows opencode.db appearing/disappearing at
runtime to move contributions between cache buckets and corrupt incremental
counts; make the strategy deterministic for the lifetime of the analyzer by
selecting and caching the strategy once at startup (e.g., compute and store a
fixed ContributionStrategy in the analyzer struct during construction) instead
of calling Self::has_sqlite_db() on each contribution_strategy() call so the
strategy cannot change mid-run.
- Around line 948-972: The JSON path currently pushes json_results into results
before deduping, allowing legacy JSON to win; change the load/dedup ordering so
SQLite rows take priority: either (A) load/extend results with SQLite-derived
rows first and then add JSON rows while skipping any whose global_hash is
already present, or (B) when building json_results check against an existing
HashSet of global_hashes from the already-loaded SQLite results and filter them
out. Look for json_sources, storage_root, load_projects, load_sessions,
json_results, results and the dedup step that checks global_hash (around the
current dedupe logic) and implement the skip-if-hash-exists behavior so JSON
cannot shadow richer SQLite records.
---
Duplicate comments:
In `@src/analyzers/opencode.rs`:
- Around line 294-301: The code sets s.cached_tokens to tokens.cache.read, which
undercounts cached usage because it omits cache writes; update the assignment in
the block handling msg.tokens so s.cached_tokens = tokens.cache.read +
tokens.cache.write (or use a checked/overflow-safe add if types require)
referencing s.cached_tokens, tokens.cache.read and tokens.cache.write to ensure
total cached tokens include both reads and writes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 22b25d55-ba96-4bfb-b04c-ac3eb48ecfd2
📒 Files selected for processing (1)
src/analyzers/opencode.rs
Fix 4 issues identified by CodeRabbit review: 1. cached_tokens now sums cache.write + cache.read (was read-only) Consistent with all other analyzers (piebald, kilo_code, cline, etc.) 2. DbSession.title changed to Option<String> for NULL safety Prevents silently dropping sessions if title column is NULL. Updated load_sessions_from_db to use row.get::<_, Option<String>> and all downstream usage to use .and_then()/.as_deref(). 3. Step-finish fallback now checks reasoning/cache tokens too Previously only checked input/output > 0, missing cases where step-finish parts had only reasoning or cache token data. 4. Dedup order swapped: SQLite parsed before JSON SQLite records are richer (tool stats, step-finish tokens) so they should win dedup. Since deduplicate_by_global_hash keeps the first seen entry, SQLite sources are now added to results first. Dismissed 4 suggestions as not applicable: - LIKE pre-filter whitespace: OpenCode uses JSON.stringify() (no pretty-printing), documented assumption, false negatives harmless - parse_source per-file reload: trait method used by watcher for single-file updates; batch path already optimized - eprintln → tracing: all analyzers use eprintln consistently - Runtime strategy flips: dismissed by PR author as unrealistic
Replace fragile LIKE pattern matching on JSON text with proper json_extract() calls for the tool-part and step-finish-part queries. Before: WHERE data LIKE '%"type":"tool"%' OR data LIKE '%"type": "tool"%' After: WHERE json_extract(data, '$.type') = 'tool' This is whitespace-agnostic and handles any valid JSON formatting. The bundled SQLite in rusqlite includes JSON1 by default. Addresses CodeRabbit review feedback on PR #120.
cargo update aws-lc-rs, aws-lc-sys, rustls-webpki, quinn-proto, reqwest - aws-lc-sys 0.36.0 → 0.39.1 (RUSTSEC-2026-0044/0045/0046/0047/0048) - rustls-webpki 0.103.8 → 0.103.10 (RUSTSEC-2026-0049) - quinn-proto 0.11.13 → 0.11.14 (RUSTSEC-2026-0037) - aws-lc-rs 1.15.3 → 1.16.2 - reqwest 0.13.1 → 0.13.2 cargo audit now reports 0 vulnerabilities.


Summary
OpenCode has gone through three storage format generations. This PR adds seamless support for all three alongside each other — no new tab, all data merges under the existing OpenCode tab.
~/.local/share/opencode/storage/message/opencode.dbwith basic tablesopencode-{channel}.db), composite indexes, and per-step token accountingWhat changed
SQLite parsing (v1 + v2)
~/.local/share/opencode/opencode.dbusing themessage,session,project, andparttablesparttable with aLIKEpre-filter to avoid deserializing large non-tool parts (text, reasoning, etc.)Channel-specific database support (v2)
opencode-{channel}.dbfiles (e.g.opencode-canary.db) in addition to the defaultopencode.dbStep-finish token aggregation (v2)
step-finishpartsinput,output,reasoning,cache.read,cache.write, andcostacross all step-finish parts per messageSeamless integration
global_hashformula (opencode_{session_id}_{msg_id})MultiSessionwhen any DB exists,SingleMessagefor JSON-only installsstorage/message/directory and the parentopencode/directory for SQLite DB changesopencode.db,opencode-{channel}.db, and legacy.jsonfilesRefactoring
Extracted shared logic into reusable helpers:
compute_message_stats()— stats computation from message + tool statsbuild_conversation_message()— ConversationMessage constructionjson_to_conversation_message()— legacy JSON path wrapperaccumulate_tool_stat()— shared tool-stat counting for both filesystem and DB pathsbatch_load_step_finish_from_db()— aggregates step-finish part tokens per messageOpenCodeMessage.idandsession_id#[serde(default)]so the same struct parses both full JSON files and DB data blobsCompatibility
Supports all three user states simultaneously:
Tests
Added 29 new tests covering:
JSON format:
SQLite v1:
build_conversation_messagewith various project hash fallbacksSQLite v2:
globalproject_id fallback handlingstep-finishpart token aggregationopencode-canary.dbetc.)All 229 tests pass (up from 200 on main). Clippy, fmt, and doc checks clean.
Summary by CodeRabbit
New Features
Tests