feat: Java tracing agent with end-to-end optimization pipeline by misrasaurabh1 · Pull Request #1874 · codeflash-ai/codeflash

misrasaurabh1 · 2026-03-19T06:05:42Z

Summary

Adds a complete Java tracing pipeline that captures method arguments from running Java programs, generates JUnit 5 replay tests, and feeds them into the optimization pipeline — achieving feature parity with the Python tracer.

Two-stage approach:

JFR profiling — uses Java Flight Recorder for accurate method-level timing (JIT-friendly, ~1% overhead)
Argument capture — uses a bytecode instrumentation agent (ASM) to serialize method arguments via Kryo into SQLite

The traced data is used to generate replay tests that exercise the original functions with real-world inputs, which are then used by the optimizer to verify correctness and benchmark candidates.

Java agent (codeflash-java-runtime)

TracerAgent / TracerConfig — agent entry point, JSON config parsing
TracingTransformer / TracingClassVisitor / TracingMethodAdapter — ASM bytecode instrumentation (uses COMPUTE_MAXS to avoid classloader deadlocks)
TraceRecorder / TraceWriter — async SQLite writer with Kryo serialization timeout (500ms via CachedThreadPool)
ReplayHelper — runtime class for replay tests: deserializes args from trace DB, invokes methods via reflection
AgentDispatcher — routes to tracer mode via trace= agent arg prefix

Python orchestration

codeflash/languages/java/tracer.py — JavaTracer two-stage flow (JFR + agent), run_java_tracer() entry point
codeflash/languages/java/replay_test.py — generates JUnit 5 replay tests from trace SQLite DB with metadata comments
codeflash/languages/java/jfr_parser.py — parses JFR files via jfr CLI tool for method-level profiling

Integration with optimizer pipeline

codeflash/tracer.py — language detection from codeflash.toml config; routes Java projects to _run_java_tracer()
codeflash/discovery/functions_to_optimize.py — _get_java_replay_test_functions() parses replay test metadata to discover traced functions
codeflash/languages/java/test_discovery.py — discovers ReplayTest_*.java files via metadata comments (static analysis can't trace helper.replay() string args)
codeflash/discovery/discover_unit_tests.py — classifies replay tests as TestType.REPLAY_TEST using TestInfo.is_replay flag
codeflash/benchmarking/function_ranker.py — JavaFunctionRanker ranks by JFR samples with min_functions=5 escape hatch for short workloads
codeflash/optimization/optimizer.py — extracts Java packages from file paths for JFR filtering; uses JavaFunctionRanker when language == "java"

Verified end-to-end

Ran the full pipeline on the Workload.java fixture:

Traced 10 invocations across 5 methods
Generated 1 replay test file with 10 test methods (correctly classified as replay tests)
Optimized 4 functions, found winning candidates for 3:
- repeatString: 2.58x faster (StringBuilder → String.repeat())
- filterEvens: 32% faster (bitwise parity, pre-sized list)
- instanceMethod: 75% faster (inlined computation)

Test plan

8 tracer e2e tests (test_java_tracer_e2e.py) — agent capture, replay test generation, two-stage orchestration
6 integration tests (test_java_tracer_integration.py) — function discovery, test discovery, JFR parsing, ranking, Maven compilation
2 replay test discovery tests (test_test_discovery.py) — metadata-based mapping, no confusion with regular tests
All 40 tracer-related tests pass; 643/644 total Java tests pass (1 pre-existing flaky timing test)
Full optimizer pipeline run on Workload fixture: trace → discover → rank → optimize → verify → benchmark
GitHub Actions workflow (e2e-java-tracer.yaml) for CI validation

🤖 Generated with Claude Code

claude · 2026-03-19T06:06:25Z

Claude finished @misrasaurabh1's task in 7s —— View job

PR Review Summary

Tasks

Prek Checks

✅ Ruff and formatting: pass (no issues)

Mypy flags 3 new errors introduced by this PR:

File	Line	Issue
`jfr_parser.py`	112–113	`top_method_key` typed as `str \| None` used as dict index and passed to `_store_method_info` — needs narrowing
`base.py`	697	`tuple[bool, list]` is missing type parameter; should be `tuple[bool, list[Any]]`

For jfr_parser.py:111-113 — the logic is actually correct (the matches_packages_cached guard returns False for None), but mypy can't see through it. Fix: change the guard to if top_method_key and matches_packages_cached(top_method_key): to explicitly narrow the type. Fix this →

Code Review

Bugs / Correctness

1. _get_java_replay_test_functions — LSP-mode silent return with invalid path (codeflash/discovery/functions_to_optimize.py:608–616)

exit_with_message doesn't call sys.exit when LSP is enabled — it logs and returns. When trace_file_path exists but the file on disk doesn't, execution falls through to return dict(functions), trace_file_path returning a path to a nonexistent file. There's already a raise AssertionError("Unreachable") guard for the None case above — the same pattern should be applied here. Fix this →

2. CompilationCache.clear() called unconditionally for all languages (codeflash/optimization/optimizer.py:751–753)

cleanup_temporary_paths always imports Java's CompilationCache and calls .clear(), even for Python and JS projects. This is a Java-specific side effect applied globally. Should be guarded with a language check (or CompilationCache should handle being called when empty). Fix this →

Design Issues

3. Language guard uses string literal instead of enum (codeflash/optimization/optimizer.py:385)

if functions_only and functions_only[0].language == "java":

The codebase mixes string comparisons ("java", "javascript") with Language enum usage — line 264 in functions_to_optimize.py uses Language.JAVASCRIPT. This is inconsistent and error-prone. Per language-patterns.md, the language should be compared via the Language enum. (Low priority since this pattern exists elsewhere in the codebase, but worth noting.)

4. Binary JAR committed to git (codeflash/languages/java/resources/codeflash-runtime-1.0.0.jar)

A 16MB JAR is modified in the diff (15.95MB → 15.97MB) and stored directly in git. There's no .gitattributes LFS config for it. This permanently bloats repo clone size. The code_to_optimize/java-gradle/libs/codeflash-runtime-1.0.0.jar (14.6MB) is also newly added in this PR for the test fixture. Consider LFS or a CI download step instead.

5. _run_java_tracer broad silent exception swallowing (codeflash/tracer.py:66, 80)

_detect_non_python_language has two except Exception: pass blocks that silently swallow any errors during language detection. If, say, the config file is malformed or the file path doesn't exist, the function returns None (treats it as Python) with no feedback. At minimum, a logger.debug in the except block would help with debugging.

Duplicate Detection

No meaningful duplicates detected. parse_replay_test_metadata is defined once in replay_test.py and wrapped by _parse_replay_metadata in test_discovery.py (thin delegation, not a duplicate).

detect_packages_from_source is a new JavaTracer static method with no equivalent elsewhere.

Test Coverage

The new files have accompanying tests:

test_java_tracer_e2e.py (8 tests)
test_java_tracer_integration.py (6 tests)
test_test_discovery.py (2 new tests)

_detect_non_python_language in codeflash/tracer.py is not exercised by unit tests — it's covered only by the e2e suite. Consider adding a unit test for the LSP-mode path in _get_java_replay_test_functions given the bug identified above.

Optimization PRs

PR #1877 (JfrProfile.get_method_ranking 73% speedup): Has merge conflicts with java-tracer (due to the previously merged #1876 touching the same file) and CI unit test failures. Leaving open — PR is less than 3 days old. The conflicts need manual resolution before it can be merged.

- Use `uv run -m codeflash.main` instead of direct file path - Remove redundant --no-pr (already hardcoded in _run_java_tracer) - Clean up leftover replay tests between retry attempts - Add error logging for subprocess output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Git doesn't track empty directories, so src/test/java must be created before process_pyproject_config validates tests-root exists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Unwrap logger.info call in tracer.py that fits within 120-char limit - Revert auto-generated dev version string in version.py back to 0.20.3 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

The original code performed a linear scan over `self._ranking` on every call to `get_function_addressable_time`, which `rank_functions` invokes repeatedly (once per function to filter, plus once per function to sort). The optimized version builds a hash map `_ranking_by_name` during `__init__`, replacing the O(n) loop with an O(1) dictionary lookup. Line profiler confirms the loop and comparison accounted for 94.7% of original runtime. When `rank_functions` calls `get_function_addressable_time` dozens or hundreds of times across a 1000-method ranking (as in `test_large_number_of_methods_and_repeated_queries_perf_and_correctness`), the lookup cost drops from ~293 µs to ~10 µs per call, yielding the 1244% overall speedup. The optimization also consolidates the two calls to `get_addressable_time_ns` in `get_function_stats_summary` into a single call, stored in a local variable, eliminating redundant work.

codeflash-ai · 2026-03-19T06:42:04Z

⚡️ Codeflash found optimizations for this PR

📄 1,245% (12.45x) speedup for `JavaFunctionRanker.get_function_addressable_time` in `codeflash/benchmarking/function_ranker.py`

⏱️ Runtime : 1.14 milliseconds → 85.0 microseconds (best of 250 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method JavaFunctionRanker.get_function_addressable_time by 1,245% in PR #1874 (java-tracer) #1875

If you approve, it will be merged into this PR (branch java-tracer).

…2026-03-19T06.41.55 ⚡️ Speed up method `JavaFunctionRanker.get_function_addressable_time` by 1,245% in PR #1874 (`java-tracer`)

codeflash-ai · 2026-03-19T06:49:57Z

This PR is now faster! 🚀 @misrasaurabh1 accepted my optimizations from:

⚡️ Speed up method JavaFunctionRanker.get_function_addressable_time by 1,245% in PR #1874 (java-tracer) #1875

- Read --timeout from both config.timeout and config.tracer_timeout - Handle multi-line /* */ block comments in package detection (aerospike source files start with license block comments before package declaration) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…n_ranker - Import Language from codeflash.languages (exported) not codeflash.languages.base - Fix _detect_non_python_language return type: object | None -> Language | None - Fix bare dict type annotations: dict -> dict[str, Any] in jfr_parser.py and function_ranker.py - Fix pytest_splits/test_paths type narrowing by separating assignment from None check Co-authored-by: Saurabh Misra <undefined@users.noreply.github.com>

The optimization precomputes all frame-to-key conversions for a stack trace once (into a `keys` list) instead of calling `_frame_to_key` repeatedly inside the caller-callee loop, cutting per-frame extraction from ~3.3 µs to ~0.19 µs (83% reduction) and lifting `_frame_to_key` from 20.8% of total time to 43.2% (the loop cost is now dominated by the upfront list comprehension rather than repeated calls). A local `matches_packages_cached` closure memoizes package-filter results to avoid re-checking the same method keys across caller relationships, reducing `_matches_packages` overhead from 12.6% to 0.8% of total time; profiler data shows `_matches_packages` hits dropped from 18,364 to 1,500. The timestamp-duration calculation switched from accumulating a list then calling `max()`/`min()` to inline min/max tracking, removing intermediate allocations; combined, these changes yield a 42% overall speedup (46.4 ms → 32.6 ms).

codeflash-ai · 2026-03-19T08:12:01Z

⚡️ Codeflash found optimizations for this PR

📄 42% (0.42x) speedup for `JfrProfile._parse_json` in `codeflash/languages/java/jfr_parser.py`

⏱️ Runtime : 46.4 milliseconds → 32.6 milliseconds (best of 32 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method JfrProfile._parse_json by 42% in PR #1874 (java-tracer) #1876

If you approve, it will be merged into this PR (branch java-tracer).

codeflash/languages/java/jfr_parser.py

codeflash-ai · 2026-03-19T09:01:41Z

⚡️ Codeflash found optimizations for this PR

📄 73% (0.73x) speedup for `JfrProfile.get_method_ranking` in `codeflash/languages/java/jfr_parser.py`

⏱️ Runtime : 4.38 milliseconds → 2.53 milliseconds (best of 5 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method JfrProfile.get_method_ranking by 73% in PR #1874 (java-tracer) #1877

If you approve, it will be merged into this PR (branch java-tracer).

…, filter empty names - Consolidate _parse_replay_metadata to call parse_replay_test_metadata instead of duplicating the parsing logic - Replace hardcoded fallback java command with a clear error message when no java command is provided - Filter empty strings from function_names split (\"".split(\",\") returns [\"\"] which is truthy) - Fix import ordering in tracer.py (ruff I001) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…2026-03-19T08.11.52 ⚡️ Speed up method `JfrProfile._parse_json` by 42% in PR #1874 (`java-tracer`)

codeflash-ai · 2026-03-19T18:42:45Z

This PR is now faster! 🚀 @claude[bot] accepted my optimizations from:

⚡️ Speed up method JfrProfile._parse_json by 42% in PR #1874 (java-tracer) #1876

These files were unrelated to the PR and got swept in during a stash/pop operation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Initial e2e tracer implementation

c699093

github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Mar 19, 2026

misrasaurabh1 and others added 4 commits March 18, 2026 23:17

fix: ensure src/test/java directory exists before config validation

c31f837

Git doesn't track empty directories, so src/test/java must be created before process_pyproject_config validates tests-root exists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style: fix ruff formatting and revert auto-generated version

d5ca52a

- Unwrap logger.info call in tracer.py that fits within 120-char limit - Revert auto-generated dev version string in version.py back to 0.20.3 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

codeflash-ai bot mentioned this pull request Mar 19, 2026

⚡️ Speed up method JavaFunctionRanker.get_function_addressable_time by 1,245% in PR #1874 (java-tracer) #1875

Merged

Merge pull request #1875 from codeflash-ai/codeflash/optimize-pr1874-…

59f34d8

…2026-03-19T06.41.55 ⚡️ Speed up method `JavaFunctionRanker.get_function_addressable_time` by 1,245% in PR #1874 (`java-tracer`)

misrasaurabh1 and others added 4 commits March 19, 2026 00:05

add some initial java docs

d12e631

codeflash-ai bot mentioned this pull request Mar 19, 2026

⚡️ Speed up method JfrProfile._parse_json by 42% in PR #1874 (java-tracer) #1876

Merged

codeflash-ai bot reviewed Mar 19, 2026

View reviewed changes

codeflash/languages/java/jfr_parser.py Show resolved Hide resolved

codeflash-ai bot mentioned this pull request Mar 19, 2026

⚡️ Speed up method JfrProfile.get_method_ranking by 73% in PR #1874 (java-tracer) #1877

Closed

misrasaurabh1 and others added 2 commits March 19, 2026 11:37

Merge pull request #1876 from codeflash-ai/codeflash/optimize-pr1874-…

5402088

…2026-03-19T08.11.52 ⚡️ Speed up method `JfrProfile._parse_json` by 42% in PR #1874 (`java-tracer`)

misrasaurabh1 and others added 2 commits March 19, 2026 11:48

chore: remove accidental package-lock.json and package.json

3bf8ffb

These files were unrelated to the PR and got swept in during a stash/pop operation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

make workload purposefully terrible

3b39657

misrasaurabh1 merged commit 59031a1 into main Mar 19, 2026
29 of 31 checks passed

misrasaurabh1 deleted the java-tracer branch March 19, 2026 22:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Java tracing agent with end-to-end optimization pipeline#1874

feat: Java tracing agent with end-to-end optimization pipeline#1874
misrasaurabh1 merged 14 commits intomainfrom
java-tracer

misrasaurabh1 commented Mar 19, 2026

Uh oh!

claude bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

codeflash-ai bot commented Mar 19, 2026

⚡️ Speed up method `JavaFunctionRanker.get_function_addressable_time` by 1,245% in PR #1874 (`java-tracer`) #1875

Uh oh!

codeflash-ai bot commented Mar 19, 2026

Uh oh!

codeflash-ai bot commented Mar 19, 2026

⚡️ Speed up method `JfrProfile._parse_json` by 42% in PR #1874 (`java-tracer`) #1876

Uh oh!

Uh oh!

codeflash-ai bot commented Mar 19, 2026

⚡️ Speed up method `JfrProfile.get_method_ranking` by 73% in PR #1874 (`java-tracer`) #1877

Uh oh!

codeflash-ai bot commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

misrasaurabh1 commented Mar 19, 2026

Summary

Java agent (codeflash-java-runtime)

Python orchestration

Integration with optimizer pipeline

Verified end-to-end

Test plan

Uh oh!

claude bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Tasks

Prek Checks

Code Review

Bugs / Correctness

Design Issues

Duplicate Detection

Test Coverage

Optimization PRs

Uh oh!

codeflash-ai bot commented Mar 19, 2026

⚡️ Codeflash found optimizations for this PR

📄 1,245% (12.45x) speedup for JavaFunctionRanker.get_function_addressable_time in codeflash/benchmarking/function_ranker.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method JavaFunctionRanker.get_function_addressable_time by 1,245% in PR #1874 (java-tracer) #1875

Uh oh!

codeflash-ai bot commented Mar 19, 2026

Uh oh!

codeflash-ai bot commented Mar 19, 2026

⚡️ Codeflash found optimizations for this PR

📄 42% (0.42x) speedup for JfrProfile._parse_json in codeflash/languages/java/jfr_parser.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method JfrProfile._parse_json by 42% in PR #1874 (java-tracer) #1876

Uh oh!

Uh oh!

codeflash-ai bot commented Mar 19, 2026

⚡️ Codeflash found optimizations for this PR

📄 73% (0.73x) speedup for JfrProfile.get_method_ranking in codeflash/languages/java/jfr_parser.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up method JfrProfile.get_method_ranking by 73% in PR #1874 (java-tracer) #1877

Uh oh!

codeflash-ai bot commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Mar 19, 2026 •

edited

Loading

📄 1,245% (12.45x) speedup for `JavaFunctionRanker.get_function_addressable_time` in `codeflash/benchmarking/function_ranker.py`

⚡️ Speed up method `JavaFunctionRanker.get_function_addressable_time` by 1,245% in PR #1874 (`java-tracer`) #1875

📄 42% (0.42x) speedup for `JfrProfile._parse_json` in `codeflash/languages/java/jfr_parser.py`

⚡️ Speed up method `JfrProfile._parse_json` by 42% in PR #1874 (`java-tracer`) #1876

📄 73% (0.73x) speedup for `JfrProfile.get_method_ranking` in `codeflash/languages/java/jfr_parser.py`

⚡️ Speed up method `JfrProfile.get_method_ranking` by 73% in PR #1874 (`java-tracer`) #1877