feat: zero-config Java projects + smart ReplayHelper for end-to-end optimization#1880
feat: zero-config Java projects + smart ReplayHelper for end-to-end optimization#1880misrasaurabh1 wants to merge 14 commits intomainfrom
Conversation
…iles Java projects no longer need a standalone config file. Codeflash reads config from pom.xml <properties> or gradle.properties, and auto-detects source/test roots from build tool conventions. Changes: - Add parse_java_project_config() to read codeflash.* properties from pom.xml and gradle.properties - Add multi-module Maven scanning: parses each module's pom.xml for <sourceDirectory> and <testSourceDirectory>, picks module with most Java files as source root, identifies test modules by name - Route Java projects through build-file detection in config_parser.py before falling back to pyproject.toml - Detect Java language from pom.xml/build.gradle presence (no config needed) - Fix project_root for multi-module projects (was resolving to sub-module) - Fix JFR parser / separators (JVM uses com/example, normalized to com.example) - Fix graceful timeout (SIGTERM before SIGKILL for JFR dump + shutdown hooks) - Remove isRecording() check from TracingTransformer (was preventing class instrumentation for classes loaded during serialization) - Delete all codeflash.toml files from fixtures and code_to_optimize - Add 33 config detection tests - Update docs for zero-config Java setup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replay tests call helper.replay() via reflection, not the target function directly. The behavior instrumentation can't wrap indirect calls and produces malformed output (code emitted outside class body) for large replay test files. For replay tests, just rename the class without adding instrumentation — JUnit pass/fail results verify correctness. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detect test framework from project build config and generate replay tests with appropriate imports (org.junit.Test for JUnit 4, org.junit.jupiter.api.Test for JUnit 5). Fixes compilation failures on projects using JUnit 4 (like aerospike-client-java). Also passes test_framework through run_java_tracer to generate_replay_tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ay tests Use a global counter per method name across all descriptors to generate unique test method names. Previously, overloaded methods (same name, different descriptor) would generate duplicate replay_methodName_N methods, causing compilation errors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on skip 10 new tests covering: - JUnit 5 replay test generation (imports, class visibility) - JUnit 4 replay test generation (imports, public methods, @afterclass) - Overloaded method handling (no duplicate test method names) - Instrumentation skip for replay tests (behavior + perf mode) - Regular tests still get instrumented normally Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…solution 13 new tests covering: - JFR class name normalization (/ to . conversion) - Package-based sample filtering - Addressable time calculation from JFR samples - Method ranking order and format - Graceful timeout (SIGTERM before SIGKILL) - Multi-module project root detection (Path not str) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The behavior instrumentation was producing malformed output for compact @test lines (annotation + method signature on same line, common in replay tests). The method signature collection loop would skip past the opening brace and consume subsequent methods' content. Fix: detect when the @test annotation line already contains { and treat it as both annotation and method signature, avoiding the separate signature search that was over-consuming lines. Reverted the instrumentation skip for replay tests — they now get properly instrumented for both behavior capture and performance timing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Claude finished @misrasaurabh1's task in 6s —— View job PR Review Summary
Prek ChecksAll clean. ✅ Code ReviewBug:
|
ReplayHelper now reads CODEFLASH_MODE env var and produces the same output as the existing test instrumentation: - Behavior mode: captures return value via Kryo serialization, writes to SQLite (test_results table) for correctness comparison, prints start/end timing markers - Performance mode: runs inner loop for JIT warmup, prints timing markers for each iteration matching the expected format - No mode: just invokes the method (trace-only or manual testing) This achieves feature parity with the existing test instrumentation for replay tests, which call functions via reflection and can't be wrapped by text-level instrumentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ay tests + speedups - Trigger on any codeflash/** or tests/** changes (not just java subset) - Validate replay test files are discovered per-function - Already validates: replay test generation, global discovery count, optimization success, and minimum speedup percentage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The refactored Java project_root handling moved args.tests_root resolution after the project_root_from_module_root call, which passed a string instead of a Path. Restore the original order: resolve tests_root to Path first, then set test_project_root, then override both for Java multi-module projects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use Path comparisons instead of forward-slash substring matching - Avoid parse_args() in test (reads stdin on Windows) — use Namespace directly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use print(flush=True) instead of logging.info for subprocess output so CI logs show progress in real-time instead of buffering until completion. Also set PYTHONUNBUFFERED=1 for the subprocess. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…_write_gradle_properties Co-authored-by: Saurabh Misra <undefined@users.noreply.github.com>
…ions harder - Set jdk.ExecutionSample#period=1ms (default was 10ms) so JFR captures samples from shorter-running programs - Workload.main now runs 1000 rounds with larger inputs so JFR can capture method-level CPU samples (repeatString with O(n²) concat dominates ~75% of samples) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Eliminates
codeflash.tomlfor Java projects and fixes the complete trace → optimize pipeline to work end-to-end on real Java projects (validated on aerospike-client-java).Zero-config Java support
pom.xml/build.gradle— no config file neededpom.xml<properties>orgradle.properties(codeflash.*keys)<sourceDirectory>/<testSourceDirectory>, picks module with most Java files as source rootcodeflash.tomlfilesSmart ReplayHelper (behavior + performance parity)
ReplayHelper.replay()now readsCODEFLASH_MODEenv var and produces the same output as existing test instrumentationtest_resultstable for correctness comparisonBug fixes
/→.in class names (JVM internal format vs Java package format)isRecording()check that prevented instrumenting classes loaded during serialization (was causing 3 captures instead of 10,000+)org.junit.Testvsorg.junit.jupiter.api.Test), detect from project build config_add_behavior_instrumentationfor compact@Testlines (annotation + signature on same line)add_help=Falseso-hin Java commands isn't intercepted as--helpValidated end-to-end on aerospike-client-java
Crypto.computeDigest)Test plan
🤖 Generated with Claude Code