feat: multi-language orchestration loop with per-language config discovery by mashraf-222 · Pull Request #1859 · codeflash-ai/codeflash

mashraf-222 · 2026-03-18T00:21:38Z

Summary

Adds multi-language orchestration to the Codeflash CLI. The optimizer now discovers all language configs in a project (Python, Java, JS/TS), and runs a full optimization pass for each language automatically.

This PR is part of a 3-repo change set:

codeflash (this PR) — Multi-language orchestration loop, config discovery, and language-agnostic git diff
codeflash-cc-plugin — Unified hook that triggers a single codeflash --subagent call
optimize-me — Mixed-language test fixture (Python + Java + JS/TS) for E2E validation

The CLI orchestration loop is the core — the cc-plugin delegates to it, and optimize-me validates it end-to-end.

What Changed

Config Discovery

find_all_config_files() walks CWD→root, discovers pyproject.toml (Python), codeflash.toml (Java), package.json (JS/TS). Closest config wins per language.
New LanguageConfig dataclass holds config dict, path, and language enum per discovered config.
Extracted normalize_toml_config() shared helper for consistent config normalization.

Orchestration Loop

main() iterates over discovered configs, deep-copies args per language, calls apply_language_config() then optimizer.run_with_args() for each.
--file flag filters to the matching language only. --all and no-flags run all discovered languages.
Per-language error isolation — one language failing doesn't block others.
Summary logging after all passes complete.

Auto-Detection

detect_unconfigured_languages() compares configs vs git diff to find languages with changed files but no config.
auto_configure_language() creates on-the-fly configs by detecting project roots (pom.xml, package.json).

Language-Agnostic Git Diff

get_git_diff() now uses get_supported_extensions() from the registry instead of the singleton. The coarse filter lets through all supported files; per-file language detection happens downstream.

Tests

73 tests in test_multi_language_orchestration.py (orchestration loop, apply_language_config, summary logging, CLI path routing, unconfigured detection)
11 tests in test_multi_config_discovery.py (find_all_config_files, LanguageConfig)
15 tests in test_git_utils.py (language-agnostic diff filtering)
Registry test pollution fix in test_registry.py

All tests pass locally.

E2E Validation

Full E2E session with local backend services — Fibonacci (java-test-project), encodedLength (aerospike), isPalindrome + mergeSorted (optimize-me Java), retryWithBackoff (optimize-me JS) all passed.

Related PRs

codeflash-cc-plugin — Unified multi-language hook

The cc-plugin is the entry point that triggers Codeflash from Claude Code. Previously it had 3 separate per-language code paths that each invoked the CLI differently. With the orchestration loop introduced in this PR, the cc-plugin was simplified to a single codeflash --subagent call — it only needs to detect that any config exists and find the binary. All multi-language discovery and dispatch is now handled by find_all_config_files() and the orchestration loop in main.py from this PR.

optimize-me — Mixed-language test fixture

The optimize-me repo was extended with Java and JS/TS subprojects to serve as an E2E validation target for the orchestration loop. Each subproject has its own codeflash config file, so running Codeflash at the repo root exercises find_all_config_files() discovering multiple configs and the loop running separate optimization passes per language. The E2E validation results above were produced by running this PR's code against the optimize-me fixture.

Generated with Claude Code

…r all languages The language singleton was only set after function discovery, but get_git_diff() needs it during discovery to filter by file extension. Now set it in process_pyproject_config() based on the config file type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

claude · 2026-03-18T00:22:22Z

Claude finished @mashraf-222's task in 6m 18s —— View job

PR Review Summary

Prek Checks

ruff-format was failing in CI: config_parser.py lines 169-171 had a 3-line configs.append(...) that fits on one line. Fixed and pushed in commit 275527db.

ruff-check F811 (redefined-while-unused) in git_utils.py was flagged in the CI run against commit c5d7c394. The current HEAD does not have a duplicate import — the inline from codeflash.languages.registry import get_supported_extensions inside get_git_diff() is the only import, with no top-level shadow. This appears to have been introduced and then resolved during the 20-commit history before this run. ✅ Clean on HEAD after the format fix.

Code Review

🔴 Accidental File Inclusions

.planning/ directory committed — 5 files added: .planning/STATE.md, .planning/config.json, .planning/phases/08-*/08-02-PLAN.md, 08-02-SUMMARY.md, 08-03-PLAN.md. These are AI agent internal workflow artifacts, not production code. They should either be removed from the PR or added to .gitignore.

Fix this →

🟠 Design Issues

1. get_changed_file_paths() in main.py duplicates get_git_diff() logic

main.py:188 introduces a new subprocess.run(["git", "diff", "--name-only", "HEAD~1"]) implementation while codeflash/code_utils/git_utils.py already has the full-featured get_git_diff(). Problems with the new function:

Returns relative Path objects (git diff --name-only output is repo-relative, not absolute)
HEAD~1 fails on an initial commit with no parent
Placed in main.py (an entry point) instead of git_utils.py (where all git utilities live)

The fix should call into get_git_diff() or extract a helper in git_utils.py. Concretely, get_git_diff() already returns dict[str, list[int]] — get_changed_file_paths() only needs the keys.

Fix this →

2. Duplicate tests_root detection logic

The Java/JS/TS tests_root fallback logic is copy-pasted verbatim between process_pyproject_config (cli.py:126-155) and apply_language_config (cli.py:289-312). Both blocks look identical. A private helper _resolve_tests_root(args, is_java, is_js_ts) should be extracted.

Fix this →

3. Business logic in main.py entry point

detect_unconfigured_languages, detect_project_for_language, and auto_configure_language (main.py:176-264) contain significant domain logic that should live in setup/ or code_utils/, not the CLI entry point. Per the architecture doc, setup logic belongs in setup/. Moving them would also improve testability.

4. apply_language_config missing --benchmark validation

process_pyproject_config (cli.py:159-183) validates --benchmarks-root is set and that the GitHub app is installed when --benchmark is used. apply_language_config skips all of this. Multi-language runs with --benchmark will silently proceed without these guards.

5. detect_project_for_language imports private-prefixed functions

main.py:203-211 imports _detect_formatter, _detect_ignore_paths, _detect_java_module_root, _detect_js_module_root, _detect_python_module_root, _detect_test_runner, _detect_tests_root from setup/detector.py. CLAUDE.md states: "NEVER use leading underscores — Python has no true private functions, use public names." Either rename those detector functions to remove the underscore prefix, or expose a public API from detector.py that main.py can call without reaching into private helpers.

🟡 Minor Issues

Docstrings on new functions — CLAUDE.md: "Do not add docstrings to new or changed code unless explicitly asked." The following new/changed functions have docstrings that should be removed: _handle_config_loading (main.py:267), print_codeflash_banner (main.py:313), _handle_reset_config (cli.py:392).

Silent exception swallowing in find_all_config_files — config_parser.py:172 has except Exception: continue which silently swallows all config parse errors. At minimum a logger.debug should log the exception to aid debugging.

Duplicate Detection

HIGH confidence — get_changed_file_paths() (main.py:188) vs get_git_diff() (git_utils.py:21): both call git diff to find changed files; the new function is a simplified re-implementation of the existing one. Should be eliminated in favour of the existing utility.

MEDIUM confidence — detect_project_for_language() (main.py:202) vs detect_project() (setup/detector.py:79): both call the same individual _detect_* helpers and build a DetectedProject. The key difference is that detect_project auto-detects the language while detect_project_for_language takes an explicit language. This near-duplication could be resolved by adding a language override parameter to detect_project().

Test Coverage

All 73 new tests pass. Coverage on the key new paths:

File	Coverage
`codeflash/main.py`	70%
`codeflash/code_utils/config_parser.py`	45%
`codeflash/code_utils/git_utils.py`	64%
`codeflash/cli_cmds/cli.py`	20%

cli.py at 20% is expected (interactive CLI). The new multi-language orchestration logic in main.py at 70% is reasonable. auto_configure_language (main.py:244) is covered; detect_project_for_language (main.py:202) lacks direct test coverage — consider adding at least a happy-path unit test.

Pushed fix: style: auto-fix ruff formatting in config_parser.py (commit 275527db)
|

codeflash/cli_cmds/cli.py

code_to_optimize/java/src/main/java/com/example/Fibonacci.java

…r all languages The language singleton was only set after function discovery, but get_git_diff() needs it during discovery to filter by file extension. - config_parser.py: set config["language"] based on config file type (codeflash.toml → java, pyproject.toml → python) so all project types return a language - cli.py: call set_current_language() in process_pyproject_config() using the config value, before the optimizer runs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ensions - Replace current_language_support().file_extensions with get_supported_extensions() from registry - Update tests: remove singleton dependency, add unsupported extension filtering test - Mixed Python+Java diffs now return both file types regardless of singleton state Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add LanguageConfig dataclass with config, config_path, language fields - Add find_all_config_files() that discovers all codeflash configs in project hierarchy - Supports pyproject.toml (Python), codeflash.toml (Java), package.json (JS/TS) - Skips configs without [tool.codeflash] section, closest config wins per language - Add 6 tests covering discovery, filtering, parent directory search, deduplication Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…DISC-04) - Add smoke test confirming get_language_support usage, not singleton - No code changes needed, function already uses per-file registry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add apply_language_config() to cli.py for multi-language mode config application - Import LanguageConfig and Language enum in cli.py - Create test_multi_language_orchestration.py with 9 tests covering: module_root/tests_root setting, path resolution, project_root, CLI override preservation, formatter_cmds, language singleton, Python/Java config handling, Java default tests_root Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add orchestration loop that iterates over all discovered LanguageConfigs - Deep-copy args per language pass to prevent mutation leakage - Run git/GitHub checks once before loop via handle_optimize_all_arg_parsing - Preserve fallback to single-config path when find_all_config_files returns empty - Add 4 orchestration tests: sequential passes, singleton per pass, fallback to single config, args isolation between passes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ig_files - Extract shared normalization logic (path resolution, defaults, key conversion) into normalize_toml_config() - Use it in both find_all_config_files and parse_config_file to eliminate duplication - Add 6 tests verifying normalization behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Wrap each language pass in try/except so one failure doesn't block others - Track per-language status (success/failed/skipped) in results dict - Add 3 tests verifying error isolation and failure tracking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Test summary format with all success, mixed statuses, and empty results - Test skipped status when formatter check fails - 4 new tests covering _log_orchestration_summary behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ation - Detect file language via get_language_support(Path(args.file)) - Filter language_configs to only the matching language before loop - Gracefully handle unsupported extensions and missing configs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- test_file_flag_filters_to_matching_language: Java file runs only Java pass - test_file_flag_python_file_filters_to_python: Python file runs only Python pass - test_file_flag_unknown_extension_runs_all: .rs file runs all language passes - test_file_flag_no_matching_config_runs_all: Java file with only Python config runs all - test_all_flag_sets_module_root_per_language: --all sets pass_args.all per language - test_no_flags_runs_all_language_passes: no flags runs all language passes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Tests for detect_unconfigured_languages() function - Tests for auto_configure_language() success and failure paths - Test for per-language logging output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tion - Add detect_unconfigured_languages() to identify languages in changed files lacking configs - Add detect_project_for_language() using per-language detection helpers (avoids wrong-language pitfall) - Add auto_configure_language() that writes config and re-discovers it in one step - Add get_changed_file_paths() helper using git diff - Wire auto-config into orchestration loop (only for subagent/no-flags path) - Failed auto-config logs warning with manual setup instructions, continues gracefully - Per-language "Processing {lang} (config: {path})" logging confirmed working Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add JS/TS config discovery tests (package.json, all three config types) - Add malformed TOML and missing codeflash section tests - Add JS/TS extension git diff tests (.js, .ts, .jsx, .tsx) - Add mixed three-language git diff test - Add TypeScript/JSX file flag routing tests - Add direct function coverage for get_changed_file_paths, detect_project_for_language - Add empty config normalize test - 13 new tests across 3 files (60 -> 73 total) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Mock posthog and sentry initialization in all tests calling main() to prevent SystemExit when prior tests overwrite CODEFLASH_API_KEY - Re-register JavaSupport in clear_registry test to prevent Java language lookup failures in subsequent tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

misrasaurabh1 reviewed Mar 18, 2026

View reviewed changes

codeflash/cli_cmds/cli.py Outdated Show resolved Hide resolved

mashraf-222 commented Mar 18, 2026

View reviewed changes

code_to_optimize/java/src/main/java/com/example/Fibonacci.java Outdated Show resolved Hide resolved

mashraf-222 force-pushed the fix/set-language-singleton-early branch from 8c9aab5 to d9dfb0b Compare March 18, 2026 00:34

mashraf-222 requested a review from misrasaurabh1 March 18, 2026 00:43

mashraf-222 and others added 18 commits March 18, 2026 04:03

test(07-02): verify find_all_functions_in_file uses registry lookup (…

52894fd

…DISC-04) - Add smoke test confirming get_language_support usage, not singleton - No code changes needed, function already uses per-file registry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(08-02): complete error isolation and config normalization plan

8ca57b2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(08): create gap closure plan for CLI path routing

da2704c

docs(08): revise 08-03 plan based on checker feedback

ee7ef5e

fix(08): revise plan 03 based on checker feedback

87fda32

test(09-02): add failing tests for unconfigured language detection

467adfd

- Tests for detect_unconfigured_languages() function - Tests for auto_configure_language() success and failure paths - Test for per-language logging output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mashraf-222 changed the title ~~fix: set language singleton early so git diff auto-detection works for non-Python languages~~ feat: multi-language orchestration loop with per-language config discovery Mar 19, 2026

style: auto-fix ruff formatting in config_parser.py

275527d

This was referenced Mar 19, 2026

feat: unified multi-language hook with single codeflash trigger codeflash-ai/codeflash-cc-plugin#34

Open

feat: add Java and JS/TS subprojects for mixed-language testing codeflash-ai/optimize-me#305

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: multi-language orchestration loop with per-language config discovery#1859

feat: multi-language orchestration loop with per-language config discovery#1859
mashraf-222 wants to merge 21 commits intomainfrom
fix/set-language-singleton-early

mashraf-222 commented Mar 18, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mashraf-222 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Config Discovery

Orchestration Loop

Auto-Detection

Language-Agnostic Git Diff

Tests

E2E Validation

Related PRs

codeflash-cc-plugin — Unified multi-language hook

optimize-me — Mixed-language test fixture

Uh oh!

claude bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Code Review

🔴 Accidental File Inclusions

🟠 Design Issues

🟡 Minor Issues

Duplicate Detection

Test Coverage

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mashraf-222 commented Mar 18, 2026 •

edited

Loading

claude bot commented Mar 18, 2026 •

edited

Loading