⚡️ Speed up function _extract_type_body_context by 31% in PR #1199 (omni-java)
#1253
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1199
If you approve this dependent PR, these changes will be merged into the original PR branch
omni-java.📄 31% (0.31x) speedup for
_extract_type_body_contextincodeflash/languages/java/context.py⏱️ Runtime :
477 microseconds→364 microseconds(best of40runs)📝 Explanation and details
This optimization achieves a 31% runtime improvement (from 477μs to 364μs) by eliminating redundant UTF-8 decoding operations and reducing attribute lookups.
Key optimizations:
Eliminated repeated UTF-8 decoding: The original code called
.decode("utf8")on byte slices multiple times per iteration (for enum constants and block comments). The optimized version introduces_slice_text_by_points()that extracts text directly from the already-decodedlineslist, avoiding the overhead of repeated UTF-8 decoding operations.Reduced attribute lookups: Added local alias
ls = linesand hoistedskip_types = ("{", "}", ";", ",")out of the loop, reducing repeated name resolutions in the hot path wherebody_node.childrenis iterated.Smarter text extraction: The helper function
_slice_text_by_points()uses line/column coordinates instead of byte offsets, directly indexing into the decoded lines. This is faster because thelinesarray is already UTF-8 decoded when passed in, so we avoid re-decoding the same bytes multiple times.Performance characteristics by test case:
Why this matters:
Line profiler shows the original code spent significant time in decode operations (lines with
source_bytes[...].decode("utf8")). For Java source files with many enum constants or Javadoc comments, this optimization reduces the cumulative decode overhead across all iterations, resulting in the observed 31% speedup on representative workloads.✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr1199-2026-02-02T00.44.56and push.