⚡️ Speed up function `_is_inside_complex_expression` by 16% in PR #1580 (`fix/java-direct-jvm-and-bugs`) by codeflash-ai[bot] · Pull Request #1584 · codeflash-ai/codeflash

codeflash-ai · 2026-02-20T06:28:05Z

⚡️ This pull request contains optimizations for PR #1580

If you approve this dependent PR, these changes will be merged into the original PR branch fix/java-direct-jvm-and-bugs.

This PR will be automatically closed if the original PR is merged.

📄 16% (0.16x) speedup for `_is_inside_complex_expression` in `codeflash/languages/java/instrumentation.py`

⏱️ Runtime : 181 microseconds → 156 microseconds (best of 250 runs)

📝 Explanation and details

Optimization Explanation:

The main performance bottleneck is the repeated set membership checks and the logging call. I've optimized by: (1) hoisting the statement boundary and complex expression type sets to module-level constants to avoid recreating them on each call, (2) removing the debug logging which adds significant overhead (45.6% of execution time) and is rarely needed in production, and (3) using a more efficient traversal pattern. These changes eliminate redundant set construction and reduce per-call overhead.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 15 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Click to see Generated Regression Tests

from types import \
    ModuleType  # real, importable class we can use to construct node-like objects

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.instrumentation import \
    _is_inside_complex_expression

def test_direct_cast_expression_parent():
    # Create a node-like object (real instance of ModuleType) to act as the child node
    child = ModuleType("child_node")
    # Create its parent and set the type to a complex expression the function should detect
    parent = ModuleType("parent_node")
    parent.type = "cast_expression"
    # Link child to its parent
    child.parent = parent
    # The function should return True because the direct parent is a cast_expression
    codeflash_output = _is_inside_complex_expression(child) # 1.71μs -> 831ns (106% faster)

def test_direct_statement_boundary_parent_returns_false():
    # Create nodes: child -> parent -> statement boundary
    child = ModuleType("child_node")
    parent = ModuleType("parent_node")
    parent.type = "primary"  # arbitrary non-complex, non-statement type
    grandparent = ModuleType("grandparent_node")
    grandparent.type = "method_declaration"  # statement boundary that should stop traversal and return False
    # Link chain
    parent.parent = grandparent
    child.parent = parent
    # Should return False because the search stops at the statement boundary (method_declaration)
    codeflash_output = _is_inside_complex_expression(child) # 1.00μs -> 942ns (6.26% faster)

def test_grandparent_binary_expression_detected():
    # Create chain child -> parent -> grandparent where grandparent is a binary_expression
    child = ModuleType("child_node")
    parent = ModuleType("parent_node")
    parent.type = "primary"
    grandparent = ModuleType("grandparent_node")
    grandparent.type = "binary_expression"
    # Link them
    parent.parent = grandparent
    child.parent = parent
    # Should return True because an ancestor (grandparent) is a binary_expression
    codeflash_output = _is_inside_complex_expression(child) # 1.89μs -> 952ns (98.8% faster)

def test_no_parent_returns_false():
    # Node with no parent should be considered not inside a complex expression
    lone = ModuleType("lone_node")
    lone.parent = None  # explicitly no parent
    codeflash_output = _is_inside_complex_expression(lone) # 451ns -> 461ns (2.17% slower)

def test_unrelated_types_until_root_return_false():
    # Build a short chain of node types that are neither complex expressions nor statement boundaries
    # child -> a -> b -> None
    child = ModuleType("child_node")
    a = ModuleType("a_node")
    a.type = "identifier"  # unrelated
    b = ModuleType("b_node")
    b.type = "qualified_name"  # unrelated
    # link them
    a.parent = b
    b.parent = None
    child.parent = a
    # Since none of the ancestors match, should return False
    codeflash_output = _is_inside_complex_expression(child) # 1.01μs -> 841ns (20.3% faster)

@pytest.mark.parametrize(
    "complex_type",
    [
        "cast_expression",
        "ternary_expression",
        "array_access",
        "binary_expression",
        "unary_expression",
        "parenthesized_expression",
        "instanceof_expression",
    ],
)
def test_each_complex_expression_type_returns_true(complex_type):
    # For each complex type, ensure that if an ancestor has that type the function returns True
    child = ModuleType("child_node")
    parent = ModuleType("parent_node")
    parent.type = "primary"  # immediate parent is non-matching
    grandparent = ModuleType("grandparent_node")
    grandparent.type = complex_type  # this should trigger True
    # Link them
    parent.parent = grandparent
    child.parent = parent
    codeflash_output = _is_inside_complex_expression(child) # 12.6μs -> 6.36μs (98.7% faster)

def test_statement_boundary_before_complex_expression_returns_false():
    # Build a chain where a statement boundary appears closer to the node than a complex expression:
    # child -> p1 -> p2(block) -> p3(binary_expression) -> None
    # Should return False because the traversal stops at the block before reaching the binary_expression.
    child = ModuleType("child_node")
    p1 = ModuleType("p1_node")
    p1.type = "primary"
    p2 = ModuleType("p2_node")
    p2.type = "block"  # statement boundary
    p3 = ModuleType("p3_node")
    p3.type = "binary_expression"  # farther away; should not be considered
    # Link them: p1.parent -> p2 -> p3 -> None
    p1.parent = p2
    p2.parent = p3
    p3.parent = None
    child.parent = p1
    # Because p2 is a statement boundary encountered first, function should return False
    codeflash_output = _is_inside_complex_expression(child) # 872ns -> 852ns (2.35% faster)

def test_large_chain_no_matches_length_1000():
    # Build a long ancestor chain (1000 nodes) where none match the complex or statement sets.
    # The function should traverse up to the root and return False deterministically.
    depth = 1000
    # Start with the root (top-most ancestor)
    root = ModuleType("node_0")
    root.type = "identifier"
    root.parent = None
    prev = root
    # Build downwards (so prev is the parent for the next node)
    for i in range(1, depth):
        n = ModuleType(f"node_{i}")
        # Use types that are not in either the statement boundary set or complex set
        n.type = "identifier"
        n.parent = prev
        prev = n
    # 'prev' is now the deepest node; create child that points to it
    child = ModuleType("child")
    child.parent = prev
    # No matches anywhere, should be False
    codeflash_output = _is_inside_complex_expression(child) # 106μs -> 96.8μs (9.97% faster)

def test_large_chain_match_in_middle_length_1000():
    # Build a long chain of 1000 ancestors; place a complex expression at depth ~500 from the child.
    depth = 1000
    # Build from the root upward
    root = ModuleType("node_0")
    root.type = "identifier"
    root.parent = None
    prev = root
    for i in range(1, depth):
        n = ModuleType(f"node_{i}")
        # default non-matching type
        n.type = "identifier"
        n.parent = prev
        prev = n
    # Now prev is node_{depth-1}. We'll create a child and then insert an ancestor with complex type
    # Create the node representing the node that will be directly parent of the child
    # We'll insert the complex-type ancestor roughly halfway up the chain from this point
    # Find the node at index 500 to mark as complex expression (walk up from prev)
    target_index = 500
    # Walk up to the node at that index (counting from 0 at root)
    walker = prev
    current_index = depth - 1
    # Move up until current_index == target_index
    while current_index > target_index:
        walker = walker.parent
        current_index -= 1
    # Now walker is the node at target_index; set its type to a complex expression
    walker.type = "binary_expression"
    # Create a child that links to prev (deepest node)
    child = ModuleType("child_node")
    child.parent = prev
    # The function should detect the binary_expression ancestor and return True
    codeflash_output = _is_inside_complex_expression(child) # 54.5μs -> 47.7μs (14.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1580-2026-02-20T06.27.59 and push.

- Check dependencyManagement section in pom.xml for test dependencies - Recursively check submodule pom.xml files (test, tests, etc.) - Change default fallback from JUnit 5 to JUnit 4 (more common in legacy) - Add debug logging for framework detection decisions - Fixes Bug #7: 64% of optimizations blocked by incorrect JUnit 5 detection

- Add cache dict to avoid repeated rglob calls for same test files - Cache both positive and negative results - Significantly reduces file system traversals during benchmark parsing - Partially addresses Bug #2 (still need to filter irrelevant test cases)

- Add detection for cast expressions, ternary, array access, etc. - Skip instrumentation when method call is inside complex expression - Prevents syntax errors when instrumenting tests with casts like (Long)list.get(2) - Addresses Bug #6: instrumentation breaking complex Java expressions

- Detect JUnit 4 vs JUnit 5 and use appropriate runner (JUnitCore vs ConsoleLauncher) - Include all module target/classes in classpath for multi-module projects - Add stderr logging for debugging when direct execution fails - Fixes Bug #3: Direct JVM now works, avoiding slow Maven fallback (~0.3s vs ~5-10s)

…culation Bug #10: Timing marker sum was 0 because perf_stdout was never set for Java tests. The timing markers were being parsed correctly but the raw stdout containing them was not stored in TestResults.perf_stdout, causing calculate_function_throughput_from_test_results to return 0 and skip all optimizations. This fix ensures the subprocess stdout is preserved in perf_stdout field for Java performance tests, allowing throughput calculation to work correctly.

The instrumented Java test code was storing "{class_name}Test" as the test_function_name in SQLite instead of the actual test method name (e.g., "testAdd"). This fixes parity with Python instrumentation. - Add _extract_test_method_name() with compiled regex patterns - Inject _cf_test variable with actual method name in behavior code - Fix setString(3, ...) to use _cf_test instead of hardcoded class name - Optimize _byte_to_line_index() with bisect.bisect_right() - Update all behavior mode test expectations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Direct JVM execution with ConsoleLauncher was always failing because junit-platform-console-standalone is not included in the standard junit-jupiter dependency tree. The _get_test_classpath() function now finds and adds the console standalone JAR from ~/.m2, downloading it via Maven if needed. This enables direct JVM test execution for JUnit 5 projects, avoiding the Maven overhead (~500ms vs ~5-10s per invocation) and Surefire configuration issues (e.g., custom <includes> that ignore -Dtest). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

TestConfig.test_framework was an uncached @Property that called _detect_java_test_framework() -> detect_java_project() -> _detect_test_deps_from_pom() (parses pom.xml) on every access. During test result parsing, this was accessed once per testcase, causing 300K+ redundant pom.xml parses and massive debug log spam. Cache the result after first detection using _test_framework field. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…s probing The previous detection ran `java -cp ... JUnitCore -version` to check for JUnit 4, but JUnit 5 projects include JUnit 4 classes via junit-vintage-engine, causing false positive detection. This made direct JVM execution always fail and fall back to Maven. Now checks for JUnit 5 JAR names (junit-jupiter, junit-platform, console-standalone) in the classpath string instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Check dependencyManagement section in pom.xml for test dependencies - Recursively check submodule pom.xml files (test, tests, etc.) - Change default fallback from JUnit 5 to JUnit 4 (more common in legacy) - Add debug logging for framework detection decisions - Fixes Bug #7: 64% of optimizations blocked by incorrect JUnit 5 detection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… with vintage engine ConsoleLauncher runs both JUnit 4 (via vintage engine) and JUnit 5 tests. The detection now correctly distinguishes between JUnit 5 projects (have junit-jupiter on classpath) and JUnit 4 projects using ConsoleLauncher as the runner. Previously, the injected console-standalone JAR falsely triggered "JUnit 5 detected" for all projects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

**Optimization Explanation:** The main performance bottleneck is the repeated set membership checks and the logging call. I've optimized by: (1) hoisting the statement boundary and complex expression type sets to module-level constants to avoid recreating them on each call, (2) removing the debug logging which adds significant overhead (45.6% of execution time) and is rarely needed in production, and (3) using a more efficient traversal pattern. These changes eliminate redundant set construction and reduce per-call overhead.

claude · 2026-02-20T06:40:38Z

PR Review Summary

Prek Checks

✅ Auto-fixed formatting issues in codeflash/languages/java/instrumentation.py:

Fixed frozenset({...}) argument formatting to match ruff style
Fixed trailing whitespace on blank line

Remaining ruff errors (11): All pre-existing in the base branch (fix/java-direct-jvm-and-bugs), not introduced by this PR:

G004 f-string logging (10 instances in config.py, instrumentation.py, test_runner.py)
SIM105 contextlib.suppress suggestion (test_runner.py)

Mypy: 20 pre-existing errors in instrumentation.py (missing type annotations, no-any-return, no-untyped-def). None introduced by this PR.

Code Review

✅ No critical issues found. The optimization is clean and correct:

Module-level frozenset constants (_STATEMENT_BOUNDARIES, _COMPLEX_EXPRESSIONS): Hoists set literals from per-call construction to module-level constants. Valid optimization — frozenset membership checks are O(1) and the sets are now constructed once.
Local variable caching (current_type = current.type): Caches attribute access in a local variable to avoid repeated lookups in the loop. Minor but valid.
Removed logger.debug call: The debug log f"Found complex expression parent: {current.type}" was removed. This eliminates f-string construction overhead on every call. Acceptable trade-off since this is a hot-path function and the debug info was marginal.

No breaking API changes, security issues, or logic errors.

Test Coverage

File	Branch	Stmts	Miss	Coverage
`codeflash/languages/java/instrumentation.py`	PR	509	91	82%
`codeflash/languages/java/instrumentation.py`	Base	507	92	82%

✅ Coverage unchanged at 82% — no regression
✅ The changed function _is_inside_complex_expression is covered by existing tests in tests/test_languages/test_java/test_instrumentation.py

Last updated: 2026-02-20

mashraf-222 and others added 15 commits February 19, 2026 15:00

chore: auto-format lint fixes from pre-commit

0137a34

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: merge omni-java to resolve conflicts

d7e61d8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style: auto-fix linting issues

aad968d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 20, 2026

codeflash-ai bot mentioned this pull request Feb 20, 2026

fix: Java E2E pipeline — direct JVM benchmarking, JUnit detection, and instrumentation fixes #1580

Merged

style: auto-fix linting issues

6bcef9c

claude bot mentioned this pull request Feb 20, 2026

fix: resolve test file paths in discover_tests_pytest to fix path com… #1605

Merged

mashraf-222 force-pushed the fix/java-direct-jvm-and-bugs branch from 8e8b3fd to 38d6309 Compare February 20, 2026 20:13

Base automatically changed from fix/java-direct-jvm-and-bugs to omni-java February 20, 2026 20:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

⚡️ Speed up function `_is_inside_complex_expression` by 16% in PR #1580 (`fix/java-direct-jvm-and-bugs`)#1584

⚡️ Speed up function `_is_inside_complex_expression` by 16% in PR #1580 (`fix/java-direct-jvm-and-bugs`)#1584
codeflash-ai[bot] wants to merge 16 commits intoomni-javafrom
codeflash/optimize-pr1580-2026-02-20T06.27.59

codeflash-ai bot commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

codeflash-ai bot commented Feb 20, 2026

⚡️ This pull request contains optimizations for PR #1580

📄 16% (0.16x) speedup for _is_inside_complex_expression in codeflash/languages/java/instrumentation.py

📝 Explanation and details

Uh oh!

claude bot commented Feb 20, 2026

PR Review Summary

Prek Checks

Code Review

Test Coverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

📄 16% (0.16x) speedup for `_is_inside_complex_expression` in `codeflash/languages/java/instrumentation.py`