Skip to content

Comments

⚡️ Speed up function _extract_hook_usages by 16% in PR #1571 (codeflash/optimize-pr1561-2026-02-20T03.56.09)#1573

Closed
codeflash-ai[bot] wants to merge 2 commits intocodeflash/optimize-pr1561-2026-02-20T03.56.09from
codeflash/optimize-pr1571-2026-02-20T04.05.48
Closed

⚡️ Speed up function _extract_hook_usages by 16% in PR #1571 (codeflash/optimize-pr1561-2026-02-20T03.56.09)#1573
codeflash-ai[bot] wants to merge 2 commits intocodeflash/optimize-pr1561-2026-02-20T03.56.09from
codeflash/optimize-pr1571-2026-02-20T04.05.48

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 20, 2026

⚡️ This pull request contains optimizations for PR #1571

If you approve this dependent PR, these changes will be merged into the original PR branch codeflash/optimize-pr1561-2026-02-20T03.56.09.

This PR will be automatically closed if the original PR is merged.


📄 16% (0.16x) speedup for _extract_hook_usages in codeflash/languages/javascript/frameworks/react/context.py

⏱️ Runtime : 8.28 milliseconds 7.14 milliseconds (best of 88 runs)

📝 Explanation and details

The optimized code achieves a 15% runtime improvement by replacing character-by-character scanning with targeted string searches using Python's built-in str.find() method when matching parentheses in React hook calls.

Key Optimization:

Instead of iterating through every character to find matching parentheses:

# Original: char-by-char iteration
while j < n:
    char = cs[j]
    if char == "(":
        bracket_depth += 1
    elif char == ")":
        bracket_depth -= 1
    j += 1

The optimized version jumps directly between parentheses:

# Optimized: direct search for next ( or )
next_open = cs.find("(", j)
next_close = cs.find(")", j)
# Then jump to whichever comes first

Why This Is Faster:

  1. Reduced iterations: Line profiler shows the inner while loop dropped from 40,809 hits to 8,005 hits (~80% reduction), directly correlating to the speedup
  2. Native string search: Python's str.find() is implemented in C and optimized for substring searching, far faster than Python-level character comparisons
  3. Bigger jumps: Instead of checking every character, the code skips directly to the next relevant delimiter

Performance Impact Based on Test Results:

The optimization particularly excels with:

  • Large dependency arrays: 414% faster on 200-dependency arrays (131μs → 25.7μs)
  • Complex nested structures: 40.5% faster with newline-heavy dependency arrays
  • High hook counts: 34.4% faster on 1000 hooks, 15% faster on realistic 200-hook components

Context from function_references:

The function is called in unit tests for React hook analysis, where components may contain dozens of hooks. Since React components with many useEffect, useMemo, and custom hooks are common in production codebases, this optimization directly benefits build-time static analysis tools that parse large component files. The 15% overall speedup becomes significant when analyzing hundreds of React components in a codebase.

The optimization maintains identical behavior—same hook detection, same dependency counting logic—while simply reducing the number of character-level operations needed to find parenthesis boundaries.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 45 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from typing import List

# imports
import pytest  # used for our unit tests
# Import the function and the HookUsage class from the actual module under test.
# We rely on the real module path shown in the provided source.
from codeflash.languages.javascript.frameworks.react.context import (
    HookUsage, _extract_hook_usages)

def test_single_hook_with_dependency_array_counts_items():
    # Hook with a two-element dependency array should be detected and counted as 2.
    src = "useEffect(() => { doSomething(); }, [foo, bar])"
    codeflash_output = _extract_hook_usages(src); hooks = codeflash_output # 12.2μs -> 10.8μs (13.4% faster)
    h = hooks[0]

def test_generic_type_hook_and_empty_dependency_array():
    # Generic hook usage and an explicit empty dependency array.
    # The regex allows optional generic type parameters before the '('.
    src = "const r = useRef<MyType>(null); useEffect(() => {}, [])"
    codeflash_output = _extract_hook_usages(src); hooks = codeflash_output # 12.0μs -> 11.6μs (3.46% faster)
    # Find them by name to avoid relying on ordering
    names = {h.name for h in hooks}
    # Verify the empty array is recognized as present and counted as 0 deps
    for h in hooks:
        if h.name == "useEffect":
            pass

def test_trailing_comma_in_dependency_array_counts_as_extra_due_to_simple_comma_counting():
    # This tests the current implementation behavior: dependency_count is computed as
    # count(",") + 1 when array_content is non-empty. A trailing comma increases the count.
    src = "useEffect(fn, [a,])"
    codeflash_output = _extract_hook_usages(src); hooks = codeflash_output # 8.62μs -> 7.83μs (9.97% faster)
    h = hooks[0]

def test_hook_name_case_sensitivity_and_non_matching_lowercase_after_use():
    # 'usecustom' should not match (lowercase 'c' after 'use'), but 'useCustom' should.
    src = "usecustom(); useCustom(); useAnotherThing()"
    codeflash_output = _extract_hook_usages(src); hooks = codeflash_output # 7.74μs -> 8.32μs (6.99% slower)
    # Only useCustom and useAnotherThing should be matched (capital letter required after 'use')
    matched_names = [h.name for h in hooks]

def test_unterminated_hook_call_no_closing_paren_results_in_no_dependency_array_but_still_returned():
    # If the source lacks a closing parenthesis for the hook call, bracket_depth never reaches 0.
    # The current implementation will append a HookUsage with has_dependency_array False.
    src = "function broken() { useEffect(() => { console.log('oops');  // missing closing paren"
    codeflash_output = _extract_hook_usages(src); hooks = codeflash_output # 11.4μs -> 8.53μs (33.3% faster)
    h = hooks[0]

def test_dependency_array_with_commas_inside_strings_is_counted_by_simple_comma_counting():
    # The implementation counts commas inside the array content naively,
    # so strings containing commas will affect the count.
    src = 'useEffect(() => {}, ["a,b", "c"])'
    codeflash_output = _extract_hook_usages(src); hooks = codeflash_output # 9.91μs -> 8.91μs (11.2% faster)
    h = hooks[0]

def test_large_scale_many_hooks_are_all_detected_and_counted():
    # Create a large source string with 1000 hook invocations. Each has a single dependency.
    n = 1000
    # Build many useEffect calls, each with a unique dependency identifier.
    parts = [f"useEffect(() => {{ }}, [dep{i}])" for i in range(n)]
    large_src = "; ".join(parts)
    # Run the parser on the large source
    codeflash_output = _extract_hook_usages(large_src); hooks = codeflash_output # 3.00ms -> 2.23ms (34.4% faster)
    # Ensure each HookUsage has the correct shape: name, has_dependency_array True, dependency_count 1
    for i, h in enumerate(hooks):
        pass

def test_multiple_hooks_with_nested_parentheses_and_order_preserved():
    # Complex args with nested parentheses and multiple hooks in one string.
    src = (
        "const m = useMemo(() => ((x) => x + 1)(val), [val]); "
        "const cb = useCallback(function(a) { return (b) => b + a; }, [a, b]); "
        "const s = useState(0)"
    )
    codeflash_output = _extract_hook_usages(src); hooks = codeflash_output # 20.4μs -> 18.1μs (12.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import re
from dataclasses import dataclass

# imports
import pytest
from codeflash.languages.javascript.frameworks.react.context import \
    _extract_hook_usages

def test_empty_string():
    """Test that an empty component source returns an empty list of hooks."""
    codeflash_output = _extract_hook_usages(""); result = codeflash_output # 1.93μs -> 1.91μs (1.10% faster)

def test_single_hook_no_dependencies():
    """Test extraction of a single hook without a dependency array."""
    source = "useState(initialValue)"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 7.88μs -> 7.22μs (9.03% faster)

def test_single_hook_with_empty_dependencies():
    """Test extraction of a single hook with an empty dependency array."""
    source = "useEffect(() => { }, [])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 8.57μs -> 8.45μs (1.42% faster)

def test_single_hook_with_one_dependency():
    """Test extraction of a single hook with one dependency."""
    source = "useEffect(() => { }, [count])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 9.55μs -> 8.88μs (7.56% faster)

def test_single_hook_with_multiple_dependencies():
    """Test extraction of a single hook with multiple dependencies."""
    source = "useEffect(() => { }, [count, name, value])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 10.5μs -> 8.75μs (19.6% faster)

def test_multiple_hooks_mixed():
    """Test extraction of multiple hooks with varying dependency configurations."""
    source = """
    const [state, setState] = useState(0);
    useEffect(() => { }, [state]);
    const value = useContext(MyContext);
    """
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 15.0μs -> 13.6μs (9.70% faster)

def test_hook_with_generic_type():
    """Test extraction of a hook call with generic type parameters."""
    source = "useState<number>(0)"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 6.16μs -> 6.19μs (0.484% slower)

def test_hook_with_whitespace_around_parens():
    """Test extraction of a hook call with various whitespace patterns."""
    source = "useState  (  initialValue  )"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 7.97μs -> 6.84μs (16.5% faster)

def test_hook_name_with_multiple_capitals():
    """Test extraction of hooks with multiple capital letters in the name."""
    source = "useMyCustomHook()"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 5.43μs -> 5.75μs (5.58% slower)

def test_nested_function_calls():
    """Test extraction of hooks within nested function calls."""
    source = "const result = processData(useState(0));"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 6.35μs -> 6.56μs (3.21% slower)

def test_hook_at_start_of_source():
    """Test hook extraction when hook is the first token."""
    source = "useState(0)"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 5.84μs -> 5.98μs (2.36% slower)

def test_hook_at_end_of_source():
    """Test hook extraction when hook is the last token."""
    source = "const hook = useState"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 3.93μs -> 3.84μs (2.35% faster)

def test_hook_followed_immediately_by_paren():
    """Test hook when parenthesis is immediately after hook name."""
    source = "useCallback(()=>{},[a,b,c])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 9.23μs -> 8.79μs (5.01% faster)

def test_hook_not_matching_pattern():
    """Test that functions not matching the pattern are not extracted."""
    source = "useState(0) use(1) fakeUse(2)"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 6.65μs -> 6.79μs (2.08% slower)

def test_dependency_array_with_whitespace():
    """Test extraction with whitespace in dependency array."""
    source = "useEffect(() => {}, [ count , name , value ])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 10.8μs -> 9.11μs (18.2% faster)

def test_dependency_array_with_newlines():
    """Test extraction with newlines in dependency array."""
    source = """useEffect(() => {}, [
        count,
        name,
        value
    ])"""
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 13.0μs -> 9.24μs (40.5% faster)

def test_hook_with_no_closing_paren():
    """Test hook call that is not properly closed (malformed code)."""
    source = "useState(0"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 5.24μs -> 5.37μs (2.42% slower)

def test_hook_with_complex_nested_parens():
    """Test hook with complex nested parentheses in arguments."""
    source = "useCallback(() => doSomething(a, b), [a, b])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 10.9μs -> 9.70μs (12.5% faster)

def test_hook_with_object_dependency():
    """Test hook with object literal in dependency array."""
    source = "useEffect(() => {}, [{ key: 'value' }])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 10.2μs -> 8.64μs (18.4% faster)

def test_multiple_hooks_on_same_line():
    """Test extraction of multiple hooks on the same line."""
    source = "useState(0); useEffect(() => {}); useContext(ctx);"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 10.9μs -> 11.4μs (3.88% slower)

def test_hook_in_string_literal():
    """Test that hooks inside string literals are still matched by pattern."""
    source = 'const str = "useState(0)"; useState(1);'
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 8.33μs -> 8.52μs (2.23% slower)

def test_hook_with_trailing_whitespace_in_deps():
    """Test dependency array with trailing whitespace."""
    source = "useEffect(() => {}, [dependency1, dependency2  ])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 11.2μs -> 8.81μs (27.3% faster)

def test_hook_with_comments():
    """Test hook extraction with comments in the code (comments are not parsed)."""
    source = "useEffect(() => {}, [count]) // dependency on count"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 9.58μs -> 8.82μs (8.64% faster)

def test_hook_with_trailing_comma_in_deps():
    """Test dependency array with a trailing comma."""
    source = "useEffect(() => {}, [a, b, c,])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 9.58μs -> 8.40μs (14.0% faster)

def test_use_prefix_without_capital_letter():
    """Test that 'use' without a capital letter following doesn't match."""
    source = "used(0) user(1) useState(2)"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 6.49μs -> 6.67μs (2.70% slower)

def test_hook_with_empty_string_dependency():
    """Test dependency array with empty string as dependency."""
    source = 'useEffect(() => {}, [""])'
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 8.67μs -> 8.26μs (4.99% faster)

def test_only_use_word_boundary():
    """Test that 'use' is matched only at word boundaries."""
    source = "notUse(0) useState(1)"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 6.18μs -> 6.31μs (2.06% slower)

def test_single_character_hook_name_after_use():
    """Test hook with single character following 'use'."""
    source = "useA() useB()"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 7.47μs -> 7.88μs (5.09% slower)

def test_many_hooks_without_dependencies():
    """Test extraction of 100 hooks without dependency arrays."""
    hooks_code = "\n".join([f"use{chr(65 + (i % 26))}Hook{i}(arg{i})" for i in range(100)])
    codeflash_output = _extract_hook_usages(hooks_code); result = codeflash_output # 142μs -> 123μs (15.9% faster)

def test_deeply_nested_parentheses():
    """Test hook with deeply nested parentheses (500 levels)."""
    # Create a string with 500 levels of nested parentheses
    nested = "(" * 250 + "useState(0)" + ")" * 250
    codeflash_output = _extract_hook_usages(nested); result = codeflash_output # 11.9μs -> 12.2μs (2.45% slower)

def test_large_dependency_array():
    """Test hook with a very large dependency array (200 dependencies)."""
    deps = ", ".join([f"dep{i}" for i in range(200)])
    source = f"useEffect(() => {{}}, [{deps}])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 131μs -> 25.7μs (414% faster)

def test_very_long_component_source():
    """Test extraction from a very long component source (10,000 characters)."""
    # Create a large source code with hooks scattered throughout
    large_source = ""
    for i in range(200):
        large_source += f"const padding{i} = getPadding();\n"
        large_source += f"useEffect(() => {{}}, [var{i}]);\n"
        large_source += f"const var{i} = useState({i});\n"
        large_source += f"const callback{i} = useCallback(() => {{}}, []);\n"
    
    codeflash_output = _extract_hook_usages(large_source); result = codeflash_output # 1.45ms -> 1.26ms (15.0% faster)
    useEffect_count = sum(1 for h in result if h.name == "useEffect")
    useState_count = sum(1 for h in result if h.name == "useState")
    useCallback_count = sum(1 for h in result if h.name == "useCallback")

def test_component_with_repeated_patterns():
    """Test component with repeated hook patterns (1000 repetitions)."""
    pattern = "useState(0); useEffect(() => {}, [x]);\n"
    source = pattern * 500
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 1.79ms -> 1.67ms (7.29% faster)
    useState_hooks = [h for h in result if h.name == "useState"]
    useEffect_hooks = [h for h in result if h.name == "useEffect"]

def test_random_hook_names_performance():
    """Test extraction with 100 different custom hook names."""
    hooks_code = ""
    for i in range(100):
        custom_hook_name = f"use{''.join([chr(65 + (i*7 + j) % 26) for j in range(3)])}Hook"
        hooks_code += f"{custom_hook_name}(arg);\n"
    
    codeflash_output = _extract_hook_usages(hooks_code); result = codeflash_output # 129μs -> 123μs (4.67% faster)

def test_mixed_valid_and_invalid_patterns():
    """Test source with mix of valid hooks and similar patterns (500 valid, 500 invalid)."""
    source = ""
    for i in range(500):
        source += f"useState(0);\n"  # Valid
        source += f"use(0);\n"  # Invalid (no capital after 'use')
        source += f"useHook{i}();\n"  # Valid
        source += f"used{i}();\n"  # Invalid (not starting with 'use')
    
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 1.08ms -> 1.20ms (10.3% slower)

def test_extreme_whitespace_handling():
    """Test extraction with extreme amounts of whitespace in source."""
    source = "   \n\n\n  " + "useState(0)  \n\n\n  " * 100
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 115μs -> 125μs (7.99% slower)

def test_dependency_array_with_complex_expressions():
    """Test dependency array with complex nested expressions (100 dependencies with operators)."""
    deps = ", ".join([f"({i} + {i+1})" for i in range(100)])
    source = f"useEffect(() => {{}}, [{deps}])"
    codeflash_output = _extract_hook_usages(source); result = codeflash_output # 103μs -> 74.1μs (40.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1571-2026-02-20T04.05.48 and push.

Codeflash Static Badge

The optimized code achieves a **15% runtime improvement** by replacing character-by-character scanning with targeted string searches using Python's built-in `str.find()` method when matching parentheses in React hook calls.

**Key Optimization:**

Instead of iterating through every character to find matching parentheses:
```python
# Original: char-by-char iteration
while j < n:
    char = cs[j]
    if char == "(":
        bracket_depth += 1
    elif char == ")":
        bracket_depth -= 1
    j += 1
```

The optimized version jumps directly between parentheses:
```python
# Optimized: direct search for next ( or )
next_open = cs.find("(", j)
next_close = cs.find(")", j)
# Then jump to whichever comes first
```

**Why This Is Faster:**

1. **Reduced iterations**: Line profiler shows the inner while loop dropped from **40,809 hits** to **8,005 hits** (~80% reduction), directly correlating to the speedup
2. **Native string search**: Python's `str.find()` is implemented in C and optimized for substring searching, far faster than Python-level character comparisons
3. **Bigger jumps**: Instead of checking every character, the code skips directly to the next relevant delimiter

**Performance Impact Based on Test Results:**

The optimization particularly excels with:
- **Large dependency arrays**: 414% faster on 200-dependency arrays (131μs → 25.7μs)
- **Complex nested structures**: 40.5% faster with newline-heavy dependency arrays
- **High hook counts**: 34.4% faster on 1000 hooks, 15% faster on realistic 200-hook components

**Context from function_references:**

The function is called in unit tests for React hook analysis, where components may contain dozens of hooks. Since React components with many `useEffect`, `useMemo`, and custom hooks are common in production codebases, this optimization directly benefits build-time static analysis tools that parse large component files. The 15% overall speedup becomes significant when analyzing hundreds of React components in a codebase.

The optimization maintains identical behavior—same hook detection, same dependency counting logic—while simply reducing the number of character-level operations needed to find parenthesis boundaries.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 20, 2026
@claude
Copy link
Contributor

claude bot commented Feb 20, 2026

PR Review Summary

Prek Checks

Passed after auto-fix. Fixed 1 issue:

  • RET507 (superfluous-else-continue) in context.py — removed unnecessary else after continue
  • Reformatted 1 file

Committed and pushed as style: auto-fix linting issues.

Mypy

Passed — no type errors found in context.py.

Code Review

No critical issues found. The optimization correctly replaces character-by-character parenthesis scanning with str.find() jumps between ( and ). Since the original code only acted on parenthesis characters and ignored everything else, the str.find() approach is logically equivalent while being faster (C-level string search).

Test Coverage

File Stmts Miss Coverage
codeflash/languages/javascript/frameworks/react/context.py 124 2 98%
  • This file is new (introduced in parent PR), not present on main
  • ✅ Coverage is 98%, well above the 75% threshold for new files
  • ✅ The optimized _extract_hook_usages function is fully exercised by existing tests
  • ✅ No coverage regression — optimization is a refactor of existing logic

Last updated: 2026-02-20

@claude claude bot deleted the branch codeflash/optimize-pr1561-2026-02-20T03.56.09 February 20, 2026 04:18
@claude claude bot closed this Feb 20, 2026
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr1571-2026-02-20T04.05.48 branch February 20, 2026 04:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants