Skip to content

Comments

⚡️ Speed up function function_has_return_statement by 147% in PR #1460 (call-graphee)#1535

Merged
KRRT7 merged 1 commit intocall-grapheefrom
codeflash/optimize-pr1460-2026-02-18T22.34.56
Feb 18, 2026
Merged

⚡️ Speed up function function_has_return_statement by 147% in PR #1460 (call-graphee)#1535
KRRT7 merged 1 commit intocall-grapheefrom
codeflash/optimize-pr1460-2026-02-18T22.34.56

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 18, 2026

⚡️ This pull request contains optimizations for PR #1460

If you approve this dependent PR, these changes will be merged into the original PR branch call-graphee.

This PR will be automatically closed if the original PR is merged.


📄 147% (1.47x) speedup for function_has_return_statement in codeflash/discovery/functions_to_optimize.py

⏱️ Runtime : 1.47 milliseconds 595 microseconds (best of 58 runs)

📝 Explanation and details

The optimized code achieves a 146% speedup (from 1.47ms to 595μs) by eliminating the overhead of ast.iter_child_nodes() and replacing it with direct field access on AST nodes.

Key optimizations:

  1. Direct stack initialization: Instead of starting with [function_node] and then traversing into its body, the stack is initialized directly with list(function_node.body). This skips one iteration and avoids processing the function definition wrapper itself.

  2. Manual field traversal: Rather than calling ast.iter_child_nodes(node) which is a generator that yields all child nodes, the code directly accesses node._fields and uses getattr() to inspect each field. This eliminates the generator overhead and function call costs associated with ast.iter_child_nodes().

  3. Targeted statement filtering: By checking isinstance(child, ast.stmt) or isinstance(item, ast.stmt) only on relevant fields (handling both single statements and lists of statements), the traversal focuses on statement nodes where ast.Return can appear, avoiding unnecessary checks on expression nodes.

Why this is faster:

  • Reduced function call overhead: ast.iter_child_nodes() is a generator function that incurs call/yield overhead on every iteration. Direct attribute access via getattr() is faster for small numbers of fields.
  • Fewer iterations: The line profiler shows the original code's ast.iter_child_nodes() line hit 5,453 times (69% of runtime), while the optimized version's field iteration hits only 3,290 times (17.4% of runtime).
  • Better cache locality: Direct field access patterns may benefit from better CPU cache utilization compared to generator state management.

Test case performance:

The optimization shows dramatic improvements particularly for:

  • Functions with many sequential statements (2365% faster for 1000 statements, 1430% faster for 1000 nested functions)
  • Simple functions (234-354% faster for basic return detection)
  • Moderately complex control flow (80-125% faster for nested conditionals/loops)

The speedup is consistent across all test cases, with early-return scenarios benefiting the most as the optimization allows faster discovery of the return statement before processing unnecessary nodes.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 80 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import ast  # used to parse Python source into AST nodes

import pytest  # used for our unit tests
from codeflash.discovery.functions_to_optimize import \
    function_has_return_statement

def test_detects_simple_return():
    # Parse a simple function that directly returns a value
    src = "def foo():\n    return 42\n"
    module = ast.parse(src)
    func_node = module.body[0]  # real ast.FunctionDef instance
    # The function contains a top-level Return statement -> should be True
    codeflash_output = function_has_return_statement(func_node) # 4.08μs -> 1.22μs (234% faster)

def test_returns_false_when_no_return_present():
    # Parse a function that has statements but no return
    src = "def foo():\n    x = 1\n    y = x + 2\n"
    module = ast.parse(src)
    func_node = module.body[0]
    # No Return nodes anywhere -> should be False
    codeflash_output = function_has_return_statement(func_node) # 7.40μs -> 3.94μs (88.1% faster)

def test_detects_return_in_nested_block_structures():
    # Return inside an if-block should be detected
    src = (
        "def foo():\n"
        "    if True:\n"
        "        x = 1\n"
        "        return x\n"
    )
    module = ast.parse(src)
    func_node = module.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.55μs -> 2.83μs (96.5% faster)

    # Return inside a try/except/finally should be detected
    src2 = (
        "def bar():\n"
        "    try:\n"
        "        x = 1\n"
        "    except Exception:\n"
        "        return 'handled'\n"
        "    finally:\n"
        "        pass\n"
    )
    module2 = ast.parse(src2)
    func_node2 = module2.body[0]
    codeflash_output = function_has_return_statement(func_node2) # 6.51μs -> 3.35μs (94.7% faster)

def test_ignores_yield_only_functions():
    # A function with only yield (a generator) contains no Return statement nodes.
    src = "def gen():\n    yield 1\n    yield 2\n"
    module = ast.parse(src)
    func_node = module.body[0]
    # There are Yield nodes but no ast.Return -> should be False
    codeflash_output = function_has_return_statement(func_node) # 5.80μs -> 2.26μs (156% faster)

def test_docstring_only_is_not_a_return():
    # A function containing only a docstring is represented as an Expr with a Constant;
    # there is no Return node.
    src = 'def foo():\n    """docstring only"""\n'
    module = ast.parse(src)
    func_node = module.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.67μs -> 1.78μs (162% faster)

def test_detects_return_inside_nested_function_definition():
    # A nested inner function that contains a Return should still cause the outer
    # FunctionDef to be reported as having a return because traversal visits inner defs.
    src = (
        "def outer():\n"
        "    def inner():\n"
        "        # return belongs to inner, but function_has_return_statement explores nested stmts\n"
        "        return 'inner'\n"
        "    x = 5\n"
    )
    module = ast.parse(src)
    outer_node = module.body[0]
    # According to the implementation, nested Return statements are found -> True
    codeflash_output = function_has_return_statement(outer_node) # 7.77μs -> 4.53μs (71.6% faster)

def test_async_function_def_with_and_without_return():
    # AsyncFunctionDef with a return statement
    src_with = "async def af():\n    return 10\n"
    module_with = ast.parse(src_with)
    async_node_with = module_with.body[0]  # real ast.AsyncFunctionDef instance
    codeflash_output = function_has_return_statement(async_node_with) # 3.89μs -> 1.11μs (249% faster)

    # AsyncFunctionDef without a return should be False
    src_without = "async def af2():\n    await some_coroutine()\n"
    module_without = ast.parse(src_without)
    async_node_without = module_without.body[0]
    codeflash_output = function_has_return_statement(async_node_without) # 3.36μs -> 1.40μs (139% faster)

def test_return_with_and_without_value():
    # Return with a value should be detected
    src_val = "def f1():\n    return x + y\n"
    node_val = ast.parse(src_val).body[0]
    codeflash_output = function_has_return_statement(node_val) # 3.86μs -> 1.10μs (250% faster)

    # Bare return (implicitly returning None) should also be detected
    src_bare = "def f2():\n    return\n"
    node_bare = ast.parse(src_bare).body[0]
    codeflash_output = function_has_return_statement(node_bare) # 2.33μs -> 601ns (289% faster)

def test_many_sequential_statements_with_final_return():
    # Create a function containing 1000 simple statements followed by a return.
    # This ensures the DFS examines many nodes and still finds the Return.
    n = 1000
    body_lines = ["    x = %d" % i for i in range(n)]
    body_lines.append("    return 'done'")
    src = "def big():\n" + "\n".join(body_lines) + "\n"
    module = ast.parse(src)
    big_node = module.body[0]
    codeflash_output = function_has_return_statement(big_node) # 106μs -> 4.33μs (2365% faster)

def test_many_inner_function_defs_with_last_containing_return():
    # Construct a function that defines 1000 inner functions (sequentially).
    # Only the last inner function contains a return. The traversal should still find it.
    # This tests performance and that nested FunctionDef statements are explored.
    count = 1000
    inner_defs = []
    for i in range(count):
        if i == count - 1:
            # last inner function has a return
            inner_defs.append(f"    def inner_{i}():\n        return {i}\n")
        else:
            inner_defs.append(f"    def inner_{i}():\n        pass\n")
    src = "def container():\n" + "".join(inner_defs)
    module = ast.parse(src)
    container_node = module.body[0]
    codeflash_output = function_has_return_statement(container_node) # 109μs -> 7.13μs (1430% faster)

def test_no_false_positive_for_return_like_names_or_comments():
    # Ensure that tokens like 'return' in comments or names are not parsed as ast.Return nodes.
    src = (
        "def tricky():\n"
        "    # return should not be detected here because it's a comment\n"
        "    my_return_variable = 'return'\n"
        "    def not_a_return():\n"
        "        # 'return' in a string literal\n"
        "        s = 'this looks like return in text'\n"
        "    pass\n"
    )
    module = ast.parse(src)
    node = module.body[0]
    # No actual Return statement nodes -> should be False
    codeflash_output = function_has_return_statement(node) # 9.87μs -> 5.28μs (86.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import ast
from _ast import AsyncFunctionDef, FunctionDef

# imports
import pytest
from codeflash.discovery.functions_to_optimize import \
    function_has_return_statement

def test_function_with_simple_return_statement():
    # Test: A function with a single return statement should return True
    code = """
def func():
    return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.28μs -> 1.21μs (253% faster)

def test_function_with_return_no_value():
    # Test: A function with return (no value) should return True
    code = """
def func():
    return
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.15μs -> 1.09μs (280% faster)

def test_function_without_return_statement():
    # Test: A function without any return statement should return False
    code = """
def func():
    x = 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.51μs -> 2.75μs (101% faster)

def test_async_function_with_return():
    # Test: An async function with return should return True
    code = """
async def async_func():
    return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.05μs -> 1.07μs (278% faster)

def test_async_function_without_return():
    # Test: An async function without return should return False
    code = """
async def async_func():
    await something()
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.77μs -> 1.83μs (160% faster)

def test_function_with_return_in_if_block():
    # Test: Return nested in if block should return True
    code = """
def func(x):
    if x > 0:
        return x
    else:
        pass
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 6.64μs -> 3.40μs (95.6% faster)

def test_function_with_return_in_for_loop():
    # Test: Return nested in for loop should return True
    code = """
def func():
    for i in range(10):
        return i
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.80μs -> 3.21μs (80.9% faster)

def test_function_with_return_in_while_loop():
    # Test: Return nested in while loop should return True
    code = """
def func():
    while True:
        return 1
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.16μs -> 2.69μs (92.2% faster)

def test_function_with_return_in_try_block():
    # Test: Return in try-except block should return True
    code = """
def func():
    try:
        return 42
    except Exception:
        pass
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.81μs -> 2.90μs (100% faster)

def test_function_with_return_in_except_block():
    # Test: Return in except block should return True
    code = """
def func():
    try:
        pass
    except Exception:
        return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.85μs -> 2.96μs (98.0% faster)

def test_function_with_return_in_finally_block():
    # Test: Return in finally block should return True
    code = """
def func():
    try:
        pass
    finally:
        return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.64μs -> 2.81μs (101% faster)

def test_function_with_return_in_with_block():
    # Test: Return in with block should return True
    code = """
def func():
    with open('file') as f:
        return f.read()
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.81μs -> 3.06μs (89.6% faster)

def test_function_with_multiple_return_statements():
    # Test: Function with multiple return statements should return True
    code = """
def func(x):
    if x > 0:
        return 1
    return 2
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.39μs -> 1.04μs (321% faster)

def test_function_with_deeply_nested_return():
    # Test: Return deeply nested in multiple blocks should return True
    code = """
def func(x):
    if x > 0:
        if x > 5:
            while True:
                for i in range(10):
                    try:
                        return i
                    except:
                        pass
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 10.6μs -> 6.40μs (66.0% faster)

def test_function_empty_body():
    # Test: Function with only pass statement should return False
    code = """
def func():
    pass
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.29μs -> 1.47μs (191% faster)

def test_function_with_docstring_only():
    # Test: Function with docstring but no return should return False
    code = '''
def func():
    """This is a docstring."""
'''
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.67μs -> 1.79μs (160% faster)

def test_function_with_docstring_and_return():
    # Test: Function with docstring and return should return True
    code = '''
def func():
    """This is a docstring."""
    return 42
'''
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.29μs -> 1.02μs (320% faster)

def test_function_with_nested_function_containing_return():
    # Test: Nested function with return shouldn't affect outer function result
    code = """
def outer():
    def inner():
        return 42
    x = inner()
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    # The outer function itself has no return statement
    codeflash_output = function_has_return_statement(func_node) # 7.82μs -> 4.70μs (66.3% faster)

def test_function_with_nested_function_and_outer_return():
    # Test: Outer function with return should return True
    code = """
def outer():
    def inner():
        pass
    return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.30μs -> 961ns (347% faster)

def test_function_with_nested_class_containing_method_with_return():
    # Test: Nested class method return shouldn't affect outer function
    code = """
def func():
    class MyClass:
        def method(self):
            return 42
    x = MyClass()
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    # The outer function itself has no return statement
    codeflash_output = function_has_return_statement(func_node) # 9.58μs -> 5.59μs (71.3% faster)

def test_function_with_return_in_if_elif_else_chain():
    # Test: Return in elif block should return True
    code = """
def func(x):
    if x > 10:
        pass
    elif x > 5:
        return x
    else:
        pass
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.59μs -> 3.99μs (90.5% faster)

def test_function_with_return_expression_containing_call():
    # Test: Return with function call expression should return True
    code = """
def func():
    return some_function()
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.95μs -> 1.03μs (282% faster)

def test_function_with_return_complex_expression():
    # Test: Return with complex expression should return True
    code = """
def func():
    return x + y * z if condition else a
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.00μs -> 1.05μs (280% faster)

def test_function_with_return_list_comprehension():
    # Test: Return with list comprehension should return True
    code = """
def func():
    return [x for x in range(10)]
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.95μs -> 1.10μs (258% faster)

def test_function_with_return_dict_comprehension():
    # Test: Return with dict comprehension should return True
    code = """
def func():
    return {x: x*2 for x in range(10)}
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.06μs -> 1.08μs (275% faster)

def test_function_with_return_none_explicit():
    # Test: Explicit return None should return True
    code = """
def func():
    return None
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.13μs -> 1.03μs (300% faster)

def test_function_with_return_tuple():
    # Test: Return with tuple should return True
    code = """
def func():
    return (1, 2, 3)
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.85μs -> 1.03μs (272% faster)

def test_function_with_return_unpacking():
    # Test: Return with unpacking should return True
    code = """
def func():
    return *items,
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.01μs -> 1.06μs (277% faster)

def test_function_with_multiple_statements_no_return():
    # Test: Multiple statements without return should return False
    code = """
def func():
    x = 1
    y = 2
    z = 3
    print(x, y, z)
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 9.09μs -> 4.71μs (93.0% faster)

def test_function_with_return_in_elif_not_if():
    # Test: Return only in elif, not in if should return True
    code = """
def func(x):
    if x < 0:
        pass
    elif x == 0:
        return 0
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 7.01μs -> 3.74μs (87.7% faster)

def test_function_with_return_in_nested_if():
    # Test: Return in nested if inside while should return True
    code = """
def func():
    while condition:
        if True:
            return 1
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 6.47μs -> 3.60μs (80.0% faster)

def test_function_with_return_in_with_as():
    # Test: Return in with statement with as clause should return True
    code = """
def func():
    with something() as s:
        return s.value
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.63μs -> 2.97μs (89.9% faster)

def test_function_with_return_in_for_else():
    # Test: Return in for-else block should return True
    code = """
def func():
    for i in range(10):
        pass
    else:
        return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.91μs -> 3.28μs (80.4% faster)

def test_function_with_return_in_while_else():
    # Test: Return in while-else block should return True
    code = """
def func():
    while condition:
        pass
    else:
        return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.51μs -> 2.77μs (99.3% faster)

def test_function_with_return_in_try_except_else():
    # Test: Return in except-else block should return True
    code = """
def func():
    try:
        pass
    except:
        pass
    else:
        return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.77μs -> 2.73μs (112% faster)

def test_function_no_statements():
    # Test: Empty function body (only implicit pass)
    code = "def func(): ..."
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.57μs -> 1.77μs (158% faster)

def test_function_with_lambda_containing_return_like_expression():
    # Test: Lambda expressions don't contain return statements syntactically
    code = """
def func():
    x = lambda y: y if y > 0 else 0
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    # Lambda doesn't have return statements in the AST, just expressions
    codeflash_output = function_has_return_statement(func_node) # 5.37μs -> 2.67μs (101% faster)

def test_function_with_raise_statement_no_return():
    # Test: Function with raise but no return should return False
    code = """
def func():
    raise ValueError("test")
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.01μs -> 2.12μs (136% faster)

def test_function_with_raise_and_return():
    # Test: Function with both raise and return should return True
    code = """
def func(x):
    if x < 0:
        raise ValueError()
    return x
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.34μs -> 962ns (351% faster)

def test_function_with_assert_no_return():
    # Test: Function with assert but no return should return False
    code = """
def func(x):
    assert x > 0
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 5.16μs -> 2.24μs (130% faster)

def test_function_with_break_and_return():
    # Test: Function with break inside loop and return should return True
    code = """
def func():
    for i in range(10):
        if i == 5:
            break
    return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.26μs -> 991ns (330% faster)

def test_function_with_continue_no_return():
    # Test: Function with continue but no return should return False
    code = """
def func():
    for i in range(10):
        if i == 5:
            continue
        print(i)
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 8.94μs -> 5.05μs (77.0% faster)

def test_function_with_yield_no_return():
    # Test: Generator with yield but no return should return False
    code = """
def func():
    yield 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.66μs -> 1.82μs (155% faster)

def test_function_with_yield_and_return():
    # Test: Generator with both yield and return should return True
    code = """
def func():
    yield 1
    yield 2
    return
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.45μs -> 1.02μs (335% faster)

def test_async_function_with_await_and_return():
    # Test: Async function with await and return should return True
    code = """
async def func():
    result = await something()
    return result
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.46μs -> 982ns (354% faster)

def test_async_function_with_yield_from_and_return():
    # Test: Async generator with yield and return should return True
    code = """
async def func():
    yield 42
    return
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.24μs -> 942ns (350% faster)

def test_function_with_decorator():
    # Test: Decorators should not affect detection
    code = """
@decorator
def func():
    return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.22μs -> 1.15μs (266% faster)

def test_function_with_multiple_decorators():
    # Test: Multiple decorators should not affect detection
    code = """
@decorator1
@decorator2
@decorator3
def func():
    pass
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.94μs -> 1.54μs (220% faster)

def test_function_with_default_arguments():
    # Test: Default arguments should not affect detection
    code = """
def func(x=42, y=None):
    return x + y
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.02μs -> 1.11μs (261% faster)

def test_function_with_annotations():
    # Test: Type annotations should not affect detection
    code = """
def func(x: int) -> int:
    return x * 2
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 4.05μs -> 1.08μs (274% faster)

def test_function_with_very_long_name():
    # Test: Function name length should not affect detection
    code = """
def this_is_a_very_long_function_name_that_goes_on_and_on():
    return 42
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 3.92μs -> 1.07μs (265% faster)

def test_function_with_many_sequential_statements_no_return():
    # Test: Function with 100 sequential statements without return
    code_lines = ["def func():"]
    for i in range(100):
        code_lines.append(f"    x{i} = {i}")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 111μs -> 54.2μs (106% faster)

def test_function_with_many_sequential_statements_with_return():
    # Test: Function with 100 sequential statements and return at end
    code_lines = ["def func():"]
    for i in range(100):
        code_lines.append(f"    x{i} = {i}")
    code_lines.append("    return 42")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 15.4μs -> 1.41μs (991% faster)

def test_function_with_many_nested_if_statements():
    # Test: Function with deeply nested if statements
    code_lines = ["def func(x):"]
    indent = "    "
    for i in range(50):
        code_lines.append(f"{indent}if x > {i}:")
        indent += "    "
    code_lines.append(f"{indent}return 42")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 53.0μs -> 28.3μs (87.3% faster)

def test_function_with_many_if_elif_else_branches():
    # Test: Function with many if-elif-else branches
    code_lines = ["def func(x):"]
    code_lines.append("    if x == 0:")
    code_lines.append("        pass")
    for i in range(1, 50):
        code_lines.append(f"    elif x == {i}:")
        code_lines.append("        pass")
    code_lines.append("    else:")
    code_lines.append("        return 42")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 59.5μs -> 30.8μs (93.0% faster)

def test_function_with_many_try_except_blocks():
    # Test: Function with many try-except blocks
    code_lines = ["def func():"]
    for i in range(30):
        code_lines.append("    try:")
        code_lines.append(f"        x{i} = {i}")
        code_lines.append("    except Exception:")
        code_lines.append("        pass")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 77.3μs -> 38.9μs (98.6% faster)

def test_function_with_many_try_except_with_return_in_middle():
    # Test: Function with return in middle of many try-except blocks
    code_lines = ["def func():"]
    for i in range(20):
        code_lines.append("    try:")
        code_lines.append(f"        x{i} = {i}")
        code_lines.append("    except Exception:")
        code_lines.append("        pass")
    code_lines.append("    return 42")
    for i in range(20, 40):
        code_lines.append("    try:")
        code_lines.append(f"        x{i} = {i}")
        code_lines.append("    except Exception:")
        code_lines.append("        pass")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 55.4μs -> 26.6μs (108% faster)

def test_function_with_many_nested_loops():
    # Test: Function with multiple nested loops
    code_lines = ["def func():"]
    indent = "    "
    for level in range(10):
        code_lines.append(f"{indent}for i{level} in range(10):")
        indent += "    "
    code_lines.append(f"{indent}return 42")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 18.0μs -> 10.6μs (69.3% faster)

def test_function_with_many_nested_with_statements():
    # Test: Function with multiple nested with statements
    code_lines = ["def func():"]
    indent = "    "
    for i in range(15):
        code_lines.append(f"{indent}with context{i}() as c{i}:")
        indent += "    "
    code_lines.append(f"{indent}return 42")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 20.7μs -> 11.6μs (78.9% faster)

def test_function_with_many_nested_while_loops():
    # Test: Function with nested while loops
    code_lines = ["def func():"]
    indent = "    "
    for i in range(20):
        code_lines.append(f"{indent}while condition{i}:")
        indent += "    "
    code_lines.append(f"{indent}return 42")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 23.9μs -> 12.7μs (88.5% faster)

def test_function_with_100_statements_return_at_position_50():
    # Test: Return statement early in a long function
    code_lines = ["def func():"]
    for i in range(50):
        code_lines.append(f"    x{i} = {i}")
    code_lines.append("    return 42")
    for i in range(50, 100):
        code_lines.append(f"    x{i} = {i}")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 64.1μs -> 28.5μs (125% faster)

def test_function_with_multiple_returns_in_different_branches():
    # Test: Multiple returns in different branches
    code_lines = ["def func(x):"]
    for i in range(20):
        code_lines.append(f"    if x == {i}:")
        code_lines.append(f"        return {i}")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 8.04μs -> 2.67μs (202% faster)

def test_function_without_returns_100_branches():
    # Test: No returns in function with 100 branches
    code_lines = ["def func(x):"]
    for i in range(100):
        code_lines.append(f"    if x == {i}:")
        code_lines.append(f"        pass")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 150μs -> 66.0μs (127% faster)

def test_function_with_many_assignments_no_return():
    # Test: Performance with many assignment statements
    code_lines = ["def func():"]
    for i in range(200):
        code_lines.append(f"    var_{i} = {i * 2}")
    code = "\n".join(code_lines)
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 219μs -> 106μs (106% faster)

def test_function_with_complex_control_flow_with_return():
    # Test: Complex control flow with multiple statement types
    code = """
def func(x, y, z):
    if x > 0:
        try:
            for i in range(y):
                if i % 2 == 0:
                    while i < z:
                        with context() as c:
                            if c.valid:
                                return i
                        i += 1
                else:
                    continue
        except Exception as e:
            pass
        finally:
            pass
    else:
        for j in range(z):
            yield j
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 19.2μs -> 10.8μs (77.8% faster)

def test_function_with_complex_control_flow_no_return():
    # Test: Complex control flow without returns
    code = """
def func(x, y, z):
    if x > 0:
        try:
            for i in range(y):
                if i % 2 == 0:
                    while i < z:
                        with context() as c:
                            if c.valid:
                                break
                        i += 1
                else:
                    continue
        except Exception as e:
            pass
        finally:
            pass
    else:
        for j in range(z):
            yield j
"""
    tree = ast.parse(code)
    func_node = tree.body[0]
    codeflash_output = function_has_return_statement(func_node) # 19.2μs -> 10.7μs (79.6% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1460-2026-02-18T22.34.56 and push.

Codeflash Static Badge

The optimized code achieves a **146% speedup** (from 1.47ms to 595μs) by eliminating the overhead of `ast.iter_child_nodes()` and replacing it with direct field access on AST nodes.

**Key optimizations:**

1. **Direct stack initialization**: Instead of starting with `[function_node]` and then traversing into its body, the stack is initialized directly with `list(function_node.body)`. This skips one iteration and avoids processing the function definition wrapper itself.

2. **Manual field traversal**: Rather than calling `ast.iter_child_nodes(node)` which is a generator that yields all child nodes, the code directly accesses `node._fields` and uses `getattr()` to inspect each field. This eliminates the generator overhead and function call costs associated with `ast.iter_child_nodes()`.

3. **Targeted statement filtering**: By checking `isinstance(child, ast.stmt)` or `isinstance(item, ast.stmt)` only on relevant fields (handling both single statements and lists of statements), the traversal focuses on statement nodes where `ast.Return` can appear, avoiding unnecessary checks on expression nodes.

**Why this is faster:**

- **Reduced function call overhead**: `ast.iter_child_nodes()` is a generator function that incurs call/yield overhead on every iteration. Direct attribute access via `getattr()` is faster for small numbers of fields.
- **Fewer iterations**: The line profiler shows the original code's `ast.iter_child_nodes()` line hit 5,453 times (69% of runtime), while the optimized version's field iteration hits only 3,290 times (17.4% of runtime).
- **Better cache locality**: Direct field access patterns may benefit from better CPU cache utilization compared to generator state management.

**Test case performance:**

The optimization shows dramatic improvements particularly for:
- **Functions with many sequential statements** (2365% faster for 1000 statements, 1430% faster for 1000 nested functions)
- **Simple functions** (234-354% faster for basic return detection)
- **Moderately complex control flow** (80-125% faster for nested conditionals/loops)

The speedup is consistent across all test cases, with early-return scenarios benefiting the most as the optimization allows faster discovery of the return statement before processing unnecessary nodes.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 18, 2026
@codeflash-ai codeflash-ai bot mentioned this pull request Feb 18, 2026
2 tasks
@KRRT7 KRRT7 merged commit 392453a into call-graphee Feb 18, 2026
27 of 28 checks passed
@KRRT7 KRRT7 deleted the codeflash/optimize-pr1460-2026-02-18T22.34.56 branch February 18, 2026 22:49
@claude
Copy link
Contributor

claude bot commented Feb 18, 2026

PR Review Summary

Prek Checks

✅ All prek checks pass — no formatting or linting issues found.

Mypy

⚠️ 19 pre-existing mypy errors in codeflash/discovery/functions_to_optimize.py — none introduced by this PR. All errors relate to existing str vs Path type mismatches and Optional handling that predate these changes.

Code Review

✅ No critical issues found. The PR makes two clean optimizations:

  1. find_functions_with_return_statement (lines 117-140): Refactors the FunctionWithReturnStatement class (ast.NodeVisitor) into a standalone function using iterative DFS. Functionally equivalent — same behavior for function/class/async traversal, same continue to skip recursing into function bodies.

  2. function_has_return_statement (lines 957-974): Optimizes AST traversal by only visiting ast.stmt children instead of all child nodes. Correct because ast.Return is always an ast.stmt subclass and can only appear in statement contexts. Initializes the stack with function_node.body directly instead of wrapping the function node.

Test Coverage

File Stmts Miss Cover
codeflash/discovery/functions_to_optimize.py 525 165 69%

Changed lines analysis:

  • Lines 117-140 (find_functions_with_return_statement): All executable lines covered via find_all_functions_in_file()_find_all_functions_in_python_file() call chain
  • Lines 957-974 (function_has_return_statement): All executable lines covered — both return True (functions with returns) and return False (functions without) paths exercised

The codeflash bot reports 100% test coverage on the optimization with 80 generated regression tests passing.

Test suite: 2392 passed, 57 skipped, 8 failed (all failures in test_tracer.py — pre-existing, unrelated to this PR)

Codeflash Optimization PRs

No mergeable optimization PRs targeting main — PR #1389 and #1291 both have failing CI checks.


Last updated: 2026-02-18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant