refactor: move Python static analysis modules to languages/python/static_analysis/#1546
refactor: move Python static analysis modules to languages/python/static_analysis/#1546
Conversation
…guages/python/static_analysis/
…python/static_analysis/
The optimization achieves a **523% speedup** (from 2.29s to 367ms) by eliminating expensive libcst metadata operations and replacing the visitor/transformer pattern with direct AST manipulation. ## Key Performance Improvements **1. Removed MetadataWrapper (~430ms saved, ~9% of total time)** - Original: `cst.metadata.MetadataWrapper(cst.parse_module(optimized_code))` then `optimized_module.visit(visitor)` took 5.45s combined - Optimized: Direct `cst.parse_module(optimized_code)` takes only 183ms - The metadata infrastructure was unnecessary for this use case since we only need to identify and extract function definitions, not track parent-child relationships **2. Replaced Visitor Pattern with Direct Iteration (~5.3s saved, ~78% of total time)** - Original: Used `OptimFunctionCollector` visitor class with metadata dependencies, requiring full tree traversal and metadata resolution - Optimized: Simple for-loop over `optimized_module.body` to collect functions and classes - Direct iteration avoids the overhead of visitor callback infrastructure and metadata lookups **3. Eliminated Transformer Pattern (~87ms saved, ~1.6% of total time)** - Original: Used `OptimFunctionReplacer` transformer to traverse and rebuild the entire AST - Optimized: Manual list building with targeted `with_changes()` calls only where needed - Reduces redundant tree traversals and object creation **4. Improved Memory Efficiency** - Pre-allocated data structures instead of using visitor state - Single-pass collection instead of multiple tree traversals - Direct list manipulation instead of transformer's recursive rebuilding ## Test Performance Pattern The optimization excels across all test cases: - **Simple functions**: 587-696% faster (e.g., `test_replace_simple_function`: 2.62ms → 459μs) - **Class methods**: 509-549% faster (e.g., `test_replace_function_in_class`: 2.24ms → 367μs) - **Large files**: Still shows gains even with parsing overhead (e.g., `test_replace_function_in_large_file`: 9.37ms → 7.32ms, 28% faster) - **Batch operations**: Dramatic improvement in loops (e.g., 1000 iterations: 1.91s → 201ms, 850% faster) ## Impact on Workloads Based on `function_references`, this optimization benefits: - **Test suites** that perform multiple function replacements during test execution - **Code refactoring tools** that need to replace functions while preserving surrounding code - **Language parity testing** where consistent performance across language support implementations matters The optimization is particularly valuable for batch processing scenarios (as shown by the 850% improvement in the loop test), making it highly effective for CI/CD pipelines and automated code transformation workflows.
⚡️ Codeflash found optimizations for this PR📄 523% (5.23x) speedup for
|
PR Review SummaryPrek ChecksFixed and verified. Auto-fixed 4 ruff errors and 6 formatting issues:
All fixes committed and pushed. Prek now passes cleanly. MypyPre-existing mypy errors in moved files (libcst union-attr, missing generic type params). These are not introduced by this PR — the refactoring moved files without changing logic. Code ReviewNo new critical issues found. Two previously flagged issues still apply:
Test Coverage
Notable changes:
Codeflash Optimization PRs2 open PRs targeting main (#1389, #1291) — both have CI failures ( Last updated: 2026-02-19T12:00:00Z |
…2026-02-19T09.51.40 ⚡️ Speed up method `PythonSupport.replace_function` by 523% in PR #1546 (`follow-up-reference-graph`)
|
This PR is now faster! 🚀 @KRRT7 accepted my optimizations from: |
⚡️ Codeflash found optimizations for this PR📄 12% (0.12x) speedup for
|
⚡️ Codeflash found optimizations for this PR📄 1,230% (12.30x) speedup for
|
| if new_functions: | ||
| if max_function_index is not None: | ||
| new_body = [*new_body[: max_function_index + 1], *new_functions, *new_body[max_function_index + 1 :]] |
There was a problem hiding this comment.
Bug: Index offset when both new classes and new functions are inserted.
max_function_index is computed from the original module body, but after new classes are inserted at line 497, new_body has grown by len(unique_classes) elements. If new_classes_insertion_idx < max_function_index, the function insertion will use a stale index and place new functions at the wrong position.
Example: original body = [import, ClassA, func_B, func_C] → max_class_index=1, max_function_index=3. After inserting NewClass at index 1, new_body becomes [import, NewClass, ClassA, func_B, func_C]. Functions would be inserted at index 4 instead of 5.
Consider adjusting the index:
if new_functions:
offset = len(unique_classes) if new_classes and unique_classes and (new_classes_insertion_idx or 0) <= (max_function_index or 0) else 0
idx = max_function_index + 1 + offset if max_function_index is not None else ...|
|
||
| """ | ||
| try: | ||
| from codeflash.languages.javascript.treesitter import TreeSitterAnalyzer, TreeSitterLanguage |
There was a problem hiding this comment.
Nit: This is a self-import — extract_calling_function_source is defined in codeflash.languages.javascript.treesitter, so importing TreeSitterAnalyzer and TreeSitterLanguage from the same module is unnecessary. This was carried over from when the function lived in code_extractor.py. Consider removing the import and using the classes directly.
|
|
refactor: move Python static analysis modules to languages/python/static_analysis/
…java-support Merge main's lang_support refactor (PR #1546) with Java support branch. Key decisions: grouped Python-specific imports (main's style), kept Java routing in parse_test_output and verification_utils, added tree-sitter-java dep and B009 ruff ignore.
Summary
static_analysis.py,concolic_utils.py,coverage_utils.py,line_profile_utils.py,edit_generated_tests.py,code_extractor.py,code_replacer.py) fromcodeflash/code_utils/tocodeflash/languages/python/static_analysis/languages/python/Test plan
uv run python -c "from codeflash.languages.python.static_analysis import ..."— all 7 modules importuv run pytest tests/test_static_analysis.py tests/test_code_replacement.py tests/test_add_needed_imports_from_module.py -x— 74 tests passuv run ruff check— clean