Skip to content

Performance issues with shutil.copytree on windows #144687

@Marc-Pierre-Barbier

Description

@Marc-Pierre-Barbier

Feature or enhancement

Proposal:

I noticed that Conan seemed to softlock while building clang-tidy on Windows, and I wondered why. After digging around for a while I found that _copy_sources call shutil.copytree which seems to get stuck for a long time. It's technically not stuck, but simply slow.

To compare I wanted to try robocopy so that I can have performance numbers, so I patched installer.py._copy_sources :

                start_time = time.time()
                shutil.copytree(source_folder, build_folder, symlinks=True)
                end_time = time.time()
                print(end_time - start_time)

This runs for 1329.195054769516s

                start_time = time.time()
                out = subprocess.run(['robocopy', '/ndl', '/nfl', '/sl', '/S', source_folder+'\\', build_folder+'\\'])
                assert out.returncode <= 1 # 1 and 0 are not errors anything else should contain an error
                end_time = time.time()
                print(end_time - start_time)

this runs for 92.37016916275024s

The copied folder contains just the llvm source code as shipped in their release page. This is not a theoretical workload.

So robocopy is significantly faster, finishing in 1.5m, while shutil needed 22 minutes.
This solution is obviously not clean and takes a lot more space on disk (no longer creating symlinks) but it does show that there is indeed a big performance issue with shutil with high file count copies.

I will update this issue tomorrow with performance numbers for symlinks=false.

Has this already been discussed elsewhere?

It was already discussed on Discourse a while ago

Links to previous discussion of this feature:

At the time the only proposition was using multi threading. I did not benchmark their solution as at the time it was found inadequate but maybe now that the GIL is less of an issue it is worth considering again.

#124117
https://discuss.python.org/t/significantly-improve-shutil-copytree/62078/25

Metadata

Metadata

Assignees

No one assigned

    Labels

    OS-windowsperformancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions