-
-
Notifications
You must be signed in to change notification settings - Fork 34.1k
Description
Feature or enhancement
Proposal:
I noticed that Conan seemed to softlock while building clang-tidy on Windows, and I wondered why. After digging around for a while I found that _copy_sources call shutil.copytree which seems to get stuck for a long time. It's technically not stuck, but simply slow.
To compare I wanted to try robocopy so that I can have performance numbers, so I patched installer.py._copy_sources :
start_time = time.time()
shutil.copytree(source_folder, build_folder, symlinks=True)
end_time = time.time()
print(end_time - start_time)This runs for 1329.195054769516s
start_time = time.time()
out = subprocess.run(['robocopy', '/ndl', '/nfl', '/sl', '/S', source_folder+'\\', build_folder+'\\'])
assert out.returncode <= 1 # 1 and 0 are not errors anything else should contain an error
end_time = time.time()
print(end_time - start_time)this runs for 92.37016916275024s
The copied folder contains just the llvm source code as shipped in their release page. This is not a theoretical workload.
So robocopy is significantly faster, finishing in 1.5m, while shutil needed 22 minutes.
This solution is obviously not clean and takes a lot more space on disk (no longer creating symlinks) but it does show that there is indeed a big performance issue with shutil with high file count copies.
I will update this issue tomorrow with performance numbers for symlinks=false.
Has this already been discussed elsewhere?
It was already discussed on Discourse a while ago
Links to previous discussion of this feature:
At the time the only proposition was using multi threading. I did not benchmark their solution as at the time it was found inadequate but maybe now that the GIL is less of an issue it is worth considering again.
#124117
https://discuss.python.org/t/significantly-improve-shutil-copytree/62078/25