-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Labels
Description
Thanks to @cloudRoutine we have a solid benchmarking framework in place. There are some cases where SIMD versions of the operations don't win, or only win with very large arrays. We should investigate ways to improve that.
Loop Unrolling
Unrolling the main SIMD loop could be viable, as the JIT tends not to do that, while C++ compilers tend to put two SIMD operations per loop. EDIT - I've not managed to get any meaningful performance gains by trying this, but maybe there is a more clever way.
Reactions are currently unavailable