Conversation
|
Thanks. It would be great to have a CI reproducer first, could you have a look or do you want me to handle it? |
|
I could cherry pick the commit here in case you want to merge them in one go |
yes, please do, first the ci commit then the fix :-) |
eac3479 to
d5ed51f
Compare
|
The fact that it breaks it is a good thing for me. I was chasing bugs in finufft with gcc-10. This might be the cause. flatironinstitute/finufft#780 |
the issue seems to be gcc-10 auto vectorization not xsimd explicit one. Adding that this code too causes issues: https://marco.godbolt.org/z/bjo6o8TfG |
|
if that's a GCC issue, specific to gcc-10, we could add something specific, but is that worth the effort? |
|
gcc-10 has a problem with avx512 but there is also an issue that I introduced when changing the avx512 swizzle: flatironinstitute/finufft#780 (comment) I should fix that. For gcc-10 if this does not pass CI, I am not sure what is best. I just need xsimd to work with gcc-10 as is not too old of a compiler otherwise I need a #error in finufft if avx512 is used to compile finufft. |
test/test_shuffle.cpp
Outdated
| } | ||
| }; | ||
|
|
||
| #if defined(__GNUC__) && (__GNUC__ == 10) && XSIMD_WITH_AVX512F |
There was a problem hiding this comment.
clang also defines __GNUC__, probably not as __GNUC__ == 10 but && !defined (__clang__) would be safer.
| template <typename ElemT, typename U, U... Vs> | ||
| XSIMD_INLINE constexpr bool is_cross_lane() noexcept | ||
| { | ||
| static_assert(std::is_integral<U>::value, "swizzle mask values must be integral"); |
There was a problem hiding this comment.
you could probably also static_assert one sizeof...(U) to be on the safe side.
| static_assert(sizeof...(Vs) >= 1, "Need at least one lane"); | ||
| constexpr std::size_t N = sizeof...(Vs); | ||
| constexpr std::size_t lane_elems = 16 / sizeof(ElemT); | ||
| return cross_impl128<0, N, lane_elems, U, Vs...>::value; |
There was a problem hiding this comment.
Now that we're in C++14, which has better constexpr support you could probably write this in a more procedural style, using a temporary array and a for loop to perform the check?
e9699a8 to
346301a
Compare
Replace _mm512_cvtsi512_si32 with _mm_cvtsi128_si32(_mm512_castsi512_si128(self)) to fix compilation issues with GCC 10. The _mm512_cvtsi512_si32 intrinsic is not available in older compiler versions.
Public API now checks 128-bit (16-byte) lanes, the standard for SSE/AVX/AVX512. Internal helper available for explicit lane sizes. Uses C++14 constexpr procedural style.
346301a to
1f39847
Compare
Add compiler-specific workaround for GCC 10 with AVX-512F in shuffle tests. Use zip_lo/zip_hi as stable reference for the expected interleave pattern instead of manually constructing the reference arrays.
1f39847 to
df8db98
Compare
|
@DiamonDinoia , this looks good to me, CI is green, can I merge? |
|
Yes! thanks |
|
Thank you |
On my machine with gcc-10 I get the error:
This fixes it for me.