Fix non-release kernel builds via CudaBuilder#322
Fix non-release kernel builds via CudaBuilder#322LegNeato wants to merge 1 commit intoRust-GPU:mainfrom
Conversation
First, we weren't handling all the types. After fixing that, it exposed a `libnvvm` crash. Also saw a type issue in one of the warp APIs used by vecadd so fixed that. Fixes Rust-GPU#320
nnethercote
left a comment
There was a problem hiding this comment.
One problem, one question. It would be nice to have some kind of test added to avoid regressing, too, though I'm not sure what that would look like.
| extern "C" { | ||
| #[link_name = "llvm.nvvm.match.any.sync.i64"] | ||
| fn __nvvm_warp_match_any_64(mask: u32, value: u64) -> u32; | ||
| fn __nvvm_warp_match_any_64(mask: u32, value: u64) -> u64; |
There was a problem hiding this comment.
This looks wrong. https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html has this:
declare i32 @llvm.nvvm.match.any.sync.i32(i32 %membermask, i32 %value)
declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value)
declare {i32, i1} @llvm.nvvm.match.all.sync.i32(i32 %membermask, i32 %value)
declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value)
Not sure about the signed/unsigned mismatches, but the return value is definitely 32-bits.
Aside: The match_all_{32,64} functions below don't have link_name attributes the way the match_any_{32,64} functions do. Not sure if this is valid. I suspect these functions aren't tested at all!
Anyway, I think this change should be reverted.
There was a problem hiding this comment.
Oh, hm. Ok, will revert.
| TypeKind::Vector | TypeKind::ScalableVector => { | ||
| // Recurse on element type for vector floats | ||
| self.float_width(self.element_type(ty)) | ||
| } |
There was a problem hiding this comment.
Are all of Half/BFloat/Vector/ScalableVector needed to fix the issue? I see that rustc_codegen_llvm only has Half. Seems wise to only add code that's necessary for the fix (and thus has some level of testing).
There was a problem hiding this comment.
I think we need at least BFloat as well but I'll double check.
|
What do you think about changing the default here to be keyed off of |
sounds ok |
First, we weren't handling all the types.
After fixing that, it exposed a
libnvvmcrash.Also saw a type issue in one of the warp APIs used by vecadd so fixed that.
Fixes #320