added safe wrappers called copy_from_async_sync and copy_to_async_syc in crates/cust/src/memory/device/device_slice.rs by Adesoji1 · Pull Request #140 · Rust-GPU/rust-cuda

Adesoji1 · 2025-02-02T21:31:55Z

Now i introduced Safe Wrappers:
I suppose that these new two new methods (named copy_from_async_sync and copy_to_async_sync) simply wraps the existing unsafe methods defined by the AsyncCopyDestination(
https://bheisler.github.io/RustaCUDA/rustacuda/memory/trait.CopyDestination.html) trait. So they perform the asynchronous copy and then immediately call stream.synchronize(), thereby ensuring that the copy is complete before returning right?.
Furthermore, i know that with these additions, users who do not need overlapping computation can avoid unsafe blocks and explicit synchronization. This methods return a CudaResult<()> (https://bheisler.github.io/RustaCUDA/rustacuda/error/enum.CudaError.html) so that any error from either the asynchronous copy or the stream synchronization is propagated.

Availability on DeviceBuffer:
Now Since DeviceBuffer implements Deref<Target = DeviceSlice>, these new methods are also available on DeviceBuffer.

…c in crates/cust/src/memory/device/device_slice.rs

juntyr · 2025-02-02T21:49:09Z

Thank you for the PR! What can these new wrappers do that the CopyDestination trait methods, which are sync and safe, cannot? Does the addition of an explicit stream parameter help with some problem?

Adesoji1 · 2025-02-03T12:05:48Z

Thank you for the PR! What can these new wrappers do that the CopyDestination trait methods, which are sync and safe, cannot? Does the addition of an explicit stream parameter help with some problem?

thank you @juntyr i suppose that while the synchronous CopyDestination methods block until the copy is complete using the default or an internal stream, the new wrappers let the caller supply a specific stream. i believe that this is important for cases of when you want to integrate the copy into a larger asynchronous workflow or maybe coordinate it with other operations running on that stream though i stand to be corrected .Also i believe that the asynchronous with safety copy means that if needed, one could modify the usage (or write additional wrappers) to take advantage of overlapping computation and data transfer.

LegNeato · 2025-02-08T07:29:00Z

Is this a theoretical improvement or do you have code or intend to write code that needs this? I'd love to see an actual example.

Adesoji1 · 2025-02-13T10:16:40Z

Is this a theoretical improvement or do you have code or intend to write code that needs this? I'd love to see an actual example.

For a code that need this, kindly view


use cust::prelude::*; // This is for the safe wrapper around the unsafe API.
use cust::memory::*;
use cust::stream::{Stream, StreamFlags};

fn main() -> cust::error::CudaResult<()> {
    // Now let us Initialize a CUDA context.
    let _ctx = cust::quick_init()?;

    // Ghen Create a non-blocking stream. With this stream, you can interleave other asynchronous operations if desired.
    let stream = Stream::new(StreamFlags::NON_BLOCKING, None)?;

    // Now the source where the daya is stored will be called host data: Host data that we want to copy into a device buffer.
    let host_data = [10u32, 20, 30, 40, 50, 60];

    // Create a device buffer with the same length as host_data.
    let mut device_buffer = DeviceBuffer::from_slice(&[0u32; 6])?;

    // Use the new safe wrapper to asynchronously copy data from host_data to the device buffer.
    // This call internally uses an unsafe asynchronous copy but immediately synchronizes the stream.
    device_buffer
        .as_slice_mut()
        .copy_from_async_sync(&host_data, &stream)?;

    // Prepare a host array to copy the data back from the device.
    let mut host_result = [0u32; 6];

    // Use the safe wrapper to asynchronously copy the data back to the host.
    device_buffer
        .as_slice()
        .copy_to_async_sync(&mut host_result, &stream)?;

    //Then  Verify that the data round-tripped correctly.
    assert_eq!(host_data, host_result);
    println!("Async sync copy successful: {:?}", host_result);

    Ok(())
}
 Then the output should be : Async sync copy successful: [10, 20, 30, 40, 50, 60]

i await your response on this, also for the CI/CD test of the code, it failed because of the version in the github actions but after updating, you can run the code, ot will give no error, Thank you

added safe wrappers called copy_from_async_sync and copy_to_async_syn…

4154f71

…c in crates/cust/src/memory/device/device_slice.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added safe wrappers called copy_from_async_sync and copy_to_async_syc in crates/cust/src/memory/device/device_slice.rs#140

added safe wrappers called copy_from_async_sync and copy_to_async_syc in crates/cust/src/memory/device/device_slice.rs#140
Adesoji1 wants to merge 1 commit intoRust-GPU:mainfrom
Adesoji1:main

Adesoji1 commented Feb 2, 2025

Uh oh!

juntyr commented Feb 2, 2025

Uh oh!

Adesoji1 commented Feb 3, 2025

Uh oh!

LegNeato commented Feb 8, 2025

Uh oh!

Adesoji1 commented Feb 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Adesoji1 commented Feb 2, 2025

Uh oh!

juntyr commented Feb 2, 2025

Uh oh!

Adesoji1 commented Feb 3, 2025

Uh oh!

LegNeato commented Feb 8, 2025

Uh oh!

Adesoji1 commented Feb 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants