从CUDA内核访问全局存储器指针

更新时间：2022-05-14 23:30:14

code> cudaMalloc 和 cudaMemcpy 上的 __ device __

We don't use cudaMalloc and cudaMemcpy on __device__ variables.

阅读
$ b的变量，rel =nofollow> __ device __ $ b

Read the documentation for __device__ variables, where it states the API calls to be used:

 cudaMemcpyToSymbol();
 cudaMemcpyFromSymbo();

如果您想使用 cudaMalloc 动态分配的设备数组，但将返回的指针存储在 __ device __ 变量中，您必须这样做：

If you want to use cudaMalloc on a dynamically allocated device array, but store the returned pointer in a __device__ variable, you'll have to do something like this:

void Init()
{
    int* data = new int[SIZE];
    int* d_data;
    cudaError_t cudaStatus;
    cudaStatus = cudaMalloc(&d_data, SIZE * sizeof(int));
    for (int i = 0; i < SIZE; i++)
        data[i] = i;

    cudaStatus = cudaMemcpy(d_data, data, SIZE * sizeof(int), cudaMemcpyHostToDevice);
    cudaMemcpyToSymbol(cData, &d_data, sizeof(int *));
    delete data;
}

当我按原样编译代码时， CUDA 6 nvcc ：

When I compile your code as-is, I get the following compiler warning from CUDA 6 nvcc:

t411.cu(15): warning: a __device__ variable "cData" cannot be directly read in a host function

不应忽略这些警告。

如果 SIZE 在编译期是已知的，就像在你的例子中，你也可以做这：

If SIZE is known at compile-time, as it is in your example, you can also do something like this:

__device__ int cData[SIZE];

void Init()
{
    int* data = new int[SIZE];
    cudaError_t cudaStatus;
    for (int i = 0; i < SIZE; i++)
        data[i] = i;
    cudaStatus = cudaMemcpyToSymbol(cData, data, SIZE * sizeof(int));
    delete data;
}

上一篇 : ：来自组合框数据 - >使用用户名和密码打开网站下一篇 : Webclient:java.lang.OutOfMemoryError:直接缓冲存储器

从CUDA内核访问全局存储器指针

相关阅读

技术问答最新文章