更新时间:2022-05-14 23:30:14
code> cudaMalloc 和 cudaMemcpy
上的 __ device __
We don't use cudaMalloc
and cudaMemcpy
on __device__
variables.
阅读
$ b的变量,rel =nofollow> __ device __
$ b
Read the documentation for __device__
variables, where it states the API calls to be used:
cudaMemcpyToSymbol();
cudaMemcpyFromSymbo();
如果您想使用 cudaMalloc
动态分配的设备数组,但将返回的指针存储在 __ device __
变量中,您必须这样做:
If you want to use cudaMalloc
on a dynamically allocated device array, but store the returned pointer in a __device__
variable, you'll have to do something like this:
void Init()
{
int* data = new int[SIZE];
int* d_data;
cudaError_t cudaStatus;
cudaStatus = cudaMalloc(&d_data, SIZE * sizeof(int));
for (int i = 0; i < SIZE; i++)
data[i] = i;
cudaStatus = cudaMemcpy(d_data, data, SIZE * sizeof(int), cudaMemcpyHostToDevice);
cudaMemcpyToSymbol(cData, &d_data, sizeof(int *));
delete data;
}
当我按原样编译代码时, CUDA 6 nvcc
:
When I compile your code as-is, I get the following compiler warning from CUDA 6 nvcc
:
t411.cu(15): warning: a __device__ variable "cData" cannot be directly read in a host function
不应忽略这些警告。
如果 SIZE
在编译期是已知的,就像在你的例子中,你也可以做这:
If SIZE
is known at compile-time, as it is in your example, you can also do something like this:
__device__ int cData[SIZE];
void Init()
{
int* data = new int[SIZE];
cudaError_t cudaStatus;
for (int i = 0; i < SIZE; i++)
data[i] = i;
cudaStatus = cudaMemcpyToSymbol(cData, data, SIZE * sizeof(int));
delete data;
}