更新时间:2022-12-10 13:27:12
有一种使用cblas函数的方法,尽管有点尴尬.
There is a way to do so using cblas functions, though it is a bit of an awkward way.
您需要做的是定义一个全1"向量,然后在该向量和矩阵之间做一个点积,结果就是和.
What you need to do is to define an "all 1" vector, and then do a dot product between this vector and your matrix, the result is the sum.
让myBlob
成为要汇总其元素的Caffe Blob:
Let myBlob
be a caffe Blob whose elements you want to sum:
vector<Dtype> mult_data( myBlob.count(), Dtype(1) );
Dtype sum = caffe_cpu_dot( myBlob.count(), &mult_data[0], myBlob.cpu_data() );
"Reduction"
层的实现.
要使此答案均符合GPU,必须为mult_data
分配 Blob
而不是std::vector
(因为您需要的是pgu_data()
):
To make this answer both GPU compliant, one need to allocate a Blob
for mult_data
and not a std::vector
(because you need it's pgu_data()
):
vector<int> sum_mult_shape(1, diff_.count());
Blob<Dtype> sum_multiplier_(sum_mult_shape);
const Dtype* mult_data = sum_multiplier_.cpu_data();
Dtype sum = caffe_cpu_dot( myBlob.count(), &mult_data[0], myBlob.cpu_data() );
对于GPU,(在'.cu'
源文件中):
For GPU, (in a '.cu'
source file):
vector<int> sum_mult_shape(1, diff_.count());
Blob<Dtype> sum_multiplier_(sum_mult_shape);
const Dtype* mult_data = sum_multiplier_.gpu_data();
Dtype sum;
caffe_gpu_dot( myBlob.count(), &mult_data[0], myBlob.gpu_data(), &sum );