且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

C ++ Armadillo和OpenMp:并行累加外部乘积-定义Armadillo矩阵的约简

更新时间:2023-11-11 23:40:46

这些缩减仅适用于内置类型(doubleint等).因此,您必须自己进行还原,这很简单.只需将每个线程的结果累加到线程局部变量中,然后将其添加到关键部分的全局结果中即可.

Those reductions are only available for built-in types (double, int, etc.). Thus you have to do the reduction yourself, which is simple. Just accumulate the results for each thread in a thread-local variable and add this to the global result within a critical section.

#include <armadillo>
#include <omp.h>

int main()
{

  arma::mat A = arma::randu<arma::mat>(1000,700);
  arma::mat X = arma::zeros(700,700);
  arma::rowvec point = A.row(0);

  #pragma omp parallel shared(A)
  {
    arma::mat X_local = arma::zeros(700,700);

    #pragma omp for
    for(unsigned int i = 0; i < A.n_rows; i++)
    {
      arma::rowvec diff = point - A.row(i);
      X_local += diff.t() * diff; // Adding the matrices to X here
    }

    #pragma omp critical
    X += X_local;
  }
}

使用最新的OpenMP(我认为是4.5),您还可以为您的类型声明用户定义的归约形式.

With more recent OpenMP (4.5 I think?) you can also declare a user-defined reduction for your type.

#include <armadillo>
#include <omp.h>

#pragma omp declare reduction( + : arma::mat : omp_out += omp_in ) \
  initializer( omp_priv = omp_orig )

int main()
{

  arma::mat A = arma::randu<arma::mat>(1000,700);
  arma::mat X = arma::zeros(700,700);
  arma::rowvec point = A.row(0);

  #pragma omp parallel shared(A) reduction(+:X)
  for(unsigned int i = 0; i < A.n_rows; i++)
  {
    arma::rowvec diff = point - A.row(i);
    X += diff.t() * diff; // Adding the matrices to X here
  }
}