且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

gsl :: gsl_vector vs std :: vector开销和效率

更新时间:2023-08-21 16:59:04

一方面,使用gsl_vector可以使用gsl BLAS是一个大优点。另一方面,gsl接口对于c ++程序员也是相当麻烦的。因此,这两种解决方案都不能令人满意。但是,我更喜欢使用gsl_matrix,因为

On one hand, it is true that with gsl_vector you can use gsl BLAS which is a big advantage. On the other hand, it is also true that gsl interface is quite cumbersome for a c++ programmer. So, neither solutions are truly satisfactory. However, I strongly prefer the use of gsl_matrix because

(i)有些努力你可以写一个小的包装类,以改善gsl_matrix的繁琐的C接口更难以处理缺乏BLAS库在std :: vector)。

(i) with some effort you can write a small wrapper class that ameliorate the cumbersome C interface of gsl_matrix (it is much harder to deal with the lack of BLAS library in std::vector).

(ii)gsl_matrix只是一维连续数组的包装,其中 m(i,j)= array [i * N + j ] 用于方阵(即使矩阵不是正方形,gsl_matrix仍然将其实现为一维数组)。在 std :: vector< gsl_vector *> 中,您将需要单独malloc每个gsl_vector,这意味着内存不会是连续的。这点击性能,因为在内存分配中缺少空间局部性通常会大大增加缓存未命中。

(ii) gsl_matrix is just a wrapper to an one dimensional continuous array where m(i,j) = array[i*N + j] for square matrix (even if the matrix is not square gsl_matrix still implements it as one dimensional array). In std::vector<gsl_vector*>, you will need to "malloc" each gsl_vector individually and this implies that memory won't be contiguous. This hits performance because the lack of "spatial locality" in memory allocation usually increases cache misses substantially.

如果你有选择使用完全不同的解决方案,在Blaze中使用StaticMatrix或DynamicMatrix类进行张量计算

If you have the choice to use a completely different solution, I would implement the tensor calculation using StaticMatrix or DynamicMatrix classes in Blaze lib

Blaze

为什么选择Blaze?

Why Blaze?

(i)StaticMatrix或DynamicMatrix接口比 std :: vector< gsl_vector *> gsl_matrix

(i) The StaticMatrix or DynamicMatrix interface is much better than std::vector<gsl_vector*> or gsl_matrix

(ii)Blaze是C ++中最快的BLAS库。如果你有可用的英特尔MKL(记住,英特尔MKL比gsl BLAS),它比gsl更快。为什么这样?因为Blaze使用了一种称为智能表达模板的新技术。基本上,德国的研究人员在一系列文章中展示了一系列文章 paper 1 论文2 表达模板技术是许多C ++ BLAS中的标准技术库,对于矩阵运算(BLAS 3运算)是可怕的,因为编译器不能比低级代码更聪明。然而,表达式模板可以作为一个更聪明的包装器到低级BLAS库,如intel MKL。所以他们创建智能表达模板技术,这只是一个包装你选择的低级blas lib。他们的基准惊人。

(ii) Blaze is the fastest BLAS lib available in C++. It is faster than gsl if you have available Intel MKL (remember that Intel MKL is faster than gsl BLAS). Why so? Because Blaze uses a new technique called "Smart Expression Template" . Basically, researchers in Germany showed in a series of articles paper 1 paper 2 that the "Expression Template" technique, which is the standard technique in many C++ BLAS libraries, is terrible for matrix operations (BLAS 3 operations) because compiler can't be smarter than low level code. However, "Expression Template" can be used as a smarter wrapper to low level BLAS libraries like intel MKL. So they create "Smart Expression Template" technique which is just a wrapper to your choice of low level blas lib. Their benchmarks are astonishing

基准