
且构网 - 分享程序员编程开发的那些事

gsl :: gsl_vector vs std :: vector开销和效率

更新时间:2023-08-21 16:59:04

一方面,使用gsl_vector可以使用gsl BLAS是一个大优点。另一方面,gsl接口对于c ++程序员也是相当麻烦的。因此,这两种解决方案都不能令人满意。但是,我更喜欢使用gsl_matrix,因为

On one hand, it is true that with gsl_vector you can use gsl BLAS which is a big advantage. On the other hand, it is also true that gsl interface is quite cumbersome for a c++ programmer. So, neither solutions are truly satisfactory. However, I strongly prefer the use of gsl_matrix because

(i)有些努力你可以写一个小的包装类,以改善gsl_matrix的繁琐的C接口更难以处理缺乏BLAS库在std :: vector)。

(i) with some effort you can write a small wrapper class that ameliorate the cumbersome C interface of gsl_matrix (it is much harder to deal with the lack of BLAS library in std::vector).

(ii)gsl_matrix只是一维连续数组的包装,其中 m(i,j)= array [i * N + j ] 用于方阵(即使矩阵不是正方形,gsl_matrix仍然将其实现为一维数组)。在 std :: vector< gsl_vector *> 中,您将需要单独malloc每个gsl_vector,这意味着内存不会是连续的。这点击性能,因为在内存分配中缺少空间局部性通常会大大增加缓存未命中。

(ii) gsl_matrix is just a wrapper to an one dimensional continuous array where m(i,j) = array[i*N + j] for square matrix (even if the matrix is not square gsl_matrix still implements it as one dimensional array). In std::vector<gsl_vector*>, you will need to "malloc" each gsl_vector individually and this implies that memory won't be contiguous. This hits performance because the lack of "spatial locality" in memory allocation usually increases cache misses substantially.


If you have the choice to use a completely different solution, I would implement the tensor calculation using StaticMatrix or DynamicMatrix classes in Blaze lib



Why Blaze?

(i)StaticMatrix或DynamicMatrix接口比 std :: vector< gsl_vector *> gsl_matrix

(i) The StaticMatrix or DynamicMatrix interface is much better than std::vector<gsl_vector*> or gsl_matrix

(ii)Blaze是C ++中最快的BLAS库。如果你有可用的英特尔MKL(记住,英特尔MKL比gsl BLAS),它比gsl更快。为什么这样?因为Blaze使用了一种称为智能表达模板的新技术。基本上,德国的研究人员在一系列文章中展示了一系列文章 paper 1 论文2 表达模板技术是许多C ++ BLAS中的标准技术库,对于矩阵运算(BLAS 3运算)是可怕的,因为编译器不能比低级代码更聪明。然而,表达式模板可以作为一个更聪明的包装器到低级BLAS库,如intel MKL。所以他们创建智能表达模板技术,这只是一个包装你选择的低级blas lib。他们的基准惊人。

(ii) Blaze is the fastest BLAS lib available in C++. It is faster than gsl if you have available Intel MKL (remember that Intel MKL is faster than gsl BLAS). Why so? Because Blaze uses a new technique called "Smart Expression Template" . Basically, researchers in Germany showed in a series of articles paper 1 paper 2 that the "Expression Template" technique, which is the standard technique in many C++ BLAS libraries, is terrible for matrix operations (BLAS 3 operations) because compiler can't be smarter than low level code. However, "Expression Template" can be used as a smarter wrapper to low level BLAS libraries like intel MKL. So they create "Smart Expression Template" technique which is just a wrapper to your choice of low level blas lib. Their benchmarks are astonishing
