且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

动态分配与数组

更新时间:2023-02-11 18:11:56

2004年9月10日星期五14:47:41 +1000 ,Tan Thuan Seah< u2 ****** @ anu.edu.au>写道:
On Fri, 10 Sep 2004 14:47:41 +1000, Tan Thuan Seah <u2******@anu.edu.au> wrote:
大家好,

在我正在做的大学课程中,我被告知这一点。

在C中,我们可以期待良好的表现for:
double a [N] [N],c [N] [N],d;
for(i = 0; i< N; i ++)
for(j = 0 ; j< N; j ++)
a [i] [j] = a [i] [j] + c [i] [j] * d;

但这是另一回事:
double * a [N],* c [N],d;
for(i = 0; i< N; i ++){a [i] =(double *)malloc(N * sizeof(double));
c [i] =(double *)malloc(N * sizeof(double));对于(i = 0; i< N; i ++)
for(j = 0; j< N; j ++)
a [i] [j] = a [i] [j] + c [i] [j] * d;

如果我们使用某种类型的动态内存分配,似乎我们会期待一些性能上升。但是,在运行时确定大小的
数组并不是标准的ANSI C ++。那么有什么好的建议可以通过其他方法来最小化这种性能损失或完全避免它吗?我希望链表更糟糕。
任何推荐?谢谢。
Hi all,

I was told this in one of the university course I was doing.

In C we may expect good performance for:
double a[N][N], c[N][N], d;
for (i=0; i<N; i++)
for(j=0; j<N; j++)
a[i][j] = a[i][j] + c[i][j] *d;

But this is another matter:
double *a[N], *c[N], d;
for(i=0; i<N; i++) {a[i] = (double *) malloc(N*sizeof(double));
c[i] = (double *) malloc(N*sizeof(double)); }

for(i=0; i<N; i++)
for(j=0; j<N; j++)
a[i][j] = a[i][j] + c[i][j] * d;
It seems that we would expect some performance hit if we were to use dynamic
memory allocation of some sort. But it''s not a standard ANSI C++ to have
array with the size determined during runtime. So is there any good
recommendation to minimize this performance hit or totally avoiding it
through some other method? I would expect a linked list to be even worse.
Any recommendation? Thanks.




您正在使用非常量大小的数组(假设N不是常数)
两个示例中的
。 C99允许这样,但C ++没有。所以两个例子都不会在标准C ++下编译。


引用性能变差的原因是什么?


如果它是malloc的成本,那么你在设置时就这么做了

无论如何都是不可避免的,如果不采用平台特定的话,这是不可避免的

堆栈访问(例如alloca)。


如果它是额外的指针deref那么'几乎是不可避免的,

虽然你管理添加其中两个,这可以通过

来避免下一个点之后的代码。


如果它是内存不是' t本地化的因此,你可以使用扁平化更频繁的缓存未命中率。数组:


double * a = new double [N * N];

double * c = new double [N * N];

双d;


/ * ... * /


for(int i = 0; i< N; i ++ )(b = 0; j
a [i * N + j] + = c [i * N + j] * d;


您是否有基准来确定它是否重要?我得到8%的差异

一切都是全局的(或76毫秒),堆栈上的东西它/ b
崩溃,因为200万双打不适合堆叠在我的机器上。


速度是不是更重要?


-

Sam Holden



You are using arrays with non-constant size (assuming N is not constant)
in both examples. C99 allows that, but C++ does not. So both examples
won''t compile under standard C++.

What were the reasons cited for getting worse performance?

If it''s the cost of the malloc, then you are doing it at setup time
anyway and it''s unavoidable without resorting to platform specific
stack access (such as alloca).

If it''s the extra pointer deref then that''s pretty much unavoidable,
though you managed to add two of them, which can be avoided by
the code that follows the next point too.

If it''s that the memory isn''t "localised" and hence cache misses are
more frequent you could use "flattened" arrays:

double *a = new double[N*N];
double *c = new double[N*N];
double d;

/* ... */

for(int i=0; i<N; i++)
for(int j=0; j<N; j++)
a[i*N+j] += c[i*N+j] * d;

Have you benchmarked to see if it matters? I get an 8% difference with
everything as globals (or 76 milliseconds), with things on the stack it
crashes since two million doubles don''t fit on the stack on my machine.

Is speed or not crashing more important?

--
Sam Holden


Tan Thuan Seah发布:
Tan Thuan Seah posted:
大家好,

我被告知其中一个我正在做的大学课程。

在C中我们可以期待良好的表现:
双a [N] [N],c [N] [N],d;
for(i = 0; i< N; i ++)
for(j = 0; j a [i] [j] = a [i] [j] + c [ i] [j] * d;

但这是另一回事:
双* a [N],* c [N],d;
for(i = 0 ; i< N; i ++){a [i] =(double *)malloc(N * sizeof(double));
c [i] =(double *)malloc(N * sizeof(double));对于(i = 0; i< N; i ++)
for(j = 0; j< N; j ++)
a [i] [j] = a [i] [j] + c [i] [j] * d;

如果我们使用某种类型的动态内存分配,我们似乎会期待一些性能上升。但是,在运行时确定具有大小的数组并不是标准的ANSI C ++。那么是否有任何好的建议可以最大限度地减少这种性能损失,或者通过其他方法完全避免它?我希望链表更糟糕。有什么建议?谢谢。

Thuan Seah
Hi all,

I was told this in one of the university course I was doing.

In C we may expect good performance for:
double a[N][N], c[N][N], d;
for (i=0; i<N; i++)
for(j=0; j<N; j++)
a[i][j] = a[i][j] + c[i][j] *d;

But this is another matter:
double *a[N], *c[N], d;
for(i=0; i<N; i++) {a[i] = (double *) malloc(N*sizeof(double));
c[i] = (double *) malloc(N*sizeof(double)); }

for(i=0; i<N; i++)
for(j=0; j<N; j++)
a[i][j] = a[i][j] + c[i][j] * d;
It seems that we would expect some performance hit if we were to use
dynamic memory allocation of some sort. But it''s not a standard ANSI
C++ to have array with the size determined during runtime. So is there
any good recommendation to minimize this performance hit or totally
avoiding it through some other method? I would expect a linked list to
be even worse. Any recommendation? Thanks.
Thuan Seah




您可以分配一大块内存然后将其切断。


我会给出一个例子,但是我需要几分钟来弄清楚你的b $ b b代码是做什么的。


-JKop



You could allocate one big chunk of memory and then chop it up like so.

I''d give an example but it''d take me a few minutes to figure out what your
code is doing.

-JKop


" Tan Thuan Seah" &LT; U2 ****** @ anu.edu.au&GT;在留言新闻中写道:< 41 ******** @ clarion.carno.net.au> ...
"Tan Thuan Seah" <u2******@anu.edu.au> wrote in message news:<41********@clarion.carno.net.au>...
大家好,

我被告知这是我在大学课程中的一个。

在C中我们可以期待良好的表现:
双a [N] [N],c [N] [N],d ;(i = 0; i< N; i ++)
for(j = 0; j< N; j ++)
a [i] [j] = a [i] [ j] + c [i] [j] * d;

但这是另一回事:
double * a [N],* c [N],d;
for(i = 0; i< N; i ++){a [i] =(double *)malloc(N * sizeof(double));
c [i] =(double *)malloc(N * sizeof) (双)); }

如果我们使用某种类型的动态内存分配,似乎我们会期待一些性能上升。但是,在运行时确定大小并不是标准的ANSI C ++。
Hi all,

I was told this in one of the university course I was doing.

In C we may expect good performance for:
double a[N][N], c[N][N], d;
for (i=0; i<N; i++)
for(j=0; j<N; j++)
a[i][j] = a[i][j] + c[i][j] *d;

But this is another matter:
double *a[N], *c[N], d;
for(i=0; i<N; i++) {a[i] = (double *) malloc(N*sizeof(double));
c[i] = (double *) malloc(N*sizeof(double)); }
It seems that we would expect some performance hit if we were to use dynamic
memory allocation of some sort. But it''s not a standard ANSI C++ to have
array with the size determined during runtime.




尝试使用向量向量。现代实施

它可能没有你想象的那么慢;肯定比malloc版本更快(不管怎么说)(无论如何都不行,你不能拥有数组a []和c []的
。如果这仍然太慢,你可以

寻找有人已经有效写入的自定义Matrix类。



Try using a vector of vectors. With a modern implementation
it may not be as slow as you expect; certainly faster than
the malloc version (which doesnt work anyway , you can''t have
the arrays a[] and c[]). If this is still too slow, you could
look for a custom Matrix class that someone has already
written efficiently.