且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

这两个mergesort实现的空间复杂度是否相同?

更新时间:2023-02-04 17:57:41

首先,让我们假设我们拥有完善的垃圾收集功能,并且每个列表在不使用后都会立即释放.

First, let's assume that we have perfect garbage collection and every list is deallocated immediately after it falls out of use.

在这种假设下,算法具有相同的大O空间复杂度.

With this assumption, the algorithms have the same big O space complexity.

首先看一下算法2并考虑以下示例: 想象一下,您正在对长度为16的列表进行排序.

Take a look at Algorithm 2 first and consider the following example: Imagine you're sorting a list of length 16.

[15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]

您计算列表的前一半和后一半:

You compute the first and the second half of the list:

[15,14,13,12,11,10,9,8]  [7,6,5,4,3,2,1,0]

然后您对前半部分进行排序,特别是将其分为两个新的子列表:

Then you sort the first half, in particular you divide it into two new sublists:

[15,14,13,12]  [11,10,9,8]

然后您再次执行相同操作:

And you do the same again:

[15,14]  [13,12]

再次:

[15]  [14]

然后,您才开始合并列表.

Only then you begin to merge the lists.

此时该算法分配的列表的总长度是多少?

16 + 2*8 + 2*4 + 2*2 + 2*1.通常,它是N + 2N/2 + 2N/4 + 2N/8 + ... + 2.这是一个简单的几何级数,总计约为3 * N.

It is 16 + 2*8 + 2*4 + 2*2 + 2*1. In general, it's N + 2N/2 + 2N/4 + 2N/8 + ... + 2. That's a simple geometric progression that sums to something around 3*N.

该算法还需要O(log(N))空间用于调用堆栈,但是以大的O表示法消失了: O(N)

The algorithm also needs O(log(N)) space for the call stack, but that vanishes in the big O notation: O(N)

很容易看出这是算法在任何给定点将使用的最大值-将来将要使用的分配列表的长度(因此不能被释放)不会超过3 * N.

It is easy to see that this is the maximum of what the algorithm will use at any given point -- the length of allocated lists that will be used in the future (and cannot be deallocated because of that) will never exceed 3*N.

再次考虑相同的示例.我们将对以下列表进行排序.

Consider the same example again. We're to sort the following list.

[15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0]

想象一下,我们已经对其前半部分和后半部分进行了排序:

Imagine that we have already sorted its first and second half:

[8,9,10,11,12,13,14,15, 0,1,2,3,4,5,6,7]

现在,我们需要分配一个长度为N的临时列表来执行合并. 因此,在那一刻,我们积极使用两个长度为N的列表,这使我们得到2 * N = O(N).

Now, we need to allocate a temporary list of length N to perform the merge. So at that moment we actively use two lists of length N, that gives us 2*N = O(N).

同样,很容易看出我们永远不会使用更多的内存:对列表的两半进行排序的任务自然不会比对列表本身进行排序花费更多.

Again, it is easy to see that we'll never use more memory: the task of sorting the halves of the list naturally cannot cost more than sorting the list itself.

这两种算法都使用 O(N)内存.他们将O(log(N))用于调用堆栈,但是与O(N)相比,这是一笔不小的开销.

Both algorithms use O(N) memory. They use O(log(N)) for the call stack, but that's a minor expense compared to O(N).

另外知道 Python使用引用计数取消分配未使用的对象可验证我们的初始垃圾收集的假设.

Knowing additionally that Python uses reference counting to deallocate unused objects validates our initial assumption about garbage collection.