更新时间:2023-02-05 18:47:02
这里是 O(n / k log(k))
解决方案:
i = 0
while i+k-1 < n: //don't get out of bounds
if arr[i] == arr[i+k-1]:
produce arr[i] as dupe
i = min { j | arr[j] > arr[i] } //binary search
else:
c = min { j | arr[j] == arr[i+k-1] } //binary search
i = c
这个想法是,如果索引 i + k-1
的元素与索引 i $的元素匹配,则检查该元素c $ c>-很好,是骗子。否则,您无需检查
i
到 i + k-1
之间的所有元素。与 arr [i + k-1]
的值相同。
The idea is, you check the element at index i+k-1
, if it matches the element at index i
- good, it's a dupe. Otherwise, you don't need to check all the element between i
to i+k-1
, only the one with the same value as arr[i+k-1]
.
您确实需要回顾一下该元素的最早索引,但是可以保证超过 i + k
进行下一次迭代,使该算法的迭代总数为 O(n / k)
,每个算法取 O(logk)
时间。
You do need to look back for for the earliest index of this element, but you are guaranteed to exceed the index i+k
by next iteration, making the total number of iteration of this algorithm O(n/k)
, each takes O(logk)
time.
这在渐近性上优于线性时间算法,尤其是对于 k
的大值>(对于 k
在 O中的情况,算法衰减到
,例如-查找至少以频率0.1重复的元素) O(logn)
(n)
This is asymptotically better than linear time algorithm, especially for large values of k
(where the algorithm decays to O(logn)
for cases where k
is in O(n)
, like for example - find element that repeats at least with frequency 0.1)