更新时间:2021-12-26 23:20:49
numpy_indexed 软件包包含有效的(通常为nlogn)和矢量化解决方案来解决这些类型的问题:
The numpy_indexed package contains efficient (nlogn, generally) and vectorized solutions to these types of problems:
import numpy_indexed as npi
count = len(npi.intersection(a, b))
请注意,这与您的双循环有一点不同,例如,丢弃a和b中的重复条目.如果您想在b中保留重复项,这将起作用:
Note that this is subtly different than your double loop, discarding duplicate entries in a and b for instance. If you want to retain duplicates in b, this would work:
count = npi.in_(b, a).sum()
也可以通过执行npi.count(a)并考虑其结果来处理a中重复的条目;但是无论如何,我只是出于说明目的而漫步,因为我认为区别可能对您来说并不重要.
Duplicate entries in a could also be handled by doing npi.count(a) and factoring in the result of that; but anyway, im just rambling on for illustration purposes since I imagine the distinction probably does not matter to you.