更新时间:2023-11-28 18:16:34
如果您希望获得出现次数相同的单词,则计数器dict会简单得多:
A Counter dict will be much simpler if you want to get words that appear the same amount of times:
from collections import Counter
c = Counter(["foo","foo","bar","bar","foobar","foob"])
print([k for k, v in c.items() if v == 2 ])
['foo', 'bar']
但是,您要计算相似度,如果不是相似度,只需将常用词存储在字典中并通过键访问即可.除了比使用dict效率低得多之外,索引总是会返回第一个匹配项.
However you calculate the similarity, if it is not frequency simply store common words in your dict and access by key. indexing apart from being much less efficient than using a dict is always going to return the first occurrence.
要将计数存储为关键字,将组词存储为值:
To store count as key and group words as values:
from collections import defaultdict
d = defaultdict(list)
for k,v in c.items():
d[v].append(k)
print(d.get(2,"N/A"))
['foo', 'bar']