Gensim LDA多核Python脚本运行太慢

更新时间：2023-02-27 08:43:35

我在您的代码中注意到了这些问题.但是我不确定它们是否是执行缓慢的原因. 这个循环是没有用的，它永远不会运行:

I noticed these problems in your code.. but I'm not sure the they are the reason for the slow execution.. this loop here is useless it well never run:

 for text in author_text['full_text'].tolist():
      word_list = []
      for word in text:
         word_list.append(word)
         author_text.append(word_list)

同样也不需要循环文本中的单词，只需在其上使用split函数就可以了，这将是一个单词列表，这是通过甩开作者courser来实现的.

also there is no need to loop the words of the text it is enough to use split function on it and it will be a list of words, by lopping authors courser..

尝试这样写: 首先:

all_authors_text = []
for author in authors:
    all_authors_text.append(author['full_text'].split())

然后创建字典:

dictionary = corpora.Dictionary(all_authors_text)

上一篇 : ：检查取向对Android手机下一篇 : 将scikit-learn TfIdf与gensim LDA一起使用

Gensim LDA多核Python脚本运行太慢

技术问答最新文章