更新时间:2023-11-10 16:11:58
根据您的代码段,我想我会像这样将文件分成8部分,然后由4个工作人员进行计算( 为什么要8个块和4个工人?这只是我为示例所做的随机选择.):
According to your snippet of code I guess I would do something like this to chunk the file in 8 parts and then make the computation to be done by 4 workers (why 8 chunks and 4 workers ? Just a random choice I made for the example.) :
from multiprocessing import Pool
import itertools
def myfunction(lines):
returnlist = []
for line in lines:
list_of_elem = line.split(",")
elem_id = list_of_elem[1]
elem_to_check = list_of_elem[5]
ids = list_of_elem[2].split("|")
for x in itertools.permutations(ids,2):
returnlist.append(",".join(
[elem_id,x,"1\n" if x[1] == elem_to_check else "0\n"]))
return returnlist
def chunk(it, size):
it = iter(it)
return iter(lambda: tuple(itertools.islice(it, size)), ())
if __name__ == "__main__":
my_data = open(r"my_input_file_to_be_processed.txt","r")
my_data = my_data.read().split("\n")
prep = [strings for strings in chunk(my_data, round(len(my_data) / 8))]
with Pool(4) as p:
res = p.map(myfunction, prep)
result = res.pop(0)
_ = list(map(lambda x: result.extend(x), res))
print(result) # ... or do something with the result
这是假设您确信所有行的格式都相同,并且不会导致任何错误.
Edit : This is assuming you are confident all lines are formatted in the same way and will cause no error.
根据您的评论,通过在不使用multiprocessing
的情况下对其进行测试或以相当大/难看的方式使用try/except来几乎确定该功能/文件内容中的问题可能很有用将产生输出( except 或结果):
According to your comments it might be useful to see what is the problem in your function/the content of your file by testing it without multiprocessing
or using try/except in a pretty large/ugly way to be almost sure that an output will be produced (either the exception or the result) :
def myfunction(lines):
returnlist = []
for line in lines:
try:
list_of_elem = line.split(",")
elem_id = list_of_elem[1]
elem_to_check = list_of_elem[5]
ids = list_of_elem[2].split("|")
for x in itertools.permutations(ids,2):
returnlist.append(",".join(
[elem_id,x,"1\n" if x[1] == elem_to_check else "0\n"]))
except Exception as err:
returnlist.append('I encountered error {} on line {}'.format(err, line))
return returnlist