如何在 Pandas read_csv 函数中过滤加载行?

更新时间：2023-01-29 19:41:54

在将 CSV 文件加载到 Pandas 对象之前没有筛选行的选项.

There isn't an option to filter the rows before the CSV file is loaded into a pandas object.

您可以加载文件，然后使用 df[df['field'] > 进行过滤.常量]，或者如果你有一个非常大的文件并且你担心内存耗尽，那么使用迭代器并在你连接文件块时应用过滤器，例如:

You can either load the file and then filter using df[df['field'] > constant], or if you have a very large file and you are worried about memory running out, then use an iterator and apply the filter as you concatenate chunks of your file e.g.:

import pandas as pd
iter_csv = pd.read_csv('file.csv', iterator=True, chunksize=1000)
df = pd.concat([chunk[chunk['field'] > constant] for chunk in iter_csv])

您可以改变 chunksize 以适合您的可用内存.请参阅此处更多详情.

You can vary the chunksize to suit your available memory. See here for more details.

上一篇 : ：从数据表中过滤重复的行下一篇 : 如何找到两个日期之间的日期段？

如何在 Pandas read_csv 函数中过滤加载行?

相关阅读

技术问答最新文章