且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在 CSV 文件的行开头添加新列?

更新时间:2023-11-18 22:08:22

使用 csv 模块的 DictReaderDictWriter代码>类.这是一个读取旧文件并一次性写入新文件的示例.

It would be fairly easy to do using the csv module's DictReader and DictWriter classes. Here's an example that reads the old file and writes the new one in single pass.

DictReader 实例将文件的每个逻辑行或行作为字典返回,其键是字段名称.您可以明确指定字段名称,也可以从文件的第一行读取它们(如下例所示).

A DictReader instance returns each logical line or row of the file as a dictionary whose keys are the field names. You can explicitly specify the field names or they can be read from the first line of the file (as is done in the example below).

必须在创建 DictWriter 实例时指定所需的字段名称,字段名称的顺序定义了它们在输出文件的每一行中出现的顺序.在这种情况下,新的字段名称只是简单地添加到输入文件中名称列表的开头——不管它们是什么.

You must specify the desired field names when creating a DictWriter instance and the order of the field names defines the order they will appear on each line of the output file. In this case the new field name is simply added to beginning of the list of names from the input file — whatever they may be.

import csv

with open('testdata.txt', 'r', newline='') as inf, \
     open('testdata2.txt', 'w', newline='') as outf:
    csvreader = csv.DictReader(inf)
    fieldnames = ['Node'] + csvreader.fieldnames  # Add column name to beginning.
    csvwriter = csv.DictWriter(outf, fieldnames)
    csvwriter.writeheader()
    for node, row in enumerate(csvreader, start=1):
        csvwriter.writerow(dict(row, Node='node %s' % node))

如果这是输入文件的内容:

If this was the contents of the input file:

ID,Test Description,file-name,module,view,path1,path2
id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2

这将是运行脚本后生成的输出文件的内容:

This would be the contents of the resulting output file after running the script:

Node,ID,Test Description,file-name,module,view,path1,path2
node 1,id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
node 2,id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
node 3,id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
node 4,id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
node 5,id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2

请注意,使用 dict(row, Node='node %s' % node) 将字段的数据添加到每一行,如图所示,仅当字段名称是有效的关键字参数时才有效(即有效的 Python 标识符)——比如 Node.

Note that adding the data for a field to each row with dict(row, Node='node %s' % node) as shown only works when the field name is a valid keyword argument (i.e. valid Python identifier) — like Node.

有效标识符仅由字母、数字和下划线组成,但不能以数字或下划线开头,并且不能是classforreturn全局通过

Valid identifiers consist only of letters, digits, and underscores but not start with a digit or underscore, and cannot be language keyword such as class, for, return, global, pass, etc.

要解决此限制,有必要单独进行:

To get around this limitation, it would be necessary to do it separately:

    for node, row in enumerate(csvreader, 1):
        row['Invalid Keyword'] = 'node %s' % node  # add new field and value
        csvwriter.writerow(row)