且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

读取带有字符串的文件并使用loadtxt浮动

更新时间:2023-02-18 20:20:04

您链接的网站中的表彼此之间非常不同,并且在不同的列中具有不同的类型.

Tables in the site you link are very different from each other and you have different types in different columns.

您需要为每个表定义一个record type.
记录类型使您可以在同一数组上声明字符串,整数和浮点数.它的定义和使用方式如下例所示:

You need to define a record type for each table.
A record type allows you to declare strings, integers, floats on the same array. It is defined and used like in this example:

>>> recordtype = dtype([('name', str_, 20), ('age', int32), ('weight', float32)])
>>> people = array([('Joaquin', 51, 60.0), ('Cat', 18, 8.6)], dtype=recordtype)
>>> people
array([('Joaquin', 51, 60.0), ('Cat', 18, 8.600000381469727)], dtype=[('name', '<U20'), ('age', '<i4'), ('weight', '<f4')])

另一方面,您的行包含诸如'...'之类的内容,这些内容破坏了其上数据的一致性.因此,如果您需要直接从文件中读取数据,则需要将转换器函数用于loadtxt转换器参数.

On the other hand you have rows with contents such as '...' that break the coherence of the data on it. So if you need to read directly from the file, you would need to use a converter function for loadtxt converters parameter.

或者,由于loadtext也接受生成器作为输入,因此您可以处理生成器中的行,并用干净的行来输入loadtext.

Alternatively, as loadtext accepts also a generator as input, you could process lines in the generator and feed loadtext with cleaned lines.

最后,您还应该设置skiprows参数以消除表格标题

Finally you should also set the skiprows parameter to eliminate table headings