且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Pandas DataFrame存储列表为字符串:如何转换回列表?

更新时间:2023-10-22 15:20:28

正如你所指出的,这是通常在保存和加载大熊猫DataFrames作为 .csv 文件时发生的,这是一种文本格式。



在这种情况下,这是因为列表对象具有字符串表示形式,允许将它们存储为 .csv 文件。加载 .csv 然后将产生该字符串表示。



如果要存储实际对象,应该你使用 DataFrame.to_pickle()(注意:对象必须可挑选!)。



回答你的第二个问题,您可以使用 ast.literal_eval

 >>>来自ast import literal_eval 
>>>> literal_eval('[1.23,2.34]')
[1.23,2.34]


I have an n-by-m Pandas DataFrame df defined as follows. (I know this is not the best way to do it. It makes sense for what I'm trying to do in my actual code, but that would be TMI for this post so just take my word that this approach works in my particular scenario.)

>>> df = DataFrame(columns=['col1'])
>>> df.append(Series([None]), ignore_index=True)
>>> df
Empty DataFrame
Columns: [col1]
Index: []

I stored lists in the cells of this DataFrame as follows.

>>> df['column1'][0] = [1.23, 2.34]
>>> df
     col1
0  [1, 2]

For some reason, the DataFrame stored this list as a string instead of a list.

>>> df['column1'][0]
'[1.23, 2.34]'

I have 2 questions for you.

  1. Why does the DataFrame store a list as a string and is there a way around this behavior?
  2. If not, then is there a Pythonic way to convert this string into a list?


Update

The DataFrame I was using had been saved and loaded from a CSV format. This format, rather than the DataFrame itself, converted the list from a string to a literal.

As you pointed out, this can commonly happen when saving and loading pandas DataFrames as .csv files, which is a text format.

In your case this happened because list objects have a string representation, allowing them to be stored as .csv files. Loading the .csv will then yield that string representation.

If you want to store the actual objects, you should you use DataFrame.to_pickle() (note: objects must be picklable!).

To answer your second question, you can convert it back with ast.literal_eval:

>>> from ast import literal_eval
>>> literal_eval('[1.23, 2.34]')
[1.23, 2.34]