且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

pandas数据框中的第一列不是列?

更新时间:2022-12-11 20:14:15

它称为index,通过以下方法进行检查:

It is called index, check it by:

print (df.index)
Int64Index([102, 301, 302], dtype='int64', name='vo_11')

还要检查文档:

pandas对象中的轴标签信息有许多用途:

The axis labeling information in pandas objects serves many purposes:

-使用已知的指标标识数据(即提供元数据),这对于分析,可视化和交互式控制台显示很重要
-启用自动和明确的数据对齐方式
-允许直观地获取和设置数据集的子集

-Identifies data (i.e. provides metadata) using known indicators, important for analysis, visualization, and interactive console display
-Enables automatic and explicit data alignment
-Allows intuitive getting and setting of subsets of the data set

如果需要通过 merge 的索引都DataFrames:

df = pd.merge(df1, df2, left_index=True, right_index=True)

或使用 concat :>

df = pd.concat([df1, df2], axis=1) 

注意:

用于匹配相同类型的需要索引-intobject(显然是string)

For matching need indexes of same types - both int or object (obviously string)

示例:

df1 = pd.DataFrame({
'Column1': {302: 10, 301: 21, 102: 2}, 
'Column2': {302: 0, 301: 0, 102: 0}})
print (df1)
    Column1  Column2
102        2        0
301       21        0
302       10        0

df2 = pd.DataFrame({
'Column1': {302: 4, 301: 5, 304: 6}, 
'Column2': {302: 0, 301: 0, 304: 0}})
print (df2)
     Column1  Column2
301        5        0
302        4        0
304        6        0


df = pd.merge(df1, df2, left_index=True, right_index=True)
print (df)
     Column1_x  Column2_x  Column1_y  Column2_y
301         21          0          5          0
302         10          0          4          0

df = pd.merge(df1, df2, left_index=True, right_index=True, how='outer')
print (df)
     Column1_x  Column2_x  Column1_y  Column2_y
102        2.0        0.0        NaN        NaN
301       21.0        0.0        5.0        0.0
302       10.0        0.0        4.0        0.0
304        NaN        NaN        6.0        0.0

df = pd.concat([df1, df2], axis=1) 
print (df)
     Column1  Column2  Column1  Column2
102      2.0      0.0      NaN      NaN
301     21.0      0.0      5.0      0.0
302     10.0      0.0      4.0      0.0
304      NaN      NaN      6.0      0.0

df = pd.concat([df1, df2], axis=1, join='inner') 
print (df)
     Column1  Column2  Column1  Column2
301       21        0        5        0
302       10        0        4        0