且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

将 Pandas 列从字符串 Quarters 和 Years 数组转换为日期时间列

更新时间:2023-02-03 14:08:07

首先创建季度 PeriodIndex,然后通过PeriodIndex.to_timestamp 以及 DatetimeIndex.floor:

First create quarter PeriodIndex, then convert to datetimes by PeriodIndex.to_timestamp and floor to days by DatetimeIndex.floor:

#if necessary create lists
df['Date'] = df['Date'].str.strip('[]').str.split(',')

#test if format match
print ('20' + df['Date'].str[::-1].str.join(''))
0    2010Q1
1    2010Q2
2    2010Q3
3    2010Q4
4    2011Q1
5    2011Q2
Name: Date, dtype: object


df['Date'] = (pd.PeriodIndex('20' + df['Date'].str[::-1].str.join(''), freq='Q')
                .to_timestamp(how='e')
                .floor('d'))
print (df)
        Date  Data
0 2010-03-31   8.7
1 2010-06-30   8.4
2 2010-09-30  14.1
3 2010-12-31  16.2
4 2011-03-31  18.6
5 2011-06-30  20.4

转换为 Periods 的替代方法:

Alternative for convert to Periods:

df['Date'] = (df['Date'].str[::-1].str.join('').apply(lambda x: pd.Period(x, freq='Q'))
                    .dt.to_timestamp(how='e')
                    .dt.floor('d'))

或者来自@MrFuppes 的解决方案,谢谢:

Or solution from @MrFuppes, thank you:

df['Date'] = (pd.to_datetime("20"+df['Date'].str[::-1].str.join('')) + 
              pd.offsets.QuarterEnd(0))