且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

选择两个日期之间的Pandas数据框行

更新时间:2023-11-29 18:19:40

如果数据帧不是很大,则可以简单地对虚拟键进行联接,然后进行过滤以将其缩小到所需的范围.请参见下面的示例(请注意,为了使日期格式正确,我必须对示例进行一些更新)

If your dataframes are not very big, you can simply do the join on a dummy key and then do filtering to narrow it down to what you need. See example below (note that I had to update your example a little bit to have correct date formatting)

import pandas as pd

rates = {'rate': [ 0.974, 0.966,  0.996,  0.998,  0.994, 1.006,  1.042,  1.072,  0.954],
'valid_from': ['31/12/2018','15/01/2019','01/02/2019','01/03/2019','01/04/2019','15/04/2019','01/05/2019','01/06/2019','30/06/2019'],
'valid_to': ['14/01/2019','31/01/2019','28/02/2019','31/03/2019','14/04/2019','30/04/2019','31/05/2019','29/06/2019','31/07/2019']}

df1 = pd.DataFrame(rates)
df1['valid_to'] = pd.to_datetime(df1['valid_to'],format ='%d/%m/%Y')
df1['valid_from'] = pd.to_datetime(df1['valid_from'],format='%d/%m/%Y')

那么您 df1 将是

        rate    valid_from  valid_to
    0   0.974   2018-12-31  2019-01-14
    1   0.966   2019-01-15  2019-01-31
    2   0.996   2019-02-01  2019-02-28
    3   0.998   2019-03-01  2019-03-31
    4   0.994   2019-04-01  2019-04-14
    5   1.006   2019-04-15  2019-04-30
    6   1.042   2019-05-01  2019-05-31
    7   1.072   2019-06-01  2019-06-29
    8   0.954   2019-06-30  2019-07-31

这是您的第二个数据帧 df2

This is your second data frame df2

data = {'date': ['03/01/2019','23/01/2019','27/02/2019','14/03/2019','05/04/2019','30/04/2019','14/06/2019'],
'amount': [200,305,155,67,95,174,236,]}

df2 = pd.DataFrame(data)
df2['date'] = pd.to_datetime(df2['date'],format ='%d/%m/%Y')

然后您的 df2 如下所示

     date   amount
0   2019-01-03  200
1   2019-01-23  305
2   2019-02-27  155
3   2019-03-14  67
4   2019-04-05  95
5   2019-04-30  174
6   2019-06-14  236

您的解决方案:

df1['key'] = 1
df2['key'] = 1
df_output = pd.merge(df1, df2, on='key').drop('key',axis=1)
df_output = df_output[(df_output['date'] > df_output['valid_from']) & (df_output['date'] <= df_output['valid_to'])]

结果将是这样: df_output :

    rate    valid_from  valid_to    date    amount
0   0.974   2018-12-31  2019-01-14  2019-01-03  200
8   0.966   2019-01-15  2019-01-31  2019-01-23  305
16  0.996   2019-02-01  2019-02-28  2019-02-27  155
24  0.998   2019-03-01  2019-03-31  2019-03-14  67
32  0.994   2019-04-01  2019-04-14  2019-04-05  95
40  1.006   2019-04-15  2019-04-30  2019-04-30  174
55  1.072   2019-06-01  2019-06-29  2019-06-14  236