更新时间:2021-08-26 02:30:45
您可以展开一系列时间戳:
You can explode a sequence of timestamps:
import pyspark.sql.functions as F
df2 = df.withColumn(
'Date',
F.expr("""
explode(
sequence(
timestamp(Date),
add_months(timestamp(Date), `Interval` - 1),
interval 1 month
)
)
""")
)
df2.show(99)
+------------------+----------+-------------------+---------------+--------+
| OpptyHeaderID| OpptyID| Date|BaseAmountMonth|Interval|
+------------------+----------+-------------------+---------------+--------+
|0067000000i6ONPAA2|OP-0164615|2014-07-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2014-08-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2014-09-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2014-10-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2014-11-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2014-12-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2015-01-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2015-02-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2015-03-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2015-04-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2015-05-27 00:00:00| 4375.800000| 12|
|0067000000i6ONPAA2|OP-0164615|2015-06-27 00:00:00| 4375.800000| 12|
|0065w0000215k5kAAA|OP-0218055|2020-12-23 00:00:00| 4975.000000| 7|
|0065w0000215k5kAAA|OP-0218055|2021-01-23 00:00:00| 4975.000000| 7|
|0065w0000215k5kAAA|OP-0218055|2021-02-23 00:00:00| 4975.000000| 7|
|0065w0000215k5kAAA|OP-0218055|2021-03-23 00:00:00| 4975.000000| 7|
|0065w0000215k5kAAA|OP-0218055|2021-04-23 00:00:00| 4975.000000| 7|
|0065w0000215k5kAAA|OP-0218055|2021-05-23 00:00:00| 4975.000000| 7|
|0065w0000215k5kAAA|OP-0218055|2021-06-23 00:00:00| 4975.000000| 7|
+------------------+----------+-------------------+---------------+--------+