更新时间:2023-02-26 20:56:39
如果需要测试时间,默认情况下熊猫使用今天的日期,因此可能的解决方案是使用 Timestamp.date
和
If want test times, pandas by default use today dates, so possible solution is test them with Series.dt.date
, Timestamp.date
and Series.all
if all values of column match.
Also added another solution for test dates - test if same values after removed times by Series.dt.floor
:
df = pd.DataFrame({'a':['2019-01-01 12:23:10',
'2019-01-02 12:23:10'],
'b':['2019-01-01',
'2019-01-02'],
'c':['12:23:10',
'15:23:10'],
'd':['a','b']})
print (df)
a b c d
0 2019-01-01 12:23:10 2019-01-01 12:23:10 a
1 2019-01-02 12:23:10 2019-01-02 15:23:10 b
def check(col):
try:
dt = pd.to_datetime(df[col])
if (dt.dt.floor('d') == dt).all():
return ('Its a pure date field')
elif (dt.dt.date == pd.Timestamp('now').date()).all():
return ('Its a pure time field')
else:
return ('Its a Datetime field')
except:
return ('its not a datefield')
print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield
另一个想法是测试数字列是否默认情况下不返回数字,以防止将数字强制转换为日期时间,但是如果可能的话,所有日期时间仅包含今天的日期(f
列),则测试时间与 Series.str.contains
用于匹配模式H:MM:SS
:
Another idea is also test if numeric columns and by default return not numeric for prevent casting numeric to datetimes, but if possible all datetimes contains only todays dates (f
column) then test for times is different with Series.str.contains
for match pattern HH:MM:SS
or H:MM:SS
:
df = pd.DataFrame({'a':['2019-01-01 12:23:10',
'2019-01-02'],
'b':['2019-01-01',
'2019-01-02'],
'c':['12:23:10',
'15:23:10'],
'd':['a','b'],
'e':[1,2],
'f':['2019-11-13 12:23:10',
'2019-11-13'],})
print (df)
a b c d e f
0 2019-01-01 12:23:10 2019-01-01 12:23:10 a 1 2019-11-13 12:23:10
1 2019-01-02 2019-01-02 15:23:10 b 2 2019-11-13
def check(col):
if np.issubdtype(df[col].dtype, np.number):
return ('its not a datefield')
try:
dt = pd.to_datetime(df[col])
if (dt.dt.floor('d') == dt).all():
return ('Its a pure date field')
elif df[col].str.contains(r"^\d{1,2}:\d{2}:\d{2}$").all():
return ('Its a pure time field')
else:
return ('Its a Datetime field')
except:
return ('its not a datefield')
print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
print (check('e'))
print (check('f'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield
its not a datefield
Its a Datetime field