且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

为什么dplyr的mutate更改时间格式?

更新时间:2023-11-03 16:23:58

它实际上是导致该问题的 ifelse(),而不是 dplyr :: mutate() 。属性剥离问题的一个例子显示在 help(ifelse) -

It's actually ifelse() that is causing that issue, and not dplyr::mutate(). An example of the problem of attribute stripping is shown in help(ifelse) -


## ifelse() strips attributes
## This is important when working with Dates and factors
x <- seq(as.Date("2000-02-29"), as.Date("2004-10-04"), by = "1 month")
## has many "yyyy-mm-29", but a few "yyyy-03-01" in the non-leap years
y <- ifelse(as.POSIXlt(x)$mday == 29, x, NA)
head(y) # not what you expected ... ==> need restore the class attribute:
class(y) <- class(x)


所以你有它。如果你想使用 ifelse(),这有点额外的工作。以下是两种可能的方法,无需 ifelse()即可获得所需的结果。第一个很简单,使用 is.na 。

So there you have it. It's a bit of extra work if you want to use ifelse(). Here are two possible methods that will get you to your desired result without ifelse(). The first is really simple and uses is.na<-.

## mark 'time' as NA if 'id' is NA
is.na(mydf$time) <- is.na(mydf$id)

## resulting in
mydf
#                  time    id
# 1 2015-03-05 02:28:11  1674
# 2 2015-03-03 13:10:59 36749
# 3                <NA>    NA
# 4                <NA>    NA

如果您不想选择该路由,并希望继续 dplyr 方法,可以使用 replace()而不是 ifelse()

If you don't wish to choose that route, and want to continue with the dplyr method, you can use replace() instead of ifelse().

mydf %>% mutate(time = replace(time, is.na(id), NA))
#                  time    id
# 1 2015-03-05 02:28:11  1674
# 2 2015-03-03 13:10:59 36749
# 3                <NA>    NA
# 4                <NA>    NA

数据:

mydf <- structure(list(time = structure(c(1425551291, 1425417059, 1425570948, 
1425564799), class = c("POSIXct", "POSIXt"), tzone = ""), id = c(1674L, 
36749L, NA, NA)), .Names = c("time", "id"), class = "data.frame", row.names = c(NA, 
-4L))