Вопрос

I am having trouble calculating a date that is imported in from a .csv file. What I want to do is take that date in the factor DateClosed and generate a date in a date field (a). Example if a=203 I want the date to be the equivalent of DateClosed-203. However, I am having trouble with the code listed below.

DateClose is a factor.

> head(DateClosed)
[1] 7/30/2007  12/12/2007 5/8/2009   6/24/2009  6/24/2009  2/29/2008 
165 Levels: 1/12/2010 1/15/2011 1/15/2013 1/17/2009 1/18/2008 1/19/2012 1/2/2013 1/21/2013 1/22/2010 1/24/2013 1/26/2014 ... 9/7/2010
> head(as.Date(DateClosed,format="%m/%d/%y"))
[1] "2020-07-30" "2020-12-12" "2020-05-08" "2020-06-24" "2020-06-24" "2020-02-29"

 head(as.Date(DateClosed,format="%m/%d/%y"))-203
[1] "2020-01-09" "2020-05-23" "2019-10-18" "2019-12-04" "2019-12-04" "2019-08-10"

It subtracts 203 days correctly but for some reason reads the date wrong.

Это было полезно?

Решение

DateClosed <- factor(c("7/30/2007","12/12/2007", "5/8/2009"))
as.Date(DateClosed, format="%m/%d/%Y")

Produces:

[1] "2007-07-30" "2007-12-12" "2009-05-08"

Notice the capital "Y" in the format param. The lower case "y" is for 2 digit years, so as.Date reads the first two digits of the year token ("20"), and then assumes that refers to just the last two digits of the year, and adds the current date's century (also "20"), so you end up with dates in 2020.

Другие советы

Manipulating dates becomes really easy using lubridate package.

mdy(factor(c("7/30/2007","12/12/2007", "5/8/2009")))

"2007-07-30 UTC" "2007-12-12 UTC" "2009-05-08 UTC"

Or using parse_date_time with the same package:

parse_date_time(factor(c("7/30/2007","12/12/2007", "5/8/2009")),c('mdY'))
[1] "2007-07-30 UTC" "2007-12-12 UTC" "2009-05-08 UTC"
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top