Question

I want the as.Date function to rbind any values that do not match the given format (dateFormat below) onto a blank data frame. Currently, it converts them all to NA. We don't want it converted at all just output to the data frame. Does anyone know how to short circuit the as.Date function at that point?

dataValues = data.frame(id = c("a1", "a2", "a4", "a5", "a6","a7", "a8", "a9", "a10",  "a11", "a12","a13", "a14", "a15", "a16", "a17"),
                        value1 = c('10/3/2012', '13/4/2012', NA, '0', '1/2/2012', '2/30/2013',
                        '2/4/2012', "N/A", 'No Data', '5-6-2012', '2/5/2012',
                        'Not Applicable', '5/8/2013', '2/5/2014', '6/9/2010', '5/4/2014'),
                        stringsAsFactors =  FALSE)
dateFormat = "%m/%d/%Y"
dates = toString(dataValues[,2])
tempSplit = unlist(strsplit(dates,","))            
#If it encounters anything that is not valid for the format
#such as out of range or incorrect format it will change the value
#to NA in the data frame. 
dates = as.data.frame(as.Date(tempSplit, dateFormat))
names(dates) = c("Date")
Was it helpful?

Solution

If I understand correctly, you can use split to separate the valid and invalid dates:

split(dataValues,is.na(as.Date(dataValues$value1,dateFormat)))
$`FALSE`
    id    value1
1   a1 10/3/2012
5   a6  1/2/2012
7   a8  2/4/2012
11 a12  2/5/2012
13 a14  5/8/2013
14 a15  2/5/2014
15 a16  6/9/2010
16 a17  5/4/2014

$`TRUE`
    id         value1
2   a2      13/4/2012
3   a4           <NA>
4   a5              0
6   a7      2/30/2013
8   a9            N/A
9  a10        No Data
10 a11       5-6-2012
12 a13 Not Applicable
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top