visit number by data by within factor group?
-
23-04-2021 - |
Pergunta
G'day All,
I am working in R. Sorry about this really basic question, but I am a bit stuck. I have a data set of presence/absence point count data with date of count, and site number (see below). I would like to ultimately create a data.frame that collates all counts by grid cell number and has each visit to a site as a new visit (see below). I can't figure out how to do this, so thought I would take an easier route and make a column that gives a visit number for each record. So, the column would give a number for each record by the date of the visit within each site group (see below). I can't figure out how to do this either! Any help would be great, thank you in advance.
Kind regards, Adam
I have this:
Site date
1 12/01/2000
1 24/02/2000
1 13/08/2001
2 14/01/2000
2 21/01/2002
3 1/01/1999
3 21/04/2000
Ultimately want this:
Site vist1 v2 v3
1 12/01/2000 24/02/2000 13/08/2001
2 14/01/2000 21/01/2002 na
3 01/01/1999 21/04/2000 na
But this would be good:
Site date visit
1 12/01/2000 1
1 24/02/2000 2
1 13/08/2001 3
2 14/01/2000 1
2 21/01/2002 2
3 01/01/1999 1
3 21/04/2000 2
Solução
Basically, you are wanting to reshape your data from a long format to a wide format, with repeated observations from a Site
all in a single line. The base R function reshape()
was designed for just this task.
The only (slight) complication is that you first need to add a column (which I here call obsNum
) that identifies which is the first, second, third etc. observation at a Site
. By setting timevar = "obsNum"
, you can then let reshape()
know into which column you want to put each of the values of date
.
df <- read.table(text = "Site date
1 12/01/2000
1 24/02/2000
1 13/08/2001
2 14/01/2000
2 21/01/2002
3 1/01/1999
3 21/04/2000", header=T, stringsAsFactors=FALSE)
df$obsNum <- unlist(sapply(rle(df$Site)$lengths, seq))
reshape(df, idvar="Site", timevar="obsNum", direction="wide")
# Site date.1 date.2 date.3
# 1 1 12/01/2000 24/02/2000 13/08/2001
# 4 2 14/01/2000 21/01/2002 <NA>
# 6 3 1/01/1999 21/04/2000 <NA>
Outras dicas
Here is another solution with ddply
and dcast
.
library(reshape2)
# Convert the date column into actual dates
df$date <- as.Date(df$date, format="%d/%m/%Y")
# Ensure that the data.frame is sorted
df <- df[ order(df$site, df$date), ]
# Number the visits for each site
df$visit <- 1
d <- ddply(df, "Site", transform, visit=cumsum(visit))
# Convert to a wide format
# (Since dcast forgets the Date type, convert it to strings
# before and back to dates after.)
d$date <- as.character(d$date)
d <- dcast(d, Site ~ visit, value.var="date")
d[,-1] <- lapply(d[,-1], as.Date)
d
Here is another take on the solution using plyr
and reshape2
.
require(plyr); require(reshape2); require(lubridate)
df <- ddply(df, .(Site), transform, visit = rank(dmy(date)))
dcast(df, Site ~ visit, value.var = 'date')