Question

I have about 500,000 points in R of occurrence data of a migratory bird species throughout the US.

I am attempting to overlay a grid on these points, and then count the number of occurrences in each grid. Once the counts have been tallied, I then want to reference them to a grid cell ID.

In R, I've used the over() function to just get the points within the range map, which is a shapefile.

#Read in occurrence data
data=read.csv("data.csv", header=TRUE)
coordinates(data)=c("LONGITUDE","LATITUDE")

#Get shapefile of the species' range map
range=readOGR(".",layer="data")

proj4string(data)=proj4string(range)

#Get points within the range map
inside.range=!is.na(over(data,as(range,"SpatialPolygons")))

The above worked exactly as I hoped, but does not address my current problem: how to deal with points that are the type SpatialPointsDataFrame, and a grid that is a raster. Would you recommend polygonizing the raster grid, and using the same method I indicated above? Or would another process be more efficient?

Was it helpful?

Solution

First of all, your R code doesn't work as written. I would suggest copy-pasting it into a clean session, and if it errors out for you as well, correcting syntax errors or including add-on libraries until it runs.

That said, I assume that you are supposed to end up with a data.frame of two-dimensional numeric coordinates. So, for the purposes of binning and counting them, any such data will do, so I took the liberty of simulating such a dataset. Please correct me if this doesn't capture a relevant aspect of your data.

## Skip this line if you are the OP, and substitute the real data instead.
data<-data.frame(LATITUDE=runif(100,1,100),LONGITUDE=runif(100,1,100));

## Add the latitudes and longitudes between which each observation is located
## You can substitute any number of breaks you want. Or, a vector of fixed cutpoints
## LATgrid and LONgrid are going to be factors. With ugly level names.
data$LATgrid<-cut(data$LATITUDE,breaks=10,include.lowest=T);
data$LONgrid<-cut(data$LONGITUDE,breaks=10,include.lowest=T);

## Create a single factor that gives the lat,long of each observation. 
data$IDgrid<-with(data,interaction(LATgrid,LONgrid));

## Now, create another factor based on the above one, with shorter IDs and no empty levels
data$IDNgrid<-factor(data$IDgrid); 
levels(data$IDNgrid)<-seq_along(levels(data$IDNgrid));

## If you want total grid-cell count repeated for each observation falling into that grid cell, do this:
data$count<- ave(data$LATITUDE,data$IDNgrid,FUN=length);
## You could have also used data$LONGITUDE, doesn't matter in this case

## If you want just a table of counts at each grid-cell, do this:
aggregate(data$LATITUDE,data[,c('LATgrid','LONgrid','IDNgrid')],FUN=length);
## I included the LATgrid and LONgrid vectors so there would be some 
## sort of descriptive reference accompanying the anonymous numbers in IDNgrid,
## but only IDNgrid is actually necessary

## If you want a really minimalist table, you could do this:
table(data$IDNgrid);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top