tapply like issue, but require dataframe output - R

https://stackoverflow.com/questions/8919959

30-10-2019
|

Pergunta

This is my first post, so hopefully I explain what I need to do properly. I am still quite new to R and I may have read posts that answer this, but I just can't for the life of me understand what they mean. So apologies in advance if this has already been answered.

I have a very large data set of GPS locations from radiocollars and there are inconsistent numbers of locations for each day. I want to go through the dataset and select a single data point for each day based on the accuracy level of the GPS signal.

So it essentially looks like this.

Accuracy    Month    Day    Easting    Northing    Etc
   5          6       1     #######    ########     #
   3.2        6       1     #######    ########     #
   3.8        6       1     #######    ########     #
   1.6        6       2     #######    ########     #
   4          6       3     #######    ########     #
   3.2        6       3     #######    ########     #

And I want to pull out the most accurate point for each day (the lowest accuracy measure) while keeping the rest of the associated data.

Currently I have been using the tapply function

datasub1<-subset(data,MONTH==6)
tapply(datasub1$accuracy, datasub1$day, min)

Using this method I can successfully retrieve the minimum values, one for each day, however I cannot take the associated coordinates and timing, and all the other important information along with it, and as the data set is nearly 300 000 rows, I really can't do it by hand.

So essentially, I need to get the same results as the tapply, but instead of individual points, I need the entire row that that point is found in.

Thanks in advance to anyone that could lend a hand. If you need any more information, let me know, I'll try my best to get it to you.

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow