Question

I have a table (data.frame) with numerical data & factors data of which several are character variables (e.g. 'species', 'Fam_name', 'gear') where I want to calculate the subtotals (sum) for the 'weight' and 'number' variables for each 'ss'.

I have tried using the 'aggregate' function, but I have failed to get it to return the character value for the 'gear' variable.

Below is the head of my table

   survey station         ss species weight number bdep      lon      lat                       Sci_name       Fam_name gear
1 2012901       1 2012901001 CARSC04  11.20     20   23 37.61650 19.14900        Scomberoides lysan     CARANGIDAE   TB
2 2012901       1 2012901001 SCMGR02   0.98      2   23 37.61650 19.14900 Grammatorcynus bilineatus     SCOMBRIDAE   TB
3 2012901       2 2012901002 NOCATCH   0.00      0    6 38.48333 18.71667                  NO CATCH       NO CATCH   TB
4 2012901       3 2012901003 LUTLU06   5.65      1    6 38.48333 18.71667            Lutjanus bohar     LUTJANIDAE   TB
5 2012901       3 2012901003 SHACAB1   4.00      1    6 38.48333 18.71667         Triaenodon obesus CARCHARHINIDAE   TB
6 2012901       4 2012901004 NOCATCH   0.00      0    9 38.48333 18.71667                  NO CATCH       NO CATCH   TB

I tried using the following code with the intent of combining the two using bind,

catch1<-aggregate(cbind(weight, number) ~ ss, data = catch, FUN = sum) 

catch2<-aggregate(cbind(survey, station, bdep, lon, lat, gear) ~ ss, data = catch, FUN=median) 

but while the first line does what I want it to - sums for each 'ss', the other results in numerical median for 'gear' whereas I want it to return the 'gear' code for that particular 'ss'.

Reconstruction of the 'gear' factor (thanks to BrodieG):

catch2$gear <- factor(levels(catch$gear)[catch2$gear], levels=levels(catch$gear))

Problem solved :-)

Was it helpful?

Solution

Your problem is that gear is a factor, so median is returning the median of the numerical values of the factor. Try:

catch2$gear <- factor(levels(catch$gear)[catch2$gear], levels=levels(catch$gear))

or something like it to reconstruct the factor for catch2.

OTHER TIPS

I assumed there could be two kinds of gear for a given ss. In that case the problem boils down to finding the median (or mode) of a character variable. Here is code to find the mode of a character variable (here gear).

catch <- read.table(text = '
         ss  gear
          1    AA
          1    AA
          1    BB
          1    BB
          2    CC
          2    CC
          2    CC
          3    BB
          4    AA
          4    CC
', header = TRUE)

gear.mode <- tapply(catch$gear, catch$ss, function(x) { y = table(x) ; names(y)[y==max(y)] })
gear.mode <- as.data.frame(gear.mode)
gear.mode

  gear.mode
1    AA, BB
2        CC
3        BB
4    AA, CC

You can also do this with aggregate:

aggregate(gear ~ ss, data = catch, FUN = function (x) {
   y = table(x) ; names(y)[y==max(y)] 
})

  ss   gear
1  1 AA, BB
2  2     CC
3  3     BB
4  4 AA, CC
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top