Question

I have a data frame that is 196 rows long, and I would like to divide it into 12 groups (as evenly as possible). I can only do this with 4 groups of 17 rows, and 8 groups of 16 rows.

However, I am only finding one example online that only works for multiplicative data frames:

d <- split(dataFrame,rep(1:12,each=16))

This would split a 192-row dataFrame into 12 groups of 16 consecutive rows. However, when I try to apply this to a 196-row dataFrame, I get an error:

Warning message:
In split.default(x = seq_len(nrow(x)), f = f, drop = drop, ...) :
  data length is not a multiple of split variable

I understand the error message, I just don't know to indicate that I would like any remainder to be divided across the groups as evenly as possible.

Was it helpful?

Solution

I think you're looking for how to handle the number of groups not dividing the length of the data frame. This is handled well by the cut function, operating on the row numbers for each row of your data frame:

dat <- data.frame(x=1:196)
spl <- split(dat, cut(seq(nrow(dat)), 12))
str(spl)
# List of 12
#  $ (0.805,17.1]:'data.frame': 17 obs. of  1 variable:
#   ..$ x: int [1:17] 1 2 3 4 5 6 7 8 9 10 ...
#  $ (17.1,33.4] :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 18 19 20 21 22 23 24 25 26 27 ...
#  $ (33.4,49.7] :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 34 35 36 37 38 39 40 41 42 43 ...
#  $ (49.7,65.9] :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 50 51 52 53 54 55 56 57 58 59 ...
#  $ (65.9,82.2] :'data.frame': 17 obs. of  1 variable:
#   ..$ x: int [1:17] 66 67 68 69 70 71 72 73 74 75 ...
#  $ (82.2,98.5] :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 83 84 85 86 87 88 89 90 91 92 ...
#  $ (98.5,115]  :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 99 100 101 102 103 104 105 106 107 108 ...
#  $ (115,131]   :'data.frame': 17 obs. of  1 variable:
#   ..$ x: int [1:17] 115 116 117 118 119 120 121 122 123 124 ...
#  $ (131,147]   :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 132 133 134 135 136 137 138 139 140 141 ...
#  $ (147,164]   :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 148 149 150 151 152 153 154 155 156 157 ...
#  $ (164,180]   :'data.frame': 16 obs. of  1 variable:
#   ..$ x: int [1:16] 164 165 166 167 168 169 170 171 172 173 ...
#  $ (180,196]   :'data.frame': 17 obs. of  1 variable:
#   ..$ x: int [1:17] 180 181 182 183 184 185 186 187 188 189 ...

As can be seen from the summary output, four of the groups have 17 observations and the remaining eight groups have 16 observations.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top