replicate(3, sample(200, 50))
Where 200
is the number of rows in the data frame (adjust accordingly). More automagically, assuming the data are in object df
replicate(3, sample(nrow(df), 50))
Here is an example
set.seed(10)
df <- data.frame(x1 = rnorm(1000), x2 = rnorm(1000))
ind <- replicate(3, sample(nrow(df), 50))
head(ind)
> head(ind)
[,1] [,2] [,3]
[1,] 380 220 702
[2,] 75 751 720
[3,] 775 278 153
[4,] 988 612 340
[5,] 282 568 925
[6,] 266 794 812
The columns contain the 3 subsets you want. You could then use this to index the original data frame, e.g.
df[ind[,1], "x2"]
> df[ind[,1], "x2"]
[1] 0.57982435 0.27016645 -0.08435526 1.16768142 1.38124150 0.62444167
[7] -0.54887437 1.91301831 1.84116197 0.94045377 -1.15417235 -0.06809104
[13] -2.03652525 1.06773801 -0.34235315 -0.24707548 -1.80470122 0.11993674
[19] -0.36358182 0.16819156 -1.84507669 -0.16707925 -1.80789383 0.78894210
[25] -0.05741295 -0.28905260 2.38724835 2.75762831 -0.18082554 1.61820620
[31] -0.48192569 -0.03298339 0.52087746 0.32774925 1.52103207 -0.15619668
[37] -0.49687983 -0.06623606 2.21855213 -0.48727519 1.01115806 0.25213485
[43] 1.01927105 0.31362619 0.40260968 0.26795767 0.01803656 0.19579576
[49] -0.26464131 0.48141105
wherein I take the first subset and only variable x2
.
Note this assumes that you want to sample without replacement; in other words that each row in df
can occur 0 or 1 times only in a subset, not multiple times. If you want the latter, see the replace
argument in ?sample
.