
I have a data set that looks like this:

structure(list(A = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", 
"25"), class = "factor"), T = c(0.04, 0.08, 0.12, 0.16, 0.2, 
0.24), X = c(464.4, 464.4, 464.4, 464.4, 464.4, 464.4), Y = c(418.5, 
418.5, 418.5, 418.5, 418.5, 418.5), V = c(0, 0, 0, 0, 0, 0), 
    GD = c(0, 0, 0, 0, 0, 0), ND = c(NA, 0, 0, 0, 0, 0), ND2 = c(NA, 
    0, 0, 0, 0, 0), TID = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("t1", 
    "t10", "t11", "t12", "t13", "t14", "t15", "t16", "t17", "t18", 
    "t19", "t2", "t20", "t21", "t22", "t23", "t24", "t25", "t3", 
    "t4", "t5", "t6", "t7", "t8", "t9"), class = "factor")), .Names = c("A", 
"T", "X", "Y", "V", "GD", "ND", "ND2", "TID"), row.names = c(NA, 
6L), class = "data.frame")

I want to select the first 80 observations of all variables for each TID. So far, I can do this with the first TID only using the code:

sub.data1<-NM[1:80, ]

How can I do it for all my other TIDs?


Was it helpful?

Solution 2

Using function ddply() from plyr you can split data by TID and then select forst 80 with head() and then put all again in one data frame,

ddply(NM, .(TID), head, n = 80)


I would do:

lapply(split(dat, dat$TID), head, 80)

It returns a list of data.frames with 80 (or less) rows. If instead you want everything into one data.frame:, lapply(split(dat, dat$TID), head, 80))

Using data tables, I made a shorter example with just TIDs t1 and t2 that returns the first 2 rows of t1 and t2. It can be adjusted for your data.

data<-structure(list(A = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", 
                "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
                "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", 
                "25"), class = "factor"), T = c(0.04, 0.08, 0.12, 0.16, 0.2, 
                0.24), X = c(464.4, 464.4, 464.4, 464.4, 464.4, 464.4), Y = c(418.5, 
                        418.5, 418.5, 418.5, 418.5, 418.5), V = c(0, 0, 0, 0, 0, 0), 
                GD = c(0, 0, 0, 0, 0, 0), ND = c(NA, 0, 0, 0, 0, 0), ND2 = c(NA, 
                        0, 0, 0, 0, 0), TID = c("t1","t1","t1","t2","t2","t2")), .Names = c("A", 
                "T", "X", "Y", "V", "GD", "ND", "ND2", "TID"), row.names = c(NA, 
                6L), class = "data.frame")

This results in:

   TID A    T     X     Y V GD ND ND2
1:  t1 1 0.04 464.4 418.5 0  0 NA  NA
2:  t1 1 0.08 464.4 418.5 0  0  0   0
3:  t2 1 0.16 464.4 418.5 0  0  0   0
4:  t2 1 0.20 464.4 418.5 0  0  0   0

and can be changed back to a data frame if desired by changing the last line to[,head(.SD,2),by=TID])

Here is another solution in base:, by(NM, NM$TID, head, 80))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top