Domanda

I'm trying to print out some output based on the existence of some numbers in other columns.

1) If the value of 0 exists in column V4 and if column V8 has a value greater than 30,

2) then print that line and 2 lines below it

3) but only if the same criteria is met for the line that had a value of 0, three lines above.

I can do steps 1 and 2 but I'm having trouble doing step 3. Here is the input:

> library(data.table)
> file
      V1   V2  V3 V4 V5      V6 V7 V8
 1:    0 -232 -77 -1  D     dog  0  0
 2:    1 -231 -77  0  C     cat  0  40
 3:    2 -230 -77  1  T     tai  0  0
 4:    3 -229 -76 -1  F     fis  0  0
 5:    4 -228 -76  0  G     goo  0  100
---                                  
1162: 1161  929 310 -1  S     soo  0  0
1163: 1162  930 310  0  B     bye  0  0
1164: 1163  931 310  1  G     goo  0  0
1165: 1164  932 311 -1  T     tuu  0  0
1166: 1165  933 311  0  R     roo  0  50

If I then run the following code I get an output that is close but not quite right.

#grouping data table into groups of 3
file[, grp := rep(seq_len(round(ceiling(.N/3))), each = 3,length.out=.N)]
file = file[, if(.N == 3 && V4==0) .SD, by = grp]

#generating the output formatted in the way I want
out <- file[, if(V4 == 0 && V8 > 30) c(V1[1], V1[3], V2[1], V2[3], as.list(V4), as.list(V5), as.list(V6), as.list(V8)), by=grp] 

In this step (see below) I want to add a criteria to only generate an output if column V4 = 0, column V8 of the same line is greater than 30 and if there is a value of >30 in column 8, three lines above.

file[, if(V4 == 0 && V8 > 30) ...

any ideas?

È stato utile?

Soluzione

I would realize this in a three step approach:

df <- read.table(textConnection("
V1   V2  V3 V4 V5      V6 V7 V8
 0 -232 -77 -1  D     dog  0  0
 1 -231 -77  0  C     cat  0  40
 2 -230 -77  1  T     tai  0  0
 3 -229 -76 -1  F     fis  0  0
 4 -228 -76  0  G     goo  0  100
 1161  929 310 -1  S     soo  0  0
 1162  930 310  0  B     bye  0  0"), header=TRUE)

# condition 1
a <- df$V4 == 0 & df$V8 > 30

# condition 3 (does the row 3 rows above fulfill condition 1?)
aIdx <- which(a)
b <- (aIdx-3) %in% aIdx
a[a] <- b

# condition 2 (select also the next two rows)
i <- rep(which(a), 3) + 0:2
a[i] <- TRUE

df[a, ]
#     V1   V2  V3 V4 V5  V6 V7  V8
# 5    4 -228 -76  0  G goo  0 100
# 6 1161  929 310 -1  S soo  0   0
# 7 1162  930 310  0  B bye  0   0
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top