Question

I have df that have 12 columns:

df<-read.table(header=T,text="V1    V2       V3         V4             V5 V6   V7       V8       V9    V10  V11 V12
 A01 10378809 10379882 Contig1401|m.3412 101 -  10378809 10379882 255,0,0  1 1073   0
 A01 10469105 10469293 Contig1755|m.4465  48  + 10469105 10469293 255,0,0  2  188   0
 A01 10469429 10469630 Contig1755|m.4465   5  + 10469429 10469630 255,0,0  NA  201  0")

First I want to group them by contig and then generate the following values for 12th column. I figured out how to do this with dplyr but I have some errors.

as.data.frame(df %.% group_by(V4) %.% summarise(V12=apply(df[2], 2, function(x)x-x[1])))

The error:

Error in summarise_impl(.data, named_dots(...), environment()) : attempt to use zero-length variable name.

For each group I want to subtract the 2nd value from the 1st value from 2nd column. I can do this easily if there are only 2 rows (max-min) but if there are more than 2 then I will miss the middle rows.

So I thought I will write a function and insert into dplyr but it seems I cannot use my own function with dplyr.

Here is the final output I need:

V1       V2       V3                V4  V5 V6       V7       V8      V9 V10  V11 V12
1 A01 10378809 10379882 Contig1401|m.3412 101  - 10378809 10379882 255,0,0   1 1073   0
2 A01 10469105 10469293 Contig1755|m.4465  48  + 10469105 10469293 255,0,0   2  188   0
3 A01 10469429 10469630 Contig1755|m.4465   5  + 10469429 10469630 255,0,0  NA  201 324
Was it helpful?

Solution

I suppose you're looking for this:

library(dplyr)
df %.% 
  group_by(V4) %.% 
  mutate(V12 = V2 - V2[1])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top