find the subset sorted according to absolute difference between rows in specific columns in a matrix

StackOverflow https://stackoverflow.com/questions/20772246

문제

Please help me on the following code.

set.seed(5)
matrix <- matrix(round(rnorm(100,100,50)), nrow = 4, ncol = 2, byrow = TRUE,
             dimnames = list(c("r1", "r2", "r3","r4"),c("c1","c2")))

I need a subset/rows of above matrix where the absolute difference of row r1 and rest of the rows in column c1 . If i could sort the rows by the difference in increasing order that also will be useful. From there i can find the rows with minimum difference values.

Input matrix

   c1   c2
r1 10  4
r2 6   11
r3 9   17
r4 21  91

Output Matrix

   c1   c2
r1 10   4
r2 9    17
r3 6    11
r4 21   91

row r1 remain as reference. row r2 to r3 sorted according to increasing difference from row r1 in column c1.Any help/clues appreciated.

도움이 되었습니까?

해결책

First, you can calculate the absolute differences between row 1 and all rows (concerning columns 3 and 4) with the following command:

differences <- abs(t(t(matrix[ , 3:4]) - matrix[1, 3:4])) 

#     c3 c4
# r1   0  0
# r2  39 36
# r3 124 44
# r4   9 11
# r5  75 17

Now you can order these differences by the first column (c3) in the first place and column 2 (c4) in the second place. This order is used to order your original matrix:

matrix[order(differences[ , 1], differences[ , 2]), ]

#     c1  c2  c3  c4
# r1  58 169  37 104
# r4  46  92  46  93
# r2 186  70  76  68
# r5  70  -9 112  87
# r3  86 107 161  60

Update based on new example in question:

differences <- abs(t(t(matrix[ , ]) - matrix[1, ])) 

#    c1 c2
# r1  0  0
# r2  4  7
# r3  1 13
# r4 11 87

matrix[order(differences[ , 1], differences[ , 2]), ]

#    c1 c2
# r1 10  4
# r3  9 17
# r2  6 11
# r4 21 91

다른 팁

Assuming c3 and c4 are columns 3 and 4, use apply to compute the sum of absolute differences between row 1 and the other rows. Within the function in the apply call, r is a vector of each row:

> apply(matrix,1,function(r){sum(abs(r[3:4]-matrix[1,3:4]))})
 r1  r2  r3  r4  r5 
  0  75 168  20  92 

The first one is zero, which is good because thats the sum of absolute difference of row 1 with itself. So proceed:

> diffs = apply(matrix,1,function(r){sum(abs(r[3:4]-matrix[1,3:4]))})
> diffs
 r1  r2  r3  r4  r5 
  0  75 168  20  92 

To find the index of the smallest, ignoring the zero, take the first element off, find the first one which is minimum (this will only take one if there's ties...) and add one back:

> 1+which(diffs[-1]==min(diffs[-1]))[1]
r4 
 4 

To reorder your matrix by increasing sum abs diff:

> order(diffs)
[1] 1 4 2 5 3
> matrix[order(diffs),]
    c1  c2  c3  c4
r1  58 169  37 104
r4  46  92  46  93
r2 186  70  76  68
r5  70  -9 112  87
r3  86 107 161  60
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top