How to map matrix rownames from other file and re-write it on the same matrix in R?

StackOverflow https://stackoverflow.com/questions/23524415

  •  17-07-2023
  •  | 
  •  

Question

I have very big matrix, and I want to map it's row names from other file.

Basically I have two files; File A, which contains names and it's IDs. File B is the matrix and it's row names comes from names in file A. what I need to do is that, I should read the row names of my matrix in file B and find it's associated ID in file A and replace the row name with found ID in matrix B.

would someone knows how to implement it in R ?

Here is the sturcture of my file:

File A:

        Names                                                          IDs



    unc.edu.3bdcdadf-67da-4a50-b311-81196c0c8362.1162097.rsem.genes.results   TCGA-B6-A0WW-01A-11R-A109-07
    unc.edu.3bdcdadf-67da-4a50-b311-81196c0c8362.1162128.rsem.genes.normalized_results    TCGA-B6-A0WW-01A-11R-A109-07
    unc.edu.3c1b6647-26bb-4110-aaea-f542024e8bf3.1989626.rsem.genes.results     TCGA-AQ-A54O-01A-11R-A266-07

and File B:

rownames(mymatirx)



       unc.edu.3bdcdadf-67da-4a50-b311-81196c0c8362.1162097.rsem.genes.results
      unc.edu.3c1b6647-26bb-4110-aaea-f542024e8bf3.1989626.rsem.genes.results 

Expected output:

File B:

  >rownames(mymatirx)

    CGA-B6-A0WW-01A-11R
    TCGA-AQ-A54O-01A-11R

I just need to keep the ID with before the fifth - and drop the rest, which for our case, matched IDs are :

 TCGA-B6-A0WW-01A-11R-A109-07
    TCGA-AQ-A54O-01A-11R-A266-07 

and we just keep :

 TCGA-B6-A0WW-01A-11R
 TCGA-AQ-A54O-01A-11R

and drop

-A109-07
-A266-07 
Was it helpful?

Solution

This seems to work.

rownames(A) <- A$Names
rownames(B) <- A[rownames(B)]$IDs
rownames(B) <- gsub("\\-[^-]+\\-[^-]+$","",rownames(B))
  • The first line sets the row names of A to the values in A$Names. We need this for indexing.
  • The second line uses the row names of B as index into A, replacing with the values in A$IDs
  • The third line removes everything from the second-to-last - from the new row names in B
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top