Question

There are at least two sparse matrix packages for R. I'm looking into these because I'm working with datasets that are too big and sparse to fit in memory with a dense representation. I want basic linear algebra routines, plus the ability to easily write C code to operate on them. Which library is the most mature and best to use?

So far I've found

  • Matrix which has many reverse dependencies, implying it's the most used one.
  • SparseM which doesn't have as many reverse deps.
  • Various graph libraries probably have their own (implicit) versions of this; e.g. igraph and network (the latter is part of statnet). These are too specialized for my needs.

Anyone have experience with this?

From searching around RSeek.org a little bit, the Matrix package seems the most commonly mentioned one. I often think of CRAN Task Views as fairly authoritative, and the Multivariate Task View mentions Matrix and SparseM.

Was it helpful?

Solution

Matrix is the most common and has also just been accepted R standard installation (as of 2.9.0), so should be broadly available.

Matrix in base: https://stat.ethz.ch/pipermail/r-announce/2009/000499.html

OTHER TIPS

In my experience, Matrix is the best supported and most mature of the packages you mention. Its C architecture should also be fairly well-exposed and relatively straightforward to work with.

log(x) on a sparse matrix is a bad idea since log(0) isn't defined and most elements of a sparse matrix are zero.

If you would just like to get the log of the non-zero elements, try converting to a triplet sparse representation and taking a log of those values.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top