Large SpMat object with RcppArmadillo

https://stackoverflow.com/questions/16159174

11-04-2022
|

Question

I am trying to learn and use Rcpp and RcppArmadillo for the sparse linear algebra routines.

Code below is adaptation of the example here: http://gallery.rcpp.org/articles/armadillo-sparse-matrix/

code <- '
  S4 matx(x);
  IntegerVector Xd = matx.slot("Dim");
  IntegerVector Xi = matx.slot("i");
  IntegerVector Xp = matx.slot("p");
  NumericVector Xx = matx.slot("x");

  arma::sp_mat Xsp(Xd[0], Xd[1]);

  // create space for values, and copy
  arma::access::rw(Xsp.values) = arma::memory::acquire_chunked<double>(Xx.size() + 1);
  arma::arrayops::copy(arma::access::rwp(Xsp.values), 
             Xx.begin(), 
             Xx.size() + 1);

  // create space for row_indices, and copy -- so far in a lame loop
  arma::access::rw(Xsp.row_indices) = arma::memory::acquire_chunked<arma::uword>(Xx.size() + 1);
  for (int j=0; j<Xi.size(); j++)
    arma::access::rwp(Xsp.row_indices)[j] = Xi[j];

  // create space for col_ptrs, and copy -- so far in a lame loop
  arma::access::rw(Xsp.col_ptrs) = arma::memory::acquire_chunked<arma::uword>(Xp.size() + 1);
  for (int j=0; j<Xp.size(); j++)
      arma::access::rwp(Xsp.col_ptrs)[j] = Xp[j];

  // important: set the sentinel as well
  arma::access::rwp(Xsp.col_ptrs)[Xp.size()+1] = std::numeric_limits<arma::uword>::max();

  // set the number of non-zero elements
  arma::access::rw(Xsp.n_nonzero) = Xx.size();

  Rcout << "SpMat Xsp:\\n" << arma::dot(Xsp,Xsp) << std::endl;
'

norm2 <- cxxfunction(signature(x="Matrix"),
                     code,plugin="RcppArmadillo")

When I use a vector of 1e4, things work fine:

> p <- 10000
> X <- Matrix(rnorm(p),sparse=TRUE)
> norm2(X)
SpMat Xsp:
9997.14
NULL

However, when I use a vector of length 1e5, an error is produced

> p <- 100000
> X <- Matrix(rnorm(p),sparse=TRUE)
> norm2(X)

error: SpMat::init(): requested size is too large

Error: 
>

I cannot seem to figure out what I am doing wrong. Any pointers would be appreciated.

============== more information ==============

The problem seems to be with having dimension >= 2^16=65536

Following works:

> m <- 1000
> n <- 65535
> nnz <- 10000
> iind <- sample.int(m,nnz,replace=TRUE)
> jind <- sample.int(n,nnz,replace=TRUE)
> xval <- rnorm(nnz)
> X <- sparseMatrix(i=iind,j=jind,x=xval,dims=c(m,n))
> norm2(X)
SpMat Xsp:
10029.8
NULL

Following doesn't work:

> m <- 1000
> n <- 65536
> nnz <- 10000
> iind <- sample.int(m,nnz,replace=TRUE)
> jind <- sample.int(n,nnz,replace=TRUE)
> xval <- rnorm(nnz)
> X <- sparseMatrix(i=iind,j=jind,x=xval,dims=c(m,n))
> norm2(X)

error: SpMat::init(): requested size is too large

Error: 
>

Why would this be the case?

Solution

Your matrix seems odd. By saying

 Matrix(rnorm(p),sparse=TRUE)

you end up with a p x 1 matrix, albeit sparse. If I just assign 10 rows or colums things just work.

R> p <- 100000
R> X <- Matrix(rnorm(p),nrow=10,sparse=TRUE)
R> dim(X)
[1]    10 10000
R> norm2(X)
SpMat Xsp:
100832
NULL
R>

So I think you just need a better sparse matrix to work with -- the conversion code, and Armadillo's sparse Matrix type, are fine.

Update on 2013-04-30: This was actually an Armadillo bug, which was just fixed upstream. A new RcppArmadillo verion 0.3.810.2 is now in SVN, and should migrate soon to CRAN shortly. You no longer need to define ARMA_64BIT_WORD.

OTHER TIPS

I ran into this: http://arma.sourceforge.net/docs.html#config_hpp.

The solution is to set the 64 bit integer option by adding a line above the header file for armadillo:

#define ARMA_64BIT_WORD

I put this in [R root]/lib/R/library/RcppArmadillo/include/RcppArmadilloConfig.h

I had tried using "includes" parameter for cxxfunction, but I think this define statement has to be above the include statement,

#include RcppArmadillo.h

in the cpp code. I don't know if cxxfunction of the inline package allows this.

After modifying the RcppArmadilloConfig.h, I can now declare a matrix larger than 2^16.

> m <- 1000
> n <- 65536+1000
> nnz <- 10000
> iind <- sample.int(m,nnz,replace=TRUE)
> jind <- sample.int(n,nnz,replace=TRUE)
> xval <- rnorm(nnz)
> X <- sparseMatrix(i=iind,j=jind,x=xval,dims=c(m,n))
> norm2(X)
SpMat Xsp:
10218.8
NULL
>

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow