Most effecient way of converting DataFrame to Matrix and vice versa
Question
I have been trying to implement some of the basic R functions like split in Rcpp for Data frames and matrices. For that I need to know the most efficient method of converting Data Frame and Matrix back and forth. so far I use the DataFrame constructor to convert the matrix to DataFrame. How can I convert DataFrame Back to Matrix.Lets assume that all the data are of type double.
The current approach is
matrix to DataFrame
NumericMatrix x;
DataFrame y= DataFrame(x);
y.attr("names")=x.attr("names");
DataFrame to Matrix
DataFrame x;
int xsize=x.size();
NumericVector col=x(0);
NumericMatrix y(col.size(),xsize);
for(int i=0;i<xsize;i++){
y(_,i)=col;
if(i<xsize-1){
col=x(i+1);
}
}
y.attr("names")=x.attr("names");
Is there a more efficient way of doing this conversion?
Also I am a newbie to Rcpp. Can Anybody explain how to find the source code of a particular class implementation, for example NumericMatrix?
Also the last line
y.attr("names")=x.attr("names")
does not set the column names of x as column names of y. Can anybody explain how to set the column names as the column names of the data frame.
Solution
You are missing the nrow
method for DataFrame
. This could simplify your code. You don't need the special casing for the first column, etc ...
int xsize=x.size();
NumericMatrix y(x.nrows(),xsize);
for(int i=0;i<xsize;i++){
y(_,i) = NumericVector(x[i]);
}
As for setting the column names, you can go through the dimnames
attributes:
y.attr("dimnames") = List::create( R_NilValue, x.attr("names") ) ;
The source code for Matrix
is here but I'm not sure this is going to help you.