I'm writing an R package that manipulates Matrices in C. Currently, the matrices returned to R have numbers for the row/column names. I would rather assign my own row/column names when modifying the object in C.
I've googled around for about an hour, but haven't found a good solution yet. The closest I've found is dimnames, but I want to name each column, not just the two dimensions. The matrices get larger than 4x4, below is just a small example of what I want to do.
The number of rows is 4^x where X is the length of the row name
Current
[,1] [,2] [,3] [,4]
[1,] 0.20 0.00 0.00 0.80
[2,] 0.25 0.25 0.25 0.25
[3,] 0.25 0.25 0.25 0.25
[4,] 1.00 0.00 0.00 0.00
[5,] 0.20 0.00 0.00 0.80
[6,] 0.25 0.25 0.25 0.25
[7,] 0.25 0.25 0.25 0.25
[8,] 1.00 0.00 0.00 0.00
[9,] 0.20 0.00 0.00 0.80
[10,] 0.25 0.25 0.25 0.25
[11,] 0.25 0.25 0.25 0.25
[12,] 1.00 0.00 0.00 0.00
[13,] 0.20 0.00 0.00 0.80
[14,] 0.25 0.25 0.25 0.25
[15,] 0.25 0.25 0.25 0.25
[16,] 1.00 0.00 0.00 0.00
Desired
[A] [C] [G] [T]
[AA] 0.20 0.00 0.00 0.80
[AC] 0.25 0.25 0.25 0.25
[AG] 0.25 0.25 0.25 0.25
[AT] 1.00 0.00 0.00 0.00
[CA] 0.20 0.00 0.00 0.80
[CC] 0.25 0.25 0.25 0.25
[CG] 0.25 0.25 0.25 0.25
[CT] 1.00 0.00 0.00 0.00
[GA] 0.20 0.00 0.00 0.80
[GC] 0.25 0.25 0.25 0.25
[GG] 0.25 0.25 0.25 0.25
[GT] 1.00 0.00 0.00 0.00
[TA] 0.20 0.00 0.00 0.80
[TC] 0.25 0.25 0.25 0.25
[TG] 0.25 0.25 0.25 0.25
[TT] 1.00 0.00 0.00 0.00
Method 1: using colnames() method colnames() method in R is used to rename and replace the column names of the data frame in R. The columns of the data frame can be renamed by specifying the new column names as a vector. The new name replaces the corresponding old name of the column in the data frame.
Method 1 : Using rownames() method The rownames() method in R is used to assign row names to the dataframe. It is assigned using a character vector consisting of desired names with a length equivalent to the number of rows in dataframe.
The rownames and colnames functions are used to define the corresponding names of a matrix and if we want to extract those names then the same function will be used. For example, if we have a matrix called M that has row names and column names then these names can be found by using rownames(M) and colnames(M).
If you are open to C++ instead of C, then Rcpp can make this a little easier. We just create a list object with rows and column names as we would in R, and assign that to the dimnames
attribute of the matrix object:
R> library(inline) # to compile, link, load the code here
R> src <- '
+ Rcpp::NumericMatrix x(2,2);
+ x.fill(42); // or more interesting values
+ // C++0x can assign a set of values to a vector, but we use older standard
+ Rcpp::CharacterVector rows(2); rows[0] = "aa"; rows[1] = "bb";
+ Rcpp::CharacterVector cols(2); cols[0] = "AA"; cols[1] = "BB";
+ // now create an object "dimnms" as a list with rows and cols
+ Rcpp::List dimnms = Rcpp::List::create(rows, cols);
+ // and assign it
+ x.attr("dimnames") = dimnms;
+ return(x);
+ '
R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp")
R> fun()
AA BB
aa 42 42
bb 42 42
R>
The actual assignment of the column and row names is so manual ... because the current C++ standard does not allow direct assignment of vectors at initialization, but that will change.
Edit: I just realized that I can of course use static create()
method on the row and colnames too, which makes this a little easier and shorter still
R> src <- '
+ Rcpp::NumericMatrix x(2,2);
+ x.fill(42); // or more interesting values
+ Rcpp::List dimnms = // two vec. with static names
+ Rcpp::List::create(Rcpp::CharacterVector::create("cc", "dd"),
+ Rcpp::CharacterVector::create("ee", "ff"));
+ // and assign it
+ x.attr("dimnames") = dimnms;
+ return(x);
+ '
R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp")
R> fun()
ee ff
cc 42 42
dd 42 42
R>
So we are down to three or four statements, no monkeying with PROTECT / UNPROTECT and no memory management.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With