Possibly a stupid question, but I've hunted around a lot for an answer and been unable to find one:
I'm trying to write a file reader, a la fread
or read.delim
but implemented in C++ and connected to R via Rcpp. The easiest way to do this and have it output a data.frame is have it produce a List
of vectors - one for each column - and set the class to data.frame
List foo;
foo.push_back(column);
foo.attr("class") = "data.frame";
return foo;
Simple enough, and I've done it before. Unfortunately:
So, the answer is to be able to define foo and then, for each row I read in, push_back() a field on to each of foo's underlying vectors:
List foo(1);
foo[0].push_back("turnip");
Unfortunately I can't work out how to do that: it doesn't appear that a List's member vectors can be pushed_back() to, since this results in the error "Rcpp::Vector<19>::Proxy has no member named push_back()"
So, my question: is there any way to append to a vector within an Rcpp list? Or is my only option to read the file in column-by-column, appending the resulting vectors to "foo", and bite the performance cost that's going to result from having to iterate through it [number of columns] times instead of once?
Hopefully this question is clear enough. Happy to answer any questions.
It is a semi-hard problem when you know neither rows nor columns beforehand.
In a for-work, remained-closed project a few years ago, I collected my data as a variant type (using the corresponding Boost class) and converted at the end.
In Rblpapi (to which I contributed some other code), Whit tried a few approaches and ended up defining his own helper functions and I have been meaning to distill / refactor this and discuss it with Kevin -- but that hasn't happened yet.
So feel free to come up with something better :)
Generally speaking, and getting back to your problem, we frequently receive data row-wise, often via call-backs. The Rcpp types (wrapping R types) do very poorly when you append element by element -- so don't do the naive push_back
as you will end up copying a lot.
So if you know your types, do std::list
over corresponding std::vector<T>
for the given T
. These vectors you can grow. Once you have them, assembling a Rcpp::List
and hence Rcpp::DataFrame
is easier.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With