I am trying to use R to analyze large DNA sequence files (fastq files, several gigabytes each), but the standard R interface to these files (ShortRead) has to read the entire file at once. This doesn't fit in memory, so it causes an error. Is there any way that I can read a few (thousand) lines at a time, stuff them into an in-memory file, and then use ShortRead to read from that in-memory file?
I'm looking for something like Perl's IO::Scalar, for R.
I don’t know much about R, but have you had a look at the mmap package?
It looks like ShortRead is soon to add a "FastqStreamer" class that does what I want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With