The File I/O API in Phobos is relatively easy to use, but right now I feel like it's not very well integrated with D's range interface.
I could create a range delimiting the full contents by reading the entire file into an array:
import std.file;
auto mydata = cast(ubyte[]) read("filename");
processData(mydata); // takes a range of ubytes
But this eager evaluation of the data might be undesired if I only want to retrieve a file's header, for example. The upTo
parameter doesn't solve this issue if the file's format assumes a variable-length header or any other element we wish to retrieve. It could even be in the middle of the file, and read
forces me to read all of the file up to that point.
But indeed, there are alternatives. readf
, readln
, byLine
and most particularly byChunk
let me retrieve pieces of data until I reach the end of the file, or just when I want to stop reading the file.
import std.stdio;
File file("filename");
auto chunkRange = file.byChunk(1000); // a range of ubyte[]s
processData(chunkRange); // oops! not expecting chunks!
But now I have introduced the complexity of dealing with fixed size chunks of data, rather than a continuous range of bytes.
So how can I create a simple input range of bytes from a file that is lazy evaluated, either by characters or by small chunks (to reduce the number of reads)? Can the range in the second example be seamlessly encapsulated in a way that the data can be processed like in the first example?
You can use std.algorithm.joiner
:
auto r = File("test.txt").byChunk(4096).joiner();
Note that byChunk
reuses the same buffer for each chunk, so you may need to add .map!(chunk => chunk.idup)
to lazily copy the chunks to the heap.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With