What is the most efficient general purpose way of reading "large" files (which may be text or binary), without going into unsafe
territory? I was surprised how few relevant results there were when I did a web search for "rust read large file in chunks".
For example, one of my use cases is to calculate an MD5 checksum for a file using rust-crypto
(the Md5
module allows you to add &[u8]
chunks iteratively).
Here is what I have, which seems to perform slightly better than some other methods like read_to_end
:
use std::{ fs::File, io::{self, BufRead, BufReader}, }; fn main() -> io::Result<()> { const CAP: usize = 1024 * 128; let file = File::open("my.file")?; let mut reader = BufReader::with_capacity(CAP, file); loop { let length = { let buffer = reader.fill_buf()?; // do stuff with buffer here buffer.len() }; if length == 0 { break; } reader.consume(length); } Ok(()) }
To read a large file in chunk, we can use read() function with while loop to read some chunk data from a text file at a time.
To be able to open such large CSV files, you need to download and use a third-party application. If all you want is to view such files, then Large Text File Viewer is the best choice for you. For actually editing them, you can try a feature-rich text editor like Emacs, or go for a premium tool like CSV Explorer.
Reading Large Text Files in Python We can use the file object as an iterator. The iterator will return each line one by one, which can be processed. This will not read the whole file into memory and it's suitable to read large files in Python.
I don't think you can write code more efficient than that. fill_buf
on a BufReader
over a File
is basically just a straight call to read(2)
.
That said, BufReader
isn't really a useful abstraction when you use it like that; it would probably be less awkward to just call file.read(&mut buf)
directly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With