I am basically trying to read a large file (around 10G) into a list of lines. The file contains a sequence of integer, something like this:
0x123456
0x123123
0x123123
.....
I used the method below to read files by default for my codebase, but it turns out to be quit slow (~12 minutes) at this scenario
let lines_from_file (filename : string) : string list =
let lines = ref [] in
let chan = open_in filename in
try
while true; do
lines := input_line chan :: !lines
done; []
with End_of_file ->
close_in chan;
List.rev !lines;;
I guess I need to read the file into memory, and then split them into lines (I am using a 128G server, so it should be fine for the memory space). But I still didn't understand whether OCaml
provides such facility after searching the documents here.
So here is my question:
Given my situation, how to read files into string list in a fast way?
How about using stream
? But I need to adjust related application code, then that could cause some time.
First of all you should consider whether you really need to have all the information at once in your memory. Maybe it is better to process file line-by-line?
If you really want to have it all at once in memory, then you can use Bigarray
's map_file
function to map a file as an array of characters. And then do something with it.
Also, as I see, this file contains numbers. Maybe it is better to allocate the array (or even better a bigarray) and the process each line in order and store integers in the (big)array.
I often use the two following function to read the lines of a file. Note that the function lines_from_files
is tail-recursive.
let read_line i = try Some (input_line i) with End_of_file -> None
let lines_from_files filename =
let rec lines_from_files_aux i acc = match (read_line i) with
| None -> List.rev acc
| Some s -> lines_from_files_aux i (s :: acc) in
lines_from_files_aux (open_in filename) []
let () =
lines_from_files "foo"
|> List.iter (Printf.printf "lines = %s\n")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With