I have a file that is in the format:
1 4298 3598 39980 58903
39 3598 395 395 3598 3598
So just a bunch of numbers on each line (max number within 32bit signed int range).
My current code has to parse this every single time, first splitting the line into an array of strings and then converting each string to an int. Is there a faster way to do this via serialization or something that cuts out much of the parsing as I have to go over the same file many many times. I am happy to preprocess the file in other words.
Why not have the file in a binary format? The String conversions are completely unnecessary if you are only trying to get at the numerical values. Read in four bytes at the time and create an integer using them by means of bitwise operations. Serialization is a default mechanism designed to give the programmer an easy way to store objects, but a well-planned file format will work out to be easier and faster to parse.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With