I have a big text file (+100MB), each line being an integer number (containing 10 million numbers). Of course, the size and amount may change, so I don't know this in advance.
I want to load the file into a int[]
, making the process as fast as posible. First I came to this solution:
public int[] fileToArray(String fileName) throws IOException
{
List<String> list = Files.readAllLines(Paths.get(fileName));
int[] res = new int[list.size()];
int pos = 0;
for (String line: list)
{
res[pos++] = Integer.parseInt(line);
}
return res;
}
It was pretty fast, 5.5 seconds. Of which, 5.1s goes for the readAllLines
call, and 0.4s for the loop.
But then I decided to try using BufferedReader, and came to this different solution:
public int[] fileToArray(String fileName) throws IOException
{
BufferedReader bufferedReader = new BufferedReader(new FileReader(new File(fileName)));
ArrayList<Integer> ints = new ArrayList<Integer>();
String line;
while ((line = bufferedReader.readLine()) != null)
{
ints.add(Integer.parseInt(line));
}
bufferedReader.close();
int[] res = new int[ints.size()];
int pos = 0;
for (Integer i: ints)
{
res[pos++] = i.intValue();
}
return res;
}
This was even faster! 3.1 seconds, just 3s for the while
loop and not even 0.1s for the for
loop.
I know there is no much space here for optimization, at least in time, but using an ArrayList and then a int[] seems like too much memory to me.
Any ideas on how to make this faster, or avoid using the middle ArrayList?
Just for comparison, I do this same task with FreePascal in 1.9 seconds [see edit], using TStringList
class and StrToInt
function.
EDIT: Since I got a pretty short time with Java method, I had to improve the FreePascal one. 330~360ms.
If you're using Java 8, you can eliminate this middle ArrayList
by using lines()
and then mapping to an int
, then collecting the values into an array.
You should also be using try-with-resources for proper exception handling and auto-closing.
try (BufferedReader br = new BufferedReader(new FileReader(fileName))) {
return br.lines()
.mapToInt(Integer::parseInt)
.toArray();
}
I'm not sure if this is faster, but it is certainly much easier to maintain.
Edit: It is apparently MUCH faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With