I've got a set of large ascii data files that I need to get into a numpy array. By large, I mean 390 lines, where each line is 60,000 values (double values output with high precision from a C++ program) separated by a space.
Currently I am using the following (naive) code:
import numpy as np
data_array = np.genfromtxt('l_sim_s_data.txt')
However, this takes upwards of 25 seconds to run. I suspect it is due to not preallocating the data_array before reading the values in. Is there any way to tell genfromtxt the size of the array it is making (so memory would be preallocated)? Or does anyone have an idea on how to speed this process up?
Have you tried np.loadtxt?
(genfromtxt is a more advanced file loader, which handles things like missing values and format converters.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With