I have data stored in a CSV where the first row is strings (column names) and the remaining rows are numbers. How do I store this to a numpy array? All I can find is how to set data type for columns but not for rows.
Right now I'm just skipping the headers to do the calculations but I need to have the headers in the final version. But if I leave the headers in it sets the whole array as string and the calculations fail.
This is what I have:
data = np.genfromtxt(path_to_csv, dtype=None, delimiter=',', skip_header=1)
To read CSV data into a record in a Numpy array you can use the Numpy library genfromtxt() function, In this function's parameter, you need to set the delimiter to a comma. The genfromtxt() function is used quite frequently to load data from text files in Python.
The elements of a NumPy array, or simply an array, are usually numbers, but can also be boolians, strings, or other objects.
You can keep the column names if you use the names=True
argument in the function np.genfromtxt
data = np.genfromtxt(path_to_csv, dtype=float, delimiter=',', names=True)
Please note the dtype=float
, that will convert your data to float. This is more efficient than using dtype=None
, that asks np.genfromtxt
to guess the datatype for you.
The output will be a structured array, where you can access individual columns by their name. The names will be taken from your first row. Some modifications may occur, spaces in a column name will be changed to _
for example. The documentation should cover most questions you could have.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With