I have a dataframe containing numpy array.
I saved it to a csv file.
After loading the csv file, I found that the column containing the numpy array has dtype string.
How to convert it to numpy array using read_csv?
import pandas as pd
import numpy as np
df = pd.DataFrame(columns = ['name', 'sex'])
df.loc[len(df), :] = ['Sam', 'M']
df.loc[len(df), :] = ['Mary', 'F']
df.loc[len(df), :] = ['Ann', 'F']
#insert np.array
df['data'] = ''
df['data'][0] = np.array([2,5,7])
df['data'][1] = np.array([6,4,8])
df['data'][2] = np.array([9,2,1])
#save to csv file
df.to_csv('data.csv', index =False)
#load csv file
df2 = pd.read_csv('data.csv')#data column becomes string, how to change it to np.array?
Its a workaround:
In [114]: df2['data'] = df2.data.str.split(' ',expand=True).replace('\[|\]','',regex=True).astype(int).values.tolist()
In [115]: df2['data'] = [np.array(i) for i in df2.data]
In [116]: df2.loc[0,'data']
Out[116]: array([2, 5, 7])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With