Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to keep numpy array when saving pandas dataframe to csv

I have a pandas.DataFrame with one of the columns as images. Each row of that column is an image as a 2d numpy.array. I saved the DataFrame to a csv file with pandas.DataFrame.to_csv(). However, when I open the csv file, the column becomes string instead of a numpy.array.

How can read the csv file and keep the numpy.array?

like image 471
zesla Avatar asked Mar 13 '17 01:03

zesla


1 Answers

To read the numpy.array from the csv file, you can provide a converter function to pandas.read_csv.

Code:

import ast
import numpy as np
def from_np_array(array_string):
    array_string = ','.join(array_string.replace('[ ', '[').split())
    return np.array(ast.literal_eval(array_string))

Test Code:

import numpy as np
import pandas as pd

image = np.array([[0.1, 0.2], [0.3, 0.4]])
df = pd.DataFrame(
    [['image name1', image],
     ['image name2', image],
     ],
    columns=['names', 'images']).set_index('names')
print(df)
df.to_csv('sample.csv')

df2 = pd.read_csv('sample.csv', converters={'images': from_np_array})
print(df2)

Results:

                               images
names                                
image name1  [[0.1, 0.2], [0.3, 0.4]]
image name2  [[0.1, 0.2], [0.3, 0.4]]

         names                    images
0  image name1  [[0.1, 0.2], [0.3, 0.4]]
1  image name2  [[0.1, 0.2], [0.3, 0.4]]
like image 174
Stephen Rauch Avatar answered Oct 16 '22 14:10

Stephen Rauch