Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create pandas dataframe from numpy array

To create a pandas dataframe from numpy I can use :

columns = ['1','2']
data = np.array([[1,2] , [1,5] , [2,3]])
df_1 = pd.DataFrame(data,columns=columns)
df_1

If I instead use :

columns = ['1','2']
data = np.array([[1,2,2] , [1,5,3]])
df_1 = pd.DataFrame(data,columns=columns)
df_1

Where each array is a column of data. But this throws error :

ValueError: Wrong number of items passed 3, placement implies 2

Is there support in pandas in this data format or must I use the format in example 1 ?

like image 765
blue-sky Avatar asked Dec 23 '22 07:12

blue-sky


1 Answers

You need to transpose your numpy array:

df_1 = pd.DataFrame(data.T, columns=columns)

To see why this is necessary, consider the shape of your array:

print(data.shape)

(2, 3)

The second number in the shape tuple, or the number of columns in the array, must be equal to the number of columns in your dataframe.

When we transpose the array, the data and shape of the array are transposed, enabling it to be a passed into a dataframe with two columns:

print(data.T.shape)

(3, 2)

print(data.T)

[[1 1]
 [2 5]
 [2 3]]
like image 99
jpp Avatar answered Jan 10 '23 09:01

jpp