To create a pandas dataframe from numpy I can use :
columns = ['1','2']
data = np.array([[1,2] , [1,5] , [2,3]])
df_1 = pd.DataFrame(data,columns=columns)
df_1
If I instead use :
columns = ['1','2']
data = np.array([[1,2,2] , [1,5,3]])
df_1 = pd.DataFrame(data,columns=columns)
df_1
Where each array is a column of data. But this throws error :
ValueError: Wrong number of items passed 3, placement implies 2
Is there support in pandas in this data format or must I use the format in example 1 ?
You need to transpose your numpy
array:
df_1 = pd.DataFrame(data.T, columns=columns)
To see why this is necessary, consider the shape of your array:
print(data.shape)
(2, 3)
The second number in the shape tuple, or the number of columns in the array, must be equal to the number of columns in your dataframe.
When we transpose the array, the data and shape of the array are transposed, enabling it to be a passed into a dataframe with two columns:
print(data.T.shape)
(3, 2)
print(data.T)
[[1 1]
[2 5]
[2 3]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With