I have a numpy array (a):
array([[ 1. , 5.1, 3.5, 1.4, 0.2],
[ 1. , 4.9, 3. , 1.4, 0.2],
[ 2. , 4.7, 3.2, 1.3, 0.2],
[ 2. , 4.6, 3.1, 1.5, 0.2]])
I would like to make a pandas dataframe with values=a
, columns=A,B,C,D
and index=
to the first column of my numpy array, finally it should look like this:
A B C D
1 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
2 4.6 3.1 1.5 0.2
I am trying this:
df = pd.DataFrame(a, index=a[:,0], columns=['A', 'B','C','D'])
and I get the following error:
ValueError: Shape of passed values is (5, 4), indices imply (4, 4)
Any help?
You passed the complete array as the data
param, you need to slice your array also if you want just 4 columns from the array as the data:
In [158]:
df = pd.DataFrame(a[:,1:], index=a[:,0], columns=['A', 'B','C','D'])
df
Out[158]:
A B C D
1 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
2 4.6 3.1 1.5 0.2
Also having duplicate values in the index will make filtering/indexing problematic
So here a[:,1:]
I take all the rows but index from column 1 onwards as desired, see the docs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With