I am facing an issue while adding a new row to the data set.
Here is the example DataFrame
.
column_names = ['A','B','C']
items = [['a1','b1','c1'],['a2','b2']]
newDF = pd.DataFrame(items,columns=column_names)
print(newDF)
output:
A B C
0 a1 b1 c1
1 a2 b2 None
Since c2 was missing, it was replaced with None
. This is fine and as expected.
Now if i continue to add similar rows to this existing DataFrame
, like this:
newDF.loc[len(newDF)] = ['a3','b3']
I get the error "cannot set a row with mismatched columns".
How can I add this additional row, so that it will automatically take care of missing c3 with None
or NaN?
One option is DataFrame.append
:
>>> new_row = ['a3', 'b3']
>>> newDF.append(pd.Series(new_row, index=newDF.columns[:len(new_row)]), ignore_index=True)
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
what about just :
>>> print(newDF)
A B C
0 a1 b1 c1
1 a2 b2 None
>>> newDF
A B C
0 a1 b1 c1
1 a2 b2 None
Just place new index 2
with new values a3
& b3
and last column.
>>> newDF.loc['2'] = ['a3','b3', np.nan]
>>> newDF
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
OR
>>> row = ['a3','b3', np.nan]
>>> newDF.loc['2'] = row
>>> newDF
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
Another way around: appending to Dataframe, the new values across the row for desired columns as we have for A
& B
this another column for them row will become NaN
>>> row
['a3', 'b3']
>>> newDF.append(pd.DataFrame([row],index=['2'],columns=['A', 'B']))
A B C
0 a1 b1 c1
1 a2 b2 None
2 a3 b3 NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With