I'm trying to create a DataFrame with an append:
col_stats= ['Attribute', 'Mean', 'Var', 'Std']
stats = pd.DataFrame(columns=[col_stats])
for i in train:
new_row = [
i,
train[i].mean(),
np.var(train[i]),
np.nanstd(train[i])
]
new_row = pd.Series(new_row)
stats = stats.append(new_row, ignore_index=True)
stats
It works when I eliminate this line:
stats = stats.append(new_row, ignore_index=True)
If not, It gives me this error:
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long'
The 'Attribute' columns is a string (the name of the variable). The other columns (Mean, Var, Std) are numbers (integers, floats)
Why can I not use pd.df.append here?
For loop solution append rows to list and use DataFrame constructor:
L = []
for i in train:
new_row = [
i,
train[i].mean(),
np.var(train[i]),
np.nanstd(train[i])
]
L.append(new_row)
col_stats= ['Attribute', 'Mean', 'Var', 'Std']
stats = pd.DataFrame(L, columns=col_stats)
Sample:
train = pd.DataFrame({'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0]})
L = []
for i in train:
new_row = [
i,
train[i].mean(),
np.var(train[i]),
np.nanstd(train[i])
]
L.append(new_row)
col_stats= ['Attribute', 'Mean', 'Var', 'Std']
stats = pd.DataFrame(L, columns=col_stats)
print (stats)
Attribute Mean Var Std
0 B 4.500000 0.250000 0.500000
1 C 5.500000 6.916667 2.629956
2 D 2.833333 6.138889 2.477678
f1 = lambda x: x.var(ddof=0)
f2 = lambda x: x.std(ddof=0)
stats = train.agg(['mean',f1, f2]).T.reset_index()
stats.columns = ['Attribute', 'Mean', 'Var', 'Std']
print (stats)
Attribute Mean Var Std
0 B 4.500000 0.250000 0.500000
1 C 5.500000 6.916667 2.629956
2 D 2.833333 6.138889 2.477678
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With