I have a pandas dataframe that has one column and it has a list of values in each row. I need to calculate the mean using the corresponding values from each row. That is I need the mean for eight values in the list. each element in the list is the value of a variable
>>> df_ex
0 [1, 2, 3, 4, 5, 6, 7, 8]
1 [2, 3, 4, 5, 6, 7, 8, 1]
I tried converting it to numpy array and then taking the means but I keep getting an error TypeError: unsupported operand type(s) for /: 'list' and 'int'
. I understand that instead of using lists, I should convert it to columns, but that in my context won't be possible. Any idea on how I could accomplish this?
You can convert to nested lists first and then to array
and then calculate the mean
:
a = np.array(df_ex.tolist())
print (a)
[[1 2 3 4 5 6 7 8]
[2 3 4 5 6 7 8 1]]
# Mean of all values
print (a.mean())
4.5
# Specify row-wise mean
print (a.mean(axis=1))
[ 4.5 4.5]
# Specify column-wise mean
print (a.mean(axis=0))
[ 1.5 2.5 3.5 4.5 5.5 6.5 7.5 4.5]
You can call on np.mean
by passing nested lists and specifying an axis.
Setup
df_ex = pd.DataFrame(dict(
col1=[[1, 2, 3, 4, 5, 6, 7, 8],
[2, 3, 4, 5, 6, 7, 8, 1]]))
df_ex
col1
0 [1, 2, 3, 4, 5, 6, 7, 8]
1 [2, 3, 4, 5, 6, 7, 8, 1]
Solution
np.mean(df_ex['col1'].tolist(), axis=1)
array([ 4.5, 4.5])
Or
np.mean(df_ex['col1'].tolist(), axis=0)
array([ 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 4.5])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With