Let's consider the following CSV file test.csv
:
"x","y","A","B"
8000000000,"0,1","0.113948,0.113689",0.114042
8000000000,"0,1","0.114063,0.113823",0.114175
8000000000,"0,1","0.114405,0.114366",0.114524
8000000000,"0,1,2,3","0.167543,0.172369,0.419197,0.427285",0.427576
8000000000,"0,1,2,3","0.167784,0.172145,0.418624,0.426492",0.428736
8000000000,"0,1,2,3","0.168121,0.172729,0.419768,0.427467",0.428578
My goal is to group the lines by the columns "x"
and "y"
, and compute the arithmetic mean over the columns "A"
and "B"
.
My first approach was to use a combination of groupby()
and mean()
in Pandas:
import pandas
if __name__ == "__main__":
data = pandas.read_csv("test.csv", header=0)
data = data.groupby(["x", "y"], as_index=False).mean()
print(data)
Running this script yields the following output:
x y B
0 8000000000 0,1 0.114247
1 8000000000 0,1,2,3 0.428297
As we can see, achieving my goal for the single-valued column "B"
is straightforward. However, the column "A"
is omitted. Instead, I'd like to have the column "A"
with with a string containing the arithmetic mean for each comma-separated value. The desired output should look like this:
x y A B
0 8000000000 0,1 0.114139,0.113959 0.114247
1 8000000000 0,1,2,3 0.167816,0.172414,0.419196,0.427081 0.428297
Does anybody know how to do this?
You can make a custom aggregate function that parses those strings into lists, finds the averages per column, and formats them back as strings:
def string_mean(rows):
data_list = []
for row in rows:
data_list.append([float(item) for item in row.split(",")])
data = np.array(data_list)
return ",".join([f"{item:.6f}" for item in data.mean(axis=0)])
df.groupby(["x", "y"], as_index=False).agg({"A": string_mean, "B": "mean"})
which returns
x y A B
0 8000000000 0,1 0.114139,0.113959 0.114247
1 8000000000 0,1,2,3 0.167816,0.172414,0.419196,0.427081 0.428297
Note it will error if your strings in A have a different number of columns in a single group.
You can probably clean up my function above considerably btw
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With