Let's consider the following CSV file test.csv:
"x","y","A","B"
8000000000,"0,1","0.113948,0.113689",0.114042
8000000000,"0,1","0.114063,0.113823",0.114175
8000000000,"0,1","0.114405,0.114366",0.114524
8000000000,"0,1,2,3","0.167543,0.172369,0.419197,0.427285",0.427576
8000000000,"0,1,2,3","0.167784,0.172145,0.418624,0.426492",0.428736
8000000000,"0,1,2,3","0.168121,0.172729,0.419768,0.427467",0.428578
My goal is to group the lines by the columns "x" and "y", and compute the arithmetic mean over the columns "A" and "B".
My first approach was to use a combination of groupby() and mean() in Pandas:
import pandas
if __name__ == "__main__":
data = pandas.read_csv("test.csv", header=0)
data = data.groupby(["x", "y"], as_index=False).mean()
print(data)
Running this script yields the following output:
x y B
0 8000000000 0,1 0.114247
1 8000000000 0,1,2,3 0.428297
As we can see, achieving my goal for the single-valued column "B" is straightforward. However, the column "A" is omitted. Instead, I'd like to have the column "A" with with a string containing the arithmetic mean for each comma-separated value. The desired output should look like this:
x y A B
0 8000000000 0,1 0.114139,0.113959 0.114247
1 8000000000 0,1,2,3 0.167816,0.172414,0.419196,0.427081 0.428297
Does anybody know how to do this?
You can make a custom aggregate function that parses those strings into lists, finds the averages per column, and formats them back as strings:
def string_mean(rows):
data_list = []
for row in rows:
data_list.append([float(item) for item in row.split(",")])
data = np.array(data_list)
return ",".join([f"{item:.6f}" for item in data.mean(axis=0)])
df.groupby(["x", "y"], as_index=False).agg({"A": string_mean, "B": "mean"})
which returns
x y A B
0 8000000000 0,1 0.114139,0.113959 0.114247
1 8000000000 0,1,2,3 0.167816,0.172414,0.419196,0.427081 0.428297
Note it will error if your strings in A have a different number of columns in a single group.
You can probably clean up my function above considerably btw
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With