I have three dataframes containing 17 sets of data with groups A, B, and C. A shown in the following code snippet
import pandas as pd
import numpy as np
data1 = pd.DataFrame(np.random.rand(17,3), columns=['A','B','C'])
data2 = pd.DataFrame(np.random.rand(17,3)+0.2, columns=['A','B','C'])
data3 = pd.DataFrame(np.random.rand(17,3)+0.4, columns=['A','B','C'])
I would like to plot a box plot to compare the three groups as shown in the figure below I am trying make the plot using seaborn's box plot as follows
import seaborn as sns
sns.boxplot(data1, groupby='A','B','C')
but obviously this does not work. Can someone please help?
We can use the concat function in pandas to append either columns or rows from one DataFrame to another.
Consider assigning an indicator like Location to distinguish your three sets of data. Then concatenate all three and melt the data to retrieve one value column, one Letter categorical column, and one Location column, all inputs into sns.boxplot
:
import pandas as pd
import numpy as np
from matplotlib pyplot as plt
import seaborn as sns
data1 = pd.DataFrame(np.random.rand(17,3), columns=['A','B','C']).assign(Location=1)
data2 = pd.DataFrame(np.random.rand(17,3)+0.2, columns=['A','B','C']).assign(Location=2)
data3 = pd.DataFrame(np.random.rand(17,3)+0.4, columns=['A','B','C']).assign(Location=3)
cdf = pd.concat([data1, data2, data3])
mdf = pd.melt(cdf, id_vars=['Location'], var_name=['Letter'])
print(mdf.head())
# Location Letter value
# 0 1 A 0.223565
# 1 1 A 0.515797
# 2 1 A 0.377588
# 3 1 A 0.687614
# 4 1 A 0.094116
ax = sns.boxplot(x="Location", y="value", hue="Letter", data=mdf)
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With