I have a pandas dataFrame that I am plotting with seaborn:
g = sns.FacetGrid(readCov, col='chr', col_wrap = 4, size=4)
g.map(plt.scatter, 'pos', 'bergC9', hue = edgecolor='white')
g.set(xlim= (0, 250000))
This works great and gives me a single graph for each 'chr' that is in the 'chr' column. However, I would like each graph to have multiple columns on it. Currently only one is displayed, the one called 'bergC9'. I want to put more columns on the same graph with different colors.
Any ideas?
Thanks!
edit: input data file
chr description pos bergB7 bergC9 EvolB20
1 1 '"ID=PBANKA_010290;Name=PBANKA_010290;descript... 108389 0.785456 0.899275 0.803017
2 1 '"ID=PBANKA_010300;Name=PBANKA_010300;descript... 117894 1.070673 0.964203 0.989372
3 1 '"ID=PBANKA_010310;Name=PBANKA_010310;descript... 119281 1.031106 1.042189 0.883518
4 1 '"ID=PBANKA_010320;Name=PBANKA_010320;descript... 122082 0.880109 1.031673 1.026539
5 1 '"ID=PBANKA_010330;Name=PBANKA_010330;descript... 126075 0.948105 0.969198 0.849213
EDIT: I would like a scatterplot that has pos as the x-axis and bergB7, bergC9, EvolB20 etc, which are all 'strains' as the y-axis, thus several strains on the same graph. I was able to accomplish this by reformatting my data set so it now has a 'strain' parameter or column and concatenated all of the y data. Now I can use the hue syntax with 'strain'. I would like to not have to reformat all of my data sets. I thought that it may be possible to create a loop that would reference all the columns I want plotted, but I tried several syntaxes to no avail. There are other ways I've thought of to accomplish this, but these create new datasets and I know is not the way to go programmatically. I am a new user and would like to start out correctly.
This is what the output should look like (subset of 15 graph panel shown): (I cannot post the image because my 'reputation' is not high enough)
Edited the data to have two chr
cases. Should work for any number of "strains" columns. The data does need reformatting; from the seaborn documentation:
To use these features, your data has to be in a Pandas DataFrame and it must take the form of what Hadley Whickam calls “tidy” data. In brief, that means your dataframe should be structured such that each column is a variable and each row is an observation.
but pandas does it easily:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
readCov = pd.DataFrame([ (1, '\'"ID=PBANKA_010290;Name=PBANKA_010290;descript...', 108389, 0.785456, 0.899275, 0.803017),
(1, '\'"ID=PBANKA_010300;Name=PBANKA_010300;descript...', 117894, 1.070673, 0.964203, 0.9893719999999999),
(1, '\'"ID=PBANKA_010310;Name=PBANKA_010310;descript...', 119281, 1.0311059999999999, 1.042189, 0.883518),
(2, '\'"ID=PBANKA_010320;Name=PBANKA_010320;descript...', 122082, 0.880109, 1.031673, 1.0265389999999999),
(2, '\'"ID=PBANKA_010330;Name=PBANKA_010330;descript...', 126075, 0.948105, 0.969198, 0.8492129999999999)],
columns=[u'chr', u'description', u'pos', u'bergB7', u'bergC9', u'EvolB20'],
)
meltCov = pd.melt(readCov,id_vars=['chr','description','pos'], var_name='strain')
g = sns.FacetGrid(meltCov, col='chr', hue='strain')
g.map(plt.scatter, 'pos','value')
g.set_xticklabels(rotation=45)
g.add_legend()
#this plots a figure per script automatically
from os.path import realpath, basename
s = basename(realpath(__file__))
fig = plt.gcf()
fig.savefig(s.split('.')[0])
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With