Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

plot multiple columns on same graph seaborn

I have a pandas dataFrame that I am plotting with seaborn:

g = sns.FacetGrid(readCov, col='chr', col_wrap = 4, size=4)
g.map(plt.scatter, 'pos', 'bergC9', hue =  edgecolor='white')
g.set(xlim= (0, 250000))

This works great and gives me a single graph for each 'chr' that is in the 'chr' column. However, I would like each graph to have multiple columns on it. Currently only one is displayed, the one called 'bergC9'. I want to put more columns on the same graph with different colors.

Any ideas?

Thanks!

edit: input data file

chr description pos bergB7  bergC9  EvolB20
1   1   '"ID=PBANKA_010290;Name=PBANKA_010290;descript...   108389  0.785456    0.899275    0.803017
2   1   '"ID=PBANKA_010300;Name=PBANKA_010300;descript...   117894  1.070673    0.964203    0.989372
3   1   '"ID=PBANKA_010310;Name=PBANKA_010310;descript...   119281  1.031106    1.042189    0.883518
4   1   '"ID=PBANKA_010320;Name=PBANKA_010320;descript...   122082  0.880109    1.031673    1.026539
5   1   '"ID=PBANKA_010330;Name=PBANKA_010330;descript...   126075  0.948105    0.969198    0.849213

EDIT: I would like a scatterplot that has pos as the x-axis and bergB7, bergC9, EvolB20 etc, which are all 'strains' as the y-axis, thus several strains on the same graph. I was able to accomplish this by reformatting my data set so it now has a 'strain' parameter or column and concatenated all of the y data. Now I can use the hue syntax with 'strain'. I would like to not have to reformat all of my data sets. I thought that it may be possible to create a loop that would reference all the columns I want plotted, but I tried several syntaxes to no avail. There are other ways I've thought of to accomplish this, but these create new datasets and I know is not the way to go programmatically. I am a new user and would like to start out correctly.

This is what the output should look like (subset of 15 graph panel shown): (I cannot post the image because my 'reputation' is not high enough)

like image 367
pinkvirus Avatar asked Apr 18 '15 02:04

pinkvirus


1 Answers

Edited the data to have two chr cases. Should work for any number of "strains" columns. The data does need reformatting; from the seaborn documentation:

To use these features, your data has to be in a Pandas DataFrame and it must take the form of what Hadley Whickam calls “tidy” data. In brief, that means your dataframe should be structured such that each column is a variable and each row is an observation.

but pandas does it easily:

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
readCov = pd.DataFrame([ (1, '\'"ID=PBANKA_010290;Name=PBANKA_010290;descript...', 108389, 0.785456, 0.899275, 0.803017),
       (1, '\'"ID=PBANKA_010300;Name=PBANKA_010300;descript...', 117894, 1.070673, 0.964203, 0.9893719999999999),
       (1, '\'"ID=PBANKA_010310;Name=PBANKA_010310;descript...', 119281, 1.0311059999999999, 1.042189, 0.883518),
       (2, '\'"ID=PBANKA_010320;Name=PBANKA_010320;descript...', 122082, 0.880109, 1.031673, 1.0265389999999999),
       (2, '\'"ID=PBANKA_010330;Name=PBANKA_010330;descript...', 126075, 0.948105, 0.969198, 0.8492129999999999)],
       columns=[u'chr', u'description', u'pos', u'bergB7', u'bergC9', u'EvolB20'],
       )

meltCov = pd.melt(readCov,id_vars=['chr','description','pos'], var_name='strain')
g = sns.FacetGrid(meltCov, col='chr', hue='strain')
g.map(plt.scatter, 'pos','value')
g.set_xticklabels(rotation=45)
g.add_legend()

#this plots a figure per script automatically
from os.path import realpath, basename 
s = basename(realpath(__file__))
fig = plt.gcf()
fig.savefig(s.split('.')[0])
plt.show()

enter image description here

like image 184
cphlewis Avatar answered Oct 05 '22 02:10

cphlewis