Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using R in Python with Rpy2: how to ggplot2?

I am trying to use R in Python and I found Rpy2 very interesting. It is powerful and not that difficult to use, however even if I have read the documentation and looked for a similar question, I wasn't able to solve my problem with the ggplot2 library.

Basically I have a dataset with 2 columns, 11 rows and no header and I would like to do a scatter plot using this R code from Python:

ggplot(dataset,aes(dataset$V1, dataset$V2))+geom_point()+scale_color_gradient(low="yellow",high="red")+geom_smooth(method='auto')+labs(title = "Features distribution on Scaffolds", x='Scaffolds Length', y='Number of Features')

I have tested this code in R (after read.table my file) and it works. Now, this is my python script:

import math, datetime
import rpy2
import rpy2.robjects as robjects
import rpy2.robjects.lib.ggplot2 as ggplot2

r = robjects.r
df = r("read.table('file_name.txt',sep='\t', header=F)")
gp = ggplot2.ggplot(df, ggplot2.aes(df[0], df[1])) + ggplot2.geom_point() + ggplot2.scale_color_gradient(low="yellow",high="red") + ggplot2.geom_smooth(method='auto') + ggplot2.labs(title = "Features distribution on Scaffolds", x='Scaffolds Length', y='Number of Features')
gp.plot()

If i run this Python code, it gives me two errors. The first is:

gp = ggplot2.ggplot(df, ggplot2.aes(df[0], df[1]))
TypeError: new() takes exactly 1 argument (3 given)

and the second is:

AttributeError: 'module' object has no attribute 'scale_color_gradient'

Can someone help me to understand where I'm wrong please?

like image 450
Revo Avatar asked Feb 02 '16 11:02

Revo


People also ask

How do I use rpy2 in Python?

Installing rpy2 You must have Python >=3.7 and R >= 4.0 installed to use rpy2 3.5. 2. Once R is installed, install the rpy2 package by running pip install rpy2 . If you'd like to see where you installed rpy2 on your machine, you can run python -m rpy2.

Does rpy2 need R installed?

rpy2 will typically require an R version that is not much older than itself. This means that even if your system has R pre-installed, there is a chance that the version is too old to be compaible with rpy2. At the time of this writing, the latest rpy2 version is 2.8 and requires R 3.2 or higher.

Can you run R code in Python?

How to Run Python Code from R. The reticulate package comes with a Python engine you can use in R Markdown. Reticulate allows you to run chunks of Python code, print Python output, access Python objects, and so on.


1 Answers

Perhaps you need to associate a dataframe column to the colour of the scatter points so that the scale_colour_gradient can be associated to that column:

import numpy as np
import pandas as pd
import rpy2.robjects.packages as packages
import rpy2.robjects.lib.ggplot2 as ggplot2
import rpy2.robjects as ro
R = ro.r
datasets = packages.importr('datasets')
mtcars = packages.data(datasets).fetch('mtcars')['mtcars']
gp = ggplot2.ggplot(mtcars)
pp = (gp 
      + ggplot2.aes_string(x='wt', y='mpg')
      + ggplot2.geom_point(ggplot2.aes_string(colour='qsec'))
      + ggplot2.scale_colour_gradient(low="yellow", high="red") 
      + ggplot2.geom_smooth(method='auto') 
      + ggplot2.labs(title="mtcars", x='wt', y='mpg'))

pp.plot()
R("dev.copy(png,'/tmp/out.png')")

enter image description here


The error

gp = ggplot2.ggplot(df, ggplot2.aes(df[0], df[1]))
TypeError: new() takes exactly 1 argument (3 given)

occurred because ggplot2.ggplot takes only 1 argument, the dataframe:

gp = ggplot2.ggplot(df)

You can then add the aesthetics mapping to gp:

gp + ggplot2.aes_string(x='0', y='1')

where '0' and '1' are column names of df. Per the examples in the docs, I've used aes_string here instead of aes.


The second error

AttributeError: 'module' object has no attribute 'scale_color_gradient'

occurred because ggplot2 uses the British spelling of color: scale_colour_gradient:

like image 110
unutbu Avatar answered Oct 25 '22 18:10

unutbu