Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loading a .rds file in Pandas

I have downloaded a file with format .rds, How can I load this with Pandas? It is supposed to be an R file but I haven't been able to find any info about how to load it.

like image 687
D1X Avatar asked Dec 06 '16 13:12

D1X


People also ask

Can pandas read RDS file?

A python package to read and write R RData and Rds files into/from pandas dataframes. It does not need to have R or other external dependencies installed. It can read mainly R data frames and tibbles. Also supports vectors, matrices, arrays and tables.

How do I open .RDS files?

If you cannot open your RDS file correctly, try to right-click or long-press the file. Then click "Open with" and choose an application. You can also display a RDS file directly in the browser: Just drag the file onto this browser window and drop it.

What is a .RDS file?

Rds files store a single R object. According to R documentation: These functions provide the means to save a single R object to a connection (typically a file) and to restore the object, quite possibly under a different name.


2 Answers

If you would prefer not having to install R (rpy2 requires it), there is a new package "pyreadr" to read Rds and RData files very easily.

It is a wrapper around the C library librdata, so it is very fast.

You can install it easily with pip:

pip install pyreadr 

Then you can read your rds file:

import pyreadr  result = pyreadr.read_r('/path/to/file.Rds') # also works for RData  # done!  # result is a dictionary where keys are the name of objects and the values python # objects. In the case of Rds there is only one object with None as key df = result[None] # extract the pandas data frame  

The repo is here: https://github.com/ofajardo/pyreadr

Disclaimer: I am the developer of this package.

like image 177
Otto Fajardo Avatar answered Sep 23 '22 23:09

Otto Fajardo


You could use the rpy2 interface to Pandas, in the following manner:

import rpy2.robjects as robjects from rpy2.robjects import pandas2ri pandas2ri.activate()  readRDS = robjects.r['readRDS'] df = readRDS('my_file.rds') df = pandas2ri.ri2py(df) # do something with the dataframe 
like image 43
mgalardini Avatar answered Sep 24 '22 23:09

mgalardini