Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading numpy ndarrays into R?

I've got a numpy ndarray in three dimensions, saved to disk as a .npy file. I want to bring it into R to apply some statistical functions that aren't implemented in python. Is there a convenient way to do so? The RcppCNPy doesn't generalize to 3+ dimensions, at least not yet.

I could always save the array in some different format on the python side, but that'd be less convenient and more error-prone.

Here's some dummy data:

import numpy as np
goats_are_super = np.array(list(range(24))).reshape(4,3,2)
np.save("goats_are_super", goats_are_super)
like image 461
generic_user Avatar asked Apr 04 '19 16:04

generic_user


People also ask

Can you use NumPy in R?

We can also import three-dimensional array from NumPy as the next example shows. While the RcppCNPy package provides functions for the simple reading and writing of NumPy files, we can also use the reticulate package to access the NumPy functionality directly from R.

How do I convert NumPy to CSV?

You can save your NumPy arrays to CSV files using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format. You must also specify the delimiter; this is the character used to separate each variable in the file, most commonly a comma.

Does Rpy work with NumPy?

rpy2 has features to ease bidirectional communication with numpy .


2 Answers

You can try to use reticulate to wrap from R around the existing Python code. This is a little newer, but pretty general supporting many types.

In the RcppCNPy package I have vignette showing how reticulate can do what RcppCNPy does (of course at a cost of potentially slightly more involved installation) so maybe give that a try?

Again, the vignette is here for your perusal.

like image 179
Dirk Eddelbuettel Avatar answered Sep 30 '22 05:09

Dirk Eddelbuettel


Back in 2016, I had a similar issue. The solution that Avinash Balakrishnan and myself came up with can be found here:

http://thecoatlessprofessor.com/programming/numpy-arrays-to-r-array-objects/

In short, we used rpy2 to handle the conversion of NumPy to an R array.

import os, sys, getopt
import numpy as np
import re

from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri

def convert_numpy(path_to_data, fname, export_dir):
    """Convert NumPy N-D array to R object

    Keyword arguments:
    path_to_data -- full dir path to data
    fname        -- partial file name to match
    export_dir   -- Name of export dir added to data dir
    """  
    # Create a directory path
    if not os.path.exists("%s/%s" % (path_to_data,export_dir)):
        os.makedirs("%s/%s" % (path_to_data,export_dir))

    # Get list of files in the directory
    files = os.listdir(path_to_data)

    # Sort out which files are of each type
    numpy_files = sorted([f for f in files if fname in f])

    # Begin process conversion
    for numpy_fname in numpy_files:

        # Load in 4D Numpy Array
        d = np.load("%s/%s" % (path_to_data, numpy_fname))

        # Remove the file extension of .npy binary
        file_name = re.sub('\.npy$', '', numpy_fname)

        # Convert the numpy object to R
        ro = numpy2ri(d)

        # Assign the name
        r.assign("%s" % file_name,ro)

        # Export to .gzip readable by R's load() 
        r("save(%s, file='%s/%s/%s.gzip', compress=TRUE)" % (file_name,path_to_data,export_dir,file_name))

This can be read into R using:

load("a_patches_b1.gzip")      
like image 34
coatless Avatar answered Oct 01 '22 05:10

coatless