I've got a numpy ndarray
in three dimensions, saved to disk as a .npy
file. I want to bring it into R to apply some statistical functions that aren't implemented in python. Is there a convenient way to do so? The RcppCNPy doesn't generalize to 3+ dimensions, at least not yet.
I could always save the array in some different format on the python side, but that'd be less convenient and more error-prone.
Here's some dummy data:
import numpy as np
goats_are_super = np.array(list(range(24))).reshape(4,3,2)
np.save("goats_are_super", goats_are_super)
We can also import three-dimensional array from NumPy as the next example shows. While the RcppCNPy package provides functions for the simple reading and writing of NumPy files, we can also use the reticulate package to access the NumPy functionality directly from R.
You can save your NumPy arrays to CSV files using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format. You must also specify the delimiter; this is the character used to separate each variable in the file, most commonly a comma.
rpy2 has features to ease bidirectional communication with numpy .
You can try to use reticulate to wrap from R around the existing Python code. This is a little newer, but pretty general supporting many types.
In the RcppCNPy package I have vignette showing how reticulate can do what RcppCNPy does (of course at a cost of potentially slightly more involved installation) so maybe give that a try?
Again, the vignette is here for your perusal.
Back in 2016, I had a similar issue. The solution that Avinash Balakrishnan and myself came up with can be found here:
http://thecoatlessprofessor.com/programming/numpy-arrays-to-r-array-objects/
In short, we used rpy2
to handle the conversion of NumPy to an R array.
import os, sys, getopt
import numpy as np
import re
from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri
def convert_numpy(path_to_data, fname, export_dir):
"""Convert NumPy N-D array to R object
Keyword arguments:
path_to_data -- full dir path to data
fname -- partial file name to match
export_dir -- Name of export dir added to data dir
"""
# Create a directory path
if not os.path.exists("%s/%s" % (path_to_data,export_dir)):
os.makedirs("%s/%s" % (path_to_data,export_dir))
# Get list of files in the directory
files = os.listdir(path_to_data)
# Sort out which files are of each type
numpy_files = sorted([f for f in files if fname in f])
# Begin process conversion
for numpy_fname in numpy_files:
# Load in 4D Numpy Array
d = np.load("%s/%s" % (path_to_data, numpy_fname))
# Remove the file extension of .npy binary
file_name = re.sub('\.npy$', '', numpy_fname)
# Convert the numpy object to R
ro = numpy2ri(d)
# Assign the name
r.assign("%s" % file_name,ro)
# Export to .gzip readable by R's load()
r("save(%s, file='%s/%s/%s.gzip', compress=TRUE)" % (file_name,path_to_data,export_dir,file_name))
This can be read into R using:
load("a_patches_b1.gzip")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With