Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with hdf5 files in R?

Tags:

r

hdf5

I have a file in hdf5 format. I know that it is supposed to be a matrix, but I want to read that matrix in R so that I can study it. I see that there is a h5r package that is supposed to help with this, but I do not see any simple to read/understand tutorial. Is such a tutorial available online. Specifically, How do you read a hdf5 object with this package, and how to actually extract the matrix?

UPDATE

I found out a package rhdf5 which is not part of CRAN but is part of BioConductoR. The interface is relatively easier to understand the the documentation and example code is quite clear. I could use it without problems. My problem it seems was the input file. The matrix that I wanted to read was actually stored in the hdf5 file as a python pickle. So every time I tried to open it and access it through R i got a segmentation fault. I did figure out how to save the matrix from within python as a tsv file and now that problem is solved.

like image 963
Sam Avatar asked Apr 12 '13 14:04

Sam


1 Answers

The rhdf5 package works really well, although it is not in CRAN. Install it from Bioconductor

# as of 2020-09-08, these are the updated instructions per # https://bioconductor.org/install/  if (!requireNamespace("BiocManager", quietly = TRUE))   install.packages("BiocManager") BiocManager::install(version = "3.11") 

And to use it:

library(rhdf5) 

List the objects within the file to find the data group you want to read:

h5ls("path/to/file.h5") 

Read the HDF5 data:

mydata <- h5read("path/to/file.h5", "/mygroup/mydata") 

And inspect the structure:

str(mydata) 

(Note that multidimensional arrays may appear transposed). Also you can read groups, which will be named lists in R.

like image 136
Mike T Avatar answered Oct 18 '22 23:10

Mike T