Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing a list of strings to a HDF5 Dataset from Python

Tags:

python

hdf5

h5py

I am trying to store a variable length list of string to a HDF5 Dataset. The code for this is

import h5py h5File=h5py.File('xxx.h5','w') strList=['asas','asas','asas']   h5File.create_dataset('xxx',(len(strList),1),'S10',strList) h5File.flush()  h5File.Close()   

I am getting an error stating that "TypeError: No conversion path for dtype: dtype('&lt U3')" where the &lt means actual less than symbol
How can I solve this problem.

like image 713
gman Avatar asked Apr 22 '14 13:04

gman


People also ask

Can HDF5 store strings?

Storing stringsYou can use string_dtype() to explicitly specify any HDF5 string datatype.

How does HDF5 store data?

HDF5 uses a "file directory" like structure that allows you to organize data within the file in many different structured ways, as you might do with files on your computer. The HDF5 format also allows for embedding of metadata making it self-describing.


1 Answers

You're reading in Unicode strings, but specifying your datatype as ASCII. According to the h5py wiki, h5py does not currently support this conversion.

You'll need to encode the strings in a format h5py handles:

asciiList = [n.encode("ascii", "ignore") for n in strList] h5File.create_dataset('xxx', (len(asciiList),1),'S10', asciiList) 

Note: not everything encoded in UTF-8 can be encoded in ASCII!

like image 156
SlightlyCuban Avatar answered Oct 07 '22 04:10

SlightlyCuban