Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting hdf5 to csv or tsv files

Tags:

csv

hdf5

bigdata

I am looking for a sample code which can convert .h5 files to csv or tsv. I have to read .h5 and output should be csv or tsv.

Sample code would be much appreciated,please help as i have stuck on it for last few days.I followed wrapper classes but don't know how to use that.I am not a good programmer so facing lot of problem.

please help thanks a lot in advance

like image 965
Sanjay Tiwari Avatar asked May 20 '14 11:05

Sanjay Tiwari


People also ask

Is HDF5 faster than csv?

The following picture shows averaged I/O times for each data format. An interesting observation here is that hdf shows even slower loading speed that the csv one while other binary formats perform noticeably better. The two most impressive are feather and parquet .

Can we convert H5 file to csv?

H5CellProfiler Converter is an application that converts the exported H5 files from CellProfiler into other formats, including . mat and . csv, which will help for further exploration with other program/application.

How do I convert a CSV file to HDF5?

display import clear_output CHUNK_SIZE = 5000000 filename = 'data. csv' dtypes = {'latitude': float, 'longitude': float} iter_csv = pd. read_csv( filename, iterator=True, dtype=dtypes, encoding='utf-8', chunksize=CHUNK_SIZE) cnt = 0 for ix, chunk in enumerate(iter_csv): chunk.

How do I convert a HDF file to Excel?

Excel cannot import HDF-EOS data directly. Thus, you need to generate ASCII values or create CSV file that can Excel read. Or, you need to import data through ODBC or Excel add-in.


1 Answers

Another python solution using pandas.

#!/usr/bin/env python3

import pandas as pd
import sys
fpath = sys.argv[1]
if len(sys.argv)>2:
    key = sys.argv[2]
    df = pd.read_hdf(fpath, key=key)
else:
    df = pd.read_hdf(fpath)

df.to_csv(sys.stdout, index=False)

This script is available here

First argument to this scrpt is hdf5 file. If second argument is passed, it is considered to be the name of column otherwise all columns are printed. It dumps the csv to stdout which you can redirect to a file.

For example, if your data is stored in hdf5 file called data.h5 and you have saved this script as hdf2df.py then

$ python3 hdf2df.py data.hf > data.csv

will write the data to a csv file data.csv.

like image 184
Dilawar Avatar answered Oct 24 '22 22:10

Dilawar