Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Random access on a huge file in haskell

Tags:

haskell

What is the best way to read a huge file (around 1 TB) in haskell. Basically the file contains a matrix of integer data. and I may need to (efficiently ) calculate the correlation between the different rows or between columns.

I have previously used pytables for this but was thinking of trying the same in haskell. I know haskell has some hdf5 bindings but is there any other options which I am not aware of ?

like image 712
Abhijit Ray Avatar asked Sep 24 '13 15:09

Abhijit Ray


1 Answers

As in any other language: you seek (using System.IO.hSeek), and then use binary IO (Data.ByteString.hGet). Then you parse the result (e.g. using attoparsec) and process as needed.

like image 97
Roman Cheplyaka Avatar answered Oct 29 '22 17:10

Roman Cheplyaka