I've been suggested csv-conduit as a good Haskell package to work with CSV files. I want to learn how it works, but the documentation is too terse for a newbie Haskell programmer.
Is there a way for me to figure out how it works by trial-and-error in GHCi?
More specifically, should I load modules and files from GHCi or should I write a simple HS file to load them and then move around interactively?
I mentioned csv-conduit, but I'm opened to using any CSV package. I just need to get my hands on one and fool around with it, until I feel at ease (much like I would do in IDLE).
Take a look at the following function: readCSVFile :: :: (MonadResource m, CSV ByteString a) => CSVSettings -> FilePath -> m [a]
Its relatively simple to call, as we just need a CSVSettings
, such as defCSVSettings
, and a FilePath
(aka String
), "file.csv"
or something.
Thus, after the call, we get (MonadResource m, CSV ByteString a)
. We can resolve this one at a time to figure out an appropriate type for this. We are performing IO
in this operation, so for MonadResource m
, m
should just be ResourceT IO
, which happens to be an instance of MonadBaseControl IO
as required by runResourceT
. This is a conduit
specific thing.
For the CSV ByteString a
, we need to find what instances of CSV
. To do so, go to http://hackage.haskell.org/packages/archive/csv-conduit/0.2.1.1/doc/html/Data-CSV-Conduit.html#t:CSV (where the documentation for the package is in my opinion somewhat obnoxiously all stuffed into the typeclass...) and click on Instances to see what available instances we have of the form CSV ByteString a
. The two options are CSV ByteString ByteString
and CSV ByteString Text
.
Of the two of these, Text
is preferable because it handles unicode and CSV is unlikely to be containing binary data. ByteString
is more or less similar to a [Word8]
while Text
is more similar to [Char]
which is probably what you want. Hence, a
should be Text
(although ByteString
will still work).
This means the result of the function call is ResourceT IO [Row Text]
. We can't do much with this, but because ResourceT
is a monad transformer, we can easily "pop" off the monad transformation layer with the function runResourceT
. Thus,
readFile :: FilePath -> IO [Row Text]
readFile = runResourceT . readCSVFile defCSVSettings
which is easily usable within, say, main to get at the [Row Text]
which you can then iterate over with a map
or a fold
to get your hands on the individual rows.
To run this sort of thing in GHCI you absolutely have to specifically point out the type. The reason is that the result class instance is not dependent on any of the parameters; thus, for any set of CSVSettings
and FilePath
, readCSVFile
could return any number of different types as long as they as m
is an instance of MonadResource m
and a
is an instance of CSV ByteString a
. Thus, we have to explicitly point out to GHCi which type you want.
Have you tried Text.CSV? It might be more appropriate if you're just starting out with Haskell, as it's much simpler. As for exploring new modules, you can just load it into GHCi, there's no need to write an additional file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With