Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a Lazy ByteString to a strict ByteString

I have a function that takes a lazy ByteString, that I wish to have return lists of strict ByteStrings (the laziness should be transferred to the list type of the output).

import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as L
csVals :: L.ByteString -> [B.ByteString]

I want to do this for various reasons, several lexing functions require strict ByteStrings, and I can guarantee the outputted strict ByteStrings in the output of csVals above are very small.

How do I go about "strictifying" ByteStrings without chunking them?

Update0

I want to take a Lazy ByteString, and make one strict ByteString containing all its data.

like image 796
Matt Joiner Avatar asked Oct 19 '11 00:10

Matt Joiner


2 Answers

The bytestring package now exports a toStrict function:

http://hackage.haskell.org/packages/archive/bytestring/0.10.2.0/doc/html/Data-ByteString-Lazy.html#v:toStrict

This might not be exactly what you want, but it certainly answers the question in the title of this post :)

like image 163
ocharles Avatar answered Sep 22 '22 14:09

ocharles


Like @sclv said in the comments above, a lazy bytestring is just a list of strict bytestrings. There are two approaches to converting lazy ByteString to strict (source: haskell mailing list discussion about adding toStrict function) - relevant code from the email thread below:

First, relevant libraries:

import qualified Data.ByteString               as B
import qualified Data.ByteString.Internal      as BI
import qualified Data.ByteString.Lazy          as BL
import qualified Data.ByteString.Lazy.Internal as BLI
import           Foreign.ForeignPtr
import           Foreign.Ptr

Approach 1 (same as @sclv):

toStrict1 :: BL.ByteString -> B.ByteString
toStrict1 = B.concat . BL.toChunks

Approach 2:

toStrict2 :: BL.ByteString -> B.ByteString
toStrict2 BLI.Empty = B.empty
toStrict2 (BLI.Chunk c BLI.Empty) = c
toStrict2 lb = BI.unsafeCreate len $ go lb
  where
    len = BLI.foldlChunks (\l sb -> l + B.length sb) 0 lb

    go  BLI.Empty                   _   = return ()
    go (BLI.Chunk (BI.PS fp s l) r) ptr =
        withForeignPtr fp $ \p -> do
            BI.memcpy ptr (p `plusPtr` s) (fromIntegral l)
            go r (ptr `plusPtr` l)

If performance is a concern, I recommend checking out the email thread above. It has criterion benchmark as well. toStrict2 is faster than toStrict1 in those benchmarks.

like image 33
Sal Avatar answered Sep 25 '22 14:09

Sal