Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Many types of String (ByteString)

I wish to compress my application's network traffic.

According to the (latest?) "Haskell Popularity Rankings", zlib seems to be a pretty popular solution. zlib's interface uses ByteStrings:

compress :: ByteString -> ByteString
decompress :: ByteString -> ByteString

I am using regular Strings, which are also the data types used by read, show, and Network.Socket:

sendTo :: Socket -> String -> SockAddr -> IO Int
recvFrom :: Socket -> Int -> IO (String, Int, SockAddr)

So to compress my strings, I need some way to convert a String to a ByteString and vice-versa. With hoogle's help, I found:

Data.ByteString.Char8 pack :: String -> ByteString

Trying to use it:

Prelude Codec.Compression.Zlib Data.ByteString.Char8> compress (pack "boo")

<interactive>:1:10:
    Couldn't match expected type `Data.ByteString.Lazy.Internal.ByteString'
           against inferred type `ByteString'
    In the first argument of `compress', namely `(pack "boo")'
    In the expression: compress (pack "boo")
In the definition of `it': it = compress (pack "boo")

Fails, because (?) there are different types of ByteString ?

So basically:

  • Are there several types of ByteString? What types, and why?
  • What's "the" way to convert Strings to ByteStrings?

Btw, I found that it does work with Data.ByteString.Lazy.Char8's ByteString, but I'm still intrigued.

like image 399
yairchu Avatar asked Sep 20 '09 19:09

yairchu


People also ask

What is ByteString?

A byte string is a fixed-length array of bytes. A byte is an exact integer between 0 and 255 inclusive. A byte string can be mutable or immutable. When an immutable byte string is provided to a procedure like bytes-set!, the exn:fail:contract exception is raised.

What does a ByteString look like?

A byte string is a sequence of bytes, like b'\xce\xb1\xce\xac' which represents "αά" . A character string is a bunch of characters, like "αά" . Synonymous to a sequence. A byte string can be directly stored to the disk directly, while a string (character string) cannot be directly stored on the disk.

What is the difference between a string and a byte string?

Byte objects are sequence of Bytes, whereas Strings are sequence of characters. Byte objects are in machine readable form internally, Strings are only in human readable form. Since Byte objects are machine readable, they can be directly stored on the disk.

How many bytes are in string?

A string is composed of: An 8-byte object header (4-byte SyncBlock and a 4-byte type descriptor)


2 Answers

There are two kinds of bytestrings: strict (defined in Data.Bytestring.Internal) and lazy (defined in Data.Bytestring.Lazy.Internal). zlib uses lazy bytestrings, as you've discovered.

like image 165
Alexey Romanov Avatar answered Sep 28 '22 09:09

Alexey Romanov


The function you're looking for is:

import Data.ByteString as BS
import Data.ByteString.Lazy as LBS

lazyToStrictBS :: LBS.ByteString -> BS.ByteString
lazyToStrictBS x = BS.concat $ LBS.toChunks x

I expect it can be written more concisely without the x. (i.e. point-free, but I'm new to Haskell.)

like image 42
fadedbee Avatar answered Sep 28 '22 10:09

fadedbee