Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which Haskell string type to use for Unicode data when fast (O(1)) indexing is required?

After reading about all 5 (String, Text, Text.Lazy, ByteString, ByteString.Lazy) commonly used types for strings in Haskell I am rather at the end of my wits:

What I need is a String type which is immutable (I read it once from a file and never change it), with fast indexing (O(1)) and which can be consumed by code point, rather than by potentially incomplete bytes, which form a code point.

I could live with a Data.ByteString.UTF32, actually, as with that representation, I would not need to be careful about multi byte encoding ever again.

Will I have to write myself such a module or - by any chance - did someone else come to the same conclusion and did it already?

like image 774
BitTickler Avatar asked Jul 30 '19 01:07

BitTickler


1 Answers

That sounds just like an array of Char: Data.Vector.Unbox.Vector Char.

https://hackage.haskell.org/package/vector-0.12.0.3/docs/Data-Vector-Unboxed.html

like image 151
Li-yao Xia Avatar answered Oct 23 '22 13:10

Li-yao Xia