Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data.Text vs String

While the general opinion of the Haskell community seems to be that it's always better to use Text instead of String, the fact that still the APIs of most of maintained libraries are String-oriented confuses the hell out of me. On the other hand, there are notable projects, which consider String as a mistake altogether and provide a Prelude with all instances of String-oriented functions replaced with their Text-counterparts.

So are there any reasons for people to keep writing String-oriented APIs except backwards- and standard Prelude-compatibility and the "switch-making intertia"? Are there possibly any other drawbacks to Text as compared to String?

Particularly, I'm interested in this because I'm designing a library and trying to decide which type to use to express error messages.

like image 749
Nikita Volkov Avatar asked Oct 26 '13 15:10

Nikita Volkov


People also ask

Is a string just text?

A String is just a container for a character sequence. The Text class is a JavaFX class for displaying a String / Text on the display (and formatting like color, size, etc.). So it is basically a UI component.

What is a string in Haskell?

Haskell string is a data type which is used to store the value of variable in the form of string, string is represented by the sequence of character in Haskell or in any other programming language as well.

Are there strings in Haskell?

A String is a list of characters. String constants in Haskell are values of type String .

What is text in Java?

text is new as of Java 1.1. It contains classes that support the internationalization of Java programs. The internationalization classes can be grouped as follows: Classes for formatting string representations of dates, times, numbers, and messages based on the conventions of a locale.


2 Answers

My unqualified guess is that most library writers don't want to add more dependencies than necessary. Since strings are part of literally every Haskell distribution (it's part of the language standard!), it is a lot easier to get adopted if you use strings and don't require your users to sort out Text distributions from hackage.

It's one of those "design mistakes" that you just have to live with unless you can convince most of the community to switch over night. Just look at how long it has taken to get Applicative to be a superclass of Monad – a relatively minor but much wanted change – and imagine how long it would take to replace all the String things with Text.


To answer your more specific question: I would go with String unless you get noticeable performance benefits by using Text. Error messages are usually rather small one-off things so it shouldn't be a big problem to use String.

On the other hand, if you are the kind of ideological purist that eschews pragmatism for idealism, go with Text.


* I put design mistakes in scare quotes because strings as a list-of-chars is a neat property that makes them easy to reason about and integrate with other existing list-operating functions.

like image 192
kqr Avatar answered Sep 20 '22 08:09

kqr


If your API is targeted at processing large amounts of character oriented data and/or various encodings, then your API should use Text.

If your API is primarily for dealing with small one-off strings, then using the built-in String type should be fine.

Using String for large amounts of text will make applications using your API consume significantly more memory. Using it with foreign encodings could seriously complicate usage depending on how your API works.

String is quite expensive (at least 5N words where N is the number of Char in the String). A word is same number of bits as the processor architecture (ex. 32 bits or 64 bits): http://blog.johantibell.com/2011/06/memory-footprints-of-some-common-data.html

like image 20
Alain O'Dea Avatar answered Sep 20 '22 08:09

Alain O'Dea