I have downloaded Camomile and installed it and I am good to go for using it. The question is how should I use it? in ocaml, for default string, i just do <code>let s = "a string";;</code> but what with <code>Camomile</code>? for example, if I want to construct a <code>utf8</code> string <code>こんにちは</code> (a Japanese word for hello, copied from google translate), how should I do it with <code>Camomile</code>? <hr> Edit: It is funny that it is said that <code>ocaml</code> can't support <code>utf8</code>, but I tried this code <pre class="prettyprint"><code>let s = "你好";; let _ = print_string s;print_string "\n";; </code></pre> it worked in ocaml. But why?? <code>你好</code> is a Chinese, how can ocaml can print it and handle it if everyone says <code>ocaml 4.00.1</code> cannot handle <code>utf8</code>?

Here is a short presentation of the different actors: <ul> <li>ASCII is both a set of characters (there are 127 of them) and a code to represent them (on 7 bits).</li> <li>Unicode is a set of characters (there are a lot more than 127).</li> <li>UTF-8 is a code to represent unicode characters.</li> <li>Your terminal. It interprets bytes output by your program as UTF-8 encoded characters and displays the corresponding unicode characters.</li> <li>OCaml process sequences of bytes (OCaml uses the name <code>char</code> but it is misleading and the name <code>byte</code> would be more appropriate).</li> </ul> So if OCaml outputs the sequence of bytes corresponding to the UTF-8 code for <code>"你好"</code>, your terminal will interpret it as a utf-8 string and will output <code>你好</code>. But for OCaml, <code>"你好"</code> is just a sequence of 6 bytes.

How do I use Camomile for UTF8 strings in ocaml?

Tags:

ocaml

camomile

I have downloaded Camomile and installed it and I am good to go for using it.

The question is how should I use it?

in ocaml, for default string, i just do let s = "a string";;

but what with Camomile?

for example, if I want to construct a utf8 string こんにちは (a Japanese word for hello, copied from google translate), how should I do it with Camomile?

Edit:

It is funny that it is said that ocaml can't support utf8, but I tried this code

let s = "你好";;

let _ = print_string s;print_string "\n";;

it worked in ocaml. But why?? 你好 is a Chinese, how can ocaml can print it and handle it if everyone says ocaml 4.00.1 cannot handle utf8?

539

asked Apr 24 '13 16:04

Jackson Tale

Video Answer

2 Answers

Here is a short presentation of the different actors:

ASCII is both a set of characters (there are 127 of them) and a code to represent them (on 7 bits).
Unicode is a set of characters (there are a lot more than 127).
UTF-8 is a code to represent unicode characters.
Your terminal. It interprets bytes output by your program as UTF-8 encoded characters and displays the corresponding unicode characters.
OCaml process sequences of bytes (OCaml uses the name char but it is misleading and the name byte would be more appropriate).

So if OCaml outputs the sequence of bytes corresponding to the UTF-8 code for "你好", your terminal will interpret it as a utf-8 string and will output 你好. But for OCaml, "你好" is just a sequence of 6 bytes.

108

answered Sep 22 '22 18:09

Thomash

TörökEdwin told you everything you need to know, I think. UTF-8 is specifically designed as a way to store Unicode values (codepoints) in a series of 8-bit bytes when the code is used to dealing with ASCII C strings. Since OCaml strings are a series of 8-bit bytes there's no problem storing a UTF-8 value there. If the program you use to create your OCaml source handles UTF-8, then it will have no trouble creating a string containing a UTF-8 value. You don't need to do anything special to get that to happen. (As I said I've done this many times myself.)

If you don't need to process the value, then the OCaml I/O functions can also write out such a value (or read one in), and if the encoding of your display is UTF-8 (which is what I use), it will display correctly. But most often you will need to process your values. If you change your code to (for example) just write out the length of the string, you might start to see why you would need a special library for handling UTF-8.

If you wonder why a certain Unicode string is represented as a certain series of bytes in the UTF-8 encoding you just need to read up on UTF-8. The Wikipedia article (UTF-8) might be a reasonable place to start.

answered Sep 24 '22 18:09

Jeffrey Scofield

Related questions
                            
                                When is double coercion useful?
                            
                                Why are there flexible and rigid bounds in MLF?
                            
                                OCaml modules and performance
                            
                                Change application order in OCaml
                            
                                Warning 10: this expression should have type unit
                            
                                Define recursive signatures for modules
                            
                                Why the unbound type variables in OCAML object do not appear when class type is used?
                            
                                Linear types in OCaml
                            
                                Ordered variant types and subtypes in OCaml
                            
                                Functional programming function confusion
                            
                                What's the state of -ppx syntax extensions for OCaml?
                            
                                type of high order functions
                            
                                How to detect commutative patterns in Ocaml using pattern matching?
                            
                                New to OCaml: How would I go about implementing Gaussian Elimination?
                            
                                Is there an OCaml '@@' operator, and what does it mean?
                            
                                Database binding for OCaml?
                            
                                Difference between module <name> = struct .. end and module type <name> = struct.. end?
                            
                                What Javascript libraries have good support for syntax highlighting of OCaml code?
                            
                                multiple "mains" in linked OCaml modules
                            
                                What does `[< >]` mean in OCaml?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With