Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Portable and simple unicode string library for C/C++?

Tags:

c++

c

unicode

I'm looking for a portable and easy-to-use string library for C/C++, which helps me to work with Unicode input/output. In the best case, it will store its strings in memory in UTF-8, and allow me to convert strings from ASCII to UTF-8/UTF-16 and back. I don't need much more besides that (ok, a liberal license won't hurt). I have seen that C++ comes with a <locale> header, but this seems to work on wchar_t only, which may or may not be UTF-16 encoded, plus I'm not sure how good this is actually.

Uses cases are for example: On Windows, the unicode APIs expect UTF-16 strings, and I need to convert ASCII or UTF-8 strings to pass it on to the API. Same goes for XML parsing, which may come with UTF-16, but I actually only want to process internally with UTF-8 (or, for that matter, if I switch internally to UTF-16, I'll need a conversion to that anyway).

So far, I've taken a look at the ICU, which is quite huge. Moreover, it wants to be built using it own project files, while I'd prefer a library for which there is either a CMake project or which is easy to build (something like compile all these .c files, link and good to go), instead of shipping something large as the ICU along my application.

Do you know such a library, which is also being maintained? After all, this seems to be a pretty basic problem.

like image 639
Anteru Avatar asked Jan 11 '09 17:01

Anteru


1 Answers

UTF8-CPP seems to be exactly what you want.

like image 60
Nemanja Trifunovic Avatar answered Oct 04 '22 09:10

Nemanja Trifunovic