Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detecting locale from unicode string in c++

Tags:

c++

unicode

I have a string and I want to check if the content is in English or Hindi(My local language). I figured out that the unicode range for hindi character is from U0900-U097F.

What is the simplest way to find if the string has any characters in this range?

I can use std::string or Glib::ustring depending on whichever is convenient.

like image 220
Pallavi Avatar asked Aug 17 '09 13:08

Pallavi


People also ask

Can C handle Unicode?

It can represent all 1,114,112 Unicode characters. Most C code that deals with strings on a byte-by-byte basis still works, since UTF-8 is fully compatible with 7-bit ASCII.

Does C use Unicode or ASCII?

As far as I know, the standard C's char data type is ASCII, 1 byte (8 bits).

How do I check if a char is Unicode?

Check the length of the string and size in bytes. If both are equal then it ASCII. If size in bytes is larger than length of the string, then it contains UNICODE characters.

What is the Unicode code for C?

Unicode Character “C” (U+0043)


1 Answers

Here is how you do it with Glib::ustring :

using Glib::ustring;

ustring x("सहस");    // hindi string
bool is_hindi = false;
for (ustring::iterator i = x.begin(); i != x.end(); i ++)
    if (*i >= 0x0900 && *i <= 0x097f)
        is_hindi = true;
like image 114
Sahas Avatar answered Oct 15 '22 09:10

Sahas