Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problems with cyrillic, search on string

Tags:

c

I have to write counter of symbols. If I'm looking for s in string, count is 3, but if I'm looking for Cyrillic (н), something wrong. I tried to look for 237 code. This code I found in ASCII table http://ascii.org.ru/ascii.pdf.

How I can fix it?

#include <stdio.h>
#include <string.h>

int main () {
  char str[] = "This is a string. нннн";
  char * pch;
  int count = 0;

  pch = strchr(str, 's');

  while (pch != NULL) {
    count++;
    pch = strchr(pch + 1, 's');
  }
  printf("%i", count);
  return 0;
}
like image 221
rel1x Avatar asked Feb 11 '23 07:02

rel1x


1 Answers

I would suggest switching to wchar_t and wide-char functions (wcschr(), etc.).

So character data in the program would be stored in 32bit (Linux) or 16bit (Windows) instead of 8bit. This would allow to properly handle all locales.

Also, if You'll need to work with UTF-8 (multibyte strings), mbstowcs() should convert data to wchar_t.

Full example:

#include <stdio.h>
#include <wchar.h>

int main () {
  wchar_t str[] = L"This is a string. нннн";
  wchar_t * pch;
  int count = 0;

  pch = wcschr(str, L'н');

  while (pch != NULL) {
    count++;
    pch = wcschr(pch + 1, L'н');
  }
  wprintf(L"%i", count);
  return 0;
}
like image 106
kestasx Avatar answered Feb 13 '23 21:02

kestasx