Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I find out how many bytes a character has?

I was wondering how do I find out how many bytes does a character have?

like image 876
HELP Avatar asked May 22 '11 07:05

HELP


2 Answers

If you want to find out how many UTF-8 bytes a letter in a PHP string has then:

print strlen(mb_substr($string, 0, 1, "utf-8"));

strlen() returns the raw byte length, while mb_substr() returns a "character" according to the charset/encoding. In this example from position 0.

like image 82
mario Avatar answered Sep 22 '22 14:09

mario


  • ASCII is 7 bits.
  • Most other languages use 8 bits (1 byte).
  • Many eastern languages (Chinese, Japanese) use 16 bits (2 bytes).
  • Unicode is usually 32 bits (4 bytes).

How a character is stored and represented depends on the programming language and the platform you are using.

like image 30
Oded Avatar answered Sep 21 '22 14:09

Oded