Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is Unicode? and how Encoding works? [closed]

Few hours before I was reading a c programming book. While I was reading the book I came across these words, Character encoding and Unicode. Then I started googling for the information about Unicode. Then I came to know that Unicode character set has every character from every language and UTF-8,16,32 can encode the characters listed in unicode character set.

but I was not able to understand how it works.
Does unicode depends upon the operating systems?
How it is related to softwares and programs?
Is UTF-8 is a software that is installed on my computer when i installed operating system?
or Is it related to hardware?
and how a computer encodes the things?

I have found it so much confusing. Please answer me in detail. I am new to these things, so please keep that in mind while you give me the answer.

thank you.

like image 769
Noob.Alone.Programmer Avatar asked Jul 07 '13 12:07

Noob.Alone.Programmer


1 Answers

I have written about this extensively in What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text. Here some highlights:

  • encodings are plentiful, encodings define how a "character" like "A" can be encoded as bits and bytes
  • most encodings only specify this for a small number of selected characters; for example all (or at least most) characters needed to write English or Czech; single byte encodings typically support a set of up to 256 characters
  • Unicode is one large standard effort which has catalogued and specified a number ⟷ character relationship for virtually all characters and symbols of every major language in use, which is hundreds of thousands of characters
  • UTF-8, 16 and 32 are different sub-standards for how to encode this ginormous catalog of numbers to bytes, each with different size tradeoffs
  • software needs to specifically support Unicode and its UTF-* encodings, just like it needs to support any other kind of specialized encoding; most of the work is done by the OS these days which exposes supporting functions to an application
like image 155
deceze Avatar answered Oct 14 '22 15:10

deceze