I have created a file with UTF-8 encoding, but I don't understand the rules for the size it takes up on disk. Here is my complete research:
First I created the file with a single Hindi letter 'क' and the file size on Windows 7 was
8 bytes.
Now with two letter 'कक' and the file size was 11 bytes.
Now with three letter 'ककक'and the file size was 14 bytes.
Can someone please explain me why it is showing such sizes?
The first three bytes are used for the BOM (Byte Order Mark) EF BB BF
.
Then, the bytes E0 A4 95
encode the letter क.
Then the bytes 0D 0A
encode a carriage return.
Total: 8 bytes. For each letter क you add, you need three more bytes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With