Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does MySQL varchar know how many bytes indicate the length?

The MySQL manual and several StackOverflow answers make it clear that varchar uses:

  • 1 byte for varchars with 0-255 characters in them
  • 2 bytes for varchars with more than 255 characters in them.

The first part makes sense. A single byte can store 256 different values, i.e. 0 through 255.

What I am trying to figure out is how MySQL knows how many bytes indicate the length.

Imagine a 255-char varchar starting with the following bytes: [255][w][o][r][d]~

According to the manual, only the first byte is used to indicate the length in this scenario. When reading the field, MySQL will somehow have to know that this is the case here, and that the second byte is not part of the length.

Now imagine a 256-char varchar starting with the following bytes: [255][1][w][o][r][d]~

Now MySQL miraculously knows that it should interpret the first two bytes as the length, when reading the field.

How does it distinguish? The only foolproof way I have come up with is to interpret only the first byte as length, then determine if the text length matches (in its current encoding), and if not, we know that the first two bytes must be the length.

like image 271
Timo Avatar asked Apr 08 '15 13:04

Timo


People also ask

How many bytes does varchar use in MySQL?

Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535. The effective maximum length of a VARCHAR is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.

How is varchar length calculated?

The length of a varchar column can be determined using the len() function, however this generates an error when used with the text datatype. Fortunately we can use the datalength() function to work out the length of a text field.

How many bytes does varchar take?

The size of the maximum size (m) parameter of a VARCHAR column can range from 1 to 255 bytes. If you are placing an index on a VARCHAR column, the maximum size is 254 bytes. You can store character strings that are shorter, but not longer, than the m value that you specify.

What does varchar 1024 mean?

VARCHAR(1024) 1024 this is lenght. The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. For example, CHAR(30) can hold up to 30 characters. Follow this answer to receive notifications. answered May 18, 2020 at 10:55.


1 Answers

It happens at the time of definition. All length prefixes will be the same size in bytes for a particular VARCHAR column. The VARCHAR column will use 2 bytes or the VARCHAR column will use 1 byte, depending on the defined size in characters, and the character set.

All VARCHAR columns defined such that it might require more than 255 bytes use 2 bytes to store the size. MySQL isn't going to use 1 byte for some values in a column and 2 bytes for others.

MySQL documentation on CHAR and VARCHAR Types states this pretty clearly (emphasis mine):

A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

If you declare a VARCHAR(255) column to use the utf8 character set, it's still going to use 2 bytes for the length prefix, not 1, since the length in bytes may be greater than 255 with utf8 characters.

like image 180
Marcus Adams Avatar answered Oct 15 '22 04:10

Marcus Adams