Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysql varchar byte length 255 issue

Tags:

mysql

According to the mysql documentation:

A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes

AND

The maximum row size constrains the number of columns because the total width of all columns cannot exceed this size. For example, utf8 characters require up to three bytes per character, so for a CHAR(255) CHARACTER SET utf8 column, the server must allocate 255 × 3 = 765 bytes per value. Consequently, a table cannot contain more than 65,535 / 765 = 85 such columns.

For clarity, what then is the maximum value I can set in the varchar argument so it only uses 1 byte to store its length?

like image 218
David19801 Avatar asked Dec 27 '10 14:12

David19801


2 Answers

From the MySQL documentation:

The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. For example, CHAR(30) can hold up to 30 characters.

A [VARCHAR] column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

This makes the answer to your question depend on the character encoding.

With a single-byte encoding like windows-1252 (which MySQL calls latin1), the character length is the same as the byte length, so you can use a VARCHAR(255).

With UTF-8, a VARCHAR(N) may require up to 3N bytes, as would be the case if all characters were in the range U+0800 to U+FFFF. Thus, a VARCHAR(85) is the greatest that ensures a single-byte byte length (requiring a maximum of 255 bytes).

(Note that MySQL apparently does not support characters outside the BMP. The official definition of UTF-8 allows 4 bytes per character.)

like image 150
dan04 Avatar answered Oct 25 '22 12:10

dan04


For clarity, what then is the maximum value I can set in the varchar argument so it only uses 1 byte to store its length?

This depends on the collation of the VARCHAR column.

As you noted, UTF8 may use up to three bytes per character, so if your declare a UTF8 column more than 85 characters long, there is a chance that it will use more than 255 bytes to store its data, and the length hence should be stored in a two-byte field.

If you use latin1, each character is stored in 1 byte.

So the answer is:

VARCHAR(85) COLLATE UTF8_GENERAL_CI

, or

VARCHAR(255) COLLATE LATIN1_GENERAL_CI
like image 35
Quassnoi Avatar answered Oct 25 '22 13:10

Quassnoi