Which one of the following queries will be faster and more optimal (and why):
SELECT * FROM items WHERE w = 320 AND h = 200
(w and h are INT)
SELECT * FROM items WHERE dimensions = '320x200'
(dimensions is VARCHAR)
1 Answer. Int comparisons are faster than varchar comparisons, and ints take much less space than varchars. This is applicable true for both unindexed and indexed access. You can use an indexed int column to make it faster.
Integer is for numbers, and varchar is for numbers, letters and other characters (Short text). So for age you can use a int type, for genders you can use the enum() type if there are only two options. Varchar is text and integer is number.
As the name suggests, varchar means character data that is varying. Also known as Variable Character, it is an indeterminate length string data type. It can hold numbers, letters and special characters.
A VARCHAR declaration must include a positive integer in parentheses to define the maximum allowable character string length. For example, VARCHAR(n) can accept any length of character string up to n characters in length. The length parameter may take any value from 1 to the current table page size.
Here are some actual measurements. (Using SQLite; may try it with MySQL later.)
Data = All 1,000,000 combinations of w, h ∈ {1...1000}, in randomized order.
CREATE TABLE items (id INTEGER PRIMARY KEY, w INTEGER, h INTEGER)
Average time (of 20 runs) to execute SELECT * FROM items WHERE w = 320 and h = 200
was 5.39±0.29 µs.
CREATE TABLE items (id INTEGER PRIMARY KEY, dimensions TEXT)
Average time to execute SELECT * FROM items WHERE dimensions = '320x200'
was 5.69±0.23 µs.
There is no significant difference, efficiency-wise.
There is a huge difference in terms of usability. For example, if you want to calculate the area and perimeter of the rectangles, the two-column approach is easy:
SELECT w * h, 2 * (w + h) FROM items
Try to write the corresponding query for the other way.
Intuitively, if you do not create INDEX
es on those columns, integer comparison seems faster.
In integer comparison, you compare directly 32-bit values equality with logical operators.
On the other hand, strings are character arrays, it will be difficult to compare them. Character-by-character.
However, another point is that, in 2nd query you have 1 field to compare, in 1st query you have 2 fields. If you have 1,000,000 records and no indexes on columns, that means you may have 1,000,000 string comparisons on worst case (unluckily last result is the thing you've looking for or not found at all)
On the other hand you have 1,000,000 records and all are w=320
, then you'll be comparing them for h
,too. That means 2,000,000 comparisons. However you create INDEXes on those fields IMHO they will be almost identical since VARCHAR will be hashed (takes O(1)
constant time) and will be compared using INT comparison and take O(logn)
time.
Conclusion, it depends. Prefer indexes on searchable columns and use ints.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With