I have a warehouse table that looks like this:
CREATE TABLE Warehouse (
id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
eventId BIGINT(20) UNSIGNED NOT NULL,
groupId BIGINT(20) NOT NULL,
activityId BIGINT(20) UNSIGNED NOT NULL,
... many more ids,
"txtProperty1" VARCHAR(255),
"txtProperty2" VARCHAR(255),
"txtProperty3" VARCHAR(255),
"txtProperty4" VARCHAR(255),
"txtProperty5" VARCHAR(255),
... many more of these
PRIMARY KEY ("id")
KEY "WInvestmentDetail_idx01" ("groupId"),
... several more indices
) ENGINE=INNODB;
Now, the following query spends about 0.8s in query time and 0.2s in fetch time, for a total of about one second. The query returns ~67,000 rows.
SELECT eventId
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
Adding more ids to the select clause doesn't really change the performance at all.
SELECT eventId, groupId, activityId, insertDate
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
However, adding a "property" column does change it to 0.6s fetch time and 1.8s query time.
SELECT eventId, txtProperty1
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
Now to really blow your socks off. Instead of txtProperty1, using txtProperty2 changes the times to 0.8s fetch, 24s query!
SELECT eventId, txtProperty2
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
AND scenarioId IS NULL
AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;
The two columns are pretty much identical in the type of data they hold: mostly non-null, and neither are indexed (not that that should make a difference anyways). To be sure the table itself is healthy I ran analyze/optimize against it.
This is really mystifying to me. I can see why adding columns to the select clause only can slightly increase fetch time, but it should not change query time, especially not significantly. I would appreciate any ideas as to what is causing this slowdown.
EDIT - More data points
SELECT * actually outperforms txtProperty2 - 0.8s query, 8.4s fetch. Too bad I can't use it because the fetch time is (expectedly) too long.
The MySQL documentation for the InnoDB engine suggests that if your varchar
data doesn't fit on the page (i.e. the node of the b-tree structure), then the information will be referenced on overflow pages. So on your wide Warehouse table, it may be that txtProperty1
is on-page and txtProperty2
is off-page, thus requiring additional I/O to retrieve.
Not too sure as to why the SELECT *
is better; it may be able to take advantage of reading data sequentially, rather than picking its way around the disk.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With