Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extra column ruins MySQL performance

I have a warehouse table that looks like this:

CREATE TABLE Warehouse (
  id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
  eventId BIGINT(20) UNSIGNED NOT NULL,
  groupId BIGINT(20) NOT NULL,
  activityId BIGINT(20) UNSIGNED NOT NULL,
  ... many more ids,
  "txtProperty1" VARCHAR(255),
  "txtProperty2" VARCHAR(255),
  "txtProperty3" VARCHAR(255),
  "txtProperty4" VARCHAR(255),
  "txtProperty5" VARCHAR(255),
  ... many more of these
  PRIMARY KEY ("id")
  KEY "WInvestmentDetail_idx01" ("groupId"),
  ... several more indices
) ENGINE=INNODB;

Now, the following query spends about 0.8s in query time and 0.2s in fetch time, for a total of about one second. The query returns ~67,000 rows.

SELECT eventId
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
  AND scenarioId IS NULL
  AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;

Adding more ids to the select clause doesn't really change the performance at all.

SELECT eventId, groupId, activityId, insertDate
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
  AND scenarioId IS NULL
  AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;

However, adding a "property" column does change it to 0.6s fetch time and 1.8s query time.

SELECT eventId, txtProperty1
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
  AND scenarioId IS NULL
  AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;

Now to really blow your socks off. Instead of txtProperty1, using txtProperty2 changes the times to 0.8s fetch, 24s query!

SELECT eventId, txtProperty2
FROM Warehouse
WHERE accountId IN (10, 8, 13, 9, 7, 6, 12, 11)
  AND scenarioId IS NULL
  AND insertDate BETWEEN DATE '2002-01-01' AND DATE '2011-12-31'
ORDER BY insertDate;

The two columns are pretty much identical in the type of data they hold: mostly non-null, and neither are indexed (not that that should make a difference anyways). To be sure the table itself is healthy I ran analyze/optimize against it.

This is really mystifying to me. I can see why adding columns to the select clause only can slightly increase fetch time, but it should not change query time, especially not significantly. I would appreciate any ideas as to what is causing this slowdown.

EDIT - More data points

SELECT * actually outperforms txtProperty2 - 0.8s query, 8.4s fetch. Too bad I can't use it because the fetch time is (expectedly) too long.

like image 1000
Monkey Boson Avatar asked Oct 01 '10 15:10

Monkey Boson


1 Answers

The MySQL documentation for the InnoDB engine suggests that if your varchar data doesn't fit on the page (i.e. the node of the b-tree structure), then the information will be referenced on overflow pages. So on your wide Warehouse table, it may be that txtProperty1 is on-page and txtProperty2 is off-page, thus requiring additional I/O to retrieve.

Not too sure as to why the SELECT * is better; it may be able to take advantage of reading data sequentially, rather than picking its way around the disk.

like image 197
richaux Avatar answered Oct 09 '22 19:10

richaux