I'm "ok" at basic MySQL, but this is "WAY OVER MY HEAD"!
Objectives:
The database table(s) are HUGE, speed is an issue.
Does not have to MyISAM is inoDB would be faster? Each database will be in a unique table.
I was given this as a starting place to what I'm trying to do:
CREATE TABLE `table` LIKE LiveTable
LOAD DATA INFILE..... INTO `table`
UPDATE `table` SET delete=1; -- Set the delete field to true because it will not have been updated
UPDATE `table` INNER JOIN`table`ON `LiveTable.ID`=`table.ID`
SET LiveTable.Col1=table.Col1, LiveTable.Col2=table.Col2….. delete=0
INSERT INTO LiveTable(ID,Col1,Col2,… delete=0)
SELECT ID,Col1,Col2,...FROM `table`
LEFT JOIN LiveTable
ON table.ID = LiveTable.ID
WHERE LiveTable.ID IS NULL
DELETE FROM LiveTableWHERE delete = 0
EMPTY TABLE `table`
> CREATE TABLE `product_table` (
> `programname` VARCHAR(100) NOT NULL,
> `name` VARCHAR(160) NOT NULL,
> `keywords` VARCHAR(300) NOT NULL,
> `description` TEXT NOT NULL,
> `sku` VARCHAR(100) NOT NULL,
> -- This is the only "unique identifier given, none will be duplicates"
> `price` DECIMAL(10, 2) NOT NULL,
> `created` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
> `updatedat` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
> `delete` TINYINT(4) NOT NULL DEFAULT '0',
> PRIMARY KEY (`sku`) ) ENGINE=myisam DEFAULT CHARSET=latin1;
>
> CREATE TABLE IF NOT EXISTS `temptable` LIKE `product_table`;
>
> TRUNCATE TABLE `temptable`; -- Remove data from temp table if for some
> reason it has data in it.
>
> LOAD DATA LOW_PRIORITY LOCAL INFILE "catalog.csv" INTO TABLE
> `temptable` FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY """"
> LINES TERMINATED BY "\n" IGNORE 1 LINES (`PROGRAMNAME`, `NAME`,
> `KEYWORDS`, `DESCRIPTION`, `SKU`, `PRICE`);
>
>
> UPDATE `temptable` SET `delete` = 1; -- Set the delete field to
> true UPDATE `temptable` ttable
> INNER JOIN `product_table` mtable
> ON ( mtable.sku = ttable.sku ) SET mtable.programname = ttable.programname,
> mtable.name = ttable.name,
> mtable.keywords = ttable.keywords,
> mtable.description = ttable.description,
> mtable.sku = ttable.sku,
> mtable.price = ttable.price,
> mtable.created = ttable.created,
> mtable.updatedat = NOW(),-- Set Last Update
> mtable.delete = 0; -- Set Delete to NO
>
> -- Not sure what this is for... I'm LOST at this part...
> INSERT INTO `product_table` VALUES (`programname`,
> `name`,
> `keywords`,
> `description`,
> `sku`,
> `price`,
> `created`,
> `updatedat`,
> `delete`);
>
> -- This type of join requires alias as far as I know?
> SELECT `programname`,
> `name`,
> `keywords`,
> `description`,
> `sku`,
> `price`,
> `created`,
> `updatedat`,
> `delete` FROM `temptable` tmptable
> LEFT JOIN `product_table` maintbl
> ON tmptable.sku = maintbl.sku WHERE maintbl.sku IS NULL;
>
> DELETE FROM `product_table` WHERE `delete` = 0;
>
> TRUNCATE TABLE `temptable`; `` remove all the data from temporary
> table.
I answered this question myself here: https://dba.stackexchange.com/questions/16197/innodb-update-slow-need-a-better-option/16283#16283
Using the information I've received from here, the web and several internet chat rooms, I've come up with. Web source: http://www.softwareprojects.com/resources/programming/t-how-to-use-mysql-fast-load-data-for-updates-1753.html
[DEMO][1] http://sqlfiddle.com/#!2/4ebe0/1
The process is:
Import into a new temp table.
Update The old table information with information in Temp table.
Insert new data into the table. (Real world I'm making a new CSV file and using LOAD INTO for the insert)
delete everything that is no longer in the data feed.
delete the temp table.
This seems the fastest processes so far.
Let me know what your opinion is.
InnoDB is usually much better than MyISAM at tables being available while INSERT
, UPDATE
and DELETE
are happening, because InnoDB uses row level locking for updates whereas MyISAM uses table level locking.
That is the first step.
The second step is to disable all indexes on the table before loading data into a table using ALTER TABLE .. DISABLE KEYS
and then enabling them back after the load using ALTER TABLE .. ENABLE KEYS
.
The above two show large improvements in your performance.
As another optimization, when doing large scale updates, break them down into batches (perhaps based on the primary key) so that all the rows are not locked simultaneously.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With