I have a script which captures tweets and puts them into a database. I will be running the script on a cronjob and then displaying the tweets on my site from the database to prevent hitting the limit on the twitter API.
So I don't want to have duplicate tweets in my database, I understand I can use 'INSERT...ON DUPLICATE KEY UPDATE' to achieve this, but I don't quite understand how to use it.
My database structure is as follows.
Table - Hash id (auto_increment) tweet user user_url
And currently my SQL to insert is as follows:
$tweet = $clean_content[0];
$user_url = $clean_uri[0];
$user = $clean_name[0];
$query='INSERT INTO hash (tweet, user, user_url) VALUES ("'.$tweet.'", "'.$user.'", "'.$user_url.'")';
mysql_query($query);
How would I correctly use 'INSERT...ON DUPLICATE KEY UPDATE' to insert only if it doesn't exist, and update if it does?
Thanks
INSERT ... ON DUPLICATE KEY UPDATE is a MariaDB/MySQL extension to the INSERT statement that, if it finds a duplicate unique or primary key, will instead perform an UPDATE. The row/s affected value is reported as 1 if a row is inserted, and 2 if a row is updated, unless the API's CLIENT_FOUND_ROWS flag is set.
By definition, atomicity requires that each transaction is an all or nothing. So yes it is atomic in the sense that if the data that you are trying to insert will cause a duplicate in the primary key or in the unique index, the statement will instead perform an update and not error out.
ON DUPLICATE KEY UPDATE inserts or updates a row, the LAST_INSERT_ID() function returns the AUTO_INCREMENT value. The ON DUPLICATE KEY UPDATE clause can contain multiple column assignments, separated by commas. The use of VALUES() to refer to the new row and columns is deprecated beginning with MySQL 8.0.
INSERT INTO table_name(c1) VALUES(c1) ON DUPLICATE KEY UPDATE c1 = VALUES(c1) + 1; The statement above sets the value of the c1 to its current value specified by the expression VALUES(c1) plus 1 if there is a duplicate in UNIQUE index or PRIMARY KEY .
you need some UNIQUE KEY on your table, if user_url is tweer_url, then this should fit (every tweet has a unique url, id would be better).
CREATE TABLE `hash` (
`user_url` ...,
...,
UNIQUE KEY `user_url` (`user_url`)
);
and its better to use INSERT IGNORE on your case
$query='INSERT IGNORE INTO hash (tweet, user, user_url) VALUES ("'.$tweet.'", "'.$user.'", "'.$user_url.'")';
ON DUPLICATE KEY is useful when you need update existing row but you want to insert just once
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With