Database example:
| country | animal | size | x_id* |
|---------+--------+--------+-------|
| 777 | 1001 | small | 1 |
| 777 | 2002 | medium | 2 |
| 777 | 7007 | medium | 3 |
| 777 | 7007 | large | 4 |
| 42 | 1001 | small | 1 |
| 42 | 2002 | medium | 2 |
| 42 | 7007 | large | 4 |
I need to generate the x_id
continuously based on entries in (animal, size) and if x_id
for the combination x_id
exist use it again.
Currently i use the following PHP script for this action, but on a large db table it is very slow.
query("UPDATE myTable SET x_id = -1");
$i = $j;
$c = array();
$q = query("
SELECT animal, size
FROM myTable
WHERE x_id = -1
GROUP BY animal, size");
while($r = fetch_array($q)) {
$hk = $r['animal'] . '-' . $r['size'];
if( !isset( $c[$hk] ) ) $c[$hk] = $i++;
query("
UPDATE myTable
SET x_id = {$c[$hk]}
WHERE animal = '".$r['animal']."'
AND size = '".$r['size']."'
AND x_id = -1");
}
Is there a way to convert the PHP script to one or two mysql commands?
edit:
CREATE TABLE `myTable` (
`country` int(10) unsigned NOT NULL DEFAULT '1', -- country
`animal` int(3) NOT NULL,
`size` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`lang_id` tinyint(4) NOT NULL DEFAULT '1',
`x_id` int(10) NOT NULL,
KEY `country` (`country`),
KEY `x_id` (`x_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
UPDATE myTable m
JOIN (
SELECT animal, size, @newid := @newid + 1 AS x_id
FROM myTable a
CROSS JOIN (SELECT @newid := 0) b
WHERE x_id = -1
GROUP BY animal, size
) t ON m.animal = t.animal AND m.size = t.size
SET m.x_id = t.x_id
;
http://sqlfiddle.com/#!9/5525ba/1
The group by in the subquery is not needed. It generates useless overhead. If it's fast enough, leave it like this, otherwise we can use distinct+another subquery instead.
User variables are awkward but should do the trick,tested on my machine
CREATE TABLE t
( animal VARCHAR(20),
size VARCHAR(20),
x_id INT);
INSERT INTO T(animal,size) VALUES('crocodile','small'),
('elephant','medium'),
('giraffe','medium'),
('giraffe','large'),
('crocodile','small'),
('elephant','medium'),
('giraffe','large');
UPDATE t RIGHT JOIN
(SELECT animal,size,
MIN(CASE WHEN @var:=CONCAT(animal,size) THEN @id ELSE @id:=@id+1 END)id
FROM t,
(SELECT @var:=CONCAT(animal,size) FROM t)x ,
(SELECT @id:=0)y
GROUP BY animal,size)q
ON t.animal=q.animal AND t.size=q.size
SET x_id=q.id
Results
"animal" "size" "x_id"
"crocodile" "small" "1"
"elephant" "medium" "2"
"giraffe" "medium" "3"
"giraffe" "large" "4"
"crocodile" "small" "1"
"elephant" "medium" "2"
"giraffe" "large" "4"
You want these indexes added for (a lot) faster access
ALTER TABLE `yourtable` ADD INDEX `as_idx` (`animal`,`size`);
ALTER TABLE `yourtable` ADD INDEX `id_idx` (`x_id`);
This is a conceptual. Worm it into your world if useful.
create table AnimalSize
( id int auto_increment primary key,
animal varchar(100) not null,
size varchar(100) not null,
unique key(animal,size) -- this is critical, no dupes
);
create table CountryAnimalSize
( id int auto_increment primary key,
country varchar(100) not null,
animal varchar(100) not null,
size varchar(100) not null,
xid int not null -- USE THE id achieved thru use of AnimalSize table
);
-- truncate table animalsize; -- clobber and reset auto_increment back to 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- id=1
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 2)
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 3)
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('frog','medium'); -- id=4
select last_insert_id(); -- 4
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 3)
select last_insert_id(); -- 4
Note: insert ignore
says do it, and ignore the fact that it may die. In our case, it would fail due to unique key (which is fine). In general, do not use insert ignore
unless you know what you are doing.
It is often thought of in connection with an insert on duplicate key update
(IODKU) call. Or should I say thought about, as in, How can I solve this current predicament. But, that (IODKU) would be a stretch in this case. Yet, keep both in your toolchest for solutions.
After insert ignore fires off, you know, one way or the other, that the row is there.
Forgetting the INNODB GAP aspect, what the above suggests is that if the row already exists prior to insert ignore, that
You cannot rely on last_insert_id()
for the id
So after firing off insert ignore, go and fetch the id that you know has to be there. Use that in subsequent calls against CountryAnimalSize
continue along this line of reasoning for your CountryAnimalSize table inserts where the row may or may not already be there.
There is no reason to formalize the solution here because, as you say, these aren't even your tables anyway in the Question.
Also, back to INNODB GAP
. Google that. Figure out whether or not you can live with gaps created.
Most people have bigger fish to fry that keeping id's tight and gapless.
Other people (read: OCD) are so consumed by the perceived gap problem that they blow days on it.
So, these are general comments meant to help a broader audience, than to answer your question, which, as you say, isn't even your schema.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With