Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

create mysql row with not really unique keys based on some other rows

Tags:

sql

php

mysql

Database example:

| country | animal | size   | x_id* |
|---------+--------+--------+-------|
|  777    | 1001   | small  |   1   |
|  777    | 2002   | medium |   2   |
|  777    | 7007   | medium |   3   |
|  777    | 7007   | large  |   4   |
|  42     | 1001   | small  |   1   |
|  42     | 2002   | medium |   2   |
|  42     | 7007   | large  |   4   |

I need to generate the x_id continuously based on entries in (animal, size) and if x_id for the combination x_id exist use it again.

Currently i use the following PHP script for this action, but on a large db table it is very slow.

query("UPDATE myTable SET x_id = -1");

$i = $j;
$c = array();
$q = query("
    SELECT animal, size
      FROM myTable
     WHERE x_id = -1
  GROUP BY animal, size");

while($r = fetch_array($q)) {

  $hk = $r['animal'] . '-' . $r['size'];

  if( !isset( $c[$hk] ) ) $c[$hk] = $i++;

  query("
      UPDATE myTable
       SET x_id = {$c[$hk]}
     WHERE animal = '".$r['animal']."'
       AND size = '".$r['size']."'
       AND x_id = -1");

}

Is there a way to convert the PHP script to one or two mysql commands?

edit:

CREATE TABLE `myTable` (
`country` int(10) unsigned NOT NULL DEFAULT '1', -- country
`animal` int(3) NOT NULL,
`size` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`lang_id` tinyint(4) NOT NULL DEFAULT '1',
`x_id` int(10) NOT NULL,
KEY `country` (`country`),
KEY `x_id` (`x_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
like image 423
HQ5 Avatar asked Dec 25 '15 19:12

HQ5


3 Answers

UPDATE myTable m
    JOIN (
        SELECT animal, size, @newid := @newid + 1 AS x_id
        FROM myTable a
            CROSS JOIN (SELECT @newid := 0) b
        WHERE x_id = -1
        GROUP BY animal, size
    ) t ON m.animal = t.animal AND m.size = t.size
SET m.x_id = t.x_id
;

http://sqlfiddle.com/#!9/5525ba/1

The group by in the subquery is not needed. It generates useless overhead. If it's fast enough, leave it like this, otherwise we can use distinct+another subquery instead.

like image 168
Sebas Avatar answered Nov 05 '22 06:11

Sebas


User variables are awkward but should do the trick,tested on my machine

CREATE TABLE t
( animal VARCHAR(20),
 size VARCHAR(20),
x_id INT);

 INSERT INTO T(animal,size) VALUES('crocodile','small'),
   ('elephant','medium'),
 ('giraffe','medium'),
 ('giraffe','large'),
 ('crocodile','small'),
 ('elephant','medium'),
 ('giraffe','large');




 UPDATE t  RIGHT JOIN
 (SELECT animal,size,
MIN(CASE WHEN @var:=CONCAT(animal,size) THEN @id ELSE @id:=@id+1  END)id
FROM t,
(SELECT  @var:=CONCAT(animal,size) FROM t)x ,
                      (SELECT @id:=0)y
                     GROUP BY animal,size)q
                     ON t.animal=q.animal AND t.size=q.size
                     SET x_id=q.id

Results

"animal"    "size"     "x_id"
"crocodile" "small"     "1"
"elephant"  "medium"    "2"
"giraffe"   "medium"    "3"
"giraffe"   "large"     "4"
"crocodile" "small"     "1"
"elephant"  "medium"    "2"
"giraffe"   "large"     "4"

You want these indexes added for (a lot) faster access

ALTER TABLE `yourtable` ADD INDEX `as_idx` (`animal`,`size`);
ALTER TABLE `yourtable` ADD INDEX `id_idx` (`x_id`);
like image 28
Mihai Avatar answered Nov 05 '22 07:11

Mihai


This is a conceptual. Worm it into your world if useful.

Schema

create table AnimalSize
(   id int auto_increment primary key,
    animal varchar(100) not null,
    size varchar(100) not null,
    unique key(animal,size) -- this is critical, no dupes
);

create table CountryAnimalSize
(   id int auto_increment primary key,
    country varchar(100) not null,
    animal varchar(100) not null,
    size varchar(100) not null,
    xid int not null -- USE THE id achieved thru use of AnimalSize table
);

Some queries

-- truncate table animalsize; -- clobber and reset auto_increment back to 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- id=1
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 2)
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 3)
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('frog','medium'); -- id=4
select last_insert_id(); -- 4
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 3)
select last_insert_id(); -- 4

Note: insert ignore says do it, and ignore the fact that it may die. In our case, it would fail due to unique key (which is fine). In general, do not use insert ignore unless you know what you are doing.

It is often thought of in connection with an insert on duplicate key update (IODKU) call. Or should I say thought about, as in, How can I solve this current predicament. But, that (IODKU) would be a stretch in this case. Yet, keep both in your toolchest for solutions.

After insert ignore fires off, you know, one way or the other, that the row is there.

Forgetting the INNODB GAP aspect, what the above suggests is that if the row already exists prior to insert ignore, that

You cannot rely on last_insert_id() for the id

So after firing off insert ignore, go and fetch the id that you know has to be there. Use that in subsequent calls against CountryAnimalSize

continue along this line of reasoning for your CountryAnimalSize table inserts where the row may or may not already be there.

There is no reason to formalize the solution here because, as you say, these aren't even your tables anyway in the Question.

Also, back to INNODB GAP. Google that. Figure out whether or not you can live with gaps created.

Most people have bigger fish to fry that keeping id's tight and gapless.

Other people (read: OCD) are so consumed by the perceived gap problem that they blow days on it.

So, these are general comments meant to help a broader audience, than to answer your question, which, as you say, isn't even your schema.

like image 2
Drew Avatar answered Nov 05 '22 06:11

Drew