Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

From a set of values, how do I find the values not stored in a table's column?

Tags:

mysql

I have a table which will potentially store hundreds of thousands of integers:

desc id_key_table;

+----------------+--------------+------+-----+---------+-------+
| Field          | Type         | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| id_key         | int(16)      | NO   | PRI | NULL    |       |
+----------------+--------------+------+-----+---------+-------+

From a program, I have a large set of integers. I'd like to see which of these integers are NOT in the above id_key column.

So far I've come up with the following approaches:

1) Iterate through each integer and perform a:

select count(*) count from id_key_table where id_key = :id_key

When count is 0 the id_key is missing from the table.

This seems like a horrible, horrible way to do it.


2) Create a temporary table, insert each of the values into the temporary table, and perform a JOIN on the two tables.

create temporary table id_key_table_temp (id_key int(16) primary key );

insert into id_key_table_temp values (1),(2),(3),...,(500),(501);

select temp.id_key
from id_key_table_temp temp left join id_key_table as main 
         on temp.id_key = main.id_key 
where main.killID is null;

drop table id_key_table_temp;

This seems like the best approach, however, I'm sure there is a far better approach I haven't thought of yet. I'd prefer to not have to create a temporary table and use one query to determine which integers are missing.

Is there a proper query for this type of search?

(MySQL)

like image 299
Clinton Avatar asked Oct 11 '22 02:10

Clinton


1 Answers

Using your code in the second example given in the question, I created two stored procedures (SP): 1 SP to load a sample table of prime numbers as keys, the other SP to find the missing integers.

Here is the first SP:

DELIMITER $$

DROP PROCEDURE IF EXISTS `test`.`CreateSampleTable` $$
CREATE PROCEDURE `test`.`CreateSampleTable` (maxinttoload INT)
BEGIN

  DECLARE X,OKTOUSE,MAXLOOP INT;

  DROP TABLE IF EXISTS test.id_key_table;
  CREATE TABLE test.id_key_table (id_key INT(16)) ENGINE=MyISAM;

  SET X=2;
  WHILE X <= maxinttoload DO
    INSERT INTO test.id_key_table VALUES (X);
    SET X = X + 1;
  END WHILE;
  ALTER TABLE test.id_key_table ADD PRIMARY KEY (id_key);

  SET MAXLOOP = FLOOR(SQRT(maxinttoload));
  SET X = 2;
  WHILE X <= MAXLOOP DO
    DELETE FROM test.id_key_table WHERE MOD(id_key,X) = 0 AND id_key > X;
    SELECT MIN(id_key) INTO OKTOUSE FROM test.id_key_table WHERE id_key > X;
    SET X = OKTOUSE;
  END WHILE;
  OPTIMIZE TABLE test.id_key_table;

  SELECT * FROM test.id_key_table;

END $$

DELIMITER ;

Here is the second SP:

DELIMITER $$

DROP PROCEDURE IF EXISTS `test`.`GetMissingIntegers` $$
CREATE PROCEDURE `test`.`GetMissingIntegers` (maxinttoload INT)
BEGIN

  DECLARE X INT;

  DROP TABLE IF EXISTS test.id_key_table_temp;
  CREATE TEMPORARY TABLE test.id_key_table_temp (id_key INT(16)) ENGINE=MyISAM;

  SET X=1;
  WHILE X <= maxinttoload DO
    INSERT INTO test.id_key_table_temp VALUES (X);
    SET X = X + 1;
  END WHILE;
  ALTER TABLE test.id_key_table_temp ADD PRIMARY KEY (id_key);

  SELECT temp.id_key FROM test.id_key_table_temp temp
  LEFT JOIN test.id_key_table main USING (id_key)
  WHERE main.id_key IS NULL;

END $$

DELIMITER ;

Here is the Sample Run of First SP using the number 25 to create prime numbers:

mysql> CALL test.CreateSampleTable(25);
+-------------------+----------+----------+----------+
| Table             | Op       | Msg_type | Msg_text |
+-------------------+----------+----------+----------+
| test.id_key_table | optimize | status   | OK       |
+-------------------+----------+----------+----------+
1 row in set (0.16 sec)

+--------+
| id_key |
+--------+
|      2 |
|      3 |
|      5 |
|      7 |
|     11 |
|     13 |
|     17 |
|     19 |
|     23 |
+--------+
9 rows in set (0.17 sec)

mysql>

Here is the run of the second SP using 25 as the full list to compare:

mysql> CALL test.GetMissingIntegers(25);
+--------+
| id_key |
+--------+
|      1 |
|      4 |
|      6 |
|      8 |
|      9 |
|     10 |
|     12 |
|     14 |
|     15 |
|     16 |
|     18 |
|     20 |
|     21 |
|     22 |
|     24 |
|     25 |
+--------+
16 rows in set (0.03 sec)

Query OK, 0 rows affected (0.05 sec)

mysql>

While this solution is OK for small samples, big lists become a major headache. You may want to keep the temp table (don't use CREATE TEMPORARY TABLE over and over, use CREATE TABLE just once) permamnently loaded with the numbers 1 .. MAX(id_key) and populate that permanent temp table via a trigger on id_key_table.

Just a question because I am curious: Are you doing this to see if auto_increment keys from a table can be reused ???

like image 158
RolandoMySQLDBA Avatar answered Oct 14 '22 05:10

RolandoMySQLDBA