Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you find ID gaps in a MySQL recordset?

The issue here is related to another question I had...

I have millions of records, and the ID of each of those records is auto-incremented, unfortunately sometimes the ID that is generated is sometimes thrown away so there are many many gaps between IDs.

I want to find the gaps, and re-use the ids that were abandoned.

What's an efficient way to do this in MySQL?

like image 735
qodeninja Avatar asked Dec 09 '11 17:12

qodeninja


People also ask

How do I find missing values in mysql?

To look for NULL values, you must use the IS NULL test. The following statements show how to find the NULL phone number and the empty phone number: mysql> SELECT * FROM my_table WHERE phone IS NULL; mysql> SELECT * FROM my_table WHERE phone = ''; See Section 3.3.


3 Answers

First of all, what advantage are you trying to get by reusing the skipped values? An ordinary INT UNSIGNED will let you count up to 4,294,967,295. With "millions of records" your database would have to grow a thousand times over before running out of valid IDs. (And then using a BIGINT UNSIGNED will bump you up to 18,446,744,073,709,551,615 values.)

Trying to recycle values MySQL has skipped is likely to use up a lot of your time trying to compensate for something that really doesn't bother MySQL in the first place.

With that said, you can find missing IDs with something like:

SELECT id + 1
FROM the_table
WHERE NOT EXISTS (SELECT 1 FROM the_table t2 WHERE t2.id = the_table.id + 1);

This will find only the first missing number in each sequence (e.g., if you have {1, 2, 3, 8, 10} it will find {4,9}) but it's likely to be efficient, and of course once you've filled in an ID you can always run it again.

like image 96
VoteyDisciple Avatar answered Sep 18 '22 17:09

VoteyDisciple


The following will return a row for each gap in the integer field "n" in mytab:

/* cs will contain 1 row for each contiguous sequence of integers in mytab.n
   and will have the start of that chain.
   ce will contain the end of that chain */
create temporary table cs (row int auto_increment primary key, n int);
create temporary table ce like cs;
insert into cs (n) select n from mytab where n-1 not in (select n from mytab) order by n;
insert into ce (n) select n from mytab where n+1 not in (select n from mytab) order by n;
select ce.n + 1 as bgap, cs.n - 1 as egap
  from cs, ce where cs.row = ce.row + 1;

If instead of the gaps you want the contiguous chains then the final select should be:

select cs.n as bchain, ce.n as echain from cs,ce where cs.row=ce.row;
like image 43
ccc Avatar answered Sep 21 '22 17:09

ccc


This solution is better, in case you need to include the first element as 1:

SELECT
    1 AS gap_start,
    MIN(e.id) - 1 AS gap_end
FROM
    factura_entrada e
WHERE
    NOT EXISTS(
        SELECT
            1
        FROM
            factura_entrada
        WHERE
            id = 1
    )
LIMIT 1
UNION
    SELECT
        a.id + 1 AS gap_start,
        MIN(b.id)- 1 AS gap_end
    FROM
        factura_entrada AS a,
        factura_entrada AS b
    WHERE
        a.id < b.id
    GROUP BY
        a.id
    HAVING
        gap_start < MIN(b.id);
like image 22
jalopezsuarez Avatar answered Sep 19 '22 17:09

jalopezsuarez