Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would be the best algorithm to find an ID that is not used from a table that has the capacity to hold a million rows

To elaborate .. a) A table (BIGTABLE) has a capacity to hold a million rows with a primary Key as the ID. (random and unique) b) What algorithm can be used to arrive at an ID that has not been used so far. This number will be used to insert another row into table BIGTABLE.

Updated the question with more details.. C) This table already has about 100 K rows and the primary key is not an set as identity. d) Currently, a random number is generated as the primary key and a row inserted into this table, if the insert fails another random number is generated. the problem is sometimes it goes into a loop and the random numbers generated are pretty random, but unfortunately, They already exist in the table. so if we re try the random number generation number after some time it works. e) The sybase rand() function is used to generate the random number.

Hope this addition to the question helps clarify some points.

like image 954
vicky Avatar asked Sep 18 '08 17:09

vicky


1 Answers

The question is of course: why do you want a random ID?

One case where I encountered a similar requirement, was for client IDs of a webapp: the client identifies himself with his client ID (stored in a cookie), so it has to be hard to brute force guess another client's ID (because that would allow hijacking his data).

The solution I went with, was to combine a sequential int32 with a random int32 to obtain an int64 that I used as the client ID. In PostgreSQL:

CREATE FUNCTION lift(integer, integer) returns bigint AS $$
SELECT ($1::bigint << 31) + $2
$$ LANGUAGE SQL;

CREATE FUNCTION random_pos_int() RETURNS integer AS $$
select floor((lift(1,0) - 1)*random())::integer
$$ LANGUAGE sql;

ALTER TABLE client ALTER COLUMN id SET DEFAULT
lift((nextval('client_id_seq'::regclass))::integer, random_pos_int());

The generated IDs are 'half' random, while the other 'half' guarantees you cannot obtain the same ID twice:

select lift(1, random_pos_int());  => 3108167398
select lift(2, random_pos_int());  => 4673906795
select lift(3, random_pos_int());  => 7414644984
...
like image 54
Bruno De Fraine Avatar answered Sep 28 '22 18:09

Bruno De Fraine