Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

index on column which is only used for IS NULL and IS NOT NULL

I have a column deleted in my table. On every sql statement I check whether this flag IS NULL. Does someone want to delete entries, the flag is set to the current timestamp.

In case to restore entries, this timestamp is used to restore them. This is the only use case when the value of this column will be used.

In all other cases, it's only important to know whether it IS NULL or it IS NOT NULL.

In the future the table can and will contain millions of rows.

Is it useful to create an index on this column? Because 99% of the statements & use cases don't care about the value. Does MySQL optimize IS NULL conditions and therefore an index is not needed?

like image 428
lszrh Avatar asked Nov 06 '11 07:11

lszrh


People also ask

Can you index a column with NULL values?

Yes, SQL will use an index with NULLable columns. NULL is effectively just another "value" in an index. The index will be searched normally, just like any other index would be.

Can we create index for NULL values?

It can. the nulls are not indexed themselves, but the non-null values can be very useful in an index.

Which type of index includes the NULL values?

Unlike most other types of indexes, bitmap indexes include rows that have NULL values.

Why indexes should not be used on columns that contain a high number of NULL values?

In general, indexes on binary columns are not useful. The purpose of indexes is to reduce the number of data pages that need to be read. In general, binary columns are going to have records with both values on any given data page. There are two exceptions, but the second doesn't apply to Postgres.


2 Answers

An index on 'deleted' will also index null values, and thus allow for faster lookups of non-null/null

I think this will be sufficient in this case and not cause too much overhead, since the timestamp is set on deletion, and therefor won't be changed all that much. (The opposite: using an edit-timestamp that is changed all the time and only sometimes set to null, would cause adjusting the index on every time a record is changed. That might not be optimal. That is not the case here.)

(Also, but I don't know if the indexer is smart enough to take advantage of that, the expected changes always go to the ends of the index, either at the null-end or at the 'most recent' end.)

Of course, profile (both query execution times and storage space if important) to find out if there are actual problems arising from this.

like image 176
Inca Avatar answered Sep 22 '22 03:09

Inca


Can't you create an "archive" table and store deleted rows with their timestamp. If the user want to restore a row, you juste have to transfer it from archive to your main table.

And you don't have to check "flag IS NOT NULL" in every query

like image 34
j_freyre Avatar answered Sep 22 '22 03:09

j_freyre