PostgreSQL: Create an index to quickly distinguish NULL from non-NULL values

Tags:

Consider a SQL query with the following WHERE predicate:

... WHERE name IS NOT NULL ...

Where name is a textual field in PostgreSQL.

No other query checks any textual property of this value, just whether it is NULL or not. Therefore, a full btree index seems like an overkill, even though it supports this distinction:

Also, an IS NULL or IS NOT NULL condition on an index column can be used with a B-tree index.

What's the right PostgreSQL index to quickly distinguish NULLs from non-NULLs?

633

asked Aug 12 '15 13:08

Adam Matan

1 Answers

I'm interpreting you claim that it's "overkill" in two ways: in terms of complexity (using a B-Tree instead of just a list) and space/performance.

For complexity, it's not overkill. A B-Tree index is preferable because deletes from it will be faster than some kind of "unordered" index (for lack of a better term). (An unordered index would require a full index scan just to delete.) In light of that fact, any gains from an unordered index would be usually be outweighed by the detriments, so the development effort isn't justified.

For space and performance, though, if you want a highly selective index for efficiency, you can include a WHERE clause on an index, as noted in the fine manual:

CREATE INDEX ON my_table (name) WHERE name IS NOT NULL;

Note that you'll only see benefits from this index if it can allow PostgreSQL to ignore a large amount of rows when executing your query. E.g., if 99% of the rows have name IS NOT NULL, the index isn't buying you anything over just letting a full table scan happen; in fact, it would be less efficient (as @CraigRinger notes) since it would require extra disk reads. If however, only 1% of rows have name IS NOT NULL, then this represents huge savings as PostgreSQL can ignore most of the table for your query. If your table is very large, even eliminating 50% of the rows might be worth it. This is a tuning problem, and whether the index is valuable is going to depend heavily on the size and distribution of the data.

Additionally, there is very little gain in terms of space if you still need another index for the name IS NULL rows. See Craig Ringer's answer for details.

107

answered Sep 20 '22 01:09

jpmc26

Related questions
                            
                                SQL Server - In clause with a declared variable [duplicate]
                            
                                Find all stored procedures that reference a specific column in some table
                            
                                Why use SQL database? [closed]
                            
                                In SQL / MySQL, what is the difference between "ON" and "WHERE" in a join statement?
                            
                                How do I add a auto_increment primary key in SQL Server database?
                            
                                PostgreSQL column 'foo' does not exist
                            
                                Capturing count from an SQL query
                            
                                I want to show all tables that have specified column name
                            
                                How can I roll back my last delete command in MySQL?
                            
                                SQL How to replace values of select return?
                            
                                How to search for slash (\) in MySQL? and why escaping (\) not required for where (=) but for Like is required?
                            
                                How do I create a comma-separated list using a SQL query?
                            
                                SQL Server 2008: TOP 10 and distinct together
                            
                                Laravel Eloquent get results grouped by days
                            
                                What is ROWS UNBOUNDED PRECEDING used for in Teradata?
                            
                                Equivalent of explode() to work with strings in MySQL
                            
                                Creating table names that are reserved words/keywords in MS SQL Server [closed]
                            
                                Calculating number of full months between two dates in SQL
                            
                                How to regex in a MySQL query
                            
                                IS vs AS keywords for PL/SQL Oracle Function or Procedure Creation [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PostgreSQL: Create an index to quickly distinguish NULL from non-NULL values

Tags:

sql

null

indexing

postgresql

Adam Matan

People also ask

1 Answers

jpmc26

Recent Activity

Donate For Us