Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL - Right Index choice for a status field (varchar)

Tags:

sql

postgresql

I have a table with lots of entries and a varchar field with length 8 that represents different statuses. There are only about 5 different statuses, lets say 'STATUS1', 'STATUS2', ... and most of the time it is NULL.

When I index the field, it doesn't do much because there are a lot of equal values and then postgres doesn't use the index.

My question is: Is there a way to index such a field and make it faster? Most of the time I query over status IS NULL and I think I can't make that faster. But what if I check for status = 'STATUS1'?

like image 255
Andwari Avatar asked Jan 05 '17 10:01

Andwari


People also ask

How does Postgres choose an index?

* * To be considered for an index scan, an index must match one or more * restriction clauses or join clauses from the query's qual condition, * or match the query's ORDER BY condition, or have a predicate that * matches the query's qual condition.

Which index is best in PostgreSQL?

B-Tree is the default index type for the CREATE INDEX command in PostgreSQL. It is compatible with all data types, and it can be used, for instance, to retrieve NULL values and work with caching. B-Tree is the most common index type, suitable for most cases.

What is the equivalent of varchar in PostgreSQL?

The notations varchar( n ) and char( n ) are aliases for character varying( n ) and character( n ) , respectively. If specified, the length must be greater than zero and cannot exceed 10485760. character without length specifier is equivalent to character(1) .

What is the most common index type used in PostgreSQL?

The most common and widely used index type is the B-tree index. This is the default index type for the CREATE INDEX command, unless you explicitly mention the type during index creation.


1 Answers

You can use partial indexes in some cases. Let's say you have lots of queries similar to

SELECT *
  FROM the_table
 WHERE color in ('green', 'blue') AND status = 'STATUS1' ;

This query would most probably run (much) faster if you create a partial index:

CREATE TABLE the_table
(
   color text, 
   status character varying(8)
    /* and anything you need */
) ; 

CREATE INDEX
  ON public.the_table (color)
  WHERE status = 'STATUS1' ;

If using PostgreSQL (o any other database which allows it), I'd probably be creating an enumerated type as well, instead of varchar. You have two advantages: only the enumerated values will be allowed (so "autochecking"), and the space needed to store the info (and index it) is less than varchar(8):

CREATE TYPE status_type AS ENUM
   ('STATUS1',
    'STATUS2',
    'STATUS3');

and then create the table with it:

CREATE TABLE the_table
(
   color text, 
   status status_type
    /* and anything you need */
) ; 

If you need to know (programmatically) which are the allowed values for the enumeration (for instance, to create a menu), check here.

If the database wouldn't allow for enums, I'd normalize to a small[ish] table of (anonymous_id_PK, status_value) pairs.

like image 81
joanolo Avatar answered Nov 15 '22 03:11

joanolo