Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgresql ORDER BY - choosing right index

There is table T(user, timestamp,...) with 100 ml+ records (PostgreSQL 9.1).

A query of the form

SELECT * 
FROM T 
WHERE user='abcd' 
ORDER BY timestamp 
LIMIT 1

is using timestamp index instead of user index when there are ~100000 user records.

Using timestamp index will always give poor results(20+ sec) as it eventually scans all records. Bypassing timestamp index by changing query to use ORDER BY DATE(timestamp) will result in query to resort to user index and give results which is less than 100 ms.

  • Total RAM: 64 GB
  • shared_buffers: 16 GB
  • work_mem: 32 MB

Why is postgresql ignoring user index and is using timestamp index instead (timestamp index will need to see all records)? Are there any postgresql config params which can be altered to make the query use username index itself?

like image 457
Anoop Avatar asked Nov 25 '14 08:11

Anoop


1 Answers

Nice question, I have came around this problem a while ago.

Why this happens?

You should look into the number of user='abcd' values in your stats like this:

SELECT attname, null_frac, ag_width, n_distinct,
       most_common_vals, most_common_freqs, histogram_bounds
  FROM pg_stats
 WHERE table_name='T';

My guess is — this value occurs quite often and you'll find it in the most_common_vals output. Picking the same element from the most_common_freqs you'll get the ratio for the value, multiply it by total number of rows (can be obtained from pg_class) to get number of rows that are estimated to have 'abcd' value.

Planner assumes all values to have a linear distribution. In reality things are different of course. Also, currently there're no correlated stats (although some work is being done in this direction).

So, let's take user='abcd' value, having 0.001 ratio (per question) in the corresponding most_common_freqs entry. This means value will occur every 1000 rows (assuming linear distribution). It appears, that if we'll scan table in any way we'll hit our user='abcd' in some 1000 rows. Sounds that it should be fast! Planner "thinks" the same and chooses index on timestamp column.

But it is not. If we'll assume, that your table T contains logs of user activity, and user='abcd' was on vacation for the last 3 weeks, then this means we'll have to read quite a lot of rows from the timestamp index (3 weeks worth of data) before we actually hit the row we want. Well, you as DBA know this, but planner assumes linear distribution.

So, how to fix?

You'll have to trick the planner to use what you need, as you have more knowledge of your data.

  1. Use OFFSET 0 trick with subquery:

    SELECT *
      FROM
      (
         SELECT * FROM T WHERE user='abcd' OFFSET 0
      )
      ORDER BY timestamp 
      LIMIT 1;
    

    This trick protects query from inlining, therefore inner part is executed on it's own.

  2. Use CTE (named subquery):

    WITH s AS (
            SELECT * FROM T WHERE user='abcd'
    )
    SELECT *
      FROM s
     ORDER BY timestamp 
     LIMIT 1;
    

    Per documentation:

    A useful property of WITH queries is that they are evaluated only once per execution of the parent query, even if they are referred to more than once by the parent query or sibling WITH queries.

  3. Use count(*) for aggrgated queries:

    SELECT min(session_id), count(*) -- instead of simply `min(session_id)`
      FROM T 
     WHERE user='abcd' 
     ORDER BY timestamp 
     LIMIT 1;
    

    This is not really applicable, but I wanted to mention it.

And please, consider upgrading to 9.3.

P.S. More on row estiamtes in the docs of course.

like image 179
vyegorov Avatar answered Oct 04 '22 03:10

vyegorov