Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimal Postgres text index for LIKE query?

Using Postgres 9.5, I have a table addresses.

CREATE TABLE addresses (
    id        integer PRIMARY KEY,
    address   text
);

In that table I have 7.5 million rows. Example:

1, "1600 Pennsylvania Avenue NW, Washington, DC, 20500"

I'm using this table for an autosuggest search in my application so I need to use this type of query:

SELECT * FROM addresses WHERE address LIKE '123 Main St%';

I created this index:

CREATE INDEX address_idx ON addresses (address);

But the problem is it's taking around 1 sec which is too slow.

Here's the query plan:

EXPLAIN SELECT * FROM addresses WHERE address LIKE '123 Main St%';
----
Seq Scan on addresses  (cost=0.00..161309.76 rows=740 width=41)
  Filter: (address ~~ '123 Main St%'::text)

I tried creating a few types of gin indexes but they either had no effect or made the query slower. I'm not sure if I was using them correctly though.

Any ideas on how to create an index that's optimized for this kind of query?


EDIT

The best solution found so far is to use a text range scan:

SELECT *
FROM addresses
WHERE address >= '123 Main St' AND
      address <= concat('123 Main St', 'z');
like image 863
Tyler Avatar asked Mar 09 '16 00:03

Tyler


2 Answers

This is an elaboration on the between method and too long for a comment.

If you are using standard ASCII characters, you can use the tilde-trick:

SELECT *
FROM addresses
WHERE address >= '123 Main St' AND
      address <= concat('123 Main St', '~');

Tilde has a larger ASCII value than other characters.

I do note that Postgres should use the index for the LIKE query as well. My guess is that the problem is something to do with compatibility of the types. Perhaps if you converted the pattern to a varchar(), Postgres would use the index.

like image 181
Gordon Linoff Avatar answered Sep 24 '22 00:09

Gordon Linoff


Three things you could try:

  1. If your database is on 'C' locale (which you can check with \l on the psql prompt) then the regular Btree indexes should help in optimizing a LIKE 'abc%' type of query.
  2. If not, you could try using a suitable operator class when creating the Btree index. For e.g. CREATE INDEX tbl_col_text_pattern_ops_idx ON tbl(col text_pattern_ops);
  3. If that doesn't work, you could also try using the GiST / GIN, more detailing for which is given here.

If you'd like to know more you should read Erwin's StackOverflow answer here, that details how different Postgres indexes work with LIKE / ILIKE.

like image 20
Robins Tharakan Avatar answered Sep 24 '22 00:09

Robins Tharakan