LOWER LIKE vs iLIKE

Tags:

How does the performance of the following two query components compare?

LOWER LIKE

... LOWER(description) LIKE '%abcde%' ...

iLIKE

... description iLIKE '%abcde%' ...

201

asked Dec 02 '13 19:12

user664833

3 Answers

The answer depends on many factors like Postgres version, encoding and locale - LC_COLLATE in particular.

The bare expression lower(description) LIKE '%abc%' is typically a bit faster than description ILIKE '%abc%', and either is a bit faster than the equivalent regular expression: description ~* 'abc'. This matters for sequential scans where the expression has to be evaluated for every tested row.

But for big tables like you demonstrate in your answer one would certainly use an index. For arbitrary patterns (not only left-anchored) I suggest a trigram index using the additional module pg_trgm. Then we talk about milliseconds instead of seconds and the difference between the above expressions is nullified.

GIN and GiST indexes (using the gin_trgm_ops or gist_trgm_ops operator classes) support LIKE (~~), ILIKE (~~*), ~, ~* (and some more variants) alike. With a trigram GIN index on description (typically bigger than GiST, but faster for reads), your query would use description ILIKE 'case_insensitive_pattern'.

PostgreSQL LIKE query performance variations
Similar UTF-8 strings for autocomplete field

Basics for pattern matching in Postgres:

Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL

When working with said trigram index it's typically more practical to work with:

description ILIKE '%abc%'

Or with the case-insensitive regexp operator (without % wildcards):

description ~* 'abc'

An index on (description) does not support queries on lower(description) like:

lower(description) LIKE '%abc%'

And vice versa.

With predicates on lower(description) exclusively, the expression index is the slightly better option.

In all other cases, an index on (description) is preferable as it supports both case-sensitive and -insensitive predicates.

185

answered Sep 23 '22 05:09

Erwin Brandstetter

According to my tests (ten of each query), LOWER LIKE is about 17% faster than iLIKE.

Explanation

I created a million rows contain some random mixed text data:

require 'securerandom'
inserts = []
1000000.times do |i|
        inserts << "(1, 'fake', '#{SecureRandom.urlsafe_base64(64)}')"
end
sql = "insert into books (user_id, title, description) values #{inserts.join(', ')}"
ActiveRecord::Base.connection.execute(sql)

Verify the number of rows:

my_test_db=# select count(id) from books ;
  count  
---------
 1000009

(Yes, I have nine extra rows from other tests - not a problem.)

Example query and results:

my_test_db=# SELECT "books".* FROM "books" WHERE "books"."published" = 'f'
my_test_db=# and (LOWER(description) LIKE '%abcde%') ;
   id    | user_id | title |                                      description                                       | published 
---------+---------+-------+----------------------------------------------------------------------------------------+------
 1232322 |       1 | fake  | 5WRGr7oCKABcdehqPKsUqV8ji61rsNGS1TX6pW5LJKrspOI_ttLNbaSyRz1BwTGQxp3OaxW7Xl6fzVpCu9y3fA | f
 1487103 |       1 | fake  | J6q0VkZ8-UlxIMZ_MFU_wsz_8MP3ZBQvkUo8-2INiDIp7yCZYoXqRyp1Lg7JyOwfsIVdpPIKNt1uLeaBCdelPQ | f
 1817819 |       1 | fake  | YubxlSkJOvmQo1hkk5pA1q2mMK6T7cOdcU3ADUKZO8s3otEAbCdEcmm72IOxiBdaXSrw20Nq2Lb383lq230wYg | f

Results for LOWER LIKE

my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (LOWER(description) LIKE '%abcde%') ;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Seq Scan on books  (cost=0.00..32420.14 rows=1600 width=117) (actual time=938.627..4114.038 rows=3 loops=1)
   Filter: ((NOT published) AND (lower(description) ~~ '%abcde%'::text))
   Rows Removed by Filter: 1000006
 Total runtime: 4114.098 ms

Results for iLIKE

my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (description iLIKE '%abcde%') ;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Seq Scan on books  (cost=0.00..29920.11 rows=100 width=117) (actual time=1147.612..4986.771 rows=3 loops=1)
   Filter: ((NOT published) AND (description ~~* '%abcde%'::text))
   Rows Removed by Filter: 1000006
 Total runtime: 4986.831 ms

Database info disclosure

Postgres version:

my_test_db=# select version();
                                                                                 version
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.2.4 on x86_64-apple-darwin12.4.0, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00), 64-bit

Collation setting:

my_test_db=# select datcollate from pg_database where datname = 'my_test_db';
 datcollate  
-------------
 en_CA.UTF-8

Table definition:

my_test_db=# \d books 
                                      Table "public.books"
   Column    |            Type             |                       Modifiers
-------------+-----------------------------+-------------------------------------------------------
 id          | integer                     | not null default nextval('books_id_seq'::regclass)
 user_id     | integer                     | not null
 title       | character varying(255)      | not null
 description | text                        | not null default ''::text
 published   | boolean                     | not null default false
Indexes:
    "books_pkey" PRIMARY KEY, btree (id)

answered Sep 24 '22 05:09

user664833

In my rails Project. ILIKE is almost 10x faster then LOWER LIKE, I add a GIN index on entities.name column

> Entity.where("LOWER(name) LIKE ?", name.strip.downcase).limit(1).first
Entity Load (2443.9ms)  SELECT  "entities".* FROM "entities" WHERE (lower(name) like 'baidu') ORDER BY "entities"."id" ASC LIMIT $1  [["LIMIT", 1]]

> Entity.where("name ILIKE ?", name.strip).limit(1).first
Entity Load (285.0ms)  SELECT  "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT $1  [["LIMIT", 1]]

# explain analyze SELECT  "entities".* FROM "entities" WHERE (name ilike 'Baidu') ORDER BY "entities"."id" ASC LIMIT 1;
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3186.03..3186.04 rows=1 width=1588) (actual time=7.812..7.812 rows=1 loops=1)
   ->  Sort  (cost=3186.03..3187.07 rows=414 width=1588) (actual time=7.811..7.811 rows=1 loops=1)
         Sort Key: id
         Sort Method: quicksort  Memory: 26kB
         ->  Bitmap Heap Scan on entities  (cost=1543.21..3183.96 rows=414 width=1588) (actual time=7.797..7.805 rows=1 loops=1)
               Recheck Cond: ((name)::text ~~* 'Baidu'::text)
               Rows Removed by Index Recheck: 6
               Heap Blocks: exact=7
               ->  Bitmap Index Scan on index_entities_on_name  (cost=0.00..1543.11 rows=414 width=0) (actual time=7.787..7.787 rows=7 loops=1)
                     Index Cond: ((name)::text ~~* 'Baidu'::text)
 Planning Time: 6.375 ms
 Execution Time: 7.874 ms
(12 rows)

GIN index is really helpful to improve ILIKE performance

answered Sep 25 '22 05:09

lfx_cool

Related questions
                            
                                Very slow Eclipse 4.2, how to make it more responsive?
                            
                                C# / F# Performance comparison
                            
                                Why do C++ optimizers have problems with these temporary variables or rather why `v[]` should be avoided in tight loops?
                            
                                What is the time complexity of HashMap.containsKey() in java?
                            
                                What is the best way to improve performance of NHibernate? [closed]
                            
                                Performance of inner join compared to cross join
                            
                                How many CPU cycles are needed for each assembly instruction?
                            
                                simplest tool to measure C program cache hit/miss and cpu time in linux?
                            
                                What's faster, iterating an STL vector with vector::iterator or with at()?
                            
                                Setup HTTP expires headers using PHP and Apache
                            
                                Stopwatch class for Java
                            
                                Will printf still have a cost even if I redirect output to /dev/null?
                            
                                Math optimization in C#
                            
                                C# quickest way to shift array
                            
                                React renderToString() Performance and Caching React Components
                            
                                Array of objects vs Object of Objects
                            
                                Find the least number of coins required that can make any change from 1 to 99 cents
                            
                                Defining methods via prototype vs using this in the constructor - really a performance difference?
                            
                                Most Efficient Way to Create an Index in Postgres
                            
                                What is the difference between a hash join and a merge join (Oracle RDBMS )?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

LOWER LIKE vs iLIKE

Tags:

performance

pattern-matching

postgresql

database-performance

user664833

People also ask

3 Answers

Erwin Brandstetter

user664833

lfx_cool

Recent Activity

Donate For Us