Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DocumentDB: Performance impact of built-in string functions (like UPPER)

In our .NET application we use the DocumentDB SDK to query our Azure DocumentDB. We were trying to find the cause of a performance problem when we realized that the built-in string functions in queries seems to have a big impact on performance.

I was going to paste some stats that I got from our application but I've been able to replicate the situation with the playground here: https://www.documentdb.com/sql/demo (click on the sandbox tab)

With the following query:

SELECT *
FROM food 
WHERE food.description="Babyfood, dessert, fruit pudding, orange, strained"

I get:

Query with no UPPER function

And with UPPER string function:

SELECT *
FROM food 
WHERE UPPER(food.description)=UPPER("Babyfood, dessert, fruit pudding, orange, strained")

I get:

Query with UPPER string function

The absolute numbers don't really matter here, in our application we apply UPPER on an email field and we see a big difference. Operation takes 1s with no UPPER vs 20s with it!

like image 284
phil_lgr Avatar asked Mar 10 '23 17:03

phil_lgr


1 Answers

With a few exceptions, any time you use functions on field values it can't use indexes so the query becomes a full table scan. The best way around this is to store the values in another field already UPPER and query against that. Alternatively, if you can combine a more highly selective clause with an UPPER() clause you'll get better performance.

like image 177
Larry Maccherone Avatar answered Apr 26 '23 23:04

Larry Maccherone