Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow query with cfqueryparam searching on indexed column containing hashes

I have the following query that runs in 16ms - 30ms.

<cfquery name="local.test1" datasource="imagecdn">
    SELECT hash FROM jobs WHERE hash in(
        'EBDA95630915EB80709C69089315399B',
        '3617B8E6CF0C62ECBD3C48DDF8585466',
        'D519A38F09FDA868A2FEF1C55C9FEE76',
        '135F94C3774F7719CFF8FF3A275D2D05',
        'D58FAE69C559273D8427673A08193789',
        '2BD7276F209768F2FCA6635659D7922A',
        'B1E3CFBFCCFF6F5B48A849A050E6D424',
        '2288F5B8A797F5302E8CA24323617236',
        '8951883E36B5D38A4643DFAA0396BF13',
        '839210BD564E30BE1355D1A6D4EF7081',
        'ED4A2CB0C28B608C29576819CF7BE19B',
        'CB26925A4874945B810707D5FF0B91F2',
        '33B2FC229F0CC797A02AD163CDBA0875',
        '624986E7547DBAC0F47B3005CFDE0A16',
        '6F692C289BD805CEE41EF59F83F16F4D',
        '8551F0033C617BD9EADAAD6CEC4B3E9E',
        '94C3C0A74C2DE085FF9F1BBF928821A4',
        '28DC1A9D2A69C2EDF5E6C0E6368A0B3C'
    )
</cfquery>

If I execute the same query but use cfqueryparam it runs in 500ms - 2000ms.

<cfset local.hashes = "[list of the same ids as above]">
<cfquery name="local.test2" datasource="imagecdn">
    SELECT hash FROM jobs WHERE hash in(
        <cfqueryparam cfsqltype="cf_sql_varchar" value="#local.hashes#" list="yes">
    )
</cfquery>

The table has roughly 60,000 rows. The "hash" column is varchar(50) and has a unique non-clustered index, but is not the primary key. DB server is MSSQL 2008. The web server is running the latest version of CF9.

Any idea why the cfqueryparam causes the performance to bomb out? It behaves this way every single time, no matter how many times I refresh the page. If I pair the list down to only 2 or 3 hashes, it still performs poorly at like 150-200ms. When I eliminate the cfqueryparam the performance is as expected. In this situation there is the possibility for SQL injection and thus using cfqueryparam would certainly be preferable, but it shouldn't take 100ms to find 2 records from an indexed column.

Edits:

  1. We are using hashes generated by hash() not UUIDS or GUIDS. The hash is generated by a hash(SerializeJSON({ struct })) which contains the plan for a set of operations to execute on an image. The purpose for this is that it allows us to know before insert and before query the exact unique id for that structure. These hashes act as an "index" of what structures have already been stored in the DB. In addition with hashes the same structure will hash to the same result, which is not true for UUIDS and GUIDS.

  2. The query is being executed on 5 different CF9 servers and all of them exhibit the same behavior. To me this rules out the idea that CF9 is caching something. All servers are connecting to the exact same DB so if caching was occurring it would have to be the DB level.

like image 949
Nucleon Avatar asked Feb 21 '23 13:02

Nucleon


1 Answers

Your issue may be related to VARCHAR vs NVARCHAR. These 2 links may help Querying MS SQL Server G/UUIDs from ColdFusion and nvarchar vs. varchar in SQL Server, BEWARE

What might be happening is there is a setting in ColdFusion administrator if cfqueryparam sends varchars as unicode or not. If that setting does not match the column setting (in your case, if that setting is enabled) then MS SQL will not use that index.

like image 147
Yisroel Avatar answered Apr 06 '23 18:04

Yisroel