I am doing some crude benchmarks with the xml datatype of SQL Server 2008. I've seen many places where .exist
is used in where
clauses. I recently compared two queries though and got odd results.
select count(testxmlrid) from testxml
where Attributes.exist('(form/fields/field)[@id="1"]')=1
This query takes about 1.5 seconds to run, with no indexes on anything but the primary key(testxmlrid)
select count(testxmlrid) from testxml
where Attributes.value('(/form/fields/field/@id)[1]','integer')=1
This query on the otherhand takes about .75 seconds to run.
I'm using untyped XML and my benchmarking is taking place on a SQL Server 2008 Express instance. There are about 15,000 rows in the dataset and each XML string is about 25 lines long.
Are these results I'm getting correct? If so, why does everyone use .exist
? Am I doing something wrong and .exist
could be faster?
A where clause will generally increase the performance of the database. Generally, it is more expensive to return data and filter in the application. The database can optimize the query, using indexes and partitions. The database may be running in parallel, executing the query in parallel.
The EXISTS clause is much faster than IN when the subquery results is very large. Conversely, the IN clause is faster than EXISTS when the subquery results is very small.
Although the where clause has a huge impact on performance, it is often phrased carelessly so that the database has to scan a large part of the index. The result: a poorly written where clause is the first ingredient of a slow query.
EXISTS Operator 2. IN works faster than the EXISTS Operator when If the sub-query result is small. If the sub-query result is larger, then EXISTS works faster than the IN Operator.
You are not counting the same things. Your .exist
query (form/fields/field)[@id="1"]
checks all occurrences of @id
in the XML until it finds one with the value 1
and your .value
query (/form/fields/field/@id)[1]
only fetches the first occurrence of @id
.
Test this:
declare @T table
(
testxmlrid int identity primary key,
Attributes xml
)
insert into @T values
('<form>
<fields>
<field id="2"/>
<field id="1"/>
</fields>
</form>')
select count(testxmlrid) from @T
where Attributes.exist('(form/fields/field)[@id="1"]')=1
select count(testxmlrid) from @T
where Attributes.value('(/form/fields/field/@id)[1]','integer')=1
The .exist
query count is 1 because it finds the @id=1
in the second field
node and the .value
query count is 0 because it only checks the value for the first occurrence of @id
.
An .exist
query that only checks the value for the first occurrence of @id
like your .value
query would look like this.
select count(testxmlrid) from @T
where Attributes.exist('(/form/fields/field/@id)[1][.="1"]')=1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With