Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

T-SQL: How can I compare two variables of type XML when length > VarChar(MAX)?

Using only SQL Server 2008 R2 (this is going to be in a stored proc), how can I determine if two variables of type XML are equivalent?

Here is what I want to do:

DECLARE @XmlA   XML
DECLARE @XmlB   XML

SET @XmlA = '[Really long Xml value]'
SET @XmlB = '[Really long Xml value]'

IF @XmlA = @XmlB
    SELECT 'Matching Xml!'

But as you probably know, it returns:

Msg 305, Level 16, State 1, Line 7 The XML data type cannot be compared or sorted, except when using the IS NULL operator.

I can convert to VarChar(MAX) and compare, but that only compares the first 2MB. Is there another way?

like image 761
ScottBai Avatar asked Jan 26 '12 03:01

ScottBai


People also ask

How do I compare two string lengths in SQL?

We can compare two or more strings using the STRCMP string function, LIKE operator, and Equal operator.

What is the character limit of VARCHAR Max?

Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 65,535. The effective maximum length of a VARCHAR is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.

What is the difference between VARCHAR Max and Nvarchar Max?

The key difference between varchar and nvarchar is the way they are stored, varchar is stored as regular 8-bit data(1 byte per character) and nvarchar stores data at 2 bytes per character. Due to this reason, nvarchar can hold upto 4000 characters and it takes double the space as SQL varchar.

Does VARCHAR Max affect performance?

Yes, it can affect performance, as space needs to be allocated to hold large values in the query engine. In your case, you could also use a suitably large size such as varchar(50) which would easily hold whatever you needed.


2 Answers

Check this SQL function:

CREATE FUNCTION [dbo].[CompareXml]
(
    @xml1 XML,
    @xml2 XML
)
RETURNS INT
AS 
BEGIN
    DECLARE @ret INT
    SELECT @ret = 0


    -- -------------------------------------------------------------
    -- If one of the arguments is NULL then we assume that they are
    -- not equal. 
    -- -------------------------------------------------------------
    IF @xml1 IS NULL OR @xml2 IS NULL 
    BEGIN
        RETURN 1
    END

    -- -------------------------------------------------------------
    -- Match the name of the elements 
    -- -------------------------------------------------------------
    IF  (SELECT @xml1.value('(local-name((/*)[1]))','VARCHAR(MAX)')) 
        <> 
        (SELECT @xml2.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
    BEGIN
        RETURN 1
    END

     ---------------------------------------------------------------
     --Match the value of the elements
     ---------------------------------------------------------------
    IF((@xml1.query('count(/*)').value('.','INT') = 1) AND (@xml2.query('count(/*)').value('.','INT') = 1))
    BEGIN
    DECLARE @elValue1 VARCHAR(MAX), @elValue2 VARCHAR(MAX)

    SELECT
        @elValue1 = @xml1.value('((/*)[1])','VARCHAR(MAX)'),
        @elValue2 = @xml2.value('((/*)[1])','VARCHAR(MAX)')

    IF  @elValue1 <> @elValue2
    BEGIN
        RETURN 1
    END
    END

    -- -------------------------------------------------------------
    -- Match the number of attributes 
    -- -------------------------------------------------------------
    DECLARE @attCnt1 INT, @attCnt2 INT
    SELECT
        @attCnt1 = @xml1.query('count(/*/@*)').value('.','INT'),
        @attCnt2 = @xml2.query('count(/*/@*)').value('.','INT')

    IF  @attCnt1 <> @attCnt2 BEGIN
        RETURN 1
    END


    -- -------------------------------------------------------------
    -- Match the attributes of attributes 
    -- Here we need to run a loop over each attribute in the 
    -- first XML element and see if the same attribut exists
    -- in the second element. If the attribute exists, we
    -- need to check if the value is the same.
    -- -------------------------------------------------------------
    DECLARE @cnt INT, @cnt2 INT
    DECLARE @attName VARCHAR(MAX)
    DECLARE @attValue VARCHAR(MAX)

    SELECT @cnt = 1

    WHILE @cnt <= @attCnt1 
    BEGIN
        SELECT @attName = NULL, @attValue = NULL
        SELECT
            @attName = @xml1.value(
                'local-name((/*/@*[sql:variable("@cnt")])[1])', 
                'varchar(MAX)'),
            @attValue = @xml1.value(
                '(/*/@*[sql:variable("@cnt")])[1]', 
                'varchar(MAX)')

        -- check if the attribute exists in the other XML document
        IF @xml2.exist(
                '(/*/@*[local-name()=sql:variable("@attName")])[1]'
            ) = 0
        BEGIN
            RETURN 1
        END

        IF  @xml2.value(
                '(/*/@*[local-name()=sql:variable("@attName")])[1]', 
                'varchar(MAX)')
            <>
            @attValue
        BEGIN
            RETURN 1
        END

        SELECT @cnt = @cnt + 1
    END

    -- -------------------------------------------------------------
    -- Match the number of child elements 
    -- -------------------------------------------------------------
    DECLARE @elCnt1 INT, @elCnt2 INT
    SELECT
        @elCnt1 = @xml1.query('count(/*/*)').value('.','INT'),
        @elCnt2 = @xml2.query('count(/*/*)').value('.','INT')


    IF  @elCnt1 <> @elCnt2
    BEGIN
        RETURN 1
    END


    -- -------------------------------------------------------------
    -- Start recursion for each child element
    -- -------------------------------------------------------------
    SELECT @cnt = 1
    SELECT @cnt2 = 1
    DECLARE @x1 XML, @x2 XML
    DECLARE @noMatch INT

    WHILE @cnt <= @elCnt1 
    BEGIN

        SELECT @x1 = @xml1.query('/*/*[sql:variable("@cnt")]')
    --RETURN CONVERT(VARCHAR(MAX),@x1)
    WHILE @cnt2 <= @elCnt2
    BEGIN
        SELECT @x2 = @xml2.query('/*/*[sql:variable("@cnt2")]')
        SELECT @noMatch = dbo.CompareXml( @x1, @x2 )
        IF @noMatch = 0 BREAK
        SELECT @cnt2 = @cnt2 + 1
    END

    SELECT @cnt2 = 1

        IF @noMatch = 1
        BEGIN
            RETURN 1
        END

        SELECT @cnt = @cnt + 1
    END

    RETURN @ret
END

Here is the Source


The function fails to compare XML fragments e.g. when there is not a single root element, like:

SELECT dbo.CompareXml('<data/>', '<data/><data234/>') 

In order to fix this, you must wrap your XMLs in root elements, when they are passed to the function or edit the function to do this. For, example:

SELECT dbo.CompareXml('<r><data/></r>', '<r><data/><data234/></r>')  
like image 192
gotqn Avatar answered Nov 15 '22 11:11

gotqn


There are many different ways of comparing two XML documents, and a lot depends on what kind of differences you want to tolerate: you definitely need to tolerate differences in encoding, attribute order, insignificant whitespace, numeric character references, and use of attribute delimiters, and you should probably also tolerate differences in use of comments, namespace prefixes, and CDATA. So comparing two XML documents as strings is definitely not a good idea - unless you invoke XML canonicalization first.

For many purposes the XQuery deep-equals() function does the right thing (and is more-or-less equivalent to comparing the canonical forms of the two XML documents). I don't know enough about Microsoft's SQL Server implementation of XQuery to tell you how to invoke this from the SQL level.

like image 28
Michael Kay Avatar answered Nov 15 '22 11:11

Michael Kay