Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Invalid XML in a varbinary(max) column in SQL Server 2005

So I got a varbinary(max) column in SQL Server 2005 and it's full of XML. A few records somewhere have truncated XML so they're invalid.

This means if I run a

SELECT CAST(myVarbinaryColumn as XML) ... 

it blows chunks.

How can I filter out/skip invalid xml ?

When I've done similar with a varchar that supposedly has dates I could use ISDATE(blah) = 1. So an equivalent ISVALIDXML() would be nice.

Please don't comment about "Why isn't the column XML datatype anyway.." This happened in the past and I don't have a time machine.

like image 317
hollystyles Avatar asked Sep 22 '10 16:09

hollystyles


People also ask

What is the max size of Varbinary in SQL Server?

varbinary [ ( n | max) ] Variable-length binary data. n can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size is the actual length of the data entered + 2 bytes.

How can I see Varbinary data in SQL Server?

INSERT INTO #TempTable(PK, VarBinaryColumn) SELECT PK, VarBinaryColumn FROM dbo. YourPermanentTable; If you need to convert the varbinary data back to the original file text format in T-SQL, you can use CAST or CONVERT to convert to varchar or nvarchar as long as the data was originally ASCII or Unicode.


2 Answers

I think your best bet would be to write a custom CLR function, perhaps using XmlDocument.Load. In the CLR you could trap the error on a failed load and return an appropriate result.

EDIT: The code below would also work although it's not as elegant as a UDF. Unfortunately, we can't use TRY/CATCH in a UDF.

create procedure dbo.usp_IsValidXML(@XMLCandidate varbinary(max), @Return bit output)
as
begin
    declare @x xml
    begin try
        set @x = cast(@XMLCandidate as xml)
        set @Return = 1
    end try
    begin catch
        set @Return = 0
    end catch
end
go

declare @test1 varbinary(max)
declare @test2 varbinary(max)
set @test1 = cast('<data>asdf</data>' as varbinary(max))
set @test2 = cast('<data>asdf</da' as varbinary(max))

declare @IsValid bit
exec dbo.usp_IsValidXML @test1, @IsValid output
select @IsValid
exec dbo.usp_IsValidXML @test2, @IsValid output
select @IsValid

drop procedure dbo.usp_IsValidXML
like image 108
Joe Stefanelli Avatar answered Oct 06 '22 00:10

Joe Stefanelli


I wish I had that IsValidXML() function, too..... unfortunately, I don't think there's anything like that.....

Just a thought: is there anything you could check for to filter out the invalid XML??

E.g. could you filter out all those strings that don't end in .....</data> or something like that?? (seeing that you say your invalid XML is typically truncated XML, so I would think the closing tag - </data> or whatever - would be missing in those cases).

like image 34
marc_s Avatar answered Oct 06 '22 00:10

marc_s