Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check VARCHAR(n) for well-formed XML before CAST/CONVERT

My company has a logging table containing a VARCHAR(N) column in which a string is placed that is supposed to be XML, but as it turns out it is not always well-formed. In order to perform analysis on the logging (to identify error trends, etc.), I have been using a LIKE statement. However, this is remarkably slow.

Recently, I discovered that SQL Server supports XQuery, so I started playing with it. The problem I'm running into is that I can't figure out a way to handle CAST/CONVERT errors within my SELECT statement. The closest I've come requires SQL Server 2012 as it has the TRY_CONVERT function, but upgrading from 2008 R2 is not an option right now.

Here is what I have (which would work if my company was running 2012):

CREATE FUNCTION IsMatch(
    @message AS VARCHAR(MAX),
    @match AS VARCHAR(MAX),
    @default AS VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
    DECLARE @xml XML = TRY_CONVERT(XML, @message)
    DECLARE @result VARCHAR(MAX) =
        CASE WHEN @xml IS NOT NULL
             THEN CASE WHEN @xml.exist('(/FormattedMessage)[contains(.,sql:variable("@match"))]') = 1
                       THEN @match 
                       ELSE @default 
                       END
             ELSE CASE WHEN @message LIKE '%' + @match + '%'
                       THEN @match
                       ELSE @default
                       END
             END
    RETURN @result
END

GO

DECLARE @search VARCHAR(MAX) = 'a substring of my xml error message'

SELECT Error, COUNT(*) as 'Count'
FROM ( SELECT TOP 319 [LogID]
          ,[Severity]
          ,[Title]
          ,[Timestamp]
          ,[MachineName]
          ,[FormattedMessage]
          --,CAST([formattedmessage] as xml)
          ,IsMatch(@search, 'Other') as 'Error'
       FROM [MyDatabase].[dbo].[Log] (NOLOCK) ) a
GROUP BY Error

The commented CAST (or alternately a CONVERT) will cause the query to error as soon as it encounters malformed XML. If I limit to a TOP (N), I can ensure that there are no errors and the SELECT statement work incredibly fast. I just need a way to handle errors on a per-row basis.

I considered using a TRY/CATCH in IsMatch(), but that can't be used in a function. Alternatively, to use the TRY/CATCH, I considered a stored proc, but I can't figure out how to include that in my SELECT clause.

like image 717
gregsdennis Avatar asked Oct 06 '22 02:10

gregsdennis


1 Answers

If you're stuck on 2008 R2 I think what you want to do is use a readonly forward CURSOR in your stored procedure. Then use the TRY CATCH block inside of the WHILE @@FETCH_STATUS = 0 loop.

DECLARE logcursor CURSOR FORWARD_ONLY READ_ONLY FOR 
SELECT TOP 319 [LogId]
          ,[formattedmessage]
       FROM [GenesisLogging].[dbo].[Log] (NOLOCK)

OPEN logcursor

FETCH NEXT FROM logcursor
INTO @id, @formattedmessage

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY
    SET @xml = CONVERT(xml, @formattedmessage)
    END TRY
    BEGIN CATCH
    PRINT @id
    END CATCH; 
    FETCH NEXT FROM logcursor 
    INTO @id, @formattedmessage
END 
CLOSE logcursor;
DEALLOCATE logcursor;
like image 52
Louis Ricci Avatar answered Oct 10 '22 03:10

Louis Ricci