Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare if two strings contain the same words in T-SQL for SQL Server 2008?

When I compare two strings in SQL Server, there are couple of simple ways with = or LIKE.

I want to redefine equality as:

If two strings contain the same words - no matter in what order - they are equal, otherwise they are not.

For example:

  • 'my word' and 'word my' are equal
  • 'my word' and 'aaamy word' are not

What's the best simple solution for this problem?

like image 232
KentZhou Avatar asked Mar 16 '12 15:03

KentZhou


2 Answers

I don't think there is a simple solution for what you are trying to do in SQL Server. My first thought would be to create a CLR UDF that:

  1. Accepts two strings
  2. Breaks them into two arrays using the split function on " "
  3. Compare the contents of the two arrays, returning true if they contain the same elements.

If this is a route you'd like to go, take a look at this article to get started on creating CLR UDFs.

like image 104
Abe Miessler Avatar answered Sep 28 '22 17:09

Abe Miessler


Try this... The StringSorter function breaks strings on a space and then sorts all the words and puts the string back together in sorted word order.

CREATE FUNCTION dbo.StringSorter(@sep char(1), @s varchar(8000))
RETURNS varchar(8000)
AS
BEGIN
    DECLARE @ResultVar varchar(8000);

    WITH sorter_cte AS (
      SELECT CHARINDEX(@sep, @s) as pos, 0 as lastPos
      UNION ALL
      SELECT CHARINDEX(@sep, @s, pos + 1), pos
      FROM sorter_cte
      WHERE pos > 0
    )
    , step2_cte AS (
    SELECT SUBSTRING(@s, lastPos + 1,
             case when pos = 0 then 80000
             else pos - lastPos -1 end) as chunk
    FROM sorter_cte
    )
    SELECT @ResultVar = (select ' ' + chunk 
                                     from step2_cte 
                                     order by chunk 
                                     FOR XML PATH(''));
    RETURN @ResultVar;
END
GO

Here is a test case just trying out the function:

SELECT dbo.StringSorter(' ', 'the quick brown dog jumped over the lazy fox');

which produced these results:

  brown dog fox jumped lazy over quick the the

Then to run it from a select statement using your strings

SELECT case when dbo.StringSorter(' ', 'my word') = 
                     dbo.StringSorter(' ', 'word my') 
               then 'Equal' else 'Not Equal' end as ResultCheck
SELECT case when dbo.StringSorter(' ', 'my word') = 
                     dbo.StringSorter(' ', 'aaamy word') 
               then 'Equal' else 'Not Equal' end as ResultCheck

The first one shows that they are equal, and the second does not.

This should do exactly what you are looking for with a simple function utilizing a recursive CTE to sort your string.

Enjoy!

like image 32
Steve Stedman Avatar answered Sep 28 '22 18:09

Steve Stedman