I am trying to write a query for Stack Exchange's very own Data Explorer. This query will create a temporary table containing a list of commonly misspelled words and their proper spellings and then search through the posts to find them.
Here is what I have so far (minus the comments):
DECLARE @SpellingMistakes TABLE (wrong VARCHAR(255), right VARCHAR(255))
INSERT INTO @SpellingMistakes (wrong, right)
VALUES ('ubunut', 'ubuntu')
SELECT Id as [Post Link]
FROM Posts
WHERE
...
And that's where I get stuck - in the WHERE clause. I need some way of saying "if Posts.Body contains any of @SpellingMistakes.wrong". However, I'm not sure how to do that.
Note: the data explorer uses Microsoft SQL Server 2008 R2.
I don't know MS SQL, but most SQL implementations have a 'LIKE' equivalent. So in that case, you could join the two tables and use LIKE in the JOIN condition.
SELECT Id as link
FROM Posts P JOIN SpellingMistakes S
ON P.Body LIKE '%'+S.wrong+'%'
EDIT: Assuming Posts is a large table (and SpellingMistakes is not too small either), this will take a lot of resources. One way to tackle this is to split table Posts into smaller subsets and construct multiple statements.
SELECT Id as link
FROM (SELECT * FROM Posts WHERE Id<=10000) P
JOIN
SpellingMistakes S
ON P.Body LIKE '%'+S.wrong+'%'
SELECT Id as link
FROM (SELECT * FROM Posts WHERE Id<=20000 and Id>10000) P
JOIN
SpellingMistakes S
ON P.Body LIKE '%'+S.wrong+'%'
And so on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With