Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the SQL used to do a search similar to "Related Questions" on Stackoverflow

Tags:

text

sql

search

I am trying to implement a feature similar to the "Related Questions" on Stackoverflow.

How do I go about writing the SQL statement that will search the Title and Summary field of my database for similar questions?

If my questions is: "What is the SQL used to do a search similar to "Related Questions" on Stackoverflow".

Steps that I can think of are;

  1. Strip the quotation marks
  2. Split the sentence into an array of words and run a SQL search on each word.

If I do it this way, I am guessing that I wouldn't get any meaningful results. I am not sure if Full Text Search is enabled on the server, so I am not using that. Will there be an advantage of using Full Text Search?

I found a similar question but there was no answer: similar question

Using SQL 2005

like image 363
Picflight Avatar asked Jun 01 '09 22:06

Picflight


People also ask

What is a stackoverflow search?

Stack Overflow is a question and answer website for professional and enthusiast programmers. It is the flagship site of the Stack Exchange Network. It was created in 2008 by Jeff Atwood and Joel Spolsky. It features questions and answers on a wide range of topics in computer programming.

What database is used by stackoverflow?

Expands to a ~350GB SQL Server 2008 database.


3 Answers

Check out this podcast.

One of our major performance optimizations for the “related questions” query is removing the top 10,000 most common English dictionary words (as determined by Google search) before submitting the query to the SQL Server 2008 full text engine. It’s shocking how little is left of most posts once you remove the top 10k English dictionary words. This helps limit and narrow the returned results, which makes the query dramatically faster.

like image 191
Nick Dandoulakis Avatar answered Oct 11 '22 10:10

Nick Dandoulakis


They probably relate based on tags that are added to the questions...

like image 30
Ropstah Avatar answered Oct 11 '22 09:10

Ropstah


After enabling Full Text search on my SQL 2005 server, I am using the following stored procedure to search for text.

ALTER PROCEDURE [dbo].[GetSimilarIssues] 
(
 @InputSearch varchar(255)
)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

DECLARE @SearchText varchar(500);

SELECT @SearchText = '"' + @InputSearch + '*"'

SELECT  PostId, Summary, [Description], 
Created
FROM Issue

WHERE FREETEXT (Summary, @SearchText);
END
like image 38
Picflight Avatar answered Oct 11 '22 08:10

Picflight