Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server 2008 Full Text Search (FTS) versus Lucene.NET

I know there have been questions in the past about SQL 2005 versus Lucene.NET but since 2008 came out and they made a lot of changes to it and was wondering if anyone can give me pros/cons (or link to an article).

like image 972
ajma Avatar asked Jan 31 '09 18:01

ajma


People also ask

Does SQL Server support full text search?

Full-Text Search in SQL Server and Azure SQL Database lets users and applications run full-text queries against character-based data in SQL Server tables.

What is SQL Server Full Text Search?

Full-text search refers to the functionality in SQL Server that supports full-text queries against character-based data. These types of queries can include words and phrases as well as multiple forms of a word or phrase.

How do I know if full text search is installed in SQL Server?

Look at the list of services on the machine. If full text search is installed you'll see a service named SQL Server FullText Search ([instance]) where [instance] will be the name of the SQL instance that it is associated with.


1 Answers

SQL Server FTS is going to be easier to manage for a small deployment. Since FTS is integrated with the DB, the RDBMS handles updating the index automatically. The con here is that you don't have an obvious scaling solution short of replicating DB's. So if you don't need to scale, SQL Server FTS is probably "safer". Politically, most shops are going to be more comfortable with a pure SQL Server solution.

On the Lucene side, I would favor SOLR over straight-up Lucene. With either solution you have to do more work yourself updating the index when the data changes, as well as mapping data yourself to the SOLR/Lucene index. The pros are that you can easily scale by adding additional indexes. You could run these indexes on very lean linux servers, which eliminates some license costs. If you take the Lucene/SOLR route, I would aim to put ALL the data you need directly into the index, rather than putting pointers back to the DB in the index. You can include data in the index that is not searchable, so for example you could have pre-built HTML or XML stored in the index, and serve it up as a search result. With this approach your DB could be down but you are still able to serve up search results in a disconnected mode.

I've never seen a head-to-head performance comparison between SQL Server 2008 and Lucene, but would love to see one.

like image 179
Lee Harold Avatar answered Oct 04 '22 20:10

Lee Harold