Exact phrase search using Lucene.net

Tags:

I am having trouble searching for an exact phrase using Lucene.NET 2.0.0.4

For example I am searching for "scope attribute sets the variable" (including quotes) but receive no matches, I have confirmed 100% that the phrase exists.

Can anyone suggest where I am going wrong? Is this even supported with Lucene.NET? As usual the API documentation is not too helpful and a few CodeProject articles I've read don't specifically touch on this.

Using the following code to create the index:

Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", true);

Analyzer analyzer = new Lucene.Net.Analysis.SimpleAnalyzer();

IndexWriter indexWriter = new Lucene.Net.Index.IndexWriter(dir, analyzer,true);

//create a document, add in a single field
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();

Lucene.Net.Documents.Field fldContent = new Lucene.Net.Documents.Field(
    "content", File.ReadAllText(@"Documents\100.txt"),
    Lucene.Net.Documents.Field.Store.YES,
    Lucene.Net.Documents.Field.Index.TOKENIZED);

doc.Add(fldContent);

//write the document to the index
indexWriter.AddDocument(doc);

I then search for a phrase using:

//state the file location of the index
Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", false);

//create an index searcher that will perform the search
IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher(dir);

QueryParser qp = new QueryParser("content", new SimpleAnalyzer());

// txtSearch.Text  Contains a phrase such as "this is a phrase" 
Query q=qp.Parse(txtSearch.Text);  


//execute the query
Lucene.Net.Search.Hits hits = searcher.Search(q);

The target document is about 7 MB plain text.

I have seen this previous question however I don't want a proximity search, just an exact phrase search.

307

asked May 12 '09 02:05

Ash

2 Answers

Shashikant Kore is correct with his answer, you need to enable term positions...

However, I would recommend not storing the text of the document in the field unless you absolutely need it to return back to you in the search results... Setting the store to 'NO' might help reduce the size of your index a bit.

Lucene.Net.Documents.Field fldContent = 
    new Lucene.Net.Documents.Field("content", 
        File.ReadAllText(@"Documents\100.txt"),
    Lucene.Net.Documents.Field.Store.NO,
    Lucene.Net.Documents.Field.Index.TOKENIZED, 
    Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);

196

answered Oct 27 '22 15:10

josefresno

You have not enabled the term positions. Creating field as follows should solve your problem.

Lucene.Net.Documents.Field fldContent = 
    new Lucene.Net.Documents.Field("content", 
        File.ReadAllText(@"Documents\100.txt"),
    Lucene.Net.Documents.Field.Store.YES,
    Lucene.Net.Documents.Field.Index.TOKENIZED, 
    Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS);

answered Oct 27 '22 13:10

Shashikant Kore

Related questions
                            
                                Fast search algorithm with std::vector<std::string>
                            
                                Django Haystack: filter query based on multiple items in a list.
                            
                                Python 3 Finding the last number in a string
                            
                                elasticsearch prefix query for multiple words to solve the autocomplete use case
                            
                                Recursion binary search in Python
                            
                                Javascript Search Engine (Search own site)
                            
                                MongoDB diacriticInSensitive search not showing all accented (words with diacritic mark) rows as expected and vice-versa
                            
                                C++ lambdas for std::sort and std::lower_bound/equal_range on a struct element in a sorted vector of structs
                            
                                Better way to find index of item from ArrayList<CustomObject>
                            
                                What is the complexity of bisect algorithm?
                            
                                How to use vim's 'f' command (find) to find the next tab
                            
                                How to use regular expressions do reverse search?
                            
                                Is it reasonable to stuff 1000 ids into a SELECT ... WHERE ... IN (...) query on Postgres? [duplicate]
                            
                                Algorithm used in Ruby for "String#include?"
                            
                                Using grep recursively
                            
                                Ransack, search multiple columns, one field, rails 3
                            
                                Time complexity of Uniform-cost search
                            
                                Android SearchView Hide Keyboard on Start up
                            
                                Algorithm to find if there is any i so that array[i] equals i
                            
                                How can I create a search functionality with partial view in asp.net mvc 4

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Exact phrase search using Lucene.net

Tags:

search

lucene

lucene.net

Ash

People also ask

2 Answers

josefresno

Shashikant Kore

Recent Activity

Donate For Us