Lucene and Special Characters

Tags:

I am using Lucene.Net 2.0 to index some fields from a database table. One of the fields is a 'Name' field which allows special characters. When I perform a search, it does not find my document that contains a term with special characters.

I index my field as such:

Directory DALDirectory = FSDirectory.GetDirectory(@"C:\Indexes\Name", false);
Analyzer analyzer = new StandardAnalyzer();
IndexWriter indexWriter = new IndexWriter(DALDirectory, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED);

Document doc = new Document();
doc.Add(new Field("Name", "Test (Test)", Field.Store.YES, Field.Index.TOKENIZED));
indexWriter.AddDocument(doc);

indexWriter.Optimize();
indexWriter.Close();

And I search doing the following:

value = value.Trim().ToLower();
value = QueryParser.Escape(value);

Query searchQuery = new TermQuery(new Term(field, value));
Searcher searcher = new IndexSearcher(DALDirectory);

TopDocCollector collector = new TopDocCollector(searcher.MaxDoc());
searcher.Search(searchQuery, collector);
ScoreDoc[] hits = collector.TopDocs().scoreDocs;

If I perform a search for field as 'Name' and value as 'Test', it finds the document. If I perform the same search as 'Name' and value as 'Test (Test)', then it does not find the document.

Even more strange, if I remove the QueryParser.Escape line do a search for a GUID (which, of course, contains hyphens) it finds documents where the GUID value matches, but performing the same search with the value as 'Test (Test)' still yields no results.

I am unsure what I am doing wrong. I am using the QueryParser.Escape method to escape the special characters and am storing the field and searching by the Lucene.Net's examples.

Any thoughts?

250

asked Apr 28 '10 20:04

Brandon

1 Answers

StandardAnalyzer strips out the special characters during indexing. You can pass in a list of explicit stopwords (excluding the ones you want in).

103

answered Oct 05 '22 14:10

Mikos

Related questions
                            
                                Deserializing variable Type JSON array using DataContractJsonSerializer
                            
                                C# How to programatically change the playback device
                            
                                Unable to cast object of type 'System.Collections.Generic.List`1[Item]' to type 'ItemList'
                            
                                c# HttpWebResponse Header encoding
                            
                                using securestring for a sql connection
                            
                                How to add a TextBlock within a Path?
                            
                                How would I validate string length using DataAnnotations in asp.net mvc?
                            
                                How do I get rid of the annoying ctrl+backspace behaviour in MonoDevelop?
                            
                                any ideas for avoiding duplicate code in C# and javascript
                            
                                .NET: What is typical garbage collector overhead?
                            
                                C# Lambdas: How *Not* to Defer "Dereference"?
                            
                                Using Lambda Expressions trees with IEnumerable
                            
                                Preview PDF in C#
                            
                                FileDialog DoubleClick Behavior
                            
                                How do I add/remove items to a ListView in virtual mode?
                            
                                Applying the Decorator Pattern to Forms
                            
                                Do overlays/tooltips work correctly in Emacs for Windows?
                            
                                C# Communication between threads
                            
                                Read JSON (text file) into .NET application
                            
                                C# WinForms Vertical Alignment for TextBox, etc

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Lucene and Special Characters

Tags:

c#

indexing

lucene

lucene.net

Brandon

People also ask

1 Answers

Mikos

Recent Activity

Donate For Us