How can I enable different analyzers for each field in a document I'm indexing with Lucene? Example:
RAMDirectory dir = new RAMDirectory(); IndexWriter iw = new IndexWriter(dir, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_CURRENT), true, IndexWriter.MaxFieldLength.UNLIMITED); Document doc = new Document(); Field field1 = new Field("field1", someText1, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS); Field field2 = new Field("field2", someText2, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS); doc.Add(field1); doc.Add(field2); iw.AddDocument(doc); iw.Commit();
The analyzer is an argument to the IndexWriter, but I want to use StandardAnalyzer for field1 and SimpleAnalyzer for field2, how can I do that? The same applies when searching, of course. The correct analyzer must be applied for each field.
In a nutshell an analyzer is used to tell elasticsearch how the text should be indexed and searched. And what you're looking into is the Analyze API, which is a very nice tool to understand how analyzers work. The text is provided to this API and is not related to the index.
An analyzer examines the text of fields and generates a token stream. Analyzers are specified as a child of the <fieldType> element in the schema. xml configuration file (in the same conf/ directory as solrconfig. xml ). In normal usage, only fields of type solr.TextField will specify an analyzer.
PerFieldAnalyzerWrapper is what you are looking for. The equivalent of this in Lucene.net is here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With