Lucene - Exact string matching

Tags:

I'm trying to create a Lucene 4.10 index. I just want to save in the index the exact strings that I put into the document, witout tokenization.

I'm using the StandardAnalyzer.

    Directory dir = FSDirectory.open(new File("myDire"));
    Analyzer analyzer = new StandardAnalyzer();
    IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_4_10_0, analyzer);
    iwc.setOpenMode(OpenMode.CREATE);
    IndexWriter writer = new IndexWriter(dir, iwc);
    StringField field1 = new StringField("1", content1, Store.YES);
    StringField field2 = new StringField("2", content2, Store.YES);
    StringField field3 = new StringField("3", content3, Store.YES);
    doc.add(field1);
    doc.add(field2);
    doc.add(field3);
    writer.addDocument(doc, analyzer);
    writer.close();

If I print the index's content, I can see my data being stored, for example, my document has this "field 3":

    stored,indexed,tokenized,omitNorms,indexOptions=DOCS_ONLY<3:"Fuel Tank Capacity"@en>

I'm trying to query the index in order to get it back:

    IndexSearcher searcher = new IndexSearcher(reader);
    Analyzer analyzer = new StandardAnalyzer();
    QueryParser parser = new QueryParser("3", analyzer);
    String queryString = "\"\"Fuel Tank Capacity"\@en\"";
    Query query = parser.createPhraseQuery("3", QueryParser.escape(queryString));
    TopDocs docs = searcher.search(query, null, 20);

I'm trying to search the term "Fuel Tank Capacity"@en (quotation marks included) so I tried to escape them and I put another couple of quotes around the terms in order to let lucene understand that I'm searching for the entire texts.

If I print the query, I get: 3:"fuel tank capacity en" but I dont want to split the text on the @ symbol.

I think that my first problem is the StandardAnalyzer, because it seems to tokenize, if I'm not mistaken. However, I cannot understand how to query the index in order to get exactly "Fuel Tank Capacity"@en (quotation marks included).

Thank you

966

asked Sep 12 '14 13:09

LucaT

1 Answers

You could simplify matters, and just cut the QueryParser out of the equation entirely. Since you are using a StringField, the whole content of the field is a single term, so a simple TermQuery should work well:

Query query = new TermQuery(new Term("3","\"Fuel Tank Capacity\"@en"));

194

answered Sep 19 '22 05:09

femtoRgon

Related questions
                            
                                Which members are not inherited in a child class?
                            
                                Clean code - best way to compact code in Java
                            
                                How do I handle a session timeout or expiration in Play Framework?
                            
                                Android Bluetooth Low Energy: characteristic.getPermissions() returns 0?
                            
                                Navigation Drawer onNavigationDrawerItemSelected called before MainActivity onCreate?
                            
                                Catch exceptions which are not thrown locally?
                            
                                updating references in an expression with a nested assignment
                            
                                How to view PDF document in VAADIN
                            
                                JAX-RS How to get a cookie from a request?
                            
                                Lazy Loading using MyBatis 3 with Java
                            
                                Can Android Studio Automatically Extract References From a Layout XML file into the Activity java file?
                            
                                Method overloading with Parent and Child class as parameter [duplicate]
                            
                                package org.apache.hadoop.conf does not exist after setting classpath
                            
                                Convert .docx to HTML using JAVA
                            
                                "IllegalArgumentException occurred calling getter of" while running criteria with SINGLE_TABLE Inheritance strategy
                            
                                Type cast issue from int to byte using final keyword in java
                            
                                How do you link a native library to a jar in IntelliJ?
                            
                                Guice - how to implement a factory that returns different implementations
                            
                                spring - forcing cglib proxies for @Autowired fields
                            
                                Protobuf "oneof" functionality not working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Lucene - Exact string matching

Tags:

java

lucene

tokenize

LucaT

People also ask

1 Answers

femtoRgon

Recent Activity

Donate For Us