How to get the unique results from Lucene index?

Tags:

I am trying to search from lucene index . I want to get the unique results but its returning the duplicate results also. I searched on google and found it can be done with the help of a collector. How can I achieve this?

I am using the following code:

File outputdir= new File("path upto lucene directory");
Directory directory = FSDirectory.open(outputdir);
IndexSearcher= new IndexSearcher(directory,true);

QueryParser queryparser = new QueryParser(Version.LUCENE_36, "keyword", new StandardAnalyzer(Version.LUCENE_36));

Query query = queryparser.parse("central");

topdocs = indexSearcher.search(query, maxhits);
ScoreDoc[] score = topdocs.scoreDocs;
int length = score.length;

732

asked Nov 18 '13 09:11

adesh singh

2 Answers

Are you indexing content before each search ?

If so, I suggest you to separate indexing code and searching code because if you launch this script several times without deleting the index folder Lucene doesn't overwrite the index but add again the content to the index. I think this is why you get duplicates results.

answered Jan 02 '23 13:01

Chavjoh

You should have a field named for example "duplicate" and set the value to "true" on indexing time when it already has a duplicate in the index.

So you can search for

Query query = queryparser.parse("central -duplicate:true");

answered Jan 02 '23 11:01

fatih

Related questions
                            
                                Getting text from <li>´s & check duplicates & .append it... by jQuery
                            
                                Duplicate documents on _id (in mongo)
                            
                                Checking duplicates, sum them and delete one row after summing
                            
                                How do I remove duplicate arrays in a list in Python
                            
                                How to delete rows for repeated data (R)
                            
                                Pandas: How can I remove duplicate rows from DataFrame and calculate their frequency?
                            
                                does my app display second time notification iOS 9
                            
                                Finding duplicate matrices in Python?
                            
                                Python- Renaming duplicated values based on another variable
                            
                                How to remove pair duplication in pandas? [duplicate]
                            
                                How to subset your dataframe to only keep the first duplicate? [duplicate]
                            
                                Variations in spelling of first name
                            
                                Magento duplicate class rewrite
                            
                                Removing duplicate elements from a LinkedList in Java
                            
                                Find most recent duplicates ID with MySQL
                            
                                How to remove duplicated rows by a column in an R matrix
                            
                                How to persist @ManyToMany relation - duplicate entry or detached entity
                            
                                How to remove duplicates only if consecutive in a string? [duplicate]
                            
                                Word-oriented completion suggester (ElasticSearch 5.x)
                            
                                Remove duplicates from a dataframe in PySpark

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to get the unique results from Lucene index?

Tags:

duplicates

lucene

adesh singh

People also ask

2 Answers

Chavjoh

fatih

Recent Activity

Donate For Us