Lucene 2.9.2: How to show results in random order?

Question

By default, Lucene returns the query results in the order of relevance (score). You can pass a sort field (or multiple), then the results get sorted by that field.

I am looking now for a nice solution to get the search results in random order.

The bad approach:
Of course I could take ALL results and then shuffle the collection, but in case of 5 Mio search results, that's not performing well.

The elegant paged approach:
With this approach you would be able to tell Lucene the following:
a) Give me results 1 to 10 out of 5Mio results in random order
b) Then give me 11 to 20 (based on the same random sequence used in a).
c) Just to clarify: If you call a) twice you get the same random elements.

How can you implement this approach??

Update Jul27 2012: Be aware that the solution described here for Lucene 2.9.x is not working properly. Using the RandomOrderScoreDocComparator will result in having certain results twice in the resulting list.

Adam Paynter · Accepted Answer

You could write a custom FieldComparator:

public class RandomOrderFieldComparator extends FieldComparator<Integer> {

    private final Random random = new Random();

    @Override
    public int compare(int slot1, int slot2) {
        return random.nextInt();
    }

    @Override
    public int compareBottom(int doc) throws IOException {
        return random.nextInt();
    }

    @Override
    public void copy(int slot, int doc) throws IOException {
    }

    @Override
    public void setBottom(int bottom) {
    }

    @Override
    public void setNextReader(IndexReader reader, int docBase) throws IOException {
    }

    @Override
    public Integer value(int slot) {
        return random.nextInt();
    }

}

This doesn't consume any I/O when shuffling the results. Here is my sample program that demonstrates how you use this:

public static void main(String... args) throws Exception {
    RAMDirectory directory = new RAMDirectory();

    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_33);

    IndexWriter writer = new IndexWriter(
            directory,
            new IndexWriterConfig(Version.LUCENE_33, analyzer).setOpenMode(OpenMode.CREATE_OR_APPEND)
        );

    Document alice = new Document();
    alice.add( new Field("name", "Alice", Field.Store.YES, Field.Index.ANALYZED) );
    writer.addDocument( alice );

    Document bob = new Document();
    bob.add( new Field("name", "Bob", Field.Store.YES, Field.Index.ANALYZED) );
    writer.addDocument( bob );

    Document chris = new Document();
    chris.add( new Field("name", "Chris", Field.Store.YES, Field.Index.ANALYZED) );
    writer.addDocument( chris );

    writer.close();


    IndexSearcher searcher = new IndexSearcher( directory );



    for (int pass = 1; pass <= 10; pass++) {
        Query query = new MatchAllDocsQuery();

        Sort sort = new Sort(
                new SortField(
                        "",
                        new FieldComparatorSource() {

                            @Override
                            public FieldComparator<Integer> newComparator(String fieldname, int numHits, int sortPos, boolean reversed) throws IOException {
                                return new RandomOrderFieldComparator();
                            }

                        }
                    )
            );

        TopFieldDocs topFieldDocs = searcher.search( query, 10, sort );

        System.out.print("Pass #" + pass + ":");
        for (int i = 0; i < topFieldDocs.totalHits; i++) {
            System.out.print( " " + topFieldDocs.scoreDocs[i].doc );
        }
        System.out.println();
    }
}

It yields up this output:

Pass #1: 1 0 2
Pass #2: 1 0 2
Pass #3: 0 1 2
Pass #4: 0 1 2
Pass #5: 0 1 2
Pass #6: 1 0 2
Pass #7: 0 2 1
Pass #8: 1 2 0
Pass #9: 2 0 1
Pass #10: 0 2 1

Bonus! For those of you trapped in Lucene 2

public class RandomOrderScoreDocComparator implements ScoreDocComparator {

    private final Random random = new Random();

    public int compare(ScoreDoc i, ScoreDoc j) {
        return random.nextInt();
    }

    public Comparable<?> sortValue(ScoreDoc i) {
        return Integer.valueOf( random.nextInt() );
    }

    public int sortType() {
        return SortField.CUSTOM;
    }

}

All you have to change is the Sort object:

Sort sort = new Sort(
    new SortField(
        "",
        new SortComparatorSource() {
            public ScoreDocComparator newComparator(IndexReader reader, String fieldName) throws IOException {
                return new RandomOrderScoreDocComparator();
            }
        }
    )
);

Lucene 2.9.2: How to show results in random order?

Tags:

java

algorithm

random

shuffle

lucene

basZero

1 Answers

Bonus! For those of you trapped in Lucene 2

Adam Paynter

Recent Activity

Donate For Us

Lucene 2.9.2: How to show results in random order?

Tags:

java

algorithm

random

shuffle

lucene

basZero

1 Answers

Bonus! For those of you trapped in Lucene 2

Adam Paynter

Related questions

Recent Activity

Donate For Us