Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene.Net what am I doing wrong?

Tags:

I'm very new to lucene.net. I wrote this simple console app in C# which indexes some fake data. I then wanted to be able to search the index for various terms using a booleanquery.

I never get any results back. Here is the code. Any help would be greatly appreciated. Thanks.

    static void Main(string[] args)
    {
        StandardAnalyzer analyzer = new StandardAnalyzer();
        IndexWriter writer = new IndexWriter("Test", analyzer, true);
        Console.WriteLine("Creating index");
        for (int i = 0; i < 1500; i++)
        {
            Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
            doc.Add(new Lucene.Net.Documents.Field("A", i.ToString(), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));
            doc.Add(new Lucene.Net.Documents.Field("B", "LALA" + i.ToString(), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));
            doc.Add(new Lucene.Net.Documents.Field("C", "DODO" + i.ToString(), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));
            doc.Add(new Lucene.Net.Documents.Field("D", i.ToString() + " MMMMM", Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));
            writer.AddDocument(doc);
        }            
        writer.Optimize();
        writer.Close();

        BooleanQuery query = new BooleanQuery();
        query.Add(new WildcardQuery(new Term("B", "lala*")), Lucene.Net.Search.BooleanClause.Occur.MUST);
        query.Add(new WildcardQuery(new Term("C", "DoDo1*")), Lucene.Net.Search.BooleanClause.Occur.MUST);

        IndexSearcher searcher = new IndexSearcher("Test");
        Hits hits = searcher.Search(query);
        if (hits.Length() > 0)
        {
            for (int i = 0; i < hits.Length(); i++)
            {
                Console.WriteLine("{0} - {1} - {2} - {3}", 
                    hits.Doc(i).GetField("A").StringValue(),
                    hits.Doc(i).GetField("B").StringValue(),
                    hits.Doc(i).GetField("C").StringValue(),
                    hits.Doc(i).GetField("D").StringValue());
            }
        }
        searcher.Close();

        Console.WriteLine("Done");

        Console.ReadLine();
    }

I then got it to work by using MultiFieldQueryParser Like so:

    static void Main(string[] args)
    {
        StandardAnalyzer analyzer = new StandardAnalyzer();            

        IndexWriter writer = new IndexWriter("Test", analyzer, true);
        Console.WriteLine("Creating index");
        for (int i = 0; i < 1500; i++)
        {
            Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
            doc.Add(new Lucene.Net.Documents.Field("A", i.ToString(), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.TOKENIZED));
            doc.Add(new Lucene.Net.Documents.Field("B", "LALA" + i.ToString(), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.TOKENIZED));
            doc.Add(new Lucene.Net.Documents.Field("C", "DODO" + i.ToString(), Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.TOKENIZED));
            doc.Add(new Lucene.Net.Documents.Field("D", i.ToString() + " MMMMM", Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.TOKENIZED));
            writer.AddDocument(doc);
        }            
        writer.Optimize();
        writer.Close();            

        BooleanQuery.SetMaxClauseCount(5000);
        Query query = MultiFieldQueryParser.Parse(new string[] { "LALA*", "DODO*" }, new string[] { "B", "C" }, analyzer); 

        IndexSearcher searcher = new IndexSearcher("Test");
        Hits hits = searcher.Search(query);
        if (hits.Length() > 0)
        {
            for (int i = 0; i < hits.Length(); i++)
            {
                Console.WriteLine("{0} - {1} - {2} - {3}", 
                    hits.Doc(i).GetField("A").StringValue(),
                    hits.Doc(i).GetField("B").StringValue(),
                    hits.Doc(i).GetField("C").StringValue(),
                    hits.Doc(i).GetField("D").StringValue());
            }
        }
        searcher.Close();

        Console.WriteLine("Done");

        Console.ReadLine();
    }

This is possibly the best article I've found for any new lucene developers: http://www.ifdefined.com/blog/post/2009/02/Full-Text-Search-in-ASPNET-using-LuceneNET.aspx

like image 349
dnoxs Avatar asked Dec 04 '09 11:12

dnoxs


1 Answers

I think there is a problem when building your index. You add four fields to each document, all of them are stored but none of them is indexed (=> Lucene.Net.Documents.Field.Index.NO). You should index at least on field.

Beware that the StandardAnalyzer tokenize each field index in the following way: lowercasing and splitting with common english stop words. So when building your query, use LOWERCASE prefix in order to have hits:

query.Add(new PrefixQuery(new Term("B", "lala")), BooleanClause.Occur.MUST);
query.Add(new PrefixQuery(new Term("C", "dodo")), BooleanClause.Occur.MUST);
like image 109
Laurent Etiemble Avatar answered Oct 12 '22 09:10

Laurent Etiemble