I have managed to create document and do some complex searching too but facing problem in grouping some search result.
There are books which are displayed after search which is fine. Along with this Author grouping with count need to done which will be based on same search query.
Example,
Author Name | Count
A | 12
B | 2
I am using Lucene.Net 3.0.3.0 which does not support grouping but there might be some work around. I need same feature with price ranges too.
Everything is possible if you write a custom Collector. What you describe are facets, and can easily be solved by counting the document values yourself. The core part is calling the IndexSearcher.Search overload accepting a collector. The collector should read values, usually implemented with a field-cache implementation and do the calculation needed.
This is a short demonstration using some classes from my demo-project Corelicious.Lucene.
var postTypes = new Dictionary<Int32, Int32>();
searcher.Search(query, new DelegatingCollector((reader, doc, scorer) => {
var score = scorer.Score();
if (score > 0) {
var postType = SingleFieldCache.Default.GetInt32(reader, "PostTypeId", doc);
if (postType.HasValue) {
if (postTypes.ContainsKey(postType.Value)) {
postTypes[postType.Value]++;
} else {
postTypes[postType.Value] = 1;
}
}
}
}));
Full code:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
using System.Xml;
using Corelicious.Lucene;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using Lucene.Net.Store;
using Directory = Lucene.Net.Store.Directory;
using Version = Lucene.Net.Util.Version;
namespace ConsoleApplication {
public static class Program {
public static void Main(string[] args) {
Console.WriteLine ("Creating directory...");
var directory = new RAMDirectory();
var analyzer = new StandardAnalyzer(Version.LUCENE_30);
CreateIndex(directory, analyzer);
var userQuery = "calculate pi";
var queryParser = new QueryParser(Version.LUCENE_30, "Body", analyzer);
var query = queryParser.Parse(userQuery);
Console.WriteLine("Query: '{0}'", query);
var indexReader = IndexReader.Open(directory, readOnly: true);
var searcher = new IndexSearcher(indexReader);
var postTypes = new Dictionary<Int32, Int32>();
searcher.Search(query, new DelegatingCollector((reader, doc, scorer) => {
var score = scorer.Score();
if (score > 0) {
var postType = SingleFieldCache.Default.GetInt32(reader, "PostTypeId", doc);
if (postType.HasValue) {
if (postTypes.ContainsKey(postType.Value)) {
postTypes[postType.Value]++;
} else {
postTypes[postType.Value] = 1;
}
}
}
}));
Console.WriteLine("Post type summary");
Console.WriteLine("Post type | Count");
foreach(var pair in postTypes.OrderByDescending(x => x.Value)) {
var postType = (PostType)pair.Key;
Console.WriteLine("{0,-10} | {1}", postType, pair.Value);
}
Console.ReadLine ();
}
public enum PostType {
Question = 1,
Answer = 2,
Tag = 4
}
public static void CreateIndex(Directory directory, Analyzer analyzer) {
using (var writer = new IndexWriter(directory, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED))
using (var xmlStream = File.OpenRead("/Users/sisve/Downloads/Stack Exchange Data Dump - Sept 2011/Content/092011 Mathematics/posts.xml"))
using (var xmlReader = XmlReader.Create(xmlStream)) {
while (xmlReader.ReadToFollowing("row")) {
var tags = xmlReader.GetAttribute("Tags") ?? String.Empty;
var title = xmlReader.GetAttribute("Title") ?? String.Empty;
var body = xmlReader.GetAttribute("Body");
var doc = new Document();
// tags are stored as <tag1><tag2>
foreach (Match match in Regex.Matches(tags, "<(.*?)>")) {
doc.Add(new Field("Tags", match.Groups[1].Value, Field.Store.NO, Field.Index.NOT_ANALYZED));
}
doc.Add(new Field("Title", title, Field.Store.NO, Field.Index.ANALYZED));
doc.Add(new Field("Body", body, Field.Store.NO, Field.Index.ANALYZED));
doc.Add(new Field("PostTypeId", xmlReader.GetAttribute("PostTypeId"), Field.Store.NO, Field.Index.NOT_ANALYZED));
writer.AddDocument(doc);
}
writer.Optimize();
writer.Commit();
}
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With