Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lucene.net proximity search

Does anybody have any experience with having lucene.net index latitude and longitude values then return an ordered set of results based on distance from a single point?

Will the Lucene.Net.Spatial library help me at all with this?

like image 638
jcon45 Avatar asked Oct 05 '22 16:10

jcon45


1 Answers

A little late to the party but yes, the Spatial library is the place to start with this. The basics behind it are to:

1) Add Lat and Long fields to your document

doc.Add(new Field("Latitude", 
                  NumericUtils.DoubleToPrefixCoded(Latitude), 
                  Field.Store.YES, Field.Index.NOT_ANALYZED));

doc.Add(new Field("Longitude", 
                  NumericUtils.DoubleToPrefixCoded(Longitude), 
                  Field.Store.YES, Field.Index.NOT_ANALYZED));

2) Create plotters for the each tier of granularity that your search needs to support

IProjector projector = new SinusoidalProjector();
var ctp = new CartesianTierPlotter(0, projector, 
                                   Fields.LocationTierPrefix);
StartTier = ctp.BestFit(MaxKms);
EndTier = ctp.BestFit(MinKms);

Plotters = new Dictionary<int, CartesianTierPlotter>();
for (var tier = StartTier; tier <= EndTier; tier++)
{
    Plotters.Add(tier, new CartesianTierPlotter(tier, 
                                            projector, 
                                            Fields.LocationTierPrefix));
}

3) Use your plotters to index your document for each tier

private static void AddCartesianTiers(double latitude, 
                                      double longitude, 
                                      Document document)
{
    for (var tier = StartTier; tier <= EndTier; tier++)
    {
        var ctp = Plotters[tier];
        var boxId = ctp.GetTierBoxId(latitude, longitude);
        document.Add(new Field(ctp.GetTierFieldName(),
                        NumericUtils.DoubleToPrefixCoded(boxId),
                        Field.Store.YES,
                        Field.Index.NOT_ANALYZED_NO_NORMS));
    }
}

With your document indexed you can move onto building a query. This example uses a ConstantScoreQuery but you can swap that out for your ranged scoring:

/*  Builder allows us to build a polygon which we will use to limit  
 * search scope on our cartesian tiers, this is like putting a grid 
 * over a map */
var builder = new CartesianPolyFilterBuilder(Fields.LocationTierPrefix);

/*  Bounding area draws the polygon, this can be thought of as working  
 * out which squares of the grid over a map to search */
var boundingArea = builder.GetBoundingArea(Latitude, 
                Longitude, 
                DistanceInKilometres * ProductSearchEngine.KmsToMiles);

/*  We refine, this is the equivalent of drawing a circle on the map,  
 *  within our grid squares, ignoring the parts the squares we are  
 *  searching that aren't within the circle - ignoring extraneous corners 
 *  and such */
var distFilter = new LatLongDistanceFilter(boundingArea, 
                                    DistanceInKilometres * KmsToMiles,
                                    Latitude, 
                                    Longitude, 
                                    ProductSearchEngine.Fields.Latitude,
                                    ProductSearchEngine.Fields.Longitude);

/*  We add a query stating we will only search against products that have 
 * GeoCode information */
var query = new TermQuery(new Term(Fields.HasGeoCode, 
                                   FieldFlags.HasField));

/*  Add our filter, this will stream through our results and 
 * determine eligibility */
masterQuery.Add(new ConstantScoreQuery(distanceFilter), 
                BooleanClause.Occur.MUST);

All of this is taken from a blog post I just wrote whilst looking at a similar problem. You can see it at http://www.leapinggorilla.com/Blog/Read/1005/spatial-search-in-lucenenet

like image 187
Wolfwyrd Avatar answered Oct 10 '22 04:10

Wolfwyrd