Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Search scoring

I have sets of 3 identical (in Text) items in Azure Search varying on Price and Points. Cheaper products with higher points are boosted higher. (Price is boosted more then Points, and is boosted inversely).

However, I keep seeing search results similar to this.

Search is on ‘john milton’.

I get

Product="Id = 2-462109171829-1, Price=116.57, Points=  7, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=32.499783
Product="Id = 2-462109171829-2, Price=116.40, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=32.454872
Product="Id = 2-462109171829-3, Price=115.64, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=32.316270

I expect the scoring order to be something like this, with the lowest price first.

Product="Id = 2-462109171829-3, Price=115.64, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=
Product="Id = 2-462109171829-2, Price=116.40, Points=  9, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=
Product="Id = 2-462109171829-1, Price=116.57, Points=  7, Name=Life of Schamyl / John Milton Mackie, Description=.", Score=

What am I missing or are minor scoring variations acceptable?

The index is defined as

let ProductDataIndex = 

        let fields = 
                    [|
                        new Field (
                            "id", 
                            DataType.String,
                            IsKey           = true, 
                            IsSearchable    = true);


                        new Field (
                            "culture", 
                            DataType.String,
                            IsSearchable    = true);

                        new Field (
                            "gran", 
                            DataType.String,
                            IsSearchable    = true);

                        new Field (
                            "name", 
                            DataType.String,
                            IsSearchable    = true);

                        new Field (
                            "description", 
                            DataType.String, 
                            IsSearchable    = true);

                        new Field (
                            "price", 
                            DataType.Double, 
                            IsSortable      = true,
                            IsFilterable    = true)

                        new Field (
                            "points", 
                            DataType.Int32, 
                            IsSortable      = true,
                            IsFilterable    = true)
                    |]

        let weightsText = 
            new TextWeights(
                Weights =   ([|  
                                ("name",        4.); 
                                ("description", 2.) 
                            |]
                            |> dict))

        let priceBoost = 
            new MagnitudeScoringFunction(
                new MagnitudeScoringParameters(
                    BoostingRangeStart  = 1000.0,
                    BoostingRangeEnd    = 0.0,
                    ShouldBoostBeyondRangeByConstant = true),
                "price",
                10.0)

        let pointsBoost = 
            new MagnitudeScoringFunction(
                new MagnitudeScoringParameters(
                    BoostingRangeStart  = 0.0,
                    BoostingRangeEnd   = 10000000.0,
                    ShouldBoostBeyondRangeByConstant = true),
                "points",
                2.0)

        let scoringProfileMain = 
            new ScoringProfile (
                            "main", 
                            TextWeights =
                                weightsText,
                            Functions = 
                                new List<ScoringFunction>(
                                        [
                                            priceBoost      :> ScoringFunction
                                            pointsBoost     :> ScoringFunction
                                        ]),
                            FunctionAggregation = 
                                ScoringFunctionAggregation.Sum)

        new Index 
            (Name               =   ProductIndexName
            ,Fields             =   fields 
            ,ScoringProfiles    =   new List<ScoringProfile>(
                                        [
                                            scoringProfileMain
                                        ]))
like image 448
hocho Avatar asked Apr 23 '15 05:04

hocho


People also ask

What is scoring in search?

Relevance scoring refers to the computation of a search score that serves as an indicator of an item's relevance in the context of the current query. The higher the score, the more relevant the item. The search score is computed based on statistical properties of the string input and the query itself.

What is Azure Cognitive Search index?

In Azure Cognitive Search, a search index is your searchable content, available to the search engine for indexing, full text search, and filtered queries. An index is defined by a schema and saved to the search service, with data import following as a second step.

How do I do Azure Cognitive Search?

Create a search service in the Azure portal. Start with Import data wizard. Choose a built-in sample or a supported data source to create, load, and query an index in minutes. Finish with Search Explorer, using a portal client to query the search index you just created.

What is skillset in Azure search?

A skillset is a reusable resource in Azure Cognitive Search that's attached to an indexer. It contains one or more skills that call built-in AI or external custom processing over documents retrieved from an external data source.


1 Answers

All indexes in Azure Search are split into multiple shards allowing us for quick scale up and scale downs. When a search request is issued, it’s issued against each of the shards independently. The result sets from each of the shards are then merged and ordered by score (if no other ordering is defined). It is important to know that the scoring function weights query term frequency in each document against its frequency in all documents, in the shard!

It means that in your scenario, in which you have three instances of every document, even with scoring profiles disabled, if one of those documents lands on a different shard than the other two, its score will be slightly different. The more data in your index, the smaller the differences will be (more even term distribution). It’s not possible to assume on which shard any given document will be placed.

In general, document score is not the best attribute for ordering documents. It should only give you general sense of document relevance against other documents in the results set. In your scenario, it would be possible to order the results by price and/or points if you marked price and/or points fields as sortable. You can find more information how to use $orderby query parameter here: https://msdn.microsoft.com/en-us/library/azure/dn798927.aspx

like image 183
Yahnoosh Avatar answered Oct 28 '22 12:10

Yahnoosh