Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Search Does Not Remove Data After Indexer RUN

I have a file in blob storage folder/new/data1.json.

data1 contains json array.

[   
    {
        "name": "na",
        "data": {
            "1":"something1",
            "2":"something2"

        }
    },
    {
        "name": "ha",
        "data": {
            "1":"something1",
            "2":"something2"
        }
    }
]

my datasource body :

{
    "name" : "datasource",
    "type" : "azureblob",
    "credentials" : { "connectionString" : "MyStorageConnStrning" },
    "container" : { "name" : "mycontaner", "query" : "folder/new" }
}   

my index body:

{
    "name" : "index",
    "fields": [
       { "name": "id", "type": "Edm.String", "key": true, "searchable": false },
       { "name": "name", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": true},
       { "name": "data", "type": "Edm.String", "searchable": false}
    ]
}

indexer body:

{
    "name" : "indexer",
    "dataSourceName" : "datasource",
    "targetIndexName" : "index",
    "parameters" : { "configuration" : { "parsingMode" : "jsonArray" } }
}

when created i can search for na and ha and get result.

but if i delete folder/new/data1.json from the blob storage and run the indexer and try to search na and ha i still get results.

I found that if i Delete the indexer and recreate it na and ha goes away from search.

Is there any way to remove previous data with out deleting the indexer?

like image 417
Nafis Islam Avatar asked Dec 06 '18 14:12

Nafis Islam


People also ask

What is index and indexer in Azure search?

An indexer in Azure Cognitive Search is a crawler that extracts searchable content from cloud data sources and populates a search index using field-to-field mappings between source data and a search index.

Which policy would you choose to ensure that items are deleted from the search index as well?

Soft delete strategy using custom metadata This method uses custom metadata to indicate whether a search document should be removed from the index.

How do I stop Azure search service?

Can I pause the service and stop billing? You can't pause a search service. In Azure Cognitive Search, computing resources are allocated when the service is created. It's not possible to release and reclaim those resources on-demand.


1 Answers

Deleting documents using indexers is a bit tricky, especially when your blob contains multiple documents; if you delete the blob directly then the indexer won't see the blob and wouldn't try to delete anything from the index.

To make the indexer delete documents you need to use a soft delete deletion detection policy, for example:

{
  "@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
  "softDeleteColumnName": "IsDeleted",
  "softDeleteMarkerValue": "true"
}

When you want to delete a document, add "IsDeleted": true to the JSON object. After all documents in a blob has been soft deleted and the deletes have been picked up by the indexer, only then you can do a hard delete and remove the blob.

One subtlety here is that you must not add/remove/rearrange elements of the array because you're using the default document id, which depends on the blob path and array index. If you use the name field as the key then you'll have the flexibility to do partial hard deletes inside the blob.

like image 173
8163264128 Avatar answered Nov 15 '22 09:11

8163264128