I have a file in blob storage folder/new/data1.json
.
data1
contains json array.
[
{
"name": "na",
"data": {
"1":"something1",
"2":"something2"
}
},
{
"name": "ha",
"data": {
"1":"something1",
"2":"something2"
}
}
]
my datasource body :
{
"name" : "datasource",
"type" : "azureblob",
"credentials" : { "connectionString" : "MyStorageConnStrning" },
"container" : { "name" : "mycontaner", "query" : "folder/new" }
}
my index body:
{
"name" : "index",
"fields": [
{ "name": "id", "type": "Edm.String", "key": true, "searchable": false },
{ "name": "name", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": true},
{ "name": "data", "type": "Edm.String", "searchable": false}
]
}
indexer body:
{
"name" : "indexer",
"dataSourceName" : "datasource",
"targetIndexName" : "index",
"parameters" : { "configuration" : { "parsingMode" : "jsonArray" } }
}
when created i can search for na
and ha
and get result.
but if i delete folder/new/data1.json
from the blob storage and run the indexer and try to search na
and ha
i still get results.
I found that if i Delete the indexer and recreate it na
and ha
goes away from search.
Is there any way to remove previous data with out deleting the indexer?
An indexer in Azure Cognitive Search is a crawler that extracts searchable content from cloud data sources and populates a search index using field-to-field mappings between source data and a search index.
Soft delete strategy using custom metadata This method uses custom metadata to indicate whether a search document should be removed from the index.
Can I pause the service and stop billing? You can't pause a search service. In Azure Cognitive Search, computing resources are allocated when the service is created. It's not possible to release and reclaim those resources on-demand.
Deleting documents using indexers is a bit tricky, especially when your blob contains multiple documents; if you delete the blob directly then the indexer won't see the blob and wouldn't try to delete anything from the index.
To make the indexer delete documents you need to use a soft delete deletion detection policy, for example:
{
"@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
"softDeleteColumnName": "IsDeleted",
"softDeleteMarkerValue": "true"
}
When you want to delete a document, add "IsDeleted": true
to the JSON object. After all documents in a blob has been soft deleted and the deletes have been picked up by the indexer, only then you can do a hard delete and remove the blob.
One subtlety here is that you must not add/remove/rearrange elements of the array because you're using the default document id, which depends on the blob path and array index. If you use the name
field as the key then you'll have the flexibility to do partial hard deletes inside the blob.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With