Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RavenDB DeleteByIndex

Tags:

c#

ravendb

I am working on a web crawler and the results which are saved to Raven can vary on how big the website is. I'm trying to delete a specific result which has over the the "server per session is limited to 30", I don't want to extend it to the 1,000 limit, I do however want to batch delete.

The code I have written which I think should work is

    public void DeleteCrawledLinks(string baseUrl)
    {

        DocumentStore().DatabaseCommands.DeleteByIndex(
            "Auto/UrlContainers/ByBaseUrlAndUrl",
            new IndexQuery
            {
                Query = "BaseUrl:" + baseUrl // where BaseUrl contains baseUrl
            }, allowStale: false);
     }

the BaseUrl in Raven for this example let's call it "BaseUrl": "http://localhost:2125/" and the baseUrl will be the same, when I run the delete function I get this error message

Url: "/bulk_docs/Auto/UrlContainers/ByBaseUrlAndUrl?query=BaseUrl%253Ahttp%253A%252F%252Flocalhost%253A2125%252F&start=0&pageSize=128&aggregation=None&allowStale=False"

System.ArgumentException: The field 'http' is not indexed, cannot query on fields that are not indexed

Is it because of the : in my query, is there a way around this or is there another way? I don't want to extend the limit because the sites I crawl could have more than 1,000 results returned.

like image 315
Lewis Avatar asked Feb 14 '26 10:02

Lewis


1 Answers

When constructing the query yourself, escape search terms as follows:

Query = "BaseUrl:" + RavenQuery.Escape(baseUrl)
like image 181
Matt Johnson-Pint Avatar answered Feb 17 '26 00:02

Matt Johnson-Pint