In lucene, I can do the following
doc.GetField("mycustomfield").StringValue();
This retrieves the value of a column in an index's document.
My question, for the same 'doc'
, is there a way to get the Doc. Id
? Luke displays it hence there must be a way to figure this out. I need it to delete documents on updates.
I scoured the docs but have not found the term to use in GetField or if there already is another method.
Turns out you have to do this:
var hits = searcher.Search(query);
var result = hits.Id(0);
As opposed to
var results = hits.Doc(i);
var docid = results.<...> //there's nothing I could find there to do this
I suspect the reason you're having trouble finding any documentation on determining the id of a particular Lucene Document is because they are not truly "id"s. In other words, they are not necessarily meant to be looked up and stored for later use. In fact, if you do, you will not get the results you were hoping for, as the IDs will change when the index is optimized.
Instead, think of the IDs as the current "offset" of a particular document from the start of the index, which will change when deleted documents are physically removed from the index files.
Now with that said, the proper way to look up the "id" of a document is:
QueryParser parser = new QueryParser(...);
IndexSearcher searcher = new IndexSearcher(...);
Hits hits = searcher.Search(parser.Parse(...);
for (int i = 0; i < hits.Length(); i++)
{
int id = hits.Id(i);
// do stuff
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With