I am using Solrs TermsComponent to implement an autocomplete feature. My documents contain tags which I have indexed in a "tags" field. Now I can use TermsComponent to find out which tags are used in all the stored documents. This works pretty well so far.
However there is some additional requirement: Every document has an owner field which contains the ID of the user who owns it. The autocomplete list should only contain tags from documents, that the user who is requesting the autocomplete is actually owning.
I have tried to set the query parameter, however this seems to be ignored by the TermsComponent:
public static List<String> findUniqueTags(String beginningWith, User owner) throws IOException {
SolrParams q = new SolrQuery().setQueryType("/terms")
.setQuery("owner:" + owner.id.toString())
.set(TermsParams.TERMS, true).set(TermsParams.TERMS_FIELD, "tags")
.set(TermsParams.TERMS_LOWER, beginningWith)
.set(TermsParams.TERMS_LOWER_INCLUSIVE, false)
.set(TermsParams.TERMS_PREFIX_STR, beginningWith);
QueryResponse queryResponse;
try {
queryResponse = getSolrServer().query(q);
} catch (SolrServerException e) {
Logger.error(e, "Error when querying server.");
throw new IOException(e);
}
NamedList tags = (NamedList) ((NamedList)queryResponse.getResponse().get("terms")).get("tags");
List<String> result = new ArrayList<String>();
for (Iterator iterator = tags.iterator(); iterator.hasNext();) {
Map.Entry tag = (Map.Entry) iterator.next();
result.add(tag.getKey().toString());
}
return result;
}
So is there a way of limiting the tags returned by TermsComponent, or do I manually have to query all the tags of the user and filter them myself?
Depending on a multitude of factors, a single machine can easily host a Lucene/Solr index of 5 – 80+ million documents, while a distributed solution can provide subsecond search response times across billions of documents.
The Terms Component provides access to the indexed terms in a field and the number of documents that match each term. This can be useful for building an auto-suggest feature or any other feature that operates at the term level instead of the search or document level.
In a nutshell, Solr uses a special version field named _version_ to enforce safe update semantics for documents. In the case of two different users trying to update the same document concurrently, the user that submits updates last will have a stale version field, so their update will fail.
positionIncrementGap. For multivalued fields, specifies a distance between multiple values, which prevents spurious phrase matches. integer. autoGeneratePhraseQueries. For text fields.
According to this and that post on the Solr mailing list, filtering on the terms component is not possible because it operates on raw index data.
Apparently, the Solr developers are working on a real autosuggest component that supports your filtering.
Depending on your requirements you might be able to use the faceting component for autocomplete instead of the terms component. It fully supports filter queries for reducing the set of eligible tags to a subset of the documents in the index.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With