I am using solr and solrj for index and search functionality in a web app I am creating. My request handler is configured in solrconfig.xml as follows:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="start">0</str>
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">
title^10.0 subtitle^7.0 abstract^5.0 content^1.0 text^1.0
</str>
<str name="pf">
title^10.0 subtitle^7.0 abstract^5.0 content^1.0 text^1.0
</str>
<str name="df">text</str>
</lst>
</requestHandler>
As it stands, the indexing and searching works well. However, I want to implement pagination. The config file contains "start" and "row" data. However, in solrj, when I run:
SolrQuery query = new SolrQuery(searchTerm);
System.out.println(query.getRequestHandler());
System.out.println(query.getRows());
System.out.println(query.getStart());
The three print statements each show null. I understand each of those 'gets' has a correspond 'set', but I would have imagined that they would be already set via the response handler in the solrconfig.xml. Can someone clue me in?
Cursor-based pagination works by returning a pointer to a specific item in the dataset. On subsequent requests, the server returns results after the given pointer.
Standard solr queries use the "q" parameter in a request. Filter queries use the "fq" parameter. The primary difference is that filtered queries do not affect relevance scores; the query functions purely as a filter (docset intersection, essentially).
Trying a basic queryThe main query for a solr search is specified via the q parameter. Standard Solr query syntax is the default (registered as the “lucene” query parser). If this is new to you, please check out the Solr Tutorial. Adding debug=query to your request will allow you to see how Solr is parsing your query.
Before executing the query on the server, the client would not know about what you have set on the server side, right? So it is not a surprise that they are all null.
To implement pagination you need two parameters from the client side - the page number and the number of items per page. Once you got these two, you can construct your SolrQuery on the client side as follows:
SolrQuery query = new SolrQuery(searchTerm);
query.setStart((pageNum - 1) * numItemsPerPage);
query.setRows(numItemsPerPage);
// execute the query on the server and get results
QueryResponse res = solrServer.query(solrQuery);
As @arun stated in his answer, "the client would not know about what you have set on the server side". So don't be surprise that they are empty. On the other hand I would warn you about pagination problems that can arise in some situations.
Pagination is a simple thing when you have few documents to read and all you have to do is play with start
and rows
parameters.
So for a client that wants 50 results per page, page #1 is requested using start=0&rows=50. Page #2 is start=50&rows=50, page #3 is start=100&rows=50, etc…. But in order for Solr to know which 50 docs to return starting at an arbitrary point N, it needs to build up an internal queue of the first N+50 sorted documents matching the query, so that it can then throw away the first N docs, and return the remaining 50. This means the amount of memory needed to return paginated results grows linearly with the increase in the start param.
So in case you have many documents, I mean hundreds of thousands or even millions this is not a feasible way.
This is the kind of thing that could bring your solr server to their knees.
For typical applications displaying search results to a human user, this tends to not be much of an issue since most users don’t care about drilling down past the first handful of pages of search results — but for automated systems that want to crunch data about all of the documents matching a query, it can be seriously prohibitive.
This means that if you have a website and are paging search results, a real user do not go so further but consider on the other hand what can happen if a spider or a scraper try to read all the website pages. Now we are talking of Deep Paging.
I’ll suggest to read this amazing post:
https://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/
And take a look at this document page:
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
And here is an example that try to explain how to paginate using the cursors.
SolrQuery solrQuery = new SolrQuery();
solrQuery.setRows(500);
solrQuery.setQuery("*:*");
solrQuery.addSort("id", ORDER.asc); // Pay attention to this line
String cursorMark = CursorMarkParams.CURSOR_MARK_START;
boolean done = false;
while (!done) {
solrQuery.set(CursorMarkParams.CURSOR_MARK_PARAM, cursorMark);
QueryResponse rsp = solrClient.query(solrQuery);
String nextCursorMark = rsp.getNextCursorMark();
for (SolrDocument d : rsp.getResults()) {
...
}
if (cursorMark.equals(nextCursorMark)) {
done = true;
}
cursorMark = nextCursorMark;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With