Partial Results
Partial results occur when Solr can not complete a query in a reasonable amount of time. Time limits on queries are enforced both to keep Solr from hanging your app, and to prevent other tenants on the server from tying up system resources at the expense of other users. Partial results manifest themselves in several ways:
- Returning 0 documents for a term that is known to be present
- Returning an inconsistent number of results for a given term
- Alternating between returning 0 results and all (or most) results
Detecting Partial Results
You can tell when you are getting partial results because there will be a partialResults
flag set to true
in the response header. In JSON, this looks like:
{"responseHeader": {"status":0, "partialResults":true, "QTime":200, "params":{"wt":"json"} }, "response":{"numFound":7793,"start":0,"docs":[ ... ]}}
XML responses will also have a partialResults
tag present.
Why does this happen?
There are generally two types of situations where partial results can arise:
- You perform an expensive query on a large index. If you ask for a term that occurs 100,000 times in your index (or make use of the wildcard operator
*
, functionQueries, etc), we may only be able to evaluate the first 20,000 matching documents before the maximum allowed query time has elapsed. - Insufficient system resources. In a cloud environment, the IO of the underlying instances can fluctuate. If we do not have enough available IO to complete your request in a timely manner, we will search as much as we can in a short amount of time, and return what we have found.
Fixing Partial Results
There are a few strategies for dealing with partialResults. The quickest and easiest method is to have the application check the results for the partialResults
flag, and retry the search if it’s detected. This way you can benefit from the caching of any query filters and document results.
You could also look at optimizing your queries. Typically, a Solr client will automatically construct a query based on parameters it receives from the app, but this is not always done in an intelligent way. For example, Sunspot has been known to create queries like this:
q=*:*&fq=year_added:[2012+TO+*]&fq=year_added:[*+TO+2014]...
Where the heavy use of the wildcard operator is completely unnecessary and inefficient. Wildcards are the least efficient means of pattern matching in Solr, and consume a surprisingly large amount of CPU. Generally we advise users to avoid these where possible. So, continuing from the previous example, something like this would be far more effective:
q=*:*&fq=year_added:[2012+TO+2014]...
Depending on the data being indexed, and average document sizes, you may simply need more memory than the averages our plans’ document recommendations are based on. Please contact us for an analysis of the size of your index so that we can recommend a more appropriate service level for your needs.