Frequently Asked Questions
What is Solr?
Solr is a high performance server based on the Apache Lucene information retrieval library. It can be used for searching millions of pages of data in a few milliseconds, and returning highly relevant results. It is trusted by Fortune 500 companies, startups, and high traffic brands to deliver fast, accurate results.
What is a Solr index?
A Solr index is the object used for indexing and retrieving information (analogous to a database table, although Solr itself should not be treated as a database).
Why is it not a database?
It gets a bit technical, but essentially Solr stores data in a structure known as an inverted index, which is similar to looking at a table of contents in a textbook. What this means is that it doesn’t exactly retain the source data like a traditional database would, but rather a version of it that has been translated to support fast lookups. Simply, this means that your data is broken down, analyzed and stored in a special way that makes extremely fast, accurate searches possible. The tradeoff is that it doesn’t technically offer ACID transactions. This is why Solr should only be used as a secondary data store.
How long will it take Solr to crawl my site?
Solr is a passive data store, meaning it can only index data which has been explicitly sent to it. It is not a web crawler. However, another Apache project, Nutch can be used for this purpose, and Nutch can pipe data to Solr (and Websolr) for indexing and searching.
Can I customize the MaxBooleanClauses limits?
The maxBooleanClauses setting is sort of a contentious one in the Solr community. Essentially, lots of Boolean terms can cause serious performance deterioration, and there is a debate within the developer community as to whether Solr should be allowed to degrade or should employ a safeguard in the form of a hard setting. The setting itself is also implemented in a weird way; that is, when a change is applied to maxBooleanClauses for a core, the setting takes place globally, but only after a hard restart, and then affects all cores.
I want to know something else!
Take a look at our table of contents. We have plenty of documentation covering everything you would want to know. Can’t find what you’re looking for? Shoot us an email, and we’ll be glad to help!