Solr vs. Elasticsearch: Choosing your open source search engine
October 6, 2019
October 6, 2019
Solr vs. Elasticsearch has been discussed frequently in our client projects and within the enterprise search community. But as traditional enterprise search has evolved into what Gartner calls "Insight Engines," we revisited this topic to provide the latest observations incorporating Cloud, Analytics, and Cognitive Search capabilities to help you evaluate Solr and Elasticsearch.
What is Solr?
Solr is a leading open source search engine from the Apache Software Foundation’s Lucene project. Thanks to its flexibility, scalability, and cost-effectiveness, Solr is widely used by large and small enterprises.
What is Elasticsearch?
Elasticsearch, also based on Lucene, is another leading open source search engine supporting powerful enterprise applications. Elastic - the company behind Elasticsearch and the Elastic Stack - provides enterprise solutions for search, log analytics, and other advanced analytics use cases.
Choosing your open source search engine
Oftentimes, when we help clients perform assessments that revolve around the use of an open source search engine within their enterprise solution, the question is asked: “Which is better, Solr or Elasticsearch?” While there may be a preconceived notion that one is inherently better than the other, the question is more relevant when framed as “Which is better for me?”
There are various search engine technologies available, but the most popular open source variants are those that rely on the underlying core functionality of Apache Lucene, which is, in essence, the piece that makes the search engine work. Solr and Elasticsearch are components on top of the search library providing their own implementations and features for a complete search product. The core functionality of Lucene provides the same experience for basic search functionality across Solr and Elasticsearch, but it’s their implementation approaches surrounding Lucene that create the differentiators.
<<< Start >>>
The role of a search engine has moved beyond effectively finding information to serving a key role in content analytics, predictive modeling, and integration with cognitive/intelligent search features, such as natural language processing (NLP), machine learning (ML), and relevancy scoring. We've explored and implemented these intelligent capabilities in our client work - learn more here.
<<< End >>>
Well, it depends.
There are many use cases surrounding the adoption of one technology over another. But when asked this question, I’ll typically reply with an analogy from an operational management perspective: “Solr is like Linux. Elasticsearch is like Windows.” You can heavily customize and tailor Solr to fit your needs, but management and deployment are much more involved and resource-consuming than the effort required with Elasticsearch. Elasticsearch is very easy to deploy, manage, and monitor (using X-Pack) with a very well-designed user interface (Kibana) that allows for data exploration and creation of analytical visualizations, but customizing its functionality is limited and more difficult with the plugin framework.
Elasticsearch could be for you if you want to:
Solr may be for you if you:
This is not to say that a Hadoop platform cannot work with Elasticsearch (we have proposed this scenario to clients), but some platforms, Cloudera and Hortonworks in particular, provide additional tools and methodologies for indexing data and managing Solr within the ecosystem (which is especially the case with the upcoming release of Cloudera’s CDH 6 supporting Solr 7).
From experience, we've seen that assessments can provide tremendous value in helping clients define strategies and implementation roadmaps. During our assessments, we conduct a search engine comparison matrix that evaluates the suitability of a search engine against a particular client’s needs and use cases with a weighted scoring mechanism applied based on the priority of certain features. Based on this analysis, there are common features and use cases that serve as points of interest when making an overall recommendation for a search engine.
<<< Start >>>
<<< End >>>
The chart below captures some of the observations about Solr and Elasticsearch:
SOLR | ELASTICSEARCH | |
Use Cases |
|
|
Visualization Tools |
|
|
Cloud and Big Data |
|
|
Cognitive Search Capabilities and Integration |
|
|
Management and Operations |
|
|
Development Architecture |
|
|
Cluster State Management |
|
|
Security |
|
|
Bulk Indexing Tools |
|
|
Near Real Time (NRT) Indexing (not a comprehensive list) |
|
|
Analytics |
|
|
Nested Data Structures |
|
|
Query Operations |
|
|
API Interaction |
|
|
Making the decision about which search engine is best for your specific use cases and needs should not be a decision made based on an “either-or” presumption. The overall importance of a particular piece of functionality in Solr may outweigh that of an operational advantage in Elasticsearch, for example:
In one client case, the overhead associated with Solr deployment and having to use an outdated client of SolrNET (at the time) were outweighed by the pluggable nature of Solr. Custom encryption update and request handlers were needed to apply encryption to indexed content using rotating data encryption keys, thereby necessitating the use of Solr over Elasticsearch. The functionality required by the index encryption process was not something that could effectively be implemented within Elasticsearch.
Conversely, when evaluating search engine options for a general search use case without big data or analytics considerations, Elasticsearch becomes a more popular option due to the reduced overhead in maintenance and deployment, as well as the options for fully-hosted and managed environments.
In some scenarios based on what is most important to a client, it is not immediately clear which search engine (including commercial engines) will best serve a client’s needs despite the application of a scoring rubric. In such cases, a “bake-off” can be performed using sample data sets for a client-facing evaluation on how well each engine performs for a specific set of use cases.
At the end of the day, both Solr and Elasticsearch are powerful, flexible, scalable, and extremely capable open source search engines. Overall use cases and business requirements in conjunction with your desired features, operational considerations, and integrations with new cognitive search and analytics capabilities, will ultimately drive your decision whether to select Solr or Elasticsearch.
<<< Start >>>
<<< End >>>