Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pros and cons using Lucidworks Fusion instead of regular Solr

i wanna know what are the pros and cons using Fusion instead of regular Solr ? can you guys give some example (like some problem that can be solved easily using Fusion)?

like image 610
donthurtme Avatar asked Jun 11 '15 06:06

donthurtme


People also ask

What is Lucidworks Fusion?

Lucidworks Fusion is a commercial-grade platform that package a set of services with Solr to provide a highly scalable search engine and NoSQL datastore that gives you instant access to all your data. It also brings extra value to the platform by adding pre-packaged and pre-configured features and capabilities.

What is Solr Fusion?

Solr is the fast open-source search platform built on Apache Lucene™ that provides scalable indexing and search, as well as faceting, hit highlighting, and advanced analysis/tokenization capabilities.

What is Lucidworks used for?

Lucidworks operates primarily with a subscription-based business model with their Lucidworks Fusion platform for designing, building, and deploying big data applications. Lucidworks also offers subscriptions for the support, training, and integration services that help customers in using open source search software.

Why Solr is fast?

A major driving factor for Solr performance is RAM. Solr requires sufficient memory for two separate things: One is the Java heap, the other is "free" memory for the OS disk cache. Another potential source of problems is a very high query rate. Adding memory can sometimes let Solr handle a higher rate.


2 Answers

First of all, I should disclose that I am the Product Manager for Lucidworks Fusion.

You seem to already be aware that Fusion works with Solr (or one or more Solr clusters or instances), using Solr for data storage and querying. The purpose of Fusion is to make it easier to use Solr, integrate Solr, and to build complex solutions that make use of Solr. Some of the things that Fusion provides that many people find helpful for this include:

  • Connectors and a connector framework. Bare Solr gives you a good API and the ability to push certain types of files at the command line. Fusion comes with several pre-built data source connectors that fetch data from various types of systems, process them as appropriate (including parsing, transformation, and field mapping), and sends the results to Solr. These connectors include common document stores (cloud and on-premise), relational databases, NoSQL data stores, HDFS, enterprise applications, and a very powerful and configurable web crawler.
  • Security integration. Solr does not have any authentication or authorizations (though as of version 5.2 this week, it does have a pluggable API and an basic implementation of Kerberos for authentication). Fusion wraps the Solr APIs with a secured version. Fusion has clean integrations into LDAP, Active Directory, and Kerberos for authentication. It also has a fine-grained authorizations model for mananging and configuring Fusion and Solr. And, the Fusion authorizations model can automatically link group memberships from LDAP/AD with access control lists from the Fusion Connectors data sources so that you get document-level access control mirrored from your source systems when you run search queries.
  • Pipelines processing model. Fusion provides a pipeline model with modular stages (in both API and GUI form) to make it easier to define and edit transformations of data and documents. It is analogous to unix shell pipes. For example, while indexing you can include stages to define mappings of fields, compute new fields, aggregate documents, pull in data from other sources, etc. before writing to Solr. When querying, you could do the same, along with transforming the query, running and returning the results of other analytics, and applying security filtering.
  • Admin GUI. Fusion has a web UI for viewing and configuring the above (as well as the base Solr config). We think this is convenient for people who want to use Solr, but don't use it regularly enough to remember how to use the APIs, config files, and command line tools.
  • Sophisticated search-based features: Using the pipelines model described above, Fusion includes (and make easy to use) some richer search-based components, including: Natural language processing and entity extraction modules; Real-time signals-driven relevancy adjustment. We intend to provide more of these in the future.
  • Analytics processing: Fusion includes and integrates Apache Spark for running deep analytics against data stored in Solr (or on its way in to Solr). While Solr implicitly includes certain data analytics capabilities, that is not its main purpose. We use Apache Spark to drive Fusion's signals extraction and relevancy tuning, and expect to expose APIs so users can easily run other processing there.
  • Other: many useful miscellaneous features like: dashboarding UI; basic search UI with manual relevancy tuning; easier monitoring; job management and scheduling; real-time alerting with email integration, and more.

A lot of the above can of course be built or written against Solr, without Fusion, but we think that providing these kinds of enterprise integrations will be valuable to many people.

like image 65
gkanapathy Avatar answered Oct 21 '22 18:10

gkanapathy


Pros:

  • Connectors : Lucidworks provides you a wide range of connectors, with those you can connect to datasources and pull the data from there.
  • Reusability : In Lucidworks you can create pipelines for data ingestion and data retrieval. You can create pipelines with common logic so that these can be used in other pipelines.
  • Security : You can apply restrictions over data i.e Security Trimming data. Lucidworks provides in built query-pipeline stages for Security Trimming or you can write custom pipeline for your use case.
  • Troubleshooting : Lucidworks comes with discrete services i.e api, connectors, solr. You can troubleshoot any issue according the services, each service has its logs. Also you can configure JVM properties for each service
  • Support : Lucidworks support is available 24/7 for help. You can create support case according the severity and they schedule call for you.

Cons:

  • Not much, but it keeps you away from your normal development, you don't get much chance to open your IDE and start coding.
like image 35
Tayyab Hussain Avatar answered Oct 21 '22 17:10

Tayyab Hussain