Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how much extra Space/RAM/CPU is used by apache solr?

I am using MySQL database for my webapp. I need to search over multiple tables & multiple columns, it very similar like full text searching inside those columns.

I need know your experience of using any Full Text Search API (eg. solr/lucene/mapReduce/hadoop etc..) over using simple SQL in terms of :

  1. Speed performance
  2. Extra space usage
  3. Extra CPU usage (is it continuously building index? )
  4. How long it takes to build index or it get ready for use?
  5. Please let me know your experience of using these frameworks.

Thanks a lot!

like image 666
SmartSolution Avatar asked Jan 03 '12 09:01

SmartSolution


People also ask

How much RAM does Solr need?

If your OS, Solr's Java heap, and all other running programs require 4GB of memory, then an ideal memory size for that server is at least 12GB. You might be able to make it work with 8GB total memory (leaving 4GB for disk cache), but that also might NOT be enough.

What is heap memory in Solr?

There are two types of memory Solr can use, heap memory and direct memory ( often called off-heap memory). Direct memory is used to cache blocks read from file system, similar to Linux file system cache. For heap memory, the following diagram shows various major consumers inside Solr.

Is Solr a memory?

Solr, like many others data stores, uses memory to speed up processing. In general, the more memory you have the better. This applies both to the Solr dedicated memory (JVM related) and the memory on the OS level. However, in both cases, there are some rules that you should consider.

How can I make Solr index faster?

Tip #6: commit at the end The auto-commit settings, shown below, can be configured in the solrconfig. xml file. If you configure your index to commit at the end of the indexing process, the auto-commit can efficiently perform commits during the indexing process and limit the performance impact on your indexing process.


1 Answers

To answer your questions

1.) i have an database with round about 5 Million Docs. MySQL Fulltextsearch needs 2-3 Minutes. Solr/Lucene needs for the same search round about 200-400 milliseconds.

2.) The space you need depends on your configuration, the number of copyfields and if you store the data or if you only index the data. In my configuration, full DB is indexed, but only metadata is sored. So an 30GB DB needs 40 GB on for Solr/Lucene. Keep in mind, that if you like to (re)optimize your index, you need temporary 100% of the index-size again.

3.) If you migrate from MySQL fulltext-Index to Lucene/Solr, you save CPU Power. Using MySQL Fulltext needs much more CPU Power than Solr Fulltext search -> look at answer 1.)

4.) depends on the number of documents, the size of the documents and the disk-speed. Of course the CPU performance is very important. There is not a good scaling over multiple CPU's during index-time. 2 big cores are much more faster than 8 small cores. Indexing 5 Million Docs (44GB) in my environment needs 2-3 hours on an dual core VM ware server.

5.) Migrating from MySQL Fulltext-Index to Lucene/Solr Fulltextindex was the best idea ever. ;-) But probably you have to redesign your application.

//Edit to answer the question "Will the Lucene Index get updated immediately after some Insert statements "

It depends on your SOlR configuration, but it is possible

like image 155
The Bndr Avatar answered Oct 28 '22 11:10

The Bndr