I have a Solr setup. One master and two slaves for replication. We have about 70 Millions documents in index. The slaves have 16 GBs of RAM. 10GBs for the OS and HD, 6GBs for Solr. But from time to time, the slaves are out of memory. When we downloaded the dump file just before it was out of memory, we could see that the class : <pre class="prettyprint"><code>org.apache.solr.util.ConcurrentLRUCache$Stats @ 0x6eac8fb88 </code></pre> is using up to 5Gb of memory. We are using filter caches extensively, it has a 93% hit ratio. And here's the xml for the filter cache in solrconfig.xml <pre class="prettyprint"><code><property name="filterCache.size" value="2000" /> <property name="filterCache.initialSize" value="1000" /> <property name="filterCache.autowarmCount" value="20" /> <filterCache class="solr.FastLRUCache" size="${filterCache.size}" initialSize="${filterCache.initialSize}" autowarmCount="${filterCache.autowarmCount}"/> </code></pre> The query results have the same settings, but is using the LRUCache and it only uses about 35mb of the memory. Is there something wrong with the configuration which needs to be fixed, or do I just need more memory for the filter cache?

After a friend told me how roughly the filter caches works, it become clear why we get out of memory errors from time to time. So what does the filter cache do? Basically it creates something like a bit array which tell which documents matched the filter. Some something like: <pre class="prettyprint"><code>cache = [1, 0, 0, 1, .. 0] </code></pre> 1 means it hits, and 0 means no hit. So for the example, it means the filter cache matches the 0th and 3rd documents. So a cache is kind of like an array of bit, with the length of the total documents. So let's say I have 50 millions docs, so the array length will be 50 millions, which means one filter cache will take up 50.000.000 bit in the ram. So we specified we want 2000 filter cache, it means the RAM it will take is roughly: <pre class="prettyprint"><code>50.000.000 * 2000 = 100.000.000.000 bit </code></pre> If you convert it to Gb. It will be: <pre class="prettyprint"><code>100.000.000.000 bit / 8 (to byte) / 1000 (to kb) / 1000 (to mb) / 1000 (to gb) = 12,5 Gb </code></pre> So the total RAM needed just by the filter cache is roughly 12Gb. And it means if the Solr only have 6Gb Heap Space, it will not be able to create 2000 filter caches. Yes, I know Solr doesn't always create this array, and if the result of the filter query is low, it can just create something else which take up less memory. This calculation just says roughly how much the upper limit of the filter cache is, if it has 2000 caches in the ram. It can be lower in other better cases. So one solution is to lower the number of maximum filter caches in solr config. We checked solr stats, most of the time we only have about 600 filter caches, so we can reduce the filter caches number to that as the maximum. Another option is to of course add more RAM.

Some options: <ol> <li>decrease the size of the cache, and see if you still have a good hit ratio</li> <li>replace the LRU with solr.LFUCache (Least Frequenty Used), maybe in conjuction with point 1 would still give a good hit ratio</li> <li> If when querying, sometimes you know the fq will be very rare, dont cache it, by using <blockquote> fq={!cache=false}inStock:true </blockquote> </li> <li>of course, get more memory is another option</li> <li>investigate if DocValues help here, they do help with memory in other scenarios (facetting, sorting...), but not sure if they do with fq</li> <li>if you are not at latest release, upgrade. </li> </ol>

Solr Filter Cache (FastLRUCache) takes too much memory and results in out of memory?

Tags:

java

out-of-memory

solr

lucene

I have a Solr setup. One master and two slaves for replication. We have about 70 Millions documents in index. The slaves have 16 GBs of RAM. 10GBs for the OS and HD, 6GBs for Solr.

But from time to time, the slaves are out of memory. When we downloaded the dump file just before it was out of memory, we could see that the class :

org.apache.solr.util.ConcurrentLRUCache$Stats @ 0x6eac8fb88

is using up to 5Gb of memory. We are using filter caches extensively, it has a 93% hit ratio. And here's the xml for the filter cache in solrconfig.xml

<property name="filterCache.size" value="2000" />
<property name="filterCache.initialSize" value="1000" />
<property name="filterCache.autowarmCount" value="20" />

<filterCache class="solr.FastLRUCache"
             size="${filterCache.size}"
             initialSize="${filterCache.initialSize}"
             autowarmCount="${filterCache.autowarmCount}"/>

The query results have the same settings, but is using the LRUCache and it only uses about 35mb of the memory. Is there something wrong with the configuration which needs to be fixed, or do I just need more memory for the filter cache?

446

asked Jan 08 '14 15:01

Rowanto

2 Answers

After a friend told me how roughly the filter caches works, it become clear why we get out of memory errors from time to time.

So what does the filter cache do? Basically it creates something like a bit array which tell which documents matched the filter. Some something like:

cache = [1, 0, 0, 1, .. 0]

1 means it hits, and 0 means no hit. So for the example, it means the filter cache matches the 0th and 3rd documents. So a cache is kind of like an array of bit, with the length of the total documents. So let's say I have 50 millions docs, so the array length will be 50 millions, which means one filter cache will take up 50.000.000 bit in the ram.

So we specified we want 2000 filter cache, it means the RAM it will take is roughly:

50.000.000 * 2000 = 100.000.000.000 bit

If you convert it to Gb. It will be:

100.000.000.000 bit / 8 (to byte) / 1000 (to kb) / 1000 (to mb) / 1000 (to gb) = 12,5 Gb

So the total RAM needed just by the filter cache is roughly 12Gb. And it means if the Solr only have 6Gb Heap Space, it will not be able to create 2000 filter caches.

Yes, I know Solr doesn't always create this array, and if the result of the filter query is low, it can just create something else which take up less memory. This calculation just says roughly how much the upper limit of the filter cache is, if it has 2000 caches in the ram. It can be lower in other better cases.

So one solution is to lower the number of maximum filter caches in solr config. We checked solr stats, most of the time we only have about 600 filter caches, so we can reduce the filter caches number to that as the maximum.

Another option is to of course add more RAM.

112

answered Oct 20 '22 08:10

Rowanto

Some options:

decrease the size of the cache, and see if you still have a good hit ratio
replace the LRU with solr.LFUCache (Least Frequenty Used), maybe in conjuction with point 1 would still give a good hit ratio
If when querying, sometimes you know the fq will be very rare, dont cache it, by using

fq={!cache=false}inStock:true
of course, get more memory is another option
investigate if DocValues help here, they do help with memory in other scenarios (facetting, sorting...), but not sure if they do with fq
if you are not at latest release, upgrade.

answered Oct 20 '22 07:10

Persimmonium

Related questions
                            
                                Find column # by column name or header - JTable
                            
                                Java: CSV File Easy Read/Write
                            
                                Java Reflections error: Wrong number of arguments
                            
                                ClassNotFoundException when including a library jar
                            
                                BufferedReader and InputStreamReader in Java
                            
                                Multiple jpa:repositories in xml config, how to configure with @EnableJPARepositories using Spring java config?
                            
                                What are the benefits of java.nio for a web server?
                            
                                ArrayList of classes : ArrayList<Class> but how to force those classes to extend some super class?
                            
                                Sort an iterator of strings
                            
                                Is there a way to only have the OK button in a JOptionPane showInputDialog (and no CANCEL button)?
                            
                                java full gc taking too long
                            
                                Change IntelliJ IDEA left border background color
                            
                                Sorting a list of non-comparable elements
                            
                                How to set global font-family in Apache FOP?
                            
                                Selenium 2 (WebDriver) + Java + Maven + Eclipse Hello World Program issue
                            
                                Could not connect to the database Network error IOException: Connection refused: connect
                            
                                simple java regex throwing illegalstateexception [duplicate]
                            
                                Converting configuration properties to enum values
                            
                                Can we use @autowired in jsp. If yes then how.?
                            
                                Creating a large textbox to get a paragraph input

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Solr Filter Cache (FastLRUCache) takes too much memory and results in out of memory?

Tags:

java

out-of-memory

solr

lucene

Rowanto

People also ask

2 Answers

Rowanto

Persimmonium

Recent Activity

Donate For Us