Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Configure merge_block_size in sphinx search engine

I am powering a web search using Sphinx, and am getting the following error message while building the indexes:

WARNING: sort_hits: merge_block_size=76 kb too low, increasing mem_limit may improve performance

The problem is I can't find any documentation on where this setting is configured. I'm somewhat versed on Sphinx setup, so I just need to know where the setting is configured.

like image 239
Frank Koehl Avatar asked Nov 17 '08 20:11

Frank Koehl


2 Answers

This is probably happening because you're trying to index too many items at once. Make sure you're using ranged queries. If you're already using ranged queries, increasing the mem_limit, as it suggests, may help. The merge_block_size is based on mem_limit and the number of documents.

If you're curious as to how it generates that number, check out the source. It's freely available.

like image 52
Glen Solsberry Avatar answered Nov 09 '22 14:11

Glen Solsberry


In sphinx.conf:

sql_query_range   = SELECT MIN(id),MAX(id) FROM documents
sql_range_step = 1000
sql_query = SELECT * FROM documents WHERE id>=$start AND id<=$end

If the table contains document IDs from 1 to, say, 2345, then sql_query would be run three times:

  1. with $start replaced with 1 and $end replaced with 1000;
  2. with $start replaced with 1001 and $end replaced with 2000;
  3. with $start replaced with 2000 and $end replaced with 2345.

Obviously, that's not much of a difference for 2000-row table, but when it comes to indexing 10-million-row MyISAM table, ranged queries might be of some help.

http://sphinxsearch.com/docs/current.html#ranged-queries

Hope it works for you.

like image 3
juidanho Avatar answered Nov 09 '22 15:11

juidanho