I am wanting to use fuzzy matching on a query but with exact matches showing at the top of the results.
I've tried the following.
$return = $this->_client->search(
array(
'index' => self::INDEX,
'type' => self::TYPE,
'body' => array(
'query' => array(
'bool' => array(
'must' => array(
'multi_match' => array(
'query' => $query,
'fields' => array('name', 'brand', 'description'),
'boost' => 10,
),
'fuzzy_like_this' => array(
'like_text' => $query,
'fields' => array('name', 'brand', 'description'),
'fuzziness' => 1,
),
),
),
),
'size' => '5000',
),
)
);
This doesn't work due a malformed query error.
Any ideas?
In Elasticsearch, fuzzy query means the terms are not the exact matches of the index. The result is 2, but you can use fuzziness to find the correct word for a typo in Elasticsearch's fuzzy in Match Query. For 6 characters, the Elasticsearch by default will allow 2 edit distance.
Fuzzy queryedit. Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance. An edit distance is the number of one-character changes needed to turn one term into another.
Minimum Should Match is another search technique that allows you to conduct a more controlled search on related or co-occurring topics by specifying the number of search terms or phrases in the query that should occur within the records returned.
The match query analyzes any provided text before performing a search. This means the match query can search text fields for analyzed tokens rather than an exact term. (Optional, string) Analyzer used to convert the text in the query value into tokens. Defaults to the index-time analyzer mapped for the <field> .
I ended up not using fuzzy matching to solve my problem and went with using ngram's.
/**
* Map - Create a new index with property mapping
*/
public function map()
{
$params['index'] = self::INDEX;
$params['body']['settings'] = array(
'index' => array(
'analysis' => array(
'analyzer' => array(
'product_analyzer' => array(
'type' => 'custom',
'tokenizer' => 'whitespace',
'filter' => array('lowercase', 'product_ngram'),
),
),
'filter' => array(
'product_ngram' => array(
'type' => 'nGram',
'min_gram' => 3,
'max_gram' => 5,
),
)
),
)
);
//all the beans
$mapping = array(
'_source' => array(
'enabled' => true
),
'properties' => array(
'id' => array(
'type' => 'string',
),
'name' => array(
'type' => 'string',
'analyzer' => 'product_analyzer',
'boost' => '10',
),
'brand' => array(
'type' => 'string',
'analyzer' => 'product_analyzer',
'boost' => '5',
),
'description' => array(
'type' => 'string',
),
'barcodes' => array(
'type' => 'string'
),
),
);
$params['body']['mappings'][self::TYPE] = $mapping;
$this->_client->indices()->create($params);
}
public function search($query)
{
$return = $this->_client->search(
array(
'index' => self::INDEX,
'type' => self::TYPE,
'body' => array(
'query' => array(
'multi_match' => array(
'query' => $query,
'fields' => array('id', 'name', 'brand', 'description', 'barcodes'),
),
),
'size' => '5000',
),
)
);
$productIds = array();
if (!empty($return['hits']['hits'])) {
foreach ($return['hits']['hits'] as $hit) {
$productIds[] = $hit['_id'];
}
}
return $productIds;
}
The result is exactly what I was looking for. It constructs matches based on how many ngram part the search query has within it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With