I'm using ElasticSearch in Rails 4 through elasticsearch-rails (https://github.com/elasticsearch/elasticsearch-rails)
I have a User model, with an email attribute.
I'm trying to use the 'uax_url_email' tokenizer described in the docs:
class User < ActiveRecord::Base
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
settings analysis: { analyzer: { whole_email: { tokenizer: 'uax_url_email' } } } do
mappings dynamic: 'false' do
indexes :email, analyzer: 'whole_email'
end
end
end
I followed examples in the wiki (https://github.com/elasticsearch/elasticsearch-rails/wiki) and the elasticsearch-model docs (https://github.com/elasticsearch/elasticsearch-rails/wiki) to arrive at this.
It doesn't work. If I query elasticsearch directly:
curl -XGET 'localhost:9200/users/_mapping
It returns:
{
"users": {
"mappings": {
"user": {
"properties": {
"birthdate": {
"type": "date",
"format": "dateOptionalTime"
},
"created_at": {
"type": "date",
"format": "dateOptionalTime"
},
"email": {
"type": "string"
},
"first_name": {
"type": "string"
},
"gender": {
"type": "string"
},
"id": {
"type": "long"
},
"last_name": {
"type": "string"
},
"name": {
"type": "string"
},
"role": {
"type": "string"
},
"updated_at": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
}
}
}
For custom analyzers, use a type of custom or omit the type parameter. The previous example used tokenizer, token filters, and character filters with their default configurations, but it is possible to create configured versions of each and to use them in a custom analyzer.
The key difference is that normalizers can only emit a single token while analyzers can emit many. Since they only emit one token, normalizers do not use a tokenizer. They do use character filters and token filters but are limited to using those that work at a single character at a time.
By default, Elasticsearch uses the standard analyzer for all text analysis. The standard analyzer gives you out-of-the-box support for most natural languages and use cases. If you chose to use the standard analyzer as-is, no further configuration is needed.
This ended up being an issue with how I was creating the index. I was trying:
User.__elasticsearch__.client.indices.delete index: User.index_name
User.import
I expected this to delete the index, then re-import the values. However I needed to do:
User.__elasticsearch__.create_index! force: true
User.import
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With