How to make full text search using Grails Searchable Plugin accent insensitive ?
I have solved this problem with help of Peter Ledbrook's post, however some effort was needed:
Since latest searchable plugin uses Lucene 2.4.1 which does not contain ASCIIFoldingFilter (available since 2.9.0) and ISOLatin1AccentFilter doesn't support many languages I have created custom filter for stripping accents:
import java.text.Normalizer
import org.apache.lucene.analysis.Token
import org.apache.lucene.analysis.TokenFilter
import org.apache.lucene.analysis.TokenStream
class StripAccentsFilter extends TokenFilter {
StripAccentsFilter(TokenStream input) {
super(input)
}
public final Token next(Token reusableToken) {
assert reusableToken
Token nextToken = input.next(reusableToken)
if (nextToken) {
nextToken.setTermBuffer(Normalizer.normalize(nextToken.termBuffer() as String, Normalizer.Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", ""))
return nextToken
}
return null
}
}
and corresponding filter provider:
import org.apache.lucene.analysis.TokenStream
import org.compass.core.config.CompassSettings
import org.compass.core.lucene.engine.analyzer.LuceneAnalyzerTokenFilterProvider
class StripAccentsFilterProvider implements LuceneAnalyzerTokenFilterProvider {
public void configure(CompassSettings paramCompassSettings) {
}
public TokenStream createTokenFilter(TokenStream paramTokenStream) {
return new StripAccentsFilter(paramTokenStream)
}
}
Now all you need to do is to register this filter provider in configuration of searchable plugin (grails-app/conf/Searchable.groovy):
compassSettings = [
'compass.engine.analyzer.default.filters': 'stripAccents',
'compass.engine.analyzer.search.filters': 'stripAccents',
'compass.engine.analyzerfilter.stripAccents.type': 'StripAccentsFilterProvider'
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With