I'm debugging my Solr schema and I'd like to see the results of tokenizing a specific field.
For a simplified example, if I have:
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
</analyzer>
</fieldType>
and I indexed a field with the value "Hello, worlds!"
, I want to see something along the lines of:
hello world he el ll lo hel ell llo hell ello hello wo or rl ld wor orl rld worl orld
to ensure that everything is being tokenized as I envisage it is.
Is this in any way possible?
Yes, Admin > Analysis is exactly what you want.
But there's another great tool that allows you to read index and see how exactly a field or document was indexed.
It's called Luke and it's invaluable when troubleshooting and tweaking your schema.
yes, use the Analysis page in the Solr Admin section: here It has exactly that purpose
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With