I'm working with Solr 3.5.0. I am importing from a JDBC data source and have a delimited field that I would like split into individual values. I'm using the RegexTransformer
but my field isn't being split.
Bob,Carol,Ted,Alice
<dataConfig>
<dataSource driver="..." />
<document>
<entity name="ent"
query="SELECT id,names FROM blah"
transformer="RegexTransformer">
<field column="id" />
<field column="names" splitBy="," />
</entity>
</document>
</dataConfig>
<schema name="mytest" version="1.0">
<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true"/>
<fieldType name="integer" class="solr.IntField" omitNorms="true"/>
</types>
<fields>
<field name="id" type="integer" indexed="false" stored="true"
multiValued="false" required="true" />
<field name="name" type="string" indexed="true" stored="true"
multiValued="true" required="true" />
</fields>
</schema>
When I search : I get a result doc
element like this:
<doc>
<int name="id">22</int>
<arr name="names">
<str>Bob,Carol,Ted,Alice</str>
</arr>
</doc>
I was hoping to get this instead:
<doc>
<int name="id">22</int>
<arr name="names">
<str>Bob</str>
<str>Carol</str>
<str>Ted</str>
<str>Alice</str>
</arr>
</doc>
It's quite possible I misunderstand the RegexTransformer
section of the wiki. I've tried changing my delimiter and I've tried using a different field for the parts (as shown in the wiki)...
<field column="name" splitBy="," sourceColName="names" />
...but that resulted in an empty name
field. What am I doing wrong?
I handled a similar issue by creating a fieldtype in the schema file:
<fieldType name="commaDelimited" class="solr.TextField">
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern=",\s*" />
</analyzer>
</fieldType>
Then I applied that type to the field to the data field like:
<field name="features" type="commaDelimited" indexed="true" stored="true"/>
Your database column is called names
while the Solr field is called name
(Notice the missing s
). One solution is to use the following in your DIH config and then re-index.
<field name="name" column="names" splitBy=","/>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With