I am having some confusion between the fieldtype available. string
vs strings
and int
vs ints
and the likes for other datatype.
What are the differences between the following 4?
<field name="string_multi" type="string" multiValued="true" indexed="true" stored="true"/>
<field name="string_single" type="string" indexed="true" stored="true"/>
<field name="strings_multi" type="strings" multiValued="true" indexed="true" stored="true"/>
<field name="strings_single" type="strings" indexed="true" stored="true"/>
Given that I have document, what should I declare for my field named hashtags
?
String multivalued
or strings multivalue
or strings without multivalue
, ?
{
"polarity":0.0,
"text":"RT @socialistudents: Vlad - we go to NUS conference not just as individuals but as members of Socialist Students #SocStu17",
"created_at":"Sun Feb 12 19:28:34 +0000 2017",
"hashtags":[
"hashtag1",
"hashtag2"
],
"subjectivity":0.0,
"retweet_recount":4,
"id":830861171582439424,
"favorite_count":0
}
Well if you're talking about the default field types that are made when you use Solr's default schema, if you actually look at the fieldType definition it says this:
<fieldType name="string" class="solr.StrField" sortMissingLast="true" docValues="true" />
<fieldType name="strings" class="solr.StrField" sortMissingLast="true" multiValued="true" docValues="true" />
Edited: The 2nd example should be strings
instead of string
So they actually have the same class (solr's default string class solr.StrField
) so they are the same type of data. The only difference is 'strings' is multivalued, which just means you can store multiple discrete values in the one field.
In your example, it seems that your hashtags data is just an array of individual hashtag values, so since you want to store multiple discrete strings in one field then 'strings' would be the choice as it is multiValued.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With