Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fieldtype string vs strings

Tags:

solr

solr6

I am having some confusion between the fieldtype available. string vs strings and int vs ints and the likes for other datatype.

What are the differences between the following 4?

<field name="string_multi" type="string" multiValued="true" indexed="true" stored="true"/>
<field name="string_single" type="string" indexed="true" stored="true"/>
<field name="strings_multi" type="strings" multiValued="true" indexed="true" stored="true"/>
<field name="strings_single" type="strings" indexed="true" stored="true"/>

Given that I have document, what should I declare for my field named hashtags?

String multivalued or strings multivalue or strings without multivalue, ?

{
      "polarity":0.0,
      "text":"RT @socialistudents: Vlad - we go to NUS conference not just as individuals but as members of Socialist Students #SocStu17",
      "created_at":"Sun Feb 12 19:28:34 +0000 2017",
      "hashtags":[
         "hashtag1",
         "hashtag2"
      ],
      "subjectivity":0.0,
      "retweet_recount":4,
      "id":830861171582439424,
      "favorite_count":0
}
like image 588
Gavin Avatar asked Feb 13 '17 04:02

Gavin


1 Answers

Well if you're talking about the default field types that are made when you use Solr's default schema, if you actually look at the fieldType definition it says this:

<fieldType name="string" class="solr.StrField" sortMissingLast="true" docValues="true" />
<fieldType name="strings" class="solr.StrField" sortMissingLast="true" multiValued="true" docValues="true" />

Edited: The 2nd example should be strings instead of string

So they actually have the same class (solr's default string class solr.StrField) so they are the same type of data. The only difference is 'strings' is multivalued, which just means you can store multiple discrete values in the one field.

In your example, it seems that your hashtags data is just an array of individual hashtag values, so since you want to store multiple discrete strings in one field then 'strings' would be the choice as it is multiValued.

like image 110
Jayce444 Avatar answered Sep 19 '22 17:09

Jayce444