Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Solr, what is the maximum size of a "text" field?

Tags:

solr

When using Solr client in your app, what is the max size of a text multi line field?

Can I send huge xml documents as text?

E.g.

SolrInputDocument document = new SolrInputDocument();
document.addField("id", rec.getId());
document.addField("hugeTextFile_txt", hugeTextFile);        
UpdateResponse response = solr.add(document);
solr.commit();  
like image 937
Nicholas DiPiazza Avatar asked Oct 04 '15 17:10

Nicholas DiPiazza


People also ask

What is full text search in Solr?

Searching is the most powerful capability of Solr. Once we have the documents indexed in our repository, we can search for keywords, phrases, date ranges, etc. The results are sorted by relevance (score).

What is field type in Solr?

A field type defines the analysis that will occur on a field when documents are indexed or queries are sent to the index. A field type definition can include four types of information: The name of the field type (mandatory). An implementation class name (mandatory).

What is Solr multivalued field?

A multivalued field is useful when there are more than one value present for the field. An easy example would be tags, there can be multiple tags that need to be indexed. so if we have tags field as multivalued then solr response will return a list instead of a string value.

What are dynamic fields in Solr?

Dynamic fields allow Solr to index fields that you did not explicitly define in your schema. This is useful if you discover you have forgotten to define one or more fields. Dynamic fields can make your application less brittle by providing some flexibility in the documents you can add to Solr.


1 Answers

Update

I used the same unit test using text fieldType. Below is the declaration I used. Please note that I have removed analyzer section from declaration.

<fieldType name="text" class="solr.TextField"/>

I was able to add 500,000,000 characters and index it successfully. For higher value I got Java heap space error, which is not related to the solr.


I tried to perform a simple test by adding a large value to a field. The limit I found is 32,766 bytes. After that It throws IllegalArgumentException. The fieldType for email was string.

<fieldType name="string" class="solr.StrField" sortMissingLast="true" />

@Test
public void test() throws IOException, SolrServerException {
  SolrInputDocument document = new SolrInputDocument();
  document.addField("profileId", TestConstants.PROFILE_ID);
  StringBuilder builder = new StringBuilder();
  for (int i = 0; i<32767; i++) {
    builder.append((char)((i%26)+'a'));
  }
  document.addField("email", builder.toString());
  solrClient.add(document);
  solrClient.commit();
}

Exception thrown by above for 32767 and more:

Caused by: java.lang.IllegalArgumentException: Document contains at least one immense term in field="email" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 97, 98, 99, 100]...', original message: bytes can be at most 32766 in length; got 32767

I hope this would help.

like image 184
YoungHobbit Avatar answered Dec 05 '22 07:12

YoungHobbit