Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

facet dynamic fields with apache solr

I have defined dynamic field in ApacheSolr:

I use it to store products features like: color_feature, diameter_feature, material_feature and so on. Number of those fields are not constant becouse products are changing.

Is it possible to get facet result for all those dynamic fields with the same query or do I need to write always all fields in a query like ... facet.field=color_feature&facet.field=diameter_feature&facet.field=material_feature&facet.field=...

like image 729
krinn Avatar asked Sep 22 '11 09:09

krinn


People also ask

What are dynamic fields in Solr?

Dynamic fields allow Solr to index fields that you did not explicitly define in your schema. This is useful if you discover you have forgotten to define one or more fields. Dynamic fields can make your application less brittle by providing some flexibility in the documents you can add to Solr.

What is facet field in SOLR?

Faceting is the arrangement of search results into categories based on indexed terms. Searchers are presented with the indexed terms, along with numerical counts of how many matching documents were found were each term.

How do Facets work in SOLR?

Faceted search is the dynamic clustering of items or search results into categories that let users drill into search results (or even skip searching entirely) by any value in any field. Each facet displayed also shows the number of hits within the search that match that category.

What are dynamic fields?

A dynamic field is a type of field in which the value of one field controls the values that a user can choose in another field. In a dynamic field, selecting a value for a parent field determines what value are available in the "child" fields.


2 Answers

Solr currently does not support wildcards in the facet.field parameter.
So *_feature won't work for you.

May want to check on this - https://issues.apache.org/jira/browse/SOLR-247

If you don't want to pass parameters, you can easily add these to your request handler defaults.

The qt=requesthandler in request would always include these facets.

like image 133
Jayendra Avatar answered Oct 24 '22 02:10

Jayendra


I was in a similar situation when working on an e-commerce platform. Each item had static fields (Price, Name, Category) that easily mapped to SOLR's schema.xml, but each item could also have a dynamic amount of variations.

For example, a t-shirt in the store could have Color (Black, White, Red, etc.) and Size (Small, Medium, etc.) attributes, whereas a candle in the same store could have a Scent (Pumpkin, Vanilla, etc.) variation. Essentially, this is an entity-attribute-value (EAV) relational database design used to describe some features of the product.

Since the schema.xml file in SOLR is flat from the perspective of faceting, I worked around it by munging the variations into a single multi-valued field ...

<field
  name="variation"
  type="string"
  indexed="true"
  stored="true"
  required="false"
  multiValued="true" />

... shoving data from the database into these fields as Color|Black, Size|Small, and Scent|Pumpkin ...

  <doc>
    <field name="id">ITEM-J-WHITE-M</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Medium</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-L</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Large</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-XL</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Extra Large</field>
  </doc>

... so that when I tell SOLR to facet, then I get results that look like ...

<lst name="facet_counts">
  <lst name="facet_queries"/>
  <lst name="facet_fields">
    <lst name="variation">
      <int name="Color|White">2</int>
      <int name="Size|Extra Large">2</int>
      <int name="Size|Large">2</int>
      <int name="Size|Medium">2</int>
      <int name="Size|Small">2</int>
      <int name="Color|Black">1</int>
    </lst>
  </lst>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>

... so that my code that parses these results to display to the user can just split on my | delimiter (assuming that neither my keys nor values will have a | in them) and then group by the keys ...

Color
    White (2)
    Black (1)
Size
    Extra Large (2)
    Large (2)
    Medium (2)
    Small (2)

... which is good enough for government work.

One disadvantage of doing it this way is that you'll lose the ability to do range facets on this EAV data, but in my case, that didn't apply (the Price field applying to all items and thus being defined in schema.xml so that it can be faceted in the usual way).

Hope this helps someone!

like image 39
Nicholas Piasecki Avatar answered Oct 24 '22 03:10

Nicholas Piasecki