Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

solrj: how to store and retrieve List<POJO> via multivalued field in index

My use case is an index which holds titles of online media. The provider of the data associates a list of categories with each title. I am using SolrJ to populate the index via an annotated POJO class

e.g.

@Field("title")
private String title;

@Field("categories")
private List<Category> categoryList;

The associated POJO is

public class Category {
    private Long id;
    private String name;
...

}

My question has two parts:

a) is this possible via SolrJ - the docs only contain an example of @Field using a List of String, so I assume the serialization/marshalling only supports simple types ?

b) how would I set up the schema to hold this. I have a naive assumption I just need to set multiValued=true on the required field & it will all work by magic.

I'm just starting to implement this so any response would be highly appreciated.

like image 546
David Victor Avatar asked Jul 09 '11 08:07

David Victor


1 Answers

The answer is as you thought:

a) You have only simple types available. So you will have a List of the same type e.g. String. The point is you cant represent complex types inside the lucene document so you wont deserialize them as well.

b) The problem is what you are trying is to represent relational thinking in a "document store". That will probably work only to a certain point. If you want to represent categories inside a lucene document just use the string it is not necessary to store a id as well.

The only point to store an id as well is: if you want to do aside the search a lookup on a RDBMS. If you want to do this you need to make sure that the id and the category name is softlinked. This is not working for every 1:n relation. (Every 1:n relation where the n related table consists only of required fields is possible. If you have an optional field you need to put something like a filling emptyconstant in the field if possible).

However if these 1:n relations are not sparse its possible actually if you maintain the order in which you add fields to the document. So the case with the category relation can be probably represented if you dont sort the lists.

You may implement a method which returns this Category if you instantiate it with the values at position 0...n. So the solution would be if you want to have the first category it will be at position 0 of every list related to this category.

like image 187
fyr Avatar answered Sep 21 '22 10:09

fyr