Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr and postgresql integration

I want to add the search engine Solr to my Java application. I want to index some information in Solr, but not everything, because my database is very specific.

I don't want to explain everything because is complex, so I'll use a simple example.

Let's say I have a table named T, with two columns col1 and col2:

 col1             | col2
------------------|----------
 some text...     |  123
 another text...  |  41
 bla bla...       |  124

I want to index only the col1 column in the Solr engine. I don't want to index col2 column in Solr - I know it's possible, but I don't want to do this.

In the search on my application I want to filter information from both columns. For example I need to get rows with "Lorem ipsum dolorem" in col1, and with values in range [5, 163] in col2.

How can I do this?

I use PostgreSQL and Hibernate, but perhaps I will change it to MongoDB.

like image 422
Simon Avatar asked Jul 30 '15 20:07

Simon


1 Answers

First, in your example, if you don't want to index col2 but you do want to search with filters specific to col2, do you plan to hand code some filter on top of the results? Because to filter something, it would have to be part of the index... right?

I found a blog post about hooking up Solr to mysql via a JDBC handler jar, and I found an example of the specific syntax for the PostgreSQL JDBC jar.

Putting the two together I would speculate that the steps would be (adjust as needed, because likely you've got something that partially works):

  1. In solrconfig.xml put:
    <lib dir="../../../dist/" regex="solr-dataimporthandler-\d.*\.jar" />
    
    <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
    <str name="config">db-data-config.xml</str>
    </lst>
    </requestHandler>
    
  2. Add this to your schema.xml:
    <dynamicField name="*_name" type="text_general" multiValued="false" indexed="true" stored="true" />
    
  3. Add a db-data-config.xml similar to the linked blog post but with something like (I've removed the limit, not sure if postgreSQL has a different limit syntax or not):
    <dataConfig>
      <dataSource type="JdbcDataSource"
                driver="org.postgresql.Driver"
                url="jdbc:postgresql://host/db"
                user="user"
                password="password" /> 
      <document>
        <entity name="T" query="select col1 as 'col1' from T;" />
      </document>
    </dataConfig>
    

Also look at the other follow on post for details on facets, which might help you accomplish some of the filtering you were hoping to do.

like image 150
dlamblin Avatar answered Oct 03 '22 15:10

dlamblin