Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data-config.xml and mysql - I can load only "id" column

I've got Solr 5.0.0 on Windows Server 2012. I would like to load all data from my table into solr engine.

My data-config.xml looks like this:

<?xml version="1.0" encoding="UTF-8" ?>
<!--# define data source -->
<dataConfig>
<dataSource type="JdbcDataSource" 
        driver="com.mysql.jdbc.Driver"
        url="jdbc:mysql://localhost:3306/database" 
        user="root" 
        password="root"/>
<document>
<entity name="my_table"  
pk="id"
query="SELECT ID, LASTNAME FROM my_table limit 2">
 <field column="ID" name="id" type="string" indexed="true" stored="true" required="true" />
 <field column="LASTNAME" name="lastname" type="string" indexed="true" stored="true"/>
</entity>
</document>
</dataConfig>

When I choose dataimport, I've got an answer:

Indexing completed. Added/Updated: 2 documents. Deleted 0 documents    
Requests: 1, Fetched: 2, Skipped: 0, Processed: 2 

And Raw Debug-Response :

{
  "responseHeader": {
    "status": 0,
    "QTime": 280
  },
  "initArgs": [
    "defaults",
    [
      "config",
      "data-config.xml"
    ]
  ],
  "command": "full-import",
  "mode": "debug",
  "documents": [
    {
      "id": [
        1983
      ],
      "_version_": [
        1497798459776827400
      ]
    },
    {
      "id": [
        1984
      ],
      "_version_": [
        1497798459776827400
      ]
    }
  ],
  "verbose-output": [
    "entity:my_table",
    [
      "document#1",
      [
        "query",
        "SELECT ID,LASTNAME FROM my_table limit 2",
        "time-taken",
        "0:0:0.8",
        null,
        "----------- row #1-------------",
        "LASTNAME",
        "Gates",
        "ID",
        1983,
        null,
        "---------------------------------------------"
      ],
      "document#2",
      [
        null,
        "----------- row #1-------------",
        "LASTNAME",
        "Doe",
        "ID",
        1984,
        null,
        "---------------------------------------------"
      ],
      "document#3",
      []
    ]
  ],
  "status": "idle",
  "importResponse": "",
  "statusMessages": {
    "Total Requests made to DataSource": "1",
    "Total Rows Fetched": "2",
    "Total Documents Skipped": "0",
    "Full Dump Started": "2015-04-07 15:05:22",
    "": "Indexing completed. Added/Updated: 2 documents. Deleted 0 documents.",
    "Committed": "2015-04-07 15:05:22",
    "Optimized": "2015-04-07 15:05:22",
    "Total Documents Processed": "2",
    "Time taken": "0:0:0.270"
  }
}

And finally when I'm quering Solr

http://localhost:8983/solr/test/query?q=*:*

I've got an answer:

{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"*:*"}},
  "response":{"numFound":2,"start":0,"docs":[
      {
        "id":"1983",
        "_version_":1497798459776827392},
      {
        "id":"1984",
        "_version_":1497798459776827393}]
  }}

I would like to see lastname column too. Why can't I?

like image 646
Kot Szutnik Avatar asked Apr 07 '15 13:04

Kot Szutnik


People also ask

What is data import handler?

The Data Import Handler (DIH) provides a mechanism for importing content from a data store and indexing it. In addition to relational databases, DIH can index content from HTTP based data sources such as RSS and ATOM feeds, e-mail repositories, and structured XML where an XPath processor is used to generate fields.

What is SolrJ?

SolrJ is an API that makes it easy for applications written in Java (or any language based on the JVM) to talk to Solr. SolrJ hides a lot of the details of connecting to Solr and allows your application to interact with Solr with simple high-level methods. SolrJ supports most Solr APIs, and is highly configurable.

What is Solrconfig xml in Solr?

The solrconfig. xml file is the configuration file with the most parameters affecting Solr itself. While configuring Solr, you'll work with solrconfig. xml often, either directly or via the Config API to create "configuration overlays" ( configoverlay. json ) to override the values in solrconfig.


1 Answers

That warning in the logs is actually the real issue.

If you look in the solrconfig.xml file you will have a section:

<schemaFactory class="ManagedIndexSchemaFactory">
  <bool name="mutable">true</bool>
  <str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>

This means that your schema.xml file is being ignored. Instead the file managed-schema in the same folder will be being used.

There are a couple of ways to solve this. You can comment out the managed schema section and replace it with

<schemaFactory class="ClassicIndexSchemaFactory"/>

Or another way is to delete the managed-schema file. SOLR will then read the schema.xml file on restart and generate a new managed-schema. If that works then you should then see your fields at the bottom of the file.

For more information please see:

https://cwiki.apache.org/confluence/display/solr/Managed+Schema+Definition+in+SolrConfig

like image 195
applefish Avatar answered Sep 18 '22 18:09

applefish