When performing a data import from mongodb, Solr throws the following error:
org.apache.solr.common.SolrException: TransactionLog doesn't know how to serialize class org.bson.types.ObjectId; try implementing ObjectResolver?
at org.apache.solr.update.TransactionLog$1.resolve(TransactionLog.java:100)
at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:234)
at org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:589)
at org.apache.solr.update.TransactionLog.write(TransactionLog.java:395)
at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:320)
at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:239)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:194)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:979)
at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1192)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:80)
at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:254)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:526)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:415)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:474)
at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:457)
at java.lang.Thread.run(Thread.java:748)
My Solr version is 6.6.0. What could be the reason for the error and how can it be resolved?
I came across this issue while trying to import data from multiple collections in mongoDB.
Assuming you are not using mongo-connector, I used the following to import data.
Since the returned '_id' is of type ObjectId, my work around solution was to convert the '_id' to String before indexing it into solr and while querying with respect to '_id', convert it to type ObjectId before running the query.
Download the solr mongo importer and make the following changes.
MongoMapperTransformer.java
public class MongoMapperTransformer extends Transformer {
@Override
public Object transformRow(Map<String, Object> row, Context context) {
for (Map<String, String> map : context.getAllEntityFields()) {
String mongoFieldName = map.get(MONGO_FIELD);
String mongoId = map.get(MONGO_ID);
if (mongoFieldName == null)
continue;
String columnFieldName = map.get(DataImporter.COLUMN);
//If the field is ObjectId convert it into String
if (mongoId != null && Boolean.parseBoolean(mongoId)) {
Object srcId = row.get(columnFieldName);
row.put(columnFieldName, srcId.toString());
}
else{
row.put(columnFieldName, row.get(mongoFieldName));
}
}
return row;
}
public static final String MONGO_FIELD = "mongoField";
//To identify the _id field
public static final String MONGO_ID = "objectIdToString";
}
Next, Replace the function
public Iterator <Map<String, Object>> getData(String query){...}
in MongoDataSource.java with the following:
@Override
public Iterator<Map<String, Object>> getData(String query) {
DBObject queryObject = new BasicDBObject();
/* If querying by _id, since the id is a string now,
* it has to be converted back to type ObjectId() using the
* constructor
*/
if(query.contains("_id")){
@SuppressWarnings("unchecked")
Map<String, String> queryWithId = (Map<String, String>) JSON.parse(query);
String id = queryWithId.get("_id");
queryObject = new BasicDBObject("_id", new ObjectId(id));
}
else{
queryObject = (DBObject) JSON.parse(query);
}
LOG.debug("Executing MongoQuery: " + query.toString());
long start = System.currentTimeMillis();
mongoCursor = this.mongoCollection.find(queryObject);
LOG.trace("Time taken for mongo :"
+ (System.currentTimeMillis() - start));
ResultSetIterator resultSet = new ResultSetIterator(mongoCursor);
return resultSet.getIterator();
}
After these changes you can build the jar using ant.
Copy the jars (solr mongo importer and the mongo-java-driver) into the lib directory. I copied them into ${solr-install-dir}/contrib/dataimport-handler/lib
Add the lib directives in solr-config.xml for the above jars:
<lib dir="${solr.install.dir:../../../..}/contrib/dataimporthandler/lib" regex=".*\.jar" />
Finally, here's an example of the mongo collections and data-config.xml
User collection
{
"_id" : ObjectId("56e9c892e4b0355017b2fa0f"),
"name" : "User1",
"phone" : "123456789"
}
Address collection
{
"_id" : ObjectId("56e9c892e4b0355017b2fa0f"),
"address" : "#666, Maiden street"
}
data-config.xml
Do not forget to mention objectIdToString="true" for the _id field so that the MongoMapperTransformer can stringify the id.
<dataConfig>
<dataSource name="MyMongo"
type="MongoDataSource"
database="test"
/>
<document name="UserDetails">
<!-- if query="" then it imports everything -->
<entity name="users"
processor="MongoEntityProcessor"
query=""
collection="user"
datasource="MyMongo"
transformer="MongoMapperTransformer">
<field column="_id" name="id" mongoField="_id" objectIdToString="true" />
<field column="phone" name="phone" mongoField="phone"/>
<entity name="address"
processor="MongoEntityProcessor"
query="{_id:'${users._id}'}"
collection="address"
datasource="MyMongo"
transformer="MongoMapperTransformer">
<field column="address" name="adress" mongoField="address"/>
</entity>
</entity>
</document>
</dataConfig>
The managed-schema will have the id field as string. Also, if you have nested objects in mongodb you will have to use script transformers to index them in solr.
Hope this helps, Good luck !
According to the error message,
You need to implement JavaBinCodec.ObjectResolver for org.bson.types.ObjectId
type, so Solr will know how to serialize instances of this class.
JavaBinCodec.ObjectResolver Documentation
public static interface JavaBinCodec.ObjectResolver Allows extension of JavaBinCodec to support serialization of arbitrary data types. Implementors of this interface write a method to serialize a given object using an existing JavaBinCodec
Once you write your JavaBinCodec.ObjectResolver implementation you should register it using JavaBinCodec
JavaBinCodec Documentation
public class JavaBinCodec extends Object Defines a space-efficient serialization/deserialization format for transferring data. JavaBinCodec has built in support many commonly used types. This includes primitive types (boolean, byte, short, double, int, long, float), common Java containers/utilities (Date, Map, Collection, Iterator, String, Object[], byte[]), and frequently used Solr types (NamedList, SolrDocument, SolrDocumentList). Each of the above types has a pair of associated methods which read and write that type to a stream.
Classes that aren't supported natively can still be serialized/deserialized by providing an JavaBinCodec.ObjectResolver object that knows how to work with the unsupported class. This allows JavaBinCodec to be used to marshall/unmarshall arbitrary content.
NOTE -- JavaBinCodec instances cannot be reused for more than one marshall or unmarshall operation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With