Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

No content to map due to end-of-input when parsing json

Tags:

I was using the play JSON library tools to parse a JSON data in Spark, and got the following error message. Does anyone have any clue about possible cause of this error? If this is due to a bad JSON record, how can I identify the bad record? Thanks!

Here is the major script I used to parse the JSON data:

import play.api.libs.json._
val jsonData = distdata.map(line => Json.parse(line)) //line 194 of script parseJson_v14.scala
val filteredData = jsonData.map(json => (json \ "QueryStringParameters" \ "pr").asOpt[String].orNull).countByValue()

Variable distdata is a rdd of text format JSON data, variable jsonData is a rdd of JsValue data. Since Spark transformation is lazy, the error didn't jump out until the 2nd command is executed to create the variable filteredData, and according to the error message, the error comes from the the 1st command where I create the variable jsonData.


[2017-03-29 14:55:39.616]-[Logging$class.logWarning]-[WARN]: Lost task 42.0 in stage 1.0 (TID 90, 10.119.126.114): com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
     at [Source: ; line: 1, column: 1]
            at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
            at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3110)
            at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:3024)
            at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:1652)
            at play.api.libs.json.jackson.JacksonJson$.parseJsValue(JacksonJson.scala:226)
            at play.api.libs.json.Json$.parse(Json.scala:21)
            at parseJson_v14$$anonfun$1$$anonfun$3$$anonfun$apply$1.apply(parseJson_v14.scala:194)
            at parseJson_v14$$anonfun$1$$anonfun$3$$anonfun$apply$1.apply(parseJson_v14.scala:194)
            at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
            at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
            at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
            at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
            at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
            at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1197)
            at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
            at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
            at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1250)
            at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1205)
            at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
            at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
            at org.apache.spark.scheduler.Task.run(Task.scala:89)
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
like image 482
xyin Avatar asked Mar 29 '17 19:03

xyin


1 Answers

Check if you have no blank lines in distdata and that you have all JSON object in one line, like

{"id":"121", "name":"robot 1"}
{"id":"122", "name":"robot 2"}

opposite to

{"id":"121", "name":
"robot 1"}
{"id":"122", "name":
"robot 2"}
like image 89
Andriy Kuba Avatar answered Oct 11 '22 13:10

Andriy Kuba