Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse JSON in Scala using standard Scala classes?

Tags:

json

scala

People also ask

What is the best JSON library for Scala?

Argonaut is a great library. It's by far the best JSON library for Scala, and the best JSON library on the JVM. If you're doing anything with JSON in Scala, you should be using Argonaut.

What is JSON parse () method?

parse() The JSON. parse() method parses a JSON string, constructing the JavaScript value or object described by the string. An optional reviver function can be provided to perform a transformation on the resulting object before it is returned.

How does spark read JSON?

Spark SQL can automatically infer the schema of a JSON dataset and load it as a DataFrame. using the read. json() function, which loads data from a directory of JSON files where each line of the files is a JSON object. Note that the file that is offered as a json file is not a typical JSON file.


This is a solution based on extractors which will do the class cast:

class CC[T] { def unapply(a:Any):Option[T] = Some(a.asInstanceOf[T]) }

object M extends CC[Map[String, Any]]
object L extends CC[List[Any]]
object S extends CC[String]
object D extends CC[Double]
object B extends CC[Boolean]

val jsonString =
    """
      {
        "languages": [{
            "name": "English",
            "is_active": true,
            "completeness": 2.5
        }, {
            "name": "Latin",
            "is_active": false,
            "completeness": 0.9
        }]
      }
    """.stripMargin

val result = for {
    Some(M(map)) <- List(JSON.parseFull(jsonString))
    L(languages) = map("languages")
    M(language) <- languages
    S(name) = language("name")
    B(active) = language("is_active")
    D(completeness) = language("completeness")
} yield {
    (name, active, completeness)
}

assert( result == List(("English",true,2.5), ("Latin",false,0.9)))

At the start of the for loop I artificially wrap the result in a list so that it yields a list at the end. Then in the rest of the for loop I use the fact that generators (using <-) and value definitions (using =) will make use of the unapply methods.

(Older answer edited away - check edit history if you're curious)


This is the way I do the pattern match:

val result = JSON.parseFull(jsonStr)
result match {
  // Matches if jsonStr is valid JSON and represents a Map of Strings to Any
  case Some(map: Map[String, Any]) => println(map)
  case None => println("Parsing failed")
  case other => println("Unknown data structure: " + other)
}

I like @huynhjl's answer, it led me down the right path. However, it isn't great at handling error conditions. If the desired node does not exist, you get a cast exception. I've adapted this slightly to make use of Option to better handle this.

class CC[T] {
  def unapply(a:Option[Any]):Option[T] = if (a.isEmpty) {
    None
  } else {
    Some(a.get.asInstanceOf[T])
  }
}

object M extends CC[Map[String, Any]]
object L extends CC[List[Any]]
object S extends CC[String]
object D extends CC[Double]
object B extends CC[Boolean]

for {
  M(map) <- List(JSON.parseFull(jsonString))
  L(languages) = map.get("languages")
  language <- languages
  M(lang) = Some(language)
  S(name) = lang.get("name")
  B(active) = lang.get("is_active")
  D(completeness) = lang.get("completeness")
} yield {
  (name, active, completeness)
}

Of course, this doesn't handle errors so much as avoid them. This will yield an empty list if any of the json nodes are missing. You can use a match to check for the presence of a node before acting...

for {
  M(map) <- Some(JSON.parseFull(jsonString))
} yield {
  map.get("languages") match {
    case L(languages) => {
      for {
        language <- languages
        M(lang) = Some(language)
        S(name) = lang.get("name")
        B(active) = lang.get("is_active")
        D(completeness) = lang.get("completeness")
      } yield {
        (name, active, completeness)
      }        
    }
    case None => "bad json"
  }
}

I tried a few things, favouring pattern matching as a way of avoiding casting but ran into trouble with type erasure on the collection types.

The main problem seems to be that the complete type of the parse result mirrors the structure of the JSON data and is either cumbersome or impossible to fully state. I guess that is why Any is used to truncate the type definitions. Using Any leads to the need for casting.

I've hacked something below which is concise but is extremely specific to the JSON data implied by the code in the question. Something more general would be more satisfactory but I'm not sure if it would be very elegant.

implicit def any2string(a: Any)  = a.toString
implicit def any2boolean(a: Any) = a.asInstanceOf[Boolean]
implicit def any2double(a: Any)  = a.asInstanceOf[Double]

case class Language(name: String, isActive: Boolean, completeness: Double)

val languages = JSON.parseFull(jstr) match {
  case Some(x) => {
    val m = x.asInstanceOf[Map[String, List[Map[String, Any]]]]

    m("languages") map {l => Language(l("name"), l("isActive"), l("completeness"))}
  }
  case None => Nil
}

languages foreach {println}

val jsonString =
  """
    |{
    | "languages": [{
    |     "name": "English",
    |     "is_active": true,
    |     "completeness": 2.5
    | }, {
    |     "name": "Latin",
    |     "is_active": false,
    |     "completeness": 0.9
    | }]
    |}
  """.stripMargin

val result = JSON.parseFull(jsonString).map {
  case json: Map[String, List[Map[String, Any]]] =>
    json("languages").map(l => (l("name"), l("is_active"), l("completeness")))
}.get

println(result)

assert( result == List(("English", true, 2.5), ("Latin", false, 0.9)) )