Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I turn a JSON file into a Java 8 Object Stream?

I have a very large > 1GB JSON file containing an array (it's confidential, but this file of sleep duration data is a proxy:)

 [
        {
            "date": "August 17, 2015",
            "hours": 7,
            "minutes": 10
        },
        {
            "date": "August 19, 2015",
            "hours": 4,
            "minutes": 46
        },
        {
            "date": "August 19, 2015",
            "hours": 7,
            "minutes": 22
        },
        {
            "date": "August 21, 2015",
            "hours": 4,
            "minutes": 48
        },
        {
            "date": "August 21, 2015",
            "hours": 6,
            "minutes": 1
        }
    ]

I've used JSON2POJO to produce a "Sleep" object definition.

Now, one could use Jackson's Mapper to just convert to an array, and then use Arrays.stream(ARRAY). Except that this crashes (yes, it's a BIG file).

The obvious thing is to use Jackson's Streaming API. But that's super low level. In particular, I still want Sleep Objects.

How do I use the Jackson Streaming JSON reader and my Sleep.java class to generate a Java 8 Stream of Sleep Objects?

like image 929
Witbrock Avatar asked Jan 25 '16 19:01

Witbrock


People also ask

How do you convert JSON to Java object?

We can convert a JSON to Java Object using the readValue() method of ObjectMapper class, this method deserializes a JSON content from given JSON content String.

How can I convert JSON to object?

Use the JavaScript function JSON.parse() to convert text into a JavaScript object: const obj = JSON.parse('{"name":"John", "age":30, "city":"New York"}');

Can you stream JSON?

Concatenated JSON streaming allows the sender to simply write each JSON object into the stream with no delimiters. It relies on the receiver using a parser that can recognize and emit each JSON object as the terminating character is parsed.

Can we convert JSON to Map in Java?

We can easily convert JSON data into a map because the JSON format is essentially a key-value pair grouping and the map also stores data in key-value pairs. Let's understand how we can use both JACKSON and Gson libraries to convert JSON data into a Map.


2 Answers

I couldn't find a good solution to this, and I needed one for a particular case: I had a >1GB JSON file (a top level JSON array, with tens of thousands of largish objects), and using the normal Jackson mapper caused crashes when accessing the resulting Java object array.

The examples I found for using the Jackson Streaming API lost the object mapping that is so appealing, and certainly didn't allow access to the objects via the (obviously appropriate) Java 8 Streaming API.

The code is now on GitHub

Here's a quick example of use:

 //Use the JSON File included as a resource
 ClassLoader classLoader = SleepReader.class.getClassLoader();
 File dataFile = new File(classLoader.getResource("example.json").getFile());

 //Simple example of getting the Sleep Objects from that JSON
 new JsonArrayStreamDataSupplier<>(dataFile, Sleep.class) //Got the Stream
                .forEachRemaining(nightsRest -> {
                    System.out.println(nightsRest.toString());
                });

Here's some JSON from example.json

   [
    {
        "date": "August 17, 2015",
        "hours": 7,
        "minutes": 10
    },
    {
        "date": "August 19, 2015",
        "hours": 4,
        "minutes": 46
    },
    {
        "date": "August 19, 2015",
        "hours": 7,
        "minutes": 22
    },
    {
        "date": "August 21, 2015",
        "hours": 4,
        "minutes": 48
    },
    {
        "date": "August 21, 2015",
        "hours": 6,
        "minutes": 1
    }
]

and, in case you don't want to go to GitHub (you should), here's the wrapper class itself:

    /**
 * @license APACHE LICENSE, VERSION 2.0 http://www.apache.org/licenses/LICENSE-2.0
 * @author Michael Witbrock
 */
package com.michaelwitbrock.jacksonstream;

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import java.util.Spliterators;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;

public class JsonArrayStreamDataSupplier<T> implements Iterator<T> {
    /*
    * This class wraps the Jackson streaming API for arrays (a common kind of 
    * large JSON file) in a Java 8 Stream. The initial motivation was that 
    * use of a default objectmapper to a Java array was crashing for me on
    * a very large JSON file (> 1GB).  And there didn't seem to be good example 
    * code for handling Jackson streams as Java 8 streams, which seems natural.
    */

    static ObjectMapper mapper = new ObjectMapper();
    JsonParser parser;
    boolean maybeHasNext = false;
    int count = 0;
    JsonFactory factory = new JsonFactory();
    private Class<T> type;

    public JsonArrayStreamDataSupplier(File dataFile, Class<T> type) {
        this.type = type;
        try {
            // Setup and get into a state to start iterating
            parser = factory.createParser(dataFile);
            parser.setCodec(mapper);
            JsonToken token = parser.nextToken();
            if (token == null) {
                throw new RuntimeException("Can't get any JSON Token from "
                        + dataFile.getAbsolutePath());
            }

            // the first token is supposed to be the start of array '['
            if (!JsonToken.START_ARRAY.equals(token)) {
                // return or throw exception
                maybeHasNext = false;
                throw new RuntimeException("Can't get any JSON Token fro array start from "
                        + dataFile.getAbsolutePath());
            }
        } catch (Exception e) {
            maybeHasNext = false;
        }
        maybeHasNext = true;
    }

    /*
    This method returns the stream, and is the only method other 
    than the constructor that should be used.
    */
    public Stream<T> getStream() {
        return StreamSupport.stream(Spliterators.spliteratorUnknownSize(this, 0), false);
    }

    /* The remaining methods are what enables this to be passed to the spliterator generator, 
       since they make it Iterable.
    */
    @Override
    public boolean hasNext() {
        if (!maybeHasNext) {
            return false; // didn't get started
        }
        try {
            return (parser.nextToken() == JsonToken.START_OBJECT);
        } catch (Exception e) {
            System.out.println("Ex" + e);
            return false;
        }
    }

    @Override
    public T next() {
        try {
            JsonNode n = parser.readValueAsTree();
            //Because we can't send T as a parameter to the mapper
            T node = mapper.convertValue(n, type);
            return node;
        } catch (IOException | IllegalArgumentException e) {
            System.out.println("Ex" + e);
            return null;
        }

    }


}
like image 97
Witbrock Avatar answered Oct 09 '22 21:10

Witbrock


Remove implementation of Iterator

I think you can get rid of the whole Iterator implementation using Jackson's API.

The catch 22 here is that readValueAs can return an iterator, the only thing I did not figure out completely is why I have to consume the JSON Array start before I can let Jackson do it's work

public class InputStreamJsonArrayStreamDataSupplier<T> implements Supplier<Stream<T>> {


private ObjectMapper mapper = new ObjectMapper();
private JsonParser jsonParser;
private Class<T> type;



public InputStreamJsonArrayStreamDataSupplier(Class<T> type) throws IOException {
    this.type = type;

    // Setup and get into a state to start iterating
    jsonParser = mapper.getFactory().createParser(data);
    jsonParser.setCodec(mapper);
    JsonToken token = jsonParser.nextToken();
    if (JsonToken.START_ARRAY.equals(token)) {
        // if it is started with START_ARRAY it's ok
        token = jsonParser.nextToken();
    }
    if (!JsonToken.START_OBJECT.equals(token)) {
        throw new RuntimeException("Can't get any JSON object from input " + data);
    }
}


public Stream<T> get() {
    try {
        return StreamSupport.stream(Spliterators.spliteratorUnknownSize((Iterator<T>) jsonParser.readValuesAs(type), 0), false);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}
}
like image 25
Chris Avatar answered Oct 09 '22 19:10

Chris