I have a very large > 1GB JSON file containing an array (it's confidential, but this file of sleep duration data is a proxy:)
[
{
"date": "August 17, 2015",
"hours": 7,
"minutes": 10
},
{
"date": "August 19, 2015",
"hours": 4,
"minutes": 46
},
{
"date": "August 19, 2015",
"hours": 7,
"minutes": 22
},
{
"date": "August 21, 2015",
"hours": 4,
"minutes": 48
},
{
"date": "August 21, 2015",
"hours": 6,
"minutes": 1
}
]
I've used JSON2POJO to produce a "Sleep" object definition.
Now, one could use Jackson's Mapper to just convert to an array, and then use Arrays.stream(ARRAY). Except that this crashes (yes, it's a BIG file).
The obvious thing is to use Jackson's Streaming API. But that's super low level. In particular, I still want Sleep Objects.
How do I use the Jackson Streaming JSON reader and my Sleep.java class to generate a Java 8 Stream of Sleep Objects?
We can convert a JSON to Java Object using the readValue() method of ObjectMapper class, this method deserializes a JSON content from given JSON content String.
Use the JavaScript function JSON.parse() to convert text into a JavaScript object: const obj = JSON.parse('{"name":"John", "age":30, "city":"New York"}');
Concatenated JSON streaming allows the sender to simply write each JSON object into the stream with no delimiters. It relies on the receiver using a parser that can recognize and emit each JSON object as the terminating character is parsed.
We can easily convert JSON data into a map because the JSON format is essentially a key-value pair grouping and the map also stores data in key-value pairs. Let's understand how we can use both JACKSON and Gson libraries to convert JSON data into a Map.
I couldn't find a good solution to this, and I needed one for a particular case: I had a >1GB JSON file (a top level JSON array, with tens of thousands of largish objects), and using the normal Jackson mapper caused crashes when accessing the resulting Java object array.
The examples I found for using the Jackson Streaming API lost the object mapping that is so appealing, and certainly didn't allow access to the objects via the (obviously appropriate) Java 8 Streaming API.
The code is now on GitHub
Here's a quick example of use:
//Use the JSON File included as a resource
ClassLoader classLoader = SleepReader.class.getClassLoader();
File dataFile = new File(classLoader.getResource("example.json").getFile());
//Simple example of getting the Sleep Objects from that JSON
new JsonArrayStreamDataSupplier<>(dataFile, Sleep.class) //Got the Stream
.forEachRemaining(nightsRest -> {
System.out.println(nightsRest.toString());
});
Here's some JSON from example.json
[
{
"date": "August 17, 2015",
"hours": 7,
"minutes": 10
},
{
"date": "August 19, 2015",
"hours": 4,
"minutes": 46
},
{
"date": "August 19, 2015",
"hours": 7,
"minutes": 22
},
{
"date": "August 21, 2015",
"hours": 4,
"minutes": 48
},
{
"date": "August 21, 2015",
"hours": 6,
"minutes": 1
}
]
and, in case you don't want to go to GitHub (you should), here's the wrapper class itself:
/**
* @license APACHE LICENSE, VERSION 2.0 http://www.apache.org/licenses/LICENSE-2.0
* @author Michael Witbrock
*/
package com.michaelwitbrock.jacksonstream;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import java.util.Spliterators;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;
public class JsonArrayStreamDataSupplier<T> implements Iterator<T> {
/*
* This class wraps the Jackson streaming API for arrays (a common kind of
* large JSON file) in a Java 8 Stream. The initial motivation was that
* use of a default objectmapper to a Java array was crashing for me on
* a very large JSON file (> 1GB). And there didn't seem to be good example
* code for handling Jackson streams as Java 8 streams, which seems natural.
*/
static ObjectMapper mapper = new ObjectMapper();
JsonParser parser;
boolean maybeHasNext = false;
int count = 0;
JsonFactory factory = new JsonFactory();
private Class<T> type;
public JsonArrayStreamDataSupplier(File dataFile, Class<T> type) {
this.type = type;
try {
// Setup and get into a state to start iterating
parser = factory.createParser(dataFile);
parser.setCodec(mapper);
JsonToken token = parser.nextToken();
if (token == null) {
throw new RuntimeException("Can't get any JSON Token from "
+ dataFile.getAbsolutePath());
}
// the first token is supposed to be the start of array '['
if (!JsonToken.START_ARRAY.equals(token)) {
// return or throw exception
maybeHasNext = false;
throw new RuntimeException("Can't get any JSON Token fro array start from "
+ dataFile.getAbsolutePath());
}
} catch (Exception e) {
maybeHasNext = false;
}
maybeHasNext = true;
}
/*
This method returns the stream, and is the only method other
than the constructor that should be used.
*/
public Stream<T> getStream() {
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(this, 0), false);
}
/* The remaining methods are what enables this to be passed to the spliterator generator,
since they make it Iterable.
*/
@Override
public boolean hasNext() {
if (!maybeHasNext) {
return false; // didn't get started
}
try {
return (parser.nextToken() == JsonToken.START_OBJECT);
} catch (Exception e) {
System.out.println("Ex" + e);
return false;
}
}
@Override
public T next() {
try {
JsonNode n = parser.readValueAsTree();
//Because we can't send T as a parameter to the mapper
T node = mapper.convertValue(n, type);
return node;
} catch (IOException | IllegalArgumentException e) {
System.out.println("Ex" + e);
return null;
}
}
}
I think you can get rid of the whole Iterator implementation using Jackson's API.
The catch 22 here is that readValueAs can return an iterator, the only thing I did not figure out completely is why I have to consume the JSON Array start before I can let Jackson do it's work
public class InputStreamJsonArrayStreamDataSupplier<T> implements Supplier<Stream<T>> {
private ObjectMapper mapper = new ObjectMapper();
private JsonParser jsonParser;
private Class<T> type;
public InputStreamJsonArrayStreamDataSupplier(Class<T> type) throws IOException {
this.type = type;
// Setup and get into a state to start iterating
jsonParser = mapper.getFactory().createParser(data);
jsonParser.setCodec(mapper);
JsonToken token = jsonParser.nextToken();
if (JsonToken.START_ARRAY.equals(token)) {
// if it is started with START_ARRAY it's ok
token = jsonParser.nextToken();
}
if (!JsonToken.START_OBJECT.equals(token)) {
throw new RuntimeException("Can't get any JSON object from input " + data);
}
}
public Stream<T> get() {
try {
return StreamSupport.stream(Spliterators.spliteratorUnknownSize((Iterator<T>) jsonParser.readValuesAs(type), 0), false);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With