Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jackson multiple objects and huge json files

Tags:

java

json

jackson

I get the feeling that the answer might be a duplicate of this: Jackson - Json to POJO With Multiple Entries but I think that potentially the question is different enough. Also I'm using raw data binding rather than full data binding.

So like the asker of that question, I have multiple objects in a file and I'm trying to turn them into POJOs and stuff them into a database of my design so I can access the data quickly rather than slowly.

The files here are in the order of tens of GB, with up to millions of objects in each file. Anyway here is what I have so far:

ObjectMapper mapper = new ObjectMapper();
Map<String,Object> data = mapper.readValue(new File("foo.json"), Map.class);
System.out.println(data.get("bar"));

And this works great for printing the bar element of the first object in foo, but I need a way to iterate through every element in a way that won't eat up all my memory.

Thanks.

like image 925
Tom Carrick Avatar asked May 02 '12 09:05

Tom Carrick


People also ask

Can a JSON file have multiple objects?

The file is invalid if it contains more than one JSON object. When you try to load and parse a JSON file with multiple JSON objects, each line contains valid JSON, but as a whole, it is not a valid JSON as there is no top-level list or object definition.

How does Jackson read nested JSON?

A JsonNode is Jackson's tree model for JSON and it can read JSON into a JsonNode instance and write a JsonNode out to JSON. To read JSON into a JsonNode with Jackson by creating ObjectMapper instance and call the readValue() method. We can access a field, array or nested object using the get() method of JsonNode class.

Does Jackson support JsonPath?

The Jayway JsonPath library has support for reading values using a JSON path. If you would like to specifically use GSON or Jackson to do the deserialization (the default is to use json-smart), you can also configure this: Configuration.

Should I reuse ObjectMapper?

Note that copy() operation is as expensive as constructing a new ObjectMapper instance: if possible, you should still pool and reuse mappers if you intend to use them for multiple operations.


2 Answers

You don't have to choose between Streaming (JsonParser) and ObjectMapper, do both! Traverse a bit with parser, but then call JsonParser.readValueAs(MyType.class) to bind individual JSON Object.

Or, call ObjectMapper's readValue() method passing JsonParser at appropriate points. Or use ObjectMapper.reader(Type.class).readValues() and iterate that way.

like image 198
StaxMan Avatar answered Oct 05 '22 22:10

StaxMan


Use this code sample to see the basic idea.

final InputStream in = new FileInputStream("json.json");
try {
  for (Iterator it = new ObjectMapper().readValues(
      new JsonFactory().createJsonParser(in), Map.class); it.hasNext();)
    System.out.println(it.next());
}
finally { in.close();} }
like image 35
Marko Topolnik Avatar answered Oct 05 '22 23:10

Marko Topolnik