Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing Huge JSON with Jackson

Tags:

java

json

jackson

Consider a huge JSON with structure like -

{"text": "very HUGE text here.."}

I am storing this JSON as an ObjectNode object called say json.

Now I try to extract this text from the ObjectNode.

String text = json.get("text").asText()

This JSON can be like 4-5 MB in size. When I run this code, I dont get a result (program keeps executing forever).

The above method works fine for small and normal sized strings. Is there any other best practice to extract huge data from JSON?

like image 599
hyades Avatar asked Oct 19 '22 13:10

hyades


2 Answers

test with jackson(fastxml), 7MB json node can be parsed in 200 milliseconds

    ObjectMapper objectMapper = new ObjectMapper();
    InputStream is = getClass().getResourceAsStream("/test.json");
    long begin = System.currentTimeMillis();
    Map<String,String> obj = objectMapper.readValue(is, HashMap.class);
    long end = System.currentTimeMillis();
    System.out.println(obj.get("value").length() + "\t" + (end - begin));

the output is: 7888888 168

try to upgrade you jackson?

like image 119
yegong Avatar answered Oct 22 '22 04:10

yegong


Perhaps your default heap size is too small: if input is 5 megs UTF-8 encoded, Java String of it will usually need 10 megs of memory (char is 16-bits, most UTF-8 for english chars is single byte). There isn't much you can do about this, regardless of JSON library, if value has to be handled as Java String; you need enough memory for the value and rest of processing. Further, since Java heap is divided into different generations, 64 megs may or may not work: since 10 megs needs to be consecutive, it probably gets allocated in the old generation.

So: see try with bigger heap size and see how much you need.

like image 26
StaxMan Avatar answered Oct 22 '22 04:10

StaxMan