I have been searching about this for the past few days but I don't think I am able to find a correct pointer. Please merge it with the appropriate question if found as duplicate.
I am pretty new to working with JSON and as part of one of my projects I need to decode a JSON file and do further processing on it. However when I tried decoding using the Json-simple library, I get some weird question marks in the parsed object instead of the actual characters. A sample code is shown below:
String str = "{\"alias\": [\"Evr\u00f3pa\", \"\u05d0\u05d9\u05e8\u05d5\u05e4\"]}";
JSONParser parser = new JSONParser();
JSONObject jsonObject = (JSONObject)parser.parse(str);
System.out.println(jsonObject) gives {"alias":["Evrópa","?????"]}
I tried using Json-lib too with the same result.
Thanks for the help.
The problem isn't with your JSON, it's with your System.out.println(). Those characters can't be represented in the character encoding either of your terminal (or your IDE, if that is where you ran it) or of the encoding being used by System.out in your environment.
Files can not contain Unicode characters. Files are streams of bytes, but Unicode characters are multiple bytes (usually two) in size. This is where character encodings become relevant. Unicode characters must be converted to a sequence of bytes to write them to a file (including System.out). One of the most commonly used encodings for Unicode characters is UTF-8. The trick for software programmers is to always use the correct character encoding when converting between bytes and characters. Lacking the correct encoding in a single place, for example in a debug println() call, will give erroneous and misleading output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With