Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse Apple's IAP receipt mal-formatted JSON?

I got the JSON from apple like this

{
    "original-purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
    "original-transaction-id" = "1000000051960431";
    "bvrs" = "1.0";
    "transaction-id" = "1000000051960431";
    "quantity" = "1";
    "original-purchase-date-ms" = "1340876762450";
    "product-id" = "com.x";
    "item-id" = "523404215";
    "bid" = "com.x";
    "purchase-date-ms" = "1340876762450";
    "purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
    "purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
    "original-purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
}

This is not the JSON we know. In JSON it's clearly defined that

Each name is followed by : (colon) and the name/value pairs are separated by , (comma).

How can I even parse it in python's json (or simplejson) module?

json only supports separators in json.dumps(), not in json.loads(), and in simplejson/decoder.py, the def JSONObject() has hard-coded delimiter of : and ,.

What can I do? Write my own parser?

like image 888
est Avatar asked Jan 17 '23 06:01

est


2 Answers

That is indeed rather messed up. A quick fix would be to replace the offending separators with a regular expression:

line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");')
result = line.sub(r'\1: \2,', result)

You'll also need to remove the last comma:

trailingcomma = re.compile(r',(\s*})')
result = trailingcomma.sub(r'\1', result)

With these operations the example loads as json:

>>> import json, re
>>> line = re.compile('("[^"]*")\s*=\s*("[^"]*");')
>>> result = '''\
... {
...     "original-purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
...     "original-transaction-id" = "1000000051960431";
...     "bvrs" = "1.0";
...     "transaction-id" = "1000000051960431";
...     "quantity" = "1";
...     "original-purchase-date-ms" = "1340876762450";
...     "product-id" = "com.x";
...     "item-id" = "523404215";
...     "bid" = "com.x";
...     "purchase-date-ms" = "1340876762450";
...     "purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
...     "purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
...     "original-purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
... }
... '''
>>> line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");')
>>> trailingcomma = re.compile(r',(\s*})')
>>> corrected = trailingcomma.sub(r'\1', line.sub(r'\1: \2,', result))
>>> json.loads(corrected)
{u'product-id': u'com.x', u'purchase-date-pst': u'2012-06-28 02:46:02 America/Los_Angeles', u'transaction-id': u'1000000051960431', u'original-purchase-date-pst': u'2012-06-28 02:46:02 America/Los_Angeles', u'bid': u'com.x', u'purchase-date-ms': u'1340876762450', u'original-transaction-id': u'1000000051960431', u'bvrs': u'1.0', u'original-purchase-date-ms': u'1340876762450', u'purchase-date': u'2012-06-28 09:46:02 Etc/GMT', u'original-purchase-date': u'2012-06-28 09:46:02 Etc/GMT', u'item-id': u'523404215', u'quantity': u'1'}

It should handle nested mappings as well. This does assume there are no escaped " quotes in the values themselves though. If there are you'll need a parser anyway.

like image 179
Martijn Pieters Avatar answered Jan 18 '23 23:01

Martijn Pieters


If you add the following to your HTTP request header to iTunes

{'Content-Type' : 'application/json'}

It will return a real JSON formatted response that will work with json.loads

like image 40
Tico Ballagas Avatar answered Jan 18 '23 23:01

Tico Ballagas