I have a file consists of JSON, each a line, and want to sort the file by update_time reversed.
sample JSON file:
{ "page": { "url": "url1", "update_time": "1415387875"}, "other_key": {} } { "page": { "url": "url2", "update_time": "1415381963"}, "other_key": {} } { "page": { "url": "url3", "update_time": "1415384938"}, "other_key": {} }
want output:
{ "page": { "url": "url1", "update_time": "1415387875"}, "other_key": {} } { "page": { "url": "url3", "update_time": "1415384938"}, "other_key": {} } { "page": { "url": "url2", "update_time": "1415381963"}, "other_key": {} }
my code:
#!/bin/env python #coding: utf8 import sys import os import json import operator #load json from file lines = [] while True: line = sys.stdin.readline() if not line: break line = line.strip() json_obj = json.loads(line) lines.append(json_obj) #sort json lines = sorted(lines, key=lambda k: k['page']['update_time'], reverse=True) #output result for line in lines: print line
The code works fine with sample JSON file, but if a JSON has no 'update_time', it will raise KeyError exception. Are there non-exception ways to do this?
Using json. dumps() function is one way to sort the JSON object. It is used to convert the array of JSON objects into a sorted JSON object. The value of the sort_keys argument of the dumps() function will require to set True to generate the sorted JSON objects from the array of JSON objects.
JSON return type is an array of objects. Hence sort method cannot be used directly to sort the array. However, we can use a comparer function as the argument of the 'sort' method to get the sorting implemented.
One option might be to make your data look like this: var json = [{ "name": "user1", "id": 3 }, { "name": "user2", "id": 6 }, { "name": "user3", "id": 1 }]; Now you have an array of objects, and we can sort it. Show activity on this post.
Use the sort_keys parameter to specify if the result should be sorted or not: json.dumps(x, indent=4, sort_keys=True)
Write a function that uses try...except
to handle the KeyError
, then use this as the key
argument instead of your lambda.
def extract_time(json): try: # Also convert to int since update_time will be string. When comparing # strings, "10" is smaller than "2". return int(json['page']['update_time']) except KeyError: return 0 # lines.sort() is more efficient than lines = lines.sorted() lines.sort(key=extract_time, reverse=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With