Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read json files in Tensorflow?

I'm trying to write a function, that reads json files in tensorflow. The json files have the following structure:

{
    "bounding_box": {
        "y": 98.5, 
        "x": 94.0, 
        "height": 197, 
        "width": 188
     }, 
    "rotation": {
        "yaw": -27.97019577026367,
        "roll": 2.206029415130615, 
        "pitch": 0.0}, 
        "confidence": 3.053506851196289, 
        "landmarks": {
            "1": {
                "y": 180.87722778320312, 
                "x": 124.47326660156205}, 
            "0": {
                "y": 178.60653686523438, 
                "x": 183.41931152343795}, 
            "2": {
                "y": 224.5936889648438, 
                "x": 141.62365722656205
}}}

I only need the bounding box information. There are a few examples on how to write read_and_decode-functions, and I'm trying to transform these examples into a function for json files, but there are still a lot of questions...:

def read_and_decode(filename_queue):

  reader = tf.WhichKindOfReader() # ??? 
  _, serialized_example = reader.read(filename_queue)
  features = tf.parse_single_example( 
      serialized_example,

      features={

          'bounding_box':{ 

              'y': tf.VarLenFeature(<whatstheproperdatatype>) ???
              'x': 
              'height': 
              'width': 

          # I only need the bounding box... - do I need to write 
          # the format information for the other features...???

          }
      })

  y=tf.decode() # decoding necessary?
  x=
  height=
  width= 

  return x,y,height,width

I've done research on the internet for hours, but can't find anything really detailled on how to read json in tensorflow...

Maybe someone can give me a clue...

like image 241
meridius Avatar asked Jul 14 '16 18:07

meridius


People also ask

How do you read a JSON variable in Python?

Reading From JSON Python has a built-in package called json, which can be used to work with JSON data. It's done by using the JSON module, which provides us with a lot of methods which among loads() and load() methods are gonna help us to read the JSON file.

How do I read nested JSON data in Python?

Use pd. read_json() to load simple JSONs and pd. json_normalize() to load nested JSONs. You can easily access values in your JSON file by chaining together the key names and/or indices.


2 Answers

Update

The solution below does get the job done but it is not very efficient, see comments for details.

Original answer

You can use standard python json parsing with TensorFlow if you wrap the functions with tf.py_func:

import json
import numpy as np
import tensorflow as tf

def get_bbox(str):
    obj = json.loads(str.decode('utf-8'))
    bbox = obj['bounding_box']
    return np.array([bbox['x'], bbox['y'], bbox['height'], bbox['width']], dtype='f')

def get_multiple_bboxes(str):
    return [[get_bbox(x) for x in str]]

raw = tf.placeholder(tf.string, [None])
[parsed] = tf.py_func(get_multiple_bboxes, [raw], [tf.float32])

Note that tf.py_func returns a list of tensors rather than just a single tensor, which is why we need to wrap parsed in a list [parsed]. If not, parsed would get the shape [1, None, 4] rather than the desired shape [None, 4] (where None is the batch size).

Using your data you get the following results:

json_string = """{
    "bounding_box": {
        "y": 98.5,
        "x": 94.0,
        "height": 197,
        "width": 188
     },
    "rotation": {
        "yaw": -27.97019577026367,
        "roll": 2.206029415130615,
        "pitch": 0.0},
        "confidence": 3.053506851196289,
        "landmarks": {
            "1": {
                "y": 180.87722778320312,
                "x": 124.47326660156205},
            "0": {
                "y": 178.60653686523438,
                "x": 183.41931152343795},
            "2": {
                "y": 224.5936889648438,
                "x": 141.62365722656205
}}}"""
my_data = np.array([json_string, json_string, json_string])

init_op = tf.initialize_all_variables()
with tf.Session() as sess:
    sess.run(init_op)
    print(sess.run(parsed, feed_dict={raw: my_data}))
    print(sess.run(tf.shape(parsed), feed_dict={raw: my_data}))
[[  94.    98.5  197.   188. ]
 [  94.    98.5  197.   188. ]
 [  94.    98.5  197.   188. ]]
[3 4]
like image 145
Backlin Avatar answered Oct 10 '22 10:10

Backlin


This might be skirting the issue, but you could preprocess your data with a command line tool like https://stedolan.github.io/jq/tutorial/ into a line-based data format, like csv. Would possibly be more efficient also.

like image 30
Shan Carter Avatar answered Oct 10 '22 08:10

Shan Carter