Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how can I save a string data to TFRecord?

when save to TFRecord, I use:

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))


def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


def _float_feature(value):
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))

and

one_example = tf.train.Example(
    features=tf.train.Features(
        feature={
            "image": _bytes_feature(img.tobytes()),
            "label": _bytes_feature(label.tobytes()),
            "file_name": _bytes_feature(this_city_file_name), #this line doesn't work
            "nb_rows": _int64_feature(nb_rows), 
            "nb_cols": _int64_feature(nb_cols), 
            "index_i": _int64_feature(i), 
            "index_j": _int64_feature(j),
        }
    )
)

and this_city_file_name has a type of string when I ran this code, that result in an error:

TypeError: 'xxxxxxx' has type , but expected one of: ((,),)

simply use bytes(this_city_file_name) will also result an error:

TypeError: string argument without an encoding

when load from TFRecord, I use

features = tf.parse_single_example(serialized_example,
                                   features={
                                       "image": tf.FixedLenFeature([], tf.string),
                                       "label": tf.FixedLenFeature([], tf.string),
                                       "file_name": tf.FixedLenFeature([], tf.string),
                                       "nb_rows": tf.FixedLenFeature([], tf.int64),
                                       "nb_cols": tf.FixedLenFeature([], tf.int64),
                                       "index_i": tf.FixedLenFeature([], tf.int64),
                                       "index_j": tf.FixedLenFeature([], tf.int64),
                                   },
                                   )

I know how to save int and np.array type to and read from TFRecord But how can I save and load string data from TFRecord?

like image 398
Shouyu Chen Avatar asked Aug 18 '18 03:08

Shouyu Chen


People also ask

What is a TFRecord dataset?

The TFRecord format is a simple format for storing a sequence of binary records. Protocol buffers are a cross-platform, cross-language library for efficient serialization of structured data. Protocol messages are defined by . proto files, these are often the easiest way to understand a message type.


1 Answers

I know this is old, but you have to convert this_city_file_name to a bytes object. Check out this guide

Here is the relevant code :

print(_bytes_feature(b'test_string'))
print(_bytes_feature(u'test_bytes'.encode('utf-8')))
like image 164
J.Vo Avatar answered Sep 23 '22 21:09

J.Vo