When should one use tf.train.BytesList, tf.train.FloatList, and tf.train.Int64List for data to be stored in a tf.train.Feature?

Tags:

TensorFlow provides 3 different formats for data to be stored in a tf.train.Feature. These are:

tf.train.BytesList
tf.train.FloatList
tf.train.Int64List

I often struggle to choose between tf.train.Int64List / tf.train.FloatList and tf.train.BytesList.

I see some examples online where they convert ints/floats into bytes and then store them in a tf.train.BytesList. Is this preferable to using one of the other formats? If so, why does TensorFlow even provide tf.train.Int64List and tf.train.FloatList as optional formats when you could just convert them to bytes and use tf.train.BytesList?

Thank you.

741

asked Mar 16 '19 20:03

michael_question_answerer

1 Answers

Because bytes list will require more memory. It's designed to store string data, or for example numpy arrays converted to single bytestring. Consider example:

Click to copy

def int64_feature(value):
    if type(value) != list:
        value = [value]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))

def float_feature(value):
    if type(value) != list:
        value = [value]
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))

def bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

writer = tf.python_io.TFRecordWriter('file.tfrecords')
bytes = np.array(1.1).tostring() 
int = 1
float = 1.1
example = tf.train.Example(features=tf.train.Features(feature={'1': float_feature(float)}))
writer.write(example.SerializeToString())
writer.close()

for str_rec in tf.python_io.tf_record_iterator('file.tfrecords'):
    example = tf.train.Example()
    example.ParseFromString(str_rec)
    str = (example.features.feature['1'].float_list.value[0])
    print(getsizeof(str))

For dtype float it will output 24 bytes, the lowest value. However, you can't pass int to a tf.train.FloatList. int dtype will occupy 28 bytes in this case, while bytes will be 41 undecoded(before applying np.fromstring) and even more after.

144

answered Nov 09 '22 17:11

Sharky

Related questions
                            
                                PyTorch Huggingface BERT-NLP for Named Entity Recognition
                            
                                Pandas Pivot Creating NaN
                            
                                multiple softmax classifications (Keras)
                            
                                Running two python scripts with bash file
                            
                                Opposite function to utcfromtimestamp?
                            
                                Module pytz: UTC decrease instead of increase
                            
                                How to get text objects to work with sklearn classifier pipeline?
                            
                                scipy.integrate.solve_ivp vectorized
                            
                                Nested cross-validation: How does cross_validate handle GridSearchCV as its input estimator?
                            
                                Does the @staticmethod decorator do anything?
                            
                                How to convert 3D numpy array to nifti image in nibabel?
                            
                                How to save and load my neural network model after training along with weights in python?
                            
                                Numpy inconsistent results with Pandas and missing values
                            
                                Python : download as pdf all emails from a label (gmail)
                            
                                ElasticSearch-dsl Create Query
                            
                                pandas groupby and then select a row by value of column (min,max, for example)
                            
                                python find connected components in a 3D graph / tuple with three elements?
                            
                                Calling a REST API from django view
                            
                                KeyError / frozen importlib._bootstrap error on second library import in spyder
                            
                                SqlAlchemy: case statement (case - if - then -else)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When should one use tf.train.BytesList, tf.train.FloatList, and tf.train.Int64List for data to be stored in a tf.train.Feature?

Tags:

python

tensorflow

dataformat

michael_question_answerer

People also ask

1 Answers

Sharky

Recent Activity

Donate For Us