Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert string tensor to lower case

Tags:

Is there any way to convert a string tensor to lower case, without evaluating in the session ? Some sort of tf.string_to_lower op ?

More specifically, I am reading data from tfrecords files, so my data is made of tensors. I then want to use tf.contrib.lookup.index_table_from_* to lookup indices for words in the data, and I need this to be case-insensitive. Lowering the data before writing it to tfrecords is not an option, as it needs to be kept in original format. One option would be to store both original and lowered, but I'd like to avoid this if possible.

like image 301
RaduK Avatar asked Jun 28 '17 00:06

RaduK


2 Answers

Here's an implementation with tensorflow ops:

def lowercase(s):
    ucons = tf.constant_initializer([chr(i) for i in range(65, 91)])
    lcons = tf.constant_initializer([chr(i) for i in range(97, 123)])

    upchars = tf.constant(ucons, dtype=tf.string)
    lchars = tf.constant(lcons, dtype=tf.string)

    upcharslut = tf.contrib.lookup.index_table_from_tensor(mapping=upchars, num_oov_buckets=1, default_value=-1)
    splitchars = tf.string_split(tf.reshape(s, [-1]), delimiter="").values
    upcharinds = upcharslut.lookup(splitchars)
    return tf.reduce_join(tf.map_fn(lambda x: tf.cond(x[0] > 25, lambda: x[1], lambda: lchars[x[0]]), (upcharinds, splitchars), dtype=tf.string))

if __name__ == "__main__":
    s = "komoDO DragoN "
    sess = tf.Session()
    x = lowercase(s)
    sess.run(tf.global_variables_initializer())
    sess.run(tf.tables_initializer())
    print(sess.run([x]))

returns [b'komodo dragon ']

like image 169
src Avatar answered Sep 22 '22 17:09

src


You can use tf.py_func to use a python function that manipulates your string and it's executed withing the graph.

You can do something like:

# I suppose your string tensor is tensorA
lower = tf.py_func(lambda x: x.lower(), [tensorA], tf.string, stateful=False)

# Starting from TF 2.0 `tf.py_func` is deprecated so correct code will be
lower = tf.py_function(lambda x: x.numpy().lower(), [tensorA], tf.string)
like image 24
nessuno Avatar answered Sep 22 '22 17:09

nessuno