From the TensorFlow docs it's clear how to use tf.feature_column.categorical_column_with_vocabulary_list
to create a feature column which takes as input some string and outputs a one-hot vector. For example
vocabulary_feature_column =
tf.feature_column.categorical_column_with_vocabulary_list(
key="vocab_feature",
vocabulary_list=["kitchenware", "electronics", "sports"])
Let's say "kitchenware"
maps to [1,0,0]
and "electronics"
maps to [0,1,0]
. My question is related to having a list of strings as a feature. For example, if the feature value was ["kitchenware","electronics"]
then the desired output would be [1,1,0]
. The input list length is not fixed but the output dimension is.
The use case is a straight bag-of-words type model (obviously with a much larger vocabulary list!).
What is the correct way to implement this?
Feature columns these are nothing but the bridge between the raw data and the model or estimator. These are very rich, enabling us for transforming and diversify the range of raw data into the formats that the models or estimators can use, allowing easy experimentation.
num_oov_buckets. Non-negative integer, the number of out-of-vocabulary buckets.
Here is an example how to feed data to the indicator column:
features = {'letter': [['A','A'], ['C','D'], ['E','F'], ['G','A'], ['X','R']]}
letter_feature = tf.feature_column.categorical_column_with_vocabulary_list(
"letter", ["A", "B", "C"], dtype=tf.string)
indicator = tf.feature_column.indicator_column(letter_feature)
tensor = tf.feature_column.input_layer(features, [indicator])
with tf.Session() as session:
session.run(tf.global_variables_initializer())
session.run(tf.tables_initializer())
print(session.run([tensor]))
Which outputs:
[array([[2., 0., 0.],
[0., 0., 1.],
[0., 0., 0.],
[1., 0., 0.],
[0., 0., 0.]], dtype=float32)]
you should use tf.feature_column.indicator_column see https://www.tensorflow.org/versions/master/api_docs/python/tf/feature_column/indicator_column
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With