Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between feature_column.embedding_column and keras.layers.Embedding in TensorFlow

I have been using keras.layers.Embedding for almost all of my projects. But, recently I wanted to fiddle around with tf.data and found feature_column.embedding_column.

From the documentation:

feature_column.embedding_column - DenseColumn that converts from sparse, categorical input. Use this when your inputs are sparse, but you want to convert them to a dense representation (e.g., to feed to a DNN).

keras.layers.Embedding - Turns positive integers (indexes) into dense vectors of fixed size. e.g. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]] This layer can only be used as the first layer in a model.

My question is, is both of the api doing similar thing on different type of input data(for ex. input - [0,1,2] for keras.layers.Embedding and its one-hot-encoded rep. [[1,0,0],[0,1,0],[0,0,1] for feature_column.embedding_column)?

like image 987
thisisbhavin Avatar asked Nov 07 '19 11:11

thisisbhavin


Video Answer


1 Answers

After reviewing source code for both operations here is what I found:

  • both operations rely on tensorflow.python.ops.embedding_ops funcitonality;
  • keras.layers.Embedding uses dense representations and contains generic keras code for fiddling with shapes, init variables etc;
  • feature_column.embedding_column relies on sparse and contains functionality to cache results.

So, your guess seems to be right: these 2 are doing similar things, rely on distinct input representations, contain some logic that doesn't change the essense of what they do.

like image 145
y.selivonchyk Avatar answered Oct 23 '22 00:10

y.selivonchyk