How to use tensorflow feature_columns as input to a keras model

Tags:

Tensorflow's feature_columns API is quite useful for non-numerical feature processing. However, the current API doc is more about using feature_columns with tensorflow Estimator. Is there a possible way to use feature_columns for categorical features representation and then build a model based on tf.keras?

The only reference I found is the following tutorial. It shows how to feed feature columns to a Keras Sequential model: Link

The code snippet is as follows:

from tensorflow.python.feature_column import feature_column_v2 as fc

feature_columns = [fc.embedding_column(ccv, dimension=3), ...]
feature_layer = fc.FeatureLayer(feature_columns)
model = tf.keras.Sequential([
    feature_layer,
    tf.keras.layers.Dense(128, activation=tf.nn.relu),
    tf.keras.layers.Dense(64, activation=tf.nn.relu),
    tf.keras.layers.Dense(1, activation=tf.nn.sigmoid)
])
...
model.fit(dataset, steps_per_epoch=8) # dataset is created from tensorflow Dataset API

The question is how to use a customed model with keras functional model API. I tried the following, but it did not work (tensorflow version 1.12)

feature_layer = fc.FeatureLayer(feature_columns)
dense_features = feature_layer(features) # features is a dict of ndarrays in dataset
layer1 = tf.keras.layers.Dense(128, activation=tf.nn.relu)(dense_features)
layer2 = tf.keras.layers.Dense(64, activation=tf.nn.relu)(layer1)
output = tf.keras.layers.Dense(1, activation=tf.nn.sigmoid)(layer2)
model = Model(inputs=dense_features, outputs=output)

The error log:

ValueError: Input tensors to a Model must come from `tf.layers.Input`. Received: Tensor("feature_layer/concat:0", shape=(4, 3), dtype=float32) (missing previous layer metadata).

I don't kown how to transform feature columns to keras model's input.

306

asked Jan 26 '19 02:01

JM ZHU

1 Answers

The behavior you desire could be achieved and it's able to combine tf.feature_column and keras functional API. And, actually, is not mentioned in TF docs.

This works at least in TF 2.0.0-beta1, but may being changed or even simplified in further releases.

Please check out issue in TensorFlow github repository Unable to use FeatureColumn with Keras Functional API #27416. There you will find useful comments about tf.feature_column and Keras Functional API.

Because you ask about general approach I would just copy the snippet with example from the link above. update: the code below should work

from __future__ import absolute_import, division, print_function

import numpy as np
import pandas as pd

#!pip install tensorflow==2.0.0-alpha0
import tensorflow as tf

from tensorflow import feature_column
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split

csv_file = tf.keras.utils.get_file('heart.csv', 'https://storage.googleapis.com/download.tensorflow.org/data/heart.csv')
dataframe = pd.read_csv(csv_file, nrows = 10000)
dataframe.head()

train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')

# Define method to create tf.data dataset from Pandas Dataframe
# This worked with tf 2.0 but does not work with tf 2.2
def df_to_dataset_tf_2_0(dataframe, label_column, shuffle=True, batch_size=32):
    dataframe = dataframe.copy()
    #labels = dataframe.pop(label_column)
    labels = dataframe[label_column]

    ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
    if shuffle:
        ds = ds.shuffle(buffer_size=len(dataframe))
    ds = ds.batch(batch_size)
    return ds

def df_to_dataset(dataframe, label_column, shuffle=True, batch_size=32):
    dataframe = dataframe.copy()
    labels = dataframe.pop(label_column)
    #labels = dataframe[label_column]

    ds = tf.data.Dataset.from_tensor_slices((dataframe.to_dict(orient='list'), labels))
    if shuffle:
        ds = ds.shuffle(buffer_size=len(dataframe))
    ds = ds.batch(batch_size)
    return ds


batch_size = 5 # A small batch sized is used for demonstration purposes
train_ds = df_to_dataset(train, label_column = 'target', batch_size=batch_size)
val_ds = df_to_dataset(val,label_column = 'target',  shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, label_column = 'target', shuffle=False, batch_size=batch_size)

age = feature_column.numeric_column("age")

feature_columns = []
feature_layer_inputs = {}

# numeric cols
for header in ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', 'slope', 'ca']:
  feature_columns.append(feature_column.numeric_column(header))
  feature_layer_inputs[header] = tf.keras.Input(shape=(1,), name=header)

# bucketized cols
age_buckets = feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35])
feature_columns.append(age_buckets)

# indicator cols
thal = feature_column.categorical_column_with_vocabulary_list(
      'thal', ['fixed', 'normal', 'reversible'])
thal_one_hot = feature_column.indicator_column(thal)
feature_columns.append(thal_one_hot)
feature_layer_inputs['thal'] = tf.keras.Input(shape=(1,), name='thal', dtype=tf.string)

# embedding cols
thal_embedding = feature_column.embedding_column(thal, dimension=8)
feature_columns.append(thal_embedding)

# crossed cols
crossed_feature = feature_column.crossed_column([age_buckets, thal], hash_bucket_size=1000)
crossed_feature = feature_column.indicator_column(crossed_feature)
feature_columns.append(crossed_feature)



feature_layer = tf.keras.layers.DenseFeatures(feature_columns)
feature_layer_outputs = feature_layer(feature_layer_inputs)

x = layers.Dense(128, activation='relu')(feature_layer_outputs)
x = layers.Dense(64, activation='relu')(x)

baggage_pred = layers.Dense(1, activation='sigmoid')(x)

model = keras.Model(inputs=[v for v in feature_layer_inputs.values()], outputs=baggage_pred)

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(train_ds)

117

answered Oct 02 '22 17:10

Egor B Eremeev

Related questions
                            
                                How to implement pixel-wise classification for scene labeling in TensorFlow?
                            
                                Keras Binary Classification - Sigmoid activation function
                            
                                Tensorflow `set_random_seed` not working [duplicate]
                            
                                How can visualize tensorflow convolution filters?
                            
                                How do you convert a .onnx to tflite?
                            
                                Parallelization strategies for deep learning
                            
                                How to index a list with a TensorFlow tensor?
                            
                                CNN Image Recognition with Regression Output on Tensorflow
                            
                                Use attribute and target matrices for TensorFlow Linear Regression Python
                            
                                How to deploy and serve prediction using TensorFlow from API?
                            
                                Limit Tensorflow CPU and Memory usage
                            
                                Difference between Dense(2) and Dense(1) as the final layer of a binary classification CNN?
                            
                                TensorFlow error: logits and labels must be same size
                            
                                Unsuccessful TensorSliceReader constructor: Failed to find any matching files for bird-classifier.tfl.ckpt-50912
                            
                                Why is tf.Variable uppercase but tf.constant lowercase?
                            
                                Building an SVM with Tensorflow
                            
                                TensorFlow in production for real time predictions in high traffic app - how to use?
                            
                                TensorFlow: does tf.train.batch automatically load the next batch when the batch has finished training?
                            
                                How to run multiple graphs in a Session - Tensorflow API
                            
                                Where is the tensorflow session in Keras

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use tensorflow feature_columns as input to a keras model

Tags:

tensorflow

keras

JM ZHU

People also ask

1 Answers

Egor B Eremeev

Recent Activity

Donate For Us