Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Cross Entropy in TensorFlow

I am having a hard time with calculating cross entropy in tensorflow. In particular, I am using the function:

tf.nn.softmax_cross_entropy_with_logits()

Using what is seemingly simple code, I can only get it to return a zero

import tensorflow as tf
import numpy as np

sess = tf.InteractiveSession()

a = tf.placeholder(tf.float32, shape =[None, 1])
b = tf.placeholder(tf.float32, shape = [None, 1])
sess.run(tf.global_variables_initializer())
c = tf.nn.softmax_cross_entropy_with_logits(
    logits=b, labels=a
).eval(feed_dict={b:np.array([[0.45]]), a:np.array([[0.2]])})
print c

returns

0

My understanding of cross entropy is as follows:

H(p,q) = p(x)*log(q(x))

Where p(x) is the true probability of event x and q(x) is the predicted probability of event x.

There if input any two numbers for p(x) and q(x) are used such that

0<p(x)<1 AND 0<q(x)<1

there should be a nonzero cross entropy. I am expecting that I am using tensorflow incorrectly. Thanks in advance for any help.

like image 360
David Kaftan Avatar asked Mar 01 '17 00:03

David Kaftan


People also ask

What is cross-entropy in Tensorflow?

Cross entropy can be used to define a loss function (cost function) in machine learning and optimization. It is defined on probability distributions, not single values. It works for classification because classifier output is (often) a probability distribution over class labels.

How is cross-entropy calculated?

Cross-entropy loss is calculated by taking the difference between our prediction and actual output. We then multiply that value with `-y * ln(y)`. This means we take a negative number, raise it to the power of the logarithm of y (which will be positive), and then subtract this from our original calculation.

What is cross-entropy in ML?

The average number of bits required to send a message from distribution A to distribution B is referred to as cross-entropy. Cross entropy is a concept used in machine learning when algorithms are created to predict from the model. The construction of the model is based on a comparison of actual and expected results.

Is cross-entropy the same as Softmax?

Also called Softmax Loss. It is a Softmax activation plus a Cross-Entropy loss. If we use this loss, we will train a CNN to output a probability over the C classes for each image.


2 Answers

Here is an implementation in Tensorflow 2.0 in case somebody else (me probably) needs it in the future.

@tf.function
def cross_entropy(x, y, epsilon = 1e-9):
    return -2 * tf.reduce_mean(y * tf.math.log(x + epsilon), -1) / tf.math.log(2.)

x = tf.constant([
    [1.0,0],
    [0.5,0.5],
    [.75,.25]
    ]
,dtype=tf.float32)

with tf.GradientTape() as tape:
    tape.watch(x)
    y = entropy(x, x)

tf.print(y)
tf.print(tape.gradient(y, x))

Output

[-0 1 0.811278105]
[[-1.44269502 29.8973541]
 [-0.442695022 -0.442695022]
 [-1.02765751 0.557305]]
like image 69
Souradeep Nanda Avatar answered Oct 08 '22 12:10

Souradeep Nanda


In addition to Don's answer (+1), this answer written by mrry may interest you, as it gives the formula to calculate the cross entropy in TensorFlow:

An alternative way to write:

xent = tf.nn.softmax_cross_entropy_with_logits(logits, labels)

...would be:

softmax = tf.nn.softmax(logits)
xent = -tf.reduce_sum(labels * tf.log(softmax), 1)

However, this alternative would be (i) less numerically stable (since the softmax may compute much larger values) and (ii) less efficient (since some redundant computation would happen in the backprop). For real uses, we recommend that you use tf.nn.softmax_cross_entropy_with_logits().

like image 43
Franck Dernoncourt Avatar answered Oct 08 '22 11:10

Franck Dernoncourt