Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get around Keras pad_sequences() rounding float values to zero?

So I have a text classification model built with Keras. I've been trying to pad my varying length sequences but the Keras function pad_sequences() has just returned zeros.

I've figured out that if you have a numpy array like the one below, it works just fine. But once the elements become floats or decimals like the second array it just turns to zeros.

x = [[1, 2], [3,4,5], [4], [7,8,9,10]]
print pad_sequences(x, padding='post')

outputs:

[[ 1  2  0  0]
 [ 3  4  5  0]
 [ 4  0  0  0]
 [ 7  8  9 10]]

But

x = [[.1, .2], [.3,.4,.5], [.4], [.7,.8,.9,.010]]
print pad_sequences(x, padding='post')

outputs:

[[ 0  0  0  0]
 [ 0  0  0  0]
 [ 0  0  0  0]
 [ 0  0  0  0]]

And this:

x = [[.1, .2], [.3,.4,.5], [.4], [.7,.8,.9,.010]]
print pad_sequences(x, padding='post', value=99)

outputs:

[[ 0  0 99 99]
 [ 0  0  0 99]
 [ 0 99 99 99]
 [ 0  0  0  0]]

So I guess this function just ignores floats/decimals. Is there a way I can get around this?

like image 885
th4t gi Avatar asked Jan 03 '19 23:01

th4t gi


1 Answers

It is caused by the fact that the default data type considered in the pad_sequences function is int32. Therefore, all the values will be casted to integer (and in this case become zero). To resolve this, pass dtype='float32' argument:

pad_sequences(x, padding='post', value=99, dtype='float32')
like image 113
today Avatar answered Sep 30 '22 10:09

today