Discrete Weights and Activations in Tensorflow or Keras

Question

Do you know about a way to constrain tensorflow or keras to a set of discrete weights and use discrete/rigid activation functions (e.g. like sign or hard-tanh)?

The APIs seem to have only smooth activation functions.

What I also thought about is to discretize the weights via a custom regularization function, but I don't know how to make the frameworks take this into account.

Probably I'll have to extend the (for example) Dense Layer class (of the respective framework) and define a custom forward-propagation function (and its derivative). Do you have any examples for this?

Marcin Możejko · Accepted Answer

In my opinion changing weights and activations from smooth to discrete ones might be a huge problem in Keras. I see at least two major difficulties in this approach:

Optimization framework must be completely different: the main reason why Keras / Theano does so good job in ANN is the fact that they are able to automatically differentate tensor functions. This is the main building block for most of today's optimization algorithms. Changing domain from continous to discrete one changes rules of optimization and, as far as I know, Keras & Theano are not prepared for this.
Mathematical issues: you may wonder if simply rounding every weight and activiation might be a good solution for your problem. But you have to remember that highly dimensional discrete grids have some counterintuitive properties which may be really misleading. E.g. diameter of 28 x 28 x 3 dimensional unit cube is 50 and has enormous big number of vertexes (2^dimension).

These are the reasons why solution of your problem might be really difficult.

Discrete Weights and Activations in Tensorflow or Keras

Tags:

machine-learning

neural-network

tensorflow

deep-learning

keras

ndrizza

1 Answers

Marcin Możejko

Recent Activity

Donate For Us

Discrete Weights and Activations in Tensorflow or Keras

Tags:

machine-learning

neural-network

tensorflow

deep-learning

keras

ndrizza

1 Answers

Marcin Możejko

Related questions

Recent Activity

Donate For Us