Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neural network: estimating sine wave frequency

With an objective of learning Keras LSTM and RNNs, I thought to create a simple problem to work on: given a sine wave, can we predict its frequency?

I wouldn't expect a simple neural network to be able to predict the frequency, given that the notion of time is important here. However, even with LSTMs, I am unable to learn the frequency; I'm able to learn a trivial zero as the estimated frequency (even for train samples).

Here's the code to create the train set.

import numpy as np
import matplotlib.pyplot as plt

def create_sine(frequency):
    return np.sin(frequency*np.linspace(0, 2*np.pi, 2000))

train_x = np.array([create_sine(x) for x in range(1, 300)])
train_y = list(range(1, 300))

Now, here's a simple neural network for this example.

from keras.models import Model
from keras.layers import Dense, Input, LSTM

input_series = Input(shape=(2000,),name='Input')
dense_1 = Dense(100)(input_series)
pred = Dense(1, activation='relu')(dense_1)
model = Model(input_series, pred)
model.compile('adam','mean_absolute_error')
model.fit(train_x[:100], train_y[:100], epochs=100)

As expected, this NN doesn't learn anything useful. Next, I tried a simple LSTM example.

input_series = Input(shape=(2000,1),name='Input')
lstm = LSTM(100)(input_series)
pred = Dense(1, activation='relu')(lstm)
model = Model(input_series, pred)
model.compile('adam','mean_absolute_error')
model.fit(train_x[:100].reshape(100, 2000, 1), train_y[:100], epochs=100)

However, this LSTM based model also doesn't learn anything useful.

like image 968
Nipun Batra Avatar asked Dec 21 '17 20:12

Nipun Batra


People also ask

Can machine learn the concept of sine?

In our experiments we see that the models all learned the general shape of sine function, but failed to generate future data points at a frequency outside of the training range.

Can neural network learn periodic function?

We start with a study of the extrapolation properties of neural networks; we prove and demonstrate experimentally that the standard activations functions, such as ReLU, tanh, sigmoid, along with their variants, all fail to learn to extrapolate simple periodic functions.

Which neural network is best for prediction?

Convolutional Neural Networks, or CNNs, were designed to map image data to an output variable. They have proven so effective that they are the go-to method for any type of prediction problem involving image data as an input.

What is frequency estimation?

Frequency estimation as the key technique of signal processing is applied to almost all the engineering technology fields. According to the different frequency components of the signal, frequency estimation methods involving single frequency and intensive frequency are sorted.


3 Answers

Why doesn't it learn?

You think it's a simple problem to train an RNN on, but actually your setup isn't easy for the network at all:

  • As already mentioned, there's lack of important samples. You throw so much data into it (300 * 2000 points), but the actual target (frequency) is seen only once by the network. Even if the network does learn something, there's high chance it will overfit.

  • Inconsistent data. Remember that RNNs are good at capturing similar patterns in the series data. For instance, in NLP all sentences in the corpus are governed by the same language rules and more sentences help RNN to understand these rules better, i.e., more data helps.

    In your case, the series with different frequencies aren't very much alike: compare the sine with frequency=1 and frequency=100. This kind of diversity in the data makes it harder to learn, not easier. It doesn't mean that the frequency is impossible for an RNN to learn, it simply means that you shouldn't be surprised that a trivial RNN like yours has hard time.

  • Data scale. Changing the frequency from 1 to 300, changes the scale of both x and y by two orders of magnitude, which may be problematic for any neural network.

Solution

Since your goal is rather educational, I solved the second and third items simply by limiting the target frequency to 10, so that scaling and distribution diversity isn't much of an issue (you are welcome to try different values here: you should see that increasing this one parameter to, say, 50 makes the task much more complex).

The first item is solved by giving the RNN 10 examples of each frequency, instead of just one. I've also added one more hidden layer to increase network flexibility, plus a simple regularizer (Dropout layer).

The complete code:

import numpy as np
from keras.models import Model
from keras.layers import Input, Dense, Dropout, LSTM

max_freq = 10
time_steps = 100

def create_sine(frequency, offset):
  return np.sin(frequency * np.linspace(offset, 2 * np.pi + offset, time_steps))

train_y = list(range(1, max_freq)) * 10
train_x = np.array([create_sine(freq, np.random.uniform(0,1)) for freq in train_y])
train_y = np.array(train_y)

input_series = Input(shape=(time_steps, 1), name='Input')
lstm = LSTM(units=100)(input_series)
hidden = Dense(units=100, activation='relu')(lstm)
dropout = Dropout(rate=0.1)(hidden)
output = Dense(units=1, activation='relu')(dropout)

model = Model(input_series, output)
model.compile('adam', 'mean_squared_error')
model.fit(train_x.reshape(-1, time_steps, 1), train_y, epochs=200)

# Trying the network on the same data
test_x = train_x.reshape(-1, time_steps, 1)
test_y = train_y
predicted = model.predict(test_x).reshape([-1])
print()
print((predicted - train_y)[:12])
print(np.mean(np.abs(predicted - train_y)))

The output:

max_freq=10

[-0.05612183 -0.01982236 -0.03744316 -0.02568841 -0.11959982 -0.0770483
  0.04643679  0.12057972 -0.00625324 -0.00724655 -0.16919005 -0.04512954]
0.0503574344847

max_freq=20 (everything else is the same)

[ 0.51365542  0.09269333 -0.009691    0.0619092   0.09852839  0.04378462
  0.01430321 -0.01953268  0.00722599  0.02558327 -0.04520988 -0.0614748 ]
0.146024380232

max_freq=30 (everything else is the same)

[-0.28205156 -0.28922796 -0.00569081 -0.21314907  0.1068716   0.23497915
  0.23975039  0.25955486  0.26333141  0.24235058  0.08320332 -0.03686047]
0.406703719805

Note that results are random and actually increasing the max_freq increases the changes of divergence. But even when it converges, the performance doesn't improve despite having more data, instead gets worse and pretty fast.

like image 59
Maxim Avatar answered Oct 11 '22 03:10

Maxim


sample data item very low, one for each freq,

add small noise and use more data,

normalize output data -1 to 1 range then try again

like image 1
Birol Kuyumcu Avatar answered Oct 11 '22 04:10

Birol Kuyumcu


As you said, you want to predict the frequency. You also want to use LSTM. First we generate enough data to train, then we build the network. I'm sorry my example is not with keras, I'm using tflearn.

import numpy as np
import tflearn
from random import shuffle

# parameters
n_input=100
n_train=2000
n_test = 500
# generate data
xs=[]
ys=[]
frequencies = np.linspace(1,50,n_train+n_test)
shuffle(frequencies)

t=np.linspace(0,2*np.pi,n_input)
for freq in frequencies:
    xs.append(np.sin(t*freq))
    ys.append(freq)

xs_train=np.array(xs[:n_train]).reshape(n_train,n_input,1)
xs_test=np.array(xs[n_train:]).reshape(n_test,n_input,1)
ys_train = np.array(ys[:n_train]).reshape(-1,1)
ys_test = np.array(ys[n_train:]).reshape(-1,1)

# LSTM network prediction
net = tflearn.input_data(shape=[None, n_input, 1])
net = tflearn.lstm(net, 10)
net = tflearn.fully_connected(net, 100, activation="relu")
net = tflearn.fully_connected(net, 1)
net = tflearn.regression(net, optimizer='adam', loss='mean_square')
model = tflearn.DNN(net)
model.fit(xs_train, ys_train, n_epoch=100)

print(np.hstack((model.predict(xs_test),ys_test))[:10])
# [[ 13.08494568  12.76470588]
#  [ 22.23135376  21.98039216]
#  [ 39.0812912   37.58823529]
#  [ 15.77548409  15.66666667]
#  [ 26.57996941  25.58823529]
#  [ 26.57759476  25.11764706]
#  [ 16.42217445  15.8627451 ]
#  [ 32.55020905  30.80392157]
#  [ 44.16622925  43.01960784]
#  [ 26.18071365  25.45098039]]

If you have the data in that order, you don't actually need LSTM, you can easily replace the LSTM part with a Deep Neural Network:

# Deep network instead of LSTM
net = tflearn.input_data(shape=[None, n_input])
net = tflearn.fully_connected(net, 100)
net = tflearn.fully_connected(net, 100)
net = tflearn.fully_connected(net, 1)
net = tflearn.regression(net, optimizer='adam',loss='mean_square')

model = tflearn.DNN(net)
model.fit(xs_train, ys_train)
print(np.hstack((model.predict(xs_test),ys_test))[:10])

Both codes are going to give you as result the predicted value of the frequency. I also created a gist with the program.

like image 1
silgon Avatar answered Oct 11 '22 03:10

silgon