Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

neuralnet: overcoming the non convergence of algorithm

Tags:

r

convergence

I want to train a neural network using the package "neuralnet" in R. The training data set is a data frame of 8 predictor variables (x1,x2,x3,...,x8) and 1 response variable (y). The data is given below:

data
      x1   x2   x3    x4     x5     x6      x7      x8       y
1   1.50 1.48 1.47 0.490 13.000 14.091 -0.1554 -0.1167 0.00000
2   1.50 1.51 1.44 0.484 17.379 25.286  0.0745  0.0746 0.00000
3   2.46 2.50 2.43 0.492 13.333 12.767 -0.1043 -0.1200 0.00000
4   1.50 1.53 1.46 0.491 19.897 23.255  0.0661  0.0650 1.00000
5   1.76 1.82 1.70 0.493 21.765 24.684  0.0933  0.0855 1.00000
6   1.50 1.49 1.43 0.498 11.071 11.297 -0.1567 -0.1200 0.66865
7   1.50 1.46 1.44 0.482 16.607 23.700  0.0750  0.0721 0.40079
8   1.49 1.52 1.48 0.485 21.583 23.225  0.0733  0.0700 1.00000
9   1.50 1.46 1.41 0.481 17.250 24.052  0.0743  0.0750 0.95040
10  2.55 2.57 2.57 0.483 13.778 12.796 -0.0970 -0.1145 0.00000
11  1.71 1.74 1.70 0.488 20.700 23.133  0.0855  0.0833 0.92063
12  2.54 2.57 2.57 0.491 13.038 12.140 -0.0960 -0.1143 0.00000
13  1.50 1.47 1.43 0.479 19.886 24.833  0.0757  0.0742 0.91667
14  1.50 1.46 1.43 0.488 17.036 21.792  0.0750  0.0750 1.00000
15  1.50 1.48 1.45 0.493 13.333 14.012 -0.1625 -0.1250 0.00000
16  1.49 1.52 1.49 0.486 21.988 24.579  0.0833  0.0761 1.00000
17  1.50 1.48 1.44 0.501 12.593 12.833 -0.1654 -0.1300 0.00992
18  1.50 1.48 1.45 0.493 14.536 16.946 -0.1454 -0.1092 0.61310
19  1.49 1.50 1.48 0.496 13.191 14.208 -0.1655 -0.1257 0.76389
20  1.73 1.76 1.72 0.489 20.591 23.219  0.0854  0.0839 0.99405
21  1.51 1.53 1.49 0.484 20.018 23.173  0.0704  0.0683 1.00000
22  1.50 1.47 1.44 0.480 19.310 24.704  0.0749  0.0739 1.00000
23  1.50 1.46 1.44 0.488 17.438 21.643  0.0744  0.0742 0.97222
24  1.71 1.75 1.69 0.485 18.875 22.255  0.0646  0.0580 0.05952
25  1.50 1.46 1.43 0.480 17.302 21.281  0.0744  0.0750 0.91667
26  1.50 1.46 1.45 0.478 19.040 23.250  0.0751  0.0718 1.00000
27  1.51 1.46 1.45 0.484 16.696 22.400  0.0667  0.0638 0.75794
28  1.50 1.46 1.43 0.491 17.071 21.474  0.0740  0.0650 1.00000
29  1.51 1.49 1.46 0.502 13.045 14.341 -0.1567 -0.1250 0.00000
30  1.51 1.49 1.45 0.494 13.500 15.223 -0.1600 -0.1250 0.50397
31  1.79 1.83 1.77 0.488 20.212 23.296  0.0855  0.0850 0.81151
32  1.61 1.63 1.59 0.485 20.250 23.315  0.0748  0.0733 1.00000
33  1.51 1.49 1.47 0.469 20.064 25.050  0.0755  0.0740 1.00000
34  1.50 1.51 1.48 0.480 19.636 26.605  0.0742  0.0743 0.00000
35  1.50 1.48 1.45 0.489 14.286 15.844 -0.0850 -0.0533 0.61310
36  3.10 3.14 3.14 0.491 14.100 14.120 -0.0960 -0.1131 0.00000
37  1.49 1.49 1.40 0.491 16.645 20.267  0.0645  0.0645 0.56746
38  1.50 1.49 1.45 0.499 12.398 13.096 -0.1650 -0.1333 0.24802
39  1.51 1.51 1.49 0.493 14.264 15.808 -0.1550 -0.1200 0.51984
40  1.49 1.47 1.42 0.501 11.571 12.648 -0.1660 -0.1300 0.14881
41  1.50 1.49 1.45 0.496 13.543 15.075 -0.1633 -0.1290 0.39881
42  2.51 2.55 2.51 0.488 12.692 12.611 -0.0956 -0.1100 0.00000
43  2.50 2.52 2.53 0.487 12.920 12.562 -0.0945 -0.1067 0.00000
44  2.25 2.28 2.27 0.490 13.962 14.962 -0.0900 -0.1047 0.61508
45  1.50 1.49 1.44 0.494 13.500 15.262 -0.1595 -0.1244 0.62500
46  1.50 1.47 1.42 0.496 13.560 14.618 -0.1550 -0.1220 0.30357
47  1.49 1.48 1.44 0.491 12.676 13.000 -0.1633 -0.1264 0.12103
48  2.58 2.62 2.60 0.486 14.200 13.275 -0.1000 -0.1159 0.00000
49  1.50 1.48 1.45 0.488 13.012 13.548 -0.1550 -0.1230 0.00000
50  1.49 1.50 1.47 0.482 20.508 23.194  0.0775  0.0747 0.94841
51  1.50 1.48 1.44 0.495 11.125 11.189 -0.1600 -0.1236 0.67063
52  2.59 2.63 2.64 0.483 13.038 13.370 -0.0920 -0.1100 0.00000
53  1.49 1.46 1.42 0.485  0.973  0.727  0.1507  0.1522 0.00000
54  1.50 1.47 1.44 0.487 13.327 13.917 -0.1550 -0.1200 0.00000
55  1.50 1.47 1.40 0.486 19.300 23.393  0.0864  0.0845 1.00000
56  1.50 1.48 1.45 0.498 13.250 15.443 -0.1632 -0.1250 0.23810
57  1.50 1.49 1.45 0.498 13.500 14.684 -0.1605 -0.1250 0.03373
58  1.50 1.47 1.45 0.486 20.100 23.477  0.0861  0.0844 0.72222
59  1.50 1.52 1.49 0.484 21.132 23.220  0.0716  0.0694 1.00000
60  2.31 2.34 2.30 0.490 14.143 15.000 -0.0900 -0.1033 0.49405
61  1.50 1.46 1.43 0.473 17.049 20.914  0.0753  0.0750 0.91667
62  1.50 1.48 1.45 0.495 13.650 14.643 -0.1583 -0.1250 0.00000
63  2.41 2.44 2.41 0.490 15.950 17.957 -0.0860 -0.1050 0.27183
64  1.50 1.48 1.46 0.497 13.272 14.392 -0.1553 -0.1231 0.00000
65  1.51 1.52 1.49 0.477 19.404 22.692  0.0705  0.0703 0.00000
66  2.59 2.61 2.61 0.486 14.000 12.635 -0.0967 -0.1100 0.00000
67  1.50 1.52 1.49 0.483 19.586 22.875  0.0702  0.0691 0.00000
68  1.50 1.51 1.47 0.479 17.836 21.496  0.0652  0.0647 0.00000
69  1.50 1.50 1.47 0.486 18.975 26.470  0.0744  0.0750 0.00000
70  2.63 2.65 2.65 0.482 12.900 12.696 -0.0967 -0.1133 0.00000
71  1.51 1.48 1.45 0.480 20.237 23.366  0.0933  0.0867 0.71429
72  1.50 1.47 1.45 0.485 17.265 21.600  0.0752  0.0745 0.94444
73  1.50 1.47 1.42 0.464 19.988 24.459  0.0758  0.0752 1.00000
74  1.50 1.47 1.44 0.488 11.333 12.622 -0.0936 -0.0567 1.00000
75  3.09 3.13 3.13 0.490 12.852 12.703 -0.0950 -0.1150 0.00000
76  1.51 1.50 1.47 0.496 12.581 12.632 -0.1664 -0.1300 0.24802
77  2.32 2.35 2.34 0.486 14.067 15.200 -0.0867 -0.1033 0.38095
78  1.50 1.46 1.46 0.481 17.337 21.726  0.0750  0.0741 0.94444
79  1.66 1.69 1.63 0.491 14.121 15.000 -0.0857 -0.1000 1.00000
80  1.50 1.48 1.44 0.493 13.327 15.032 -0.1608 -0.1250 0.00000
81  1.50 1.48 1.47 0.487 11.523 11.957 -0.1556 -0.1200 0.02579
82  1.50 1.46 1.42 0.485 18.000 21.857  0.0738  0.0656 0.91667
83  2.51 2.55 2.54 0.496 13.500 12.812 -0.0967 -0.1138 0.00000
84  1.50 1.50 1.47 0.490 17.217 23.744  0.0743  0.0750 0.00000
85  1.51 1.49 1.45 0.498 13.550 14.686 -0.1611 -0.1257 0.00000
86  2.58 2.62 2.60 0.496 14.056 14.062 -0.1000 -0.1163 0.00000
87  1.71 1.74 1.70 0.489 20.665 22.944  0.0714  0.0688 0.40278
88  1.50 1.53 1.46 0.480 21.022 23.259  0.0815  0.0753 1.00000
89  1.49 1.51 1.48 0.487 19.924 23.154  0.0745  0.0748 1.00000
90  1.50 1.48 1.45 0.489 13.618 15.154 -0.1565 -0.1207 0.59127
91  1.50 1.48 1.46 0.495 13.700 14.786 -0.1579 -0.1214 0.09921
92  1.50 1.45 1.44 0.482 17.605 22.105  0.0750  0.0745 0.91667
93  1.50 1.50 1.49 0.489 12.981 14.446 -0.1550 -0.1158 1.00000
94  1.49 1.46 1.43 0.491 17.375 21.110  0.0685  0.0650 0.91667
95  1.50 1.50 1.47 0.498 14.292 15.960 -0.1556 -0.1200 0.53571
96  1.50 1.48 1.44 0.497 13.708 15.214 -0.1650 -0.1247 0.36706
97  1.49 1.50 1.46 0.488 17.155 23.509  0.0644  0.0653 0.09722
98  1.50 1.48 1.44 0.497 13.100 14.837 -0.1594 -0.1250 0.00000
99  2.51 2.55 2.58 0.486 13.172 12.780 -0.0952 -0.1075 0.00000
100 1.49 1.46 1.41 0.478 16.650 23.000  0.0800  0.0750 0.00000
101 1.50 1.46 1.44 0.488 17.232 21.703  0.0756  0.0742 1.00000
102 1.50 1.49 1.47 0.495 11.471 13.333 -0.1560 -0.1250 0.82540
103 3.08 3.12 3.10 0.489 12.726 12.469 -0.0959 -0.1133 0.00000
104 1.67 1.70 1.66 0.488 21.480 23.315  0.0900  0.0850 1.00000
105 3.08 3.11 3.10 0.492 13.000 12.907 -0.0957 -0.1138 0.00000
106 1.50 1.54 1.45 0.490 18.833 22.880  0.0595  0.0541 1.00000
107 1.50 1.54 1.46 0.480 19.385 22.981  0.0691  0.0577 1.00000
108 1.50 1.47 1.46 0.485 17.318 21.800  0.0663  0.0660 0.94444
109 2.49 2.52 2.51 0.487 12.792 12.562 -0.1000 -0.1133 0.00000
110 1.50 1.49 1.44 0.500 12.750 15.000 -0.1650 -0.1324 0.48016
111 1.57 1.60 1.54 0.481 22.386 24.684  0.0946  0.0847 1.00000
112 1.50 1.49 1.46 0.501 14.250 16.364 -0.1533 -0.1157 0.07540
113 1.50 1.47 1.45 0.491 11.100 10.406 -0.1500 -0.1162 0.75794
114 1.67 1.70 1.66 0.486 22.253 24.324  0.0942  0.0855 1.00000
115 1.50 1.47 1.45 0.485 19.585 23.810  0.0782  0.0742 1.00000
116 1.49 1.48 1.44 0.497 13.853 15.366 -0.1643 -0.1250 0.36508
117 1.50 1.45 1.44 0.479 17.029 23.311  0.0742  0.0700 0.78175
118 1.67 1.70 1.63 0.488 22.455 24.994  0.0869  0.0851 1.00000
119 1.50 1.46 1.44 0.487 16.962 21.357  0.0663  0.0645 1.00000
120 2.41 2.45 2.39 0.493 12.702 12.375 -0.0950 -0.1143 0.00000 

The model specification is given as:

net <- neuralnet(y~x1+x2+x3+x4+x5+x6+x7+x8, data=data, hidden=10)       

When the execution is complete, a warning message is produced as given below:

Warning message:
algorithm did not converge in 1 of 1 repetition(s) within the stepmax

When I attempt to plot the network, an error message comes up:

plot(net)
Error in plot.nn(net) : weights were not calculated

I have used different numbers of hidden neurons in each layer ranging from 1 to 10 and numbers well over 10. The model was generated when values of hidden equals 1 and 2 but not for others. I have also tried to use different activation functions to smooth the results. The data does not have NA values. Can anyone please help me understand why this is and how it might be resolved?

like image 968
Tunde Awosanya Avatar asked Oct 14 '13 13:10

Tunde Awosanya


People also ask

What is the convergence of neural networks?

In simple words, we can say that convergence of neural networks is a point of training a model after which changes in the learning rate become lower and the errors produced by the model in training comes to a minimum.

How do you make a neural network converge?

Input normalization This method is also one of the most helpful methods to make neural networks converge faster. In many of the learning processes, we experience faster training when the training data sum to zero. We can normalize the input data by subtracting the mean value from each input variable.

What is Stepmax in neural network?

neuralnet and see the definition for stepmax it says, the maximum steps for the training of the neural network. Reaching this maximum leads to a stop of the neural network's training process. For your problem, I recommend you to increase your stepmax value to 1e7 and see what happens.

How do you know if a neural network is convergent?

There are two ways to test for convergence: either the weights or the value of the objective do not change significantly (up to some numerical threshold), or the error rate does not decrease further. They are not exclusive but, rather, complementary (as pointed out previously).


2 Answers

Most machine learning algorithms aren't just going to work "out of the box". There are paramaters that are going to need to be tuned in order to get it to work properly with your particular dataset. If you type in ?neuralnet you will see a whole bunch of parameters that can be adjusted.

One thing you could try is to increase stepmax, for example

net <- neuralnet(y~x1+x2+x3+x4+x5+x6+x7+x8, data=data, hidden=10,stepmax=1e6)  

to give the algorithm more time to converge. However, the better thing to do is to figure out exactly what is going on, by adjusting the different parameters and examining how they affect the output.

like image 180
mrip Avatar answered Oct 22 '22 17:10

mrip


I agree with mrip. You can increase the stepmax and thereby giving it more time to converge. The other option is to adjust the threshold parameter. By default its value is 0.01. Try increasing it to 0.1/0.5. If you change the lifesign to 'full' you can see the threshold values. Keep your threshold value lower than the one you see at the last step. Remember, higher the threshold, lower the accuracy of the model

like image 24
derp92 Avatar answered Oct 22 '22 18:10

derp92