Python, why is my probabilistic neural network (PNN) always predicting zeros?

Tags:

I'm working with Python 2.7.5 on Linux CentOS 7 machine. I'm trying to apply a probabilistic neural network (PNN) my dataset, to solve a binary classification problem.

I'm using the following Python packages: numpy, sklearn, neupy.algorithms. I'm trying to follow this example used for the iris dataset.

The issue is that my PNN always predicts zero values (elements classified as zeros), and I cannot understand why...

Here's my dataset ("dataset_file.csv"). There are 34 features and 1 label target (the last column, that might be 0 or 1):

47,1,0,1,0,20,1,0,1,24,1,1,0,2,1,8050,9,1,274,60,258,65,7,3.2,105,289,0,0,79,1,0,0,0,34,0
55,1,0,1,0,45,1,0,0,1,1,1,1,3,0,11200,7,0,615,86,531,97,5.4,2.6,96,7541,1.6,0.8,6,1,1,1,1,42,0
29,1,1,1,0,23,0,1,0,1,0,0,0,2,1,5300,12,1,189,30,203,72,7,3.5,93,480,0,0,90,1,0,0,0,43,1
39,1,0,1,0,10,1,0,0,3,0,1,1,0,1,7910,14,1,462,28,197,50,8,4.5,93,459,5,2.8,45,1,1,0,0,21,0
47,1,0,1,0,10,1,1,1,1.5,1,1,0,3,1,9120,4,0,530,71,181,60,6.2,3.8,83,213,3.6,1.95,53,1,1,0,0,11,0
57,1,0,1,0,50,0,1,0,24,1,0,1,3,1,16000,9,0,330,78,172,74,5.9,2.9,112,332,4.1,2.1,82,1,1,0,0,23,1
44,1,0,1,0,15,1,1,0,0.5,1,1,1,2,0,5800,14,0,275,44,155,105,7.2,3.5,84,360,3.44,1.6,55,1,1,0,0,24,0
49,1,3,1,0,25,1,1,1,1,0,1,0,3,1,8200,12,1,441,74,237,111,6.2,3.6,79,211,0,0,91,1,0,0,0,43,0
56,1,0,1,0,5,1,0,0,3,1,0,0,3,1,5100,7,1,188,58,185,62,7.8,3.9,112,472,0,0,83,1,0,0,0,34,0
33,1,4,1,0,20,1,0,1,3,0,0,0,3,1,7300,10,1,329,40,139,80,6.9,3.7,89,122,3.4,1.2,75,1,1,0,0,33,0
22,0,0,1,0,15,1,0,0,1,1,1,1,0,1,3700,8,0,617,53,267,128,6.2,3.8,91,3060,3.1,1.9,63,1,1,0,0,54,0
82,0,5,1,0,60,1,0,1,3,1,1,1,0,0,8900,11,1,275,83,255,93,5.9,3.1,95,455,4.8,1.9,68,1,1,0,0,55,0
49,0,2,1,0,20,1,0,1,2,1,0,0,0,1,8500,6,1,292,84,241,79,6.8,3.9,100,158,3.4,1.25,75,1,1,1,0,65,0
51,1,4,1,0,51,1,1,1,2,1,0,1,0,1,18300,14,1,522,91,268,105,6.1,3.1,98,758,4.2,2.5,19,1,1,1,1,67,0
61,1,2,1,0,20,1,0,0,3,1,0,0,3,1,6600,9,1,563,101,268,78,6.4,3.7,115,694,5.2,3,29,1,1,1,1,77,0
48,0,1,1,0,28,1,0,0,12,1,0,0,3,1,9100,22,0,114,18,165,63,7.2,3.6,103,429,0,0,84,1,0,0,0,34,0
57,0,0,1,0,40,1,0,1,1,0,0,0,3,1,8100,8,0,264,15,120,69,6.8,3.4,91,390,0,0,91,1,0,0,0,23,0
57,0,0,1,0,25,1,0,0,12,0,1,0,0,0,6900,16,0,847,111,289,78,5.3,2.4,105,162,3.1,1.9,68,1,1,1,0,78,0
47,1,4,1,0,40,0,1,1,6,1,1,1,2,1,21500,10,0,632,121,219,108,7.5,2.8,149,1158,3.17,1.77,8,1,1,1,0,58,1
52,0,0,1,0,30,1,1,1,2,1,0,1,0,0,14600,5,1,405,88,280,140,5.8,3.1,121,983,3.9,1.8,17,1,1,1,1,76,0
50,1,2,1,0,16,1,1,0,1,1,1,1,0,1,12200,9,1,280,7,176,71,7.4,4.2,105,293,4.5,2.7,68,1,1,0,0,67,0
63,1,4,1,0,18,1,0,1,0.5,1,1,1,3,0,16400,8,0,479,93,140,64,5.8,3.7,226,1286,6.22,3.6,18,1,1,0,0,19,0
54,0,0,1,0,20,0,0,1,8,0,0,1,0,1,7200,10,0,366,71,284,73,6.4,3.7,114,384,4.1,2.8,65,1,1,0,0,24,1
31,0,3,1,0,10,0,1,0,1,1,1,1,1,1,3800,8,0,568,102,236,59,6.4,3.7,99,387,0,0,78,1,0,0,0,45,1
44,0,6,1,0,10,1,1,0,2,1,1,1,0,1,7700,15,1,274,44,139,62,6.7,4.1,93,129,0,0,76,1,0,0,0,24,0
50,0,6,1,0,20,1,0,0,3,1,1,0,0,1,5200,6,0,403,90,224,79,6.3,3.1,109,151,3.1,1.4,79,1,0,0,0,34,0
61,1,3,1,0,30,0,1,0,3,1,0,1,2,1,11500,7,0,668,88,178,65,6.7,3.08,104,680,4.1,2.5,22,1,1,0,0,23,1

And here's my Python code:

import numpy as np
from sklearn import datasets
from sklearn.metrics import matthews_corrcoef
from sklearn.cross_validation import StratifiedKFold

from neupy.algorithms import PNN

fileName="dataset_file.csv"
TARGET_COLUMN=35

from numpy import genfromtxt
input_dataset_data = genfromtxt(fileName, delimiter=',', skip_header=0, usecols=(range(0, TARGET_COLUMN-1)))
#print(input_dataset_data)
input_dataset_target = genfromtxt(fileName, delimiter=',', skip_header=0, usecols=(TARGET_COLUMN-1))
#print(input_dataset_target)

kfold_number = 2
skfold = StratifiedKFold(input_dataset_target, kfold_number, shuffle=True)
avarage_result = 0

print("> Start classify input_dataset dataset")

for i, (train, test) in enumerate(skfold, start=1):

    pnn_network = PNN(std=0.1, step=0.2, verbose=True)

    pnn_network.train(input_dataset_data[train], input_dataset_target[train])
    predictions = pnn_network.predict(input_dataset_data[test])

    print(predictions)
    #print(input_dataset_target[test])

    mcc = matthews_corrcoef(input_dataset_target[test], predictions)
    print "The Matthews correlation coefficient is %f" % mcc

    print("kfold #{:<2}: Guessed {} out of {}".format(
        i, np.sum(predictions == input_dataset_target[test]), test.size
    ))

Does anyone know why I am getting only predictions having 0 value? Can you give me some suggestion to solve this issue?

Thanks!

EDIT: Here's the normalized dataset (normalized by column):

0.55,1,0,1,0,0.29,1,0,1,0.46,1,1,0,0.67,1,0.37,0.41,1,0.08,0.47,0.23,0.13,0.82,0.46,0.25,0.04,0,0,0.52,1,0,0,0,0.33,0
0.65,1,0,1,0,0.64,1,0,0,0.02,1,1,1,1,0,0.52,0.32,0,0.18,0.67,0.47,0.2,0.64,0.38,0.23,1,0.24,0.18,0.04,1,1,1,1,0.41,0
0.34,1,0.13,1,0,0.33,0,0.5,0,0.02,0,0,0,0.67,1,0.25,0.55,1,0.06,0.23,0.18,0.15,0.82,0.51,0.22,0.06,0,0,0.6,1,0,0,0,0.42,1
0.46,1,0,1,0,0.14,1,0,0,0.06,0,1,1,0,1,0.37,0.64,1,0.14,0.22,0.17,0.1,0.94,0.65,0.22,0.06,0.75,0.64,0.3,1,1,0,0,0.2,0
0.55,1,0,1,0,0.14,1,0.5,1,0.03,1,1,0,1,1,0.42,0.18,0,0.16,0.55,0.16,0.12,0.73,0.55,0.2,0.03,0.54,0.44,0.35,1,1,0,0,0.11,0
0.67,1,0,1,0,0.71,0,0.5,0,0.46,1,0,1,1,1,0.74,0.41,0,0.1,0.6,0.15,0.15,0.69,0.42,0.27,0.04,0.61,0.48,0.54,1,1,0,0,0.22,1
0.52,1,0,1,0,0.21,1,0.5,0,0.01,1,1,1,0.67,0,0.27,0.64,0,0.08,0.34,0.14,0.21,0.85,0.51,0.2,0.05,0.51,0.36,0.36,1,1,0,0,0.23,0
0.58,1,0.38,1,0,0.36,1,0.5,1,0.02,0,1,0,1,1,0.38,0.55,1,0.13,0.57,0.21,0.23,0.73,0.52,0.19,0.03,0,0,0.6,1,0,0,0,0.42,0
0.66,1,0,1,0,0.07,1,0,0,0.06,1,0,0,1,1,0.24,0.32,1,0.06,0.45,0.16,0.13,0.92,0.57,0.27,0.06,0,0,0.55,1,0,0,0,0.33,0
0.39,1,0.5,1,0,0.29,1,0,1,0.06,0,0,0,1,1,0.34,0.45,1,0.1,0.31,0.12,0.16,0.81,0.54,0.21,0.02,0.51,0.27,0.5,1,1,0,0,0.32,0
0.26,0,0,1,0,0.21,1,0,0,0.02,1,1,1,0,1,0.17,0.36,0,0.19,0.41,0.24,0.26,0.73,0.55,0.22,0.41,0.46,0.43,0.42,1,1,0,0,0.52,0
0.96,0,0.63,1,0,0.86,1,0,1,0.06,1,1,1,0,0,0.41,0.5,1,0.08,0.64,0.23,0.19,0.69,0.45,0.23,0.06,0.72,0.43,0.45,1,1,0,0,0.53,0
0.58,0,0.25,1,0,0.29,1,0,1,0.04,1,0,0,0,1,0.4,0.27,1,0.09,0.65,0.21,0.16,0.8,0.57,0.24,0.02,0.51,0.28,0.5,1,1,1,0,0.63,0
0.6,1,0.5,1,0,0.73,1,0.5,1,0.04,1,0,1,0,1,0.85,0.64,1,0.16,0.71,0.24,0.21,0.72,0.45,0.23,0.1,0.63,0.57,0.13,1,1,1,1,0.65,0
0.72,1,0.25,1,0,0.29,1,0,0,0.06,1,0,0,1,1,0.31,0.41,1,0.17,0.78,0.24,0.16,0.75,0.54,0.27,0.09,0.78,0.68,0.19,1,1,1,1,0.75,0
0.56,0,0.13,1,0,0.4,1,0,0,0.23,1,0,0,1,1,0.42,1,0,0.03,0.14,0.15,0.13,0.85,0.52,0.24,0.06,0,0,0.56,1,0,0,0,0.33,0
0.67,0,0,1,0,0.57,1,0,1,0.02,0,0,0,1,1,0.38,0.36,0,0.08,0.12,0.11,0.14,0.8,0.49,0.22,0.05,0,0,0.6,1,0,0,0,0.22,0
0.67,0,0,1,0,0.36,1,0,0,0.23,0,1,0,0,0,0.32,0.73,0,0.25,0.86,0.26,0.16,0.62,0.35,0.25,0.02,0.46,0.43,0.45,1,1,1,0,0.76,0
0.55,1,0.5,1,0,0.57,0,0.5,1,0.12,1,1,1,0.67,1,1,0.45,0,0.19,0.94,0.19,0.22,0.88,0.41,0.35,0.15,0.47,0.4,0.05,1,1,1,0,0.56,1
0.61,0,0,1,0,0.43,1,0.5,1,0.04,1,0,1,0,0,0.68,0.23,1,0.12,0.68,0.25,0.29,0.68,0.45,0.29,0.13,0.58,0.41,0.11,1,1,1,1,0.74,0
0.59,1,0.25,1,0,0.23,1,0.5,0,0.02,1,1,1,0,1,0.57,0.41,1,0.08,0.05,0.16,0.15,0.87,0.61,0.25,0.04,0.67,0.61,0.45,1,1,0,0,0.65,0
0.74,1,0.5,1,0,0.26,1,0,1,0.01,1,1,1,1,0,0.76,0.36,0,0.14,0.72,0.12,0.13,0.68,0.54,0.54,0.17,0.93,0.82,0.12,1,1,0,0,0.18,0
0.64,0,0,1,0,0.29,0,0,1,0.15,0,0,1,0,1,0.33,0.45,0,0.11,0.55,0.25,0.15,0.75,0.54,0.27,0.05,0.61,0.64,0.43,1,1,0,0,0.23,1
0.36,0,0.38,1,0,0.14,0,0.5,0,0.02,1,1,1,0.33,1,0.18,0.36,0,0.17,0.79,0.21,0.12,0.75,0.54,0.24,0.05,0,0,0.52,1,0,0,0,0.44,1
0.52,0,0.75,1,0,0.14,1,0.5,0,0.04,1,1,1,0,1,0.36,0.68,1,0.08,0.34,0.12,0.13,0.79,0.59,0.22,0.02,0,0,0.5,1,0,0,0,0.23,0
0.59,0,0.75,1,0,0.29,1,0,0,0.06,1,1,0,0,1,0.24,0.27,0,0.12,0.7,0.2,0.16,0.74,0.45,0.26,0.02,0.46,0.32,0.52,1,0,0,0,0.33,0
0.72,1,0.38,1,0,0.43,0,0.5,0,0.06,1,0,1,0.67,1,0.53,0.32,0,0.2,0.68,0.16,0.13,0.79,0.45,0.25,0.09,0.61,0.57,0.15,1,1,0,0,0.22,1

369

asked Sep 21 '16 15:09

DavideChicco.it

Video Answer

1 Answers

I have tried your code and it seems to me that your network predicts both classes:

import numpy as np
from sklearn.cross_validation import StratifiedKFold
from neupy.algorithms import PNN

fileName="dataset_file.csv"
TARGET_COLUMN=35

from numpy import genfromtxt
input_dataset_data = genfromtxt(fileName, delimiter=',', skip_header=0, usecols=(range(0, TARGET_COLUMN-1)))
input_dataset_target = genfromtxt(fileName, delimiter=',', skip_header=0, usecols=(TARGET_COLUMN-1))

kfold_number = 5
skfold = StratifiedKFold(input_dataset_target, kfold_number, shuffle=True)

print("> Start classify input_dataset dataset")
for std in [0.2, 0.4,  0.6,  0.8, 1]:
    average_results = []
    for i, (train, test) in enumerate(skfold, start=1):
            pnn_network = PNN(std=std, step=0.2, verbose=False, batch_size=2)
            pnn_network.train(input_dataset_data[train], input_dataset_target[train])
            predictions = pnn_network.predict(input_dataset_data[test])
            print("Positive in predictions:", 1 in predictions)
            average_result.append(np.sum(predictions == input_dataset_target[test]) /float(len(predictions)))
    print std, np.average(average_result)

An example output while tuning std:

1 0.881558441558
('Positive in predictions:', True)
('Positive in predictions:', True)
('Positive in predictions:', True)
('Positive in predictions:', True)
('Positive in predictions:', True)

That is for std=1 the average accuracy in the folds is ~0.88 and 1 is predicted in each of the rounds of stratified cross-validation. I have used the normalized data of your edit.

169

answered Oct 18 '22 22:10

geompalik

Related questions
                            
                                Image classification using SVM Python
                            
                                Terminate GObject.Mainloop() threads together with main
                            
                                Different result with vectorized code to standard loop in numpy
                            
                                What is the pythonic way to pass arguments to functions?
                            
                                Convert redundant array to dict (or JSON)?
                            
                                Class Members -- Java vs. Python
                            
                                Plotting the boundaries of cluster zone in Python with scikit package
                            
                                Python collections ValuesView abc: why doesn't it inherit from Iterable?
                            
                                How to know if a git hook is being called from not within the terminal/command line
                            
                                Difference between "wrapper" and "method" descriptors?
                            
                                Django templates and whitespace
                            
                                How to automatically generate Python API documentation in PyCharm [closed]
                            
                                Same Python code, same data, different results on different machines
                            
                                Python - reading 10 bit integers from a binary file
                            
                                mplot3d animation with transparent background
                            
                                How can a neural network architecture be visualized with Keras?
                            
                                Is there a specific range of unicode code points which can be checked for emojis?
                            
                                'EntryPoint' object has no attribute 'resolve' when using Google Compute Engine
                            
                                glib.GError: Error interpreting JPEG image file (Unsupported marker type 0x05)
                            
                                Python Click: custom error message

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python, why is my probabilistic neural network (PNN) always predicting zeros?

Tags:

python

neural-network

numpy

scikit-learn

neupy

DavideChicco.it

People also ask

Video Answer

1 Answers

geompalik

Recent Activity

Donate For Us