Normalization VS. numpy way to normalize?

Tags:

I'm supposed to normalize an array. I've read about normalization and come across a formula:

enter image description here

I wrote the following function for it:

def normalize_list(list):
    max_value = max(list)
    min_value = min(list)
    for i in range(0, len(list)):
        list[i] = (list[i] - min_value) / (max_value - min_value)

That is supposed to normalize an array of elements.

Then I have come across this: https://stackoverflow.com/a/21031303/6209399 Which says you can normalize an array by simply doing this:

def normalize_list_numpy(list):
    normalized_list = list / np.linalg.norm(list)
    return normalized_list

If I normalize this test array test_array = [1, 2, 3, 4, 5, 6, 7, 8, 9] with my own function and with the numpy method, I get these answers:

My own function: [0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0]
The numpy way: [0.059234887775909233, 0.11846977555181847, 0.17770466332772769, 0.23693955110363693, 0.29617443887954614, 0.35540932665545538, 0.41464421443136462, 0.47387910220727386, 0.5331139899831830

Why do the functions give different answers? Is there others way to normalize an array of data? What does numpy.linalg.norm(list) do? What do I get wrong?

647

asked Oct 24 '17 16:10

OuuGiii

1 Answers

The question/answer that you reference doesn't explicitly relate your own formula to the np.linalg.norm(list) version that you use here.

One NumPy solution would be this:

import numpy as np
def normalize(x):
    x = np.asarray(x)
    return (x - x.min()) / (np.ptp(x))

print(normalize(test_array))    
# [ 0.     0.125  0.25   0.375  0.5    0.625  0.75   0.875  1.   ]

Here np.ptp is peak-to-peak ie

Range of values (maximum - minimum) along an axis.

This approach scales the values to the interval [0, 1] as pointed out by @phg.

The more traditional definition of normalization would be to scale to a 0 mean and unit variance:

x = np.asarray(test_array)
res = (x - x.mean()) / x.std()
print(res.mean(), res.std())
# 0.0 1.0

Or use sklearn.preprocessing.normalize as a pre-canned function.

Using test_array / np.linalg.norm(test_array) creates a result that is of unit length; you'll see that np.linalg.norm(test_array / np.linalg.norm(test_array)) equals 1. So you're talking about two different fields here, one being statistics and the other being linear algebra.

answered Sep 29 '22 01:09

Brad Solomon

Related questions
                            
                                Django filter with OR condition using dict argument
                            
                                Python: Check if a key in a dictionary is contained in a string
                            
                                Get "super(): no arguments" error in one case but not a similar case
                            
                                SQLAlchemy JSON column - how to perform a contains query
                            
                                SQLAlchemy query filter on child attribute
                            
                                What does the error: `Loaded runtime CuDNN library: 5005 but source was compiled with 5103` mean?
                            
                                How to detect a full black color image in OpenCV Python?
                            
                                Bootstrap with Flask
                            
                                push_back/emplace_back a shallow copy of an object into another vector
                            
                                How to convert a string into list with one element in python [duplicate]
                            
                                Add header to CSV without loading CSV
                            
                                Difference between class foo , class foo() and class foo(object)?
                            
                                Why are my gunicorn Python/Flask workers exiting from signal term?
                            
                                Python requests return 504 in localhost
                            
                                how to pip install 64 bit packages while having both 64 bit and 32 bit versions?
                            
                                How to pass a string to a post call, using python requests
                            
                                bins must increase monotonically
                            
                                Why does assert np.nan == np.nan cause an error?
                            
                                How can I create a partial search filter in Django REST framework?
                            
                                Python pandas cumsum with reset everytime there is a 0

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Normalization VS. numpy way to normalize?

Tags:

python

numpy

normalization

OuuGiii

People also ask

1 Answers

Brad Solomon

Recent Activity

Donate For Us