Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting all non-numeric to 0 (zero) in Python

Tags:

python

I'm looking for the easiest way to convert all non-numeric data (including blanks) in Python to zeros. Taking the following for example:

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]

I would like the output to be as follows:

desiredData = [[1.0,4,7,-50],[0,0,0,12.5644]]

So '7' should be 7, but '8 bananas' should be converted to 0.

like image 358
user1882017 Avatar asked Sep 20 '15 14:09

user1882017


People also ask

How do you convert non-numeric data to numeric data in Python?

To encode non-numeric data to numeric you can use scikit-learn's LabelEncoder. It will encode each category such as COL1's a , b , c to integers. enc. fit() creates the corresponding integer values.

How do I remove non-numeric characters from a column in Python?

Use the re. sub() method to remove all non-numeric characters from a string, e.g. result = re. sub(r'[^0-9]', '', my_str) .


2 Answers

import numbers
def mapped(x):
    if isinstance(x,numbers.Number):
        return x
    for tpe in (int, float):
        try:
            return tpe(x)
        except ValueError:
            continue
    return 0
for sub  in someData:
    sub[:] = map(mapped,sub)

print(someData)
[[1.0, 4, 7, -50], [0, 0, 0, 12.5644]]

It will work for different numeric types:

In [4]: from decimal import Decimal

In [5]: someData = [[1.0,4,'7',-50 ,"99", Decimal("1.5")],["foobar",'8 bananas','text','',12.5644]]

In [6]: for sub in someData:
   ...:         sub[:] = map(mapped,sub)
   ...:     

In [7]: someData
Out[7]: [[1.0, 4, 7, -50, 99, Decimal('1.5')], [0, 0, 0, 0, 12.5644]]

if isinstance(x,numbers.Number) catches subelements that are already floats, ints etc.. if it is not a numeric type we first try casting to int then to float, if none of those are successful we simply return 0.

like image 169
Padraic Cunningham Avatar answered Sep 21 '22 17:09

Padraic Cunningham


Another solution using regular expressions

import re

def toNumber(e):
    if type(e) != str:
        return e
    if re.match("^-?\d+?\.\d+?$", e):
        return float(e)
    if re.match("^-?\d+?$", e):
        return int(e)
    return 0

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]
someData = [map(toNumber, list) for list in someData]
print(someData)

you get:

[[1.0, 4, 7, -50], [0, 0, 0, 12.5644]]

Note It don't works for numbers in scientific notation

like image 25
Jose Ricardo Bustos M. Avatar answered Sep 25 '22 17:09

Jose Ricardo Bustos M.