element wise test of numpy array is numeric

Tags:

I have an array as following:

In [1]: x = array(['1.2', '2.3', '1.2.3'])

I want to test if each element in the array can be converted into numerical value. That is, a function: is_numeric(x) will return a True/False array as following:

In [2]: is_numeric(x)
Out[2]: array([True, True, False])

How to do this?

799

asked Jun 23 '16 15:06

Wei Li

2 Answers

I find the following works well for my purpose.

First, save the isNumeric function from https://rosettacode.org/wiki/Determine_if_a_string_is_numeric#C in a file called ctest.h, then create a .pyx file as follows:

from numpy cimport ndarray, uint8_t
import numpy as np
cimport numpy as np

cdef extern from "ctest.h":
     int isNumeric(const char * s)

def is_numeric_elementwise(ndarray x):
    cdef Py_ssize_t i
    cdef ndarray[uint8_t, mode='c', cast=True] y = np.empty_like(x, dtype=np.uint8)

    for i in range(x.size):
        y[i] = isNumeric(x[i])

    return y > 0

The above Cython function runs quite fast.

In [4]: is_numeric_elementwise(array(['1.2', '2.3', '1.2.3']))
Out[4]: array([ True,  True, False], dtype=bool)

In [5]: %timeit is_numeric_elementwise(array(['1.2', '2.3', '1.2.3'] * 1000000))
1 loops, best of 3: 695 ms per loop

Compare with is_numeric_3 method in https://stackoverflow.com/a/37997673/4909242, it is ~5 times faster.

In [6]: %timeit is_numeric_3(array(['1.2', '2.3', '1.2.3'] * 1000000))
1 loops, best of 3: 3.45 s per loop

There might still be some rooms to improve, I guess.

124

answered Sep 18 '22 13:09

Wei Li

import numpy as np

def is_float(val):
        try:
            float(val)
        except ValueError:
            return False
        else:
            return True

a = np.array(['1.2', '2.3', '1.2.3'])

is_numeric_1 = lambda x: map(is_float, x)              # return python list
is_numeric_2 = lambda x: np.array(map(is_float, x))    # return numpy array
is_numeric_3 = np.vectorize(is_float, otypes = [bool]) # return numpy array

Depend on the size of a array and the type of the returned values, these functions have different speed.

In [26]: %timeit is_numeric_1(a)
100000 loops, best of 3: 2.34 µs per loop

In [27]: %timeit is_numeric_2(a)
100000 loops, best of 3: 3.13 µs per loop

In [28]: %timeit is_numeric_3(a)
100000 loops, best of 3: 6.7 µs per loop

In [29]: a = np.array(['1.2', '2.3', '1.2.3']*1000)

In [30]: %timeit is_numeric_1(a)
1000 loops, best of 3: 1.53 ms per loop

In [31]: %timeit is_numeric_2(a)
1000 loops, best of 3: 1.6 ms per loop

In [32]: %timeit is_numeric_3(a)
1000 loops, best of 3: 1.58 ms per loop

If list is okay, use is_numeric_1.

If you want a numpy array, and size of a is small, use is_numeric_2.

Else, use is_numeric_3

answered Sep 19 '22 13:09

dragon2fly

Related questions
                            
                                SQLAlchemy: Using a CTE from a (sub)query w/ FROM clause specified as literal text
                            
                                Dependency rule tried to blank-out primary key column in SQL-Alchemy when trying to delete record
                            
                                Load directly gz file into pandas dataframe
                            
                                Weighted linear regression with Scikit-learn
                            
                                When is KeyboardInterrupt raised in Python?
                            
                                Paho MQTT client connection reliability (reconnect on disconnection)
                            
                                Python 3 Unit Testing - Assert Logger NOT called
                            
                                What is inter_byte_timeout (interCharTimeout) in pyserial?
                            
                                Destructor in metaclass Singleton object
                            
                                Add multiplication signs (*) between coefficients
                            
                                How add asymmetric errorbars to Pandas grouped barplot?
                            
                                Avoid pandas str.replace using a regex
                            
                                Pip Requirements.txt --global-option causing installation errors with other packages. "option not recognized"
                            
                                More Pythonic way of adding attributes to class?
                            
                                Django - Generate excel reports based on model fields
                            
                                How to update nested variables in Ansible
                            
                                Merging 2 Lists In Multiple Ways - Python
                            
                                Python: mock patch a module wherever it is imported from
                            
                                String Distance Matrix in Python
                            
                                How do I extend Django admin's DateFieldListFilter class ?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

element wise test of numpy array is numeric

Tags:

python

arrays

numpy

Wei Li

People also ask

2 Answers

Wei Li

dragon2fly

Recent Activity

Donate For Us