Check whether a DataFrame or ndrray contains digits

Tags:

It is few hours I am stuck with this: I have a DataFrame containing a list of email addresses, from those email addresses I want to check whether in the mail is contained or not a number I.E. [email protected], if yes I want this number to be appended to an array:

I have tried both with a DataFrame, and also a ndarray woth numpy, but it does not work. This is what i am trying to do:

mail_addresses = pd.DataFrame(customers_df.iloc[:,0].values)
mail_addresses = mail_addresses.dropna(axis = 0, how= 'all')
mail_addresses_toArray = mail_addresses.values

for i in mail_addresses:
dates =[]
if any(i.isdigit()) == True:
    dates.append(i)
    print(dates)

I think my problem is that I don't know how I can convert all the elements in this array to string so that the method isdigit() would work and iterate through all the elements inside (825 mail addresses).

When running the code above this is the error i get:

AttributeError: 'numpy.int64' object has no attribute 'isdigit'

Meanwhile, if i try with the numpy array (mail_addresses_toArray) this is the error:

AttributeError: 'numpy.ndarray' object has no attribute 'isdigit'

334

asked Apr 11 '18 11:04

roberto.sannazzaro

2 Answers

Use extract if each mail contains only one number or findall if there is possible multiple ones:

customers_df = pd.DataFrame({'A':['[email protected]','[email protected]',
                                  '[email protected]','[email protected]'],
                   'B':[4,5,4,5],
                   'C':[7,8,9,4]})

print (customers_df)
                        A  B  C
0  [email protected]  4  7
1          [email protected]  5  8
2             [email protected]  4  9
3           [email protected]  5  4

L = customers_df.iloc[:,0].str.extract('(\d+)', expand=False).dropna().astype(int).tolist()
print (L)
[123, 123, 23]

L = np.concatenate(customers_df.iloc[:,0].str.findall('(\d+)')).astype(int).tolist()
print (L)
[123, 123, 23, 55]

answered Oct 03 '22 10:10

jezrael

Here is one way.

import pandas as pd

df = pd.DataFrame({'A': ['[email protected]', '[email protected]',
                         '[email protected]', None]})

s = df['A'].dropna()

t = s.map(lambda x: ''.join([i for i in x if i.isdigit()]).strip())
res = t.loc[t != ''].map(int).tolist()

# [123, 43]

answered Oct 03 '22 09:10

jpp

Related questions
                            
                                How to install a wheel-style package using setup.py
                            
                                Keras : Why does Sequential and Model give different outputs?
                            
                                Odd TypeError from the airflow scheduler -- has usage of @once for scheduler interval changed in v1.9?
                            
                                How do I copy the contents of a word document?
                            
                                How to get stdout and stderr from a tmux session?
                            
                                Sort python dictionary keys based on sub-dictionary keys by defining sorting order
                            
                                Converting Tensor to np.array using K.eval() in Keras returns InvalidArgumentError
                            
                                Time complexity of min, max on sets
                            
                                Q Learning Applied To a Two Player Game
                            
                                Keras ConvLSTM2D: ValueError on output layer
                            
                                ModuleNotFoundError issue for pytest
                            
                                Cryptacular is broken
                            
                                matplotlib 1.3.1 has requirement numpy>=1.5, but you'll have numpy 1.8.0rc1 which is incompatible
                            
                                Python: Remove duplicates for a specific item from list
                            
                                Why can a subprocess still write to stdout after it's been closed?
                            
                                python requests.get gets stuck
                            
                                Is tf.contrib.layers.fully_connected() behavior change between tensorflow 1.3 and 1.4 an issue?
                            
                                Updating an OpenCV tracker with a bounding box in python
                            
                                How to serialize numpy arrays?
                            
                                beautiful soup regex

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Check whether a DataFrame or ndrray contains digits

Tags:

python

pandas

dataframe

numpy

roberto.sannazzaro

People also ask

2 Answers

jezrael

jpp

Recent Activity

Donate For Us