I am trying to check if a certain value is contained in a python column. I'm using df.date.isin(['07311954'])
, which I do not doubt to be a good tool. The problem is that I have over 350K rows and the output won't show all of them so that I can see if the value is actually contained. Put simply, I just want to know (Y/N) whether or not a specific value is contained in a column. My code follows:
import numpy as np import pandas as pd import glob df = (pd.read_csv('/home/jayaramdas/anaconda3/Thesis/FEC_data/itpas2_data/itpas214.txt',\ sep='|', header=None, low_memory=False, names=['1', '2', '3', '4', '5', '6', '7', \ '8', '9', '10', '11', '12', '13', 'date', '15', '16', '17', '18', '19', '20', \ '21', '22'])) df.date.isin(['07311954'])
To find duplicates on a specific column, we can simply call duplicated() method on the column. The result is a boolean Series with the value True denoting duplicate. In other words, the value True means the entry is identical to a previous one.
You can see how we can determine a pandas column contains a particular value of DataFrame using Series. Str. contains() . This contains() function is used to test the pattern or regex is contained within a string of a Series or Index.
Code 1: Find duplicate columns in a DataFrame. To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.
You can simply use this:
'07311954' in df.date.values
which returns True
or False
Here is the further explanation:
In pandas, using in
check directly with DataFrame and Series (e.g. val in df
or val in series
) will check whether the val
is contained in the Index.
BUT you can still use in
check for their values too (instead of Index)! Just using val in df.col_name.values
or val in series.values
. In this way, you are actually checking the val
with a Numpy array.
And .isin(vals)
is the other way around, it checks whether the DataFrame/Series values are in the vals
. Here vals
must be set or list-like. So this is not the natural way to go for the question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With