Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Filtering: Returning True/False vs the Actual Value

My dataframe:

df_all_xml_mfiles_tgther

      file_names     searching_for                                 everything
0          a.txt             where              Dave Ran Away. Where is Dave?
1          a.txt             candy                                mmmm, candy
2          b.txt              time                We are looking for the book.
3          b.txt             where                   where the red fern grows

My problem:

I am trying to filter for records that contain the words found in my search criteria. I need to go through 1 record at a time and return the actual record instead of just the word true.

What I have tried:

search_content_array = ['where', 'candy', 'time']
file_names_only = ['a.txt', 'b.txt']


for cc in range(0, len(file_names_only), 1):
     for bb in range(0, len(search_content_array), 1):

            stuff = `df_all_xml_mfiles_tgther[cc:cc+1].everything.str.contains(search_content_array[bb], flags=re.IGNORECASE, na=False, regex=True)`

            if not regex_stuff.empty:
                 regex_stuff_new = pd.DataFrame([regex_stuff.rename(None)])
                 regex_stuff_new.columns = ['everything']
                 regex_stuff_new['searched_for_found'] = search_content_array[bb]
                 regex_stuff_new['file_names'] = file_names_only[cc]

            regex_stuff_new = regex_stuff_new[['file_names', 'searched_for_found', 'everything']] ##This rearranges the columns

            df_regex_test =  df_regex_test.append(regex_stuff_new, ignore_index=True, sort=False)

The results I am getting are this:

    file_names  searched_for_found  everything
0        a.txt               where        True
1        a.txt               candy        True
2        b.txt               where        True

The results I want are this:

    file_names  searched_for_found                           everything
0        a.txt               where        Dave Ran Away. Where is Dave?
1        a.txt               candy                          mmmm, candy
3        b.txt               where             where the red fern grows

How do I get the actual value for returned results instead of just true/false?

like image 811
Chicken Sandwich No Pickles Avatar asked Nov 17 '25 18:11

Chicken Sandwich No Pickles


2 Answers

Do this elementwise using a list comprehension.

df[[y.lower() in x.lower() for x, y in zip(df['everything'], df['searching_for'])]]

Or,

df[[y.lower() in x.lower() 
    for x, y in df[['everything', 'searching_for']].values.tolist()]]


  file_names searching_for                     everything
0      a.txt         where  Dave Ran Away. Where is Dave?
1      a.txt         candy                    mmmm, candy
3      b.txt         where       where the red fern grows
like image 114
cs95 Avatar answered Nov 19 '25 07:11

cs95


Using replace and str.contains, PS I think cold's method is more succinct

s=df.everything.replace(regex=r'(?i)'+ df.searching_for,value='OkIFINDIT')
df[s.str.contains('OkIFINDIT')]
Out[405]: 
  file_names searching_for                  everything
0      a.txt         where Dave Ran Away Where is Dave
1      a.txt         candy                  mmmm,candy
3      b.txt         where    where the red fern grows
like image 20
BENY Avatar answered Nov 19 '25 07:11

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!