Filter Rows by Condition You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows.
If you need to extract data that matches regex pattern from a column in Pandas dataframe you can use extract method in Pandas pandas.
Slicing Rows and Columns by Index PositionWhen slicing by index position in Pandas, the start index is included in the output, but the stop index is one step beyond the row you want to select. So the slice return row 0 and row 1, but does not return row 2. The second slice [:] indicates that all columns are required.
Use contains instead:
In [10]: df.b.str.contains('^f')
Out[10]:
0 False
1 True
2 True
3 False
Name: b, dtype: bool
There is already a string handling function Series.str.startswith()
.
You should try foo[foo.b.str.startswith('f')]
.
Result:
a b
1 2 foo
2 3 fat
I think what you expect.
Alternatively you can use contains with regex option. For example:
foo[foo.b.str.contains('oo', regex= True, na=False)]
Result:
a b
1 2 foo
na=False
is to prevent Errors in case there is nan, null etc. values
It may be a bit late, but this is now easier to do in Pandas by calling Series.str.match
. The docs explain the difference between match
, fullmatch
and contains
.
Note that in order to use the results for indexing, set the na=False
argument (or True
if you want to include NANs in the results).
Multiple column search with dataframe:
frame[frame.filename.str.match('*.'+MetaData+'.*') & frame.file_path.str.match('C:\test\test.txt')]
Building off of the great answer by user3136169, here is an example of how that might be done also removing NoneType values.
def regex_filter(val):
if val:
mo = re.search(regex,val)
if mo:
return True
else:
return False
else:
return False
df_filtered = df[df['col'].apply(regex_filter)]
You can also add regex as an arg:
def regex_filter(val,myregex):
...
df_filtered = df[df['col'].apply(regex_filter,regex=myregex)]
Write a Boolean function that checks the regex and use apply on the column
foo[foo['b'].apply(regex_function)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With