Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

Python Regex Match Before Character AND Ignore White Space

Tags:

python

regex

I'm trying to write a regex to match part of a string that comes before '/' but also ignores any leading or trailing white space within the match.

So far I've got ^[^\/]* which matches everything before the '/' but I can't figure out how to ignore the white space.

      123 / some text 123

should yield

and

     a test / some text 123

should yield

a test

like image

448

asked May 17 '19 20:05

harryk

People also ask

How do you skip a space in regex?

You can stick optional whitespace characters \s* in between every other character in your regex.

Does regex ignore whitespace?

regex ignore spacesTrim whitespaces around string, but not inside of string.

Which modifier ignores white space in regex?

With flavors that support mode modifiers, you can put (? x) the very start of the regex to make the remainder of the regex free-spacing. In free-spacing mode, whitespace between regular expression tokens is ignored.

Which regex matches only a whitespace character in Python?

\s | Matches whitespace characters, which include the \t , \n , \r , and space characters.

4 Answers

That's a little bit tricky. You first start matching from a non-whitespace character then continue matching slowly but surely up to the position that is immediately followed by an optional number of spaces and a slash mark:

\S.*?(?= *\/)

See live demo here

If slash mark could be the first non-whitespace character in input string then replace \S with [^\s\/]:

[^\s\/].*?(?= *\/)

like image

127

answered Oct 24 '22 19:10

revo

This expression is what you might want to explore:

^(.*?)(\s+\/.*)$

Here, we have two capturing groups where the first one collects your desired output, and the second one is your undesired pattern, bounded by start and end chars, just to be safe that can be removed if you want:

(.*?)(\s+\/.*)

Python Test

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"^(.*?)(\s+\/.*)$"

test_str = ("123 / some text 123\n"
    "anything else    / some text 123")

subst = "\\1"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

JavaScript Demo

const regex = /^(.*?)(\s+\/.*)$/gm;
const str = `123 / some text 123
anything else    / some text 123`;
const subst = `\n$1`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

RegEx

If this wasn't your desired expression, you can modify/change your expressions in regex101.com.

enter image description here

RegEx Circuit

You can also visualize your expressions in jex.im:

enter image description here

Spaces

For spaces before your desired output, we can simply add a capturing group with negative lookbehind:

 ^(\s+)?(.*?)(\s+\/.*)$

JavaScript Demo

const regex = /^(\s+)?(.*?)(\s+\/.*)$/gm;
const str = `      123 / some text 123
             anything else    / some text 123
123 / some text 123
anything else    / some text 123`;
const subst = `$2`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

Demo

enter image description here

like image

26

answered Oct 24 '22 19:10

Emma

Here is a possible solution

Regex

(?<!\/)\S.*\S(?=\s*\/)

Example

# import regex # or re

string = ' 123 / some text 123'
test = regex.search(r'(?<!\/)\S.*\S(?=\s*\/)', string)
print(test.group(0))
# prints '123'

string = 'a test / some text 123'
test = regex.search(r'(?<!\/)\S.*\S(?=\s*\/)', string)
print(test.group(0))
# prints 'a test'

Short explanation

(?<!\/) says before a possible match there can be no / symbol.
\S.*\S matches lazily anything (.*) while making sure it does not start or end with a white space (\S)
(?=\s*\/) means a possible match must be followed by a / symbol or by white spaces + a /.

like image

24

answered Oct 24 '22 18:10

user101

You could do it without a regex

my_string = "      123 / some text 123"
match = my_string.split("/")[0].strip()

like image

31

answered Oct 24 '22 18:10

Boris

Sign in to Comment

Related questions
                            
                                Python: logging comments printed to console before other outputs
                            
                                Wrong current working directory when running python code and jupyter extension in vscode
                            
                                Find elements in a list of which all elements in another list are factors, using a list comprehension
                            
                                Homebrew pyenv install error dyld: Library not loaded: /usr/local/opt/readline/lib/libreadline.7.dylib
                            
                                Python pytest does not show assertion differences
                            
                                /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found required by TensorFlow
                            
                                How to run flask_migrate in Docker
                            
                                Pytest - testing parser Error : Unrecognised arguments
                            
                                Pandas groupby give any non nan values
                            
                                How to train a neural network model with bert embeddings instead of static embeddings like glove/fasttext?
                            
                                how to avoid using _siftup or _siftdown in heapq
                            
                                redis installation using conda not working ModuleNotFoundError No module named 'redis'
                            
                                Convert string with NaNs to int in pandas
                            
                                Build 2d pyramidal array - Python / NumPy
                            
                                Recursively iterate through a nested dict with list, and replace matched values
                            
                                How to use gradient_override_map in Tensorflow 2.0?
                            
                                Comparing two lists element-wise in python [duplicate]
                            
                                Pyspark Error:- dataType <class 'pyspark.sql.types.StringType'> should be an instance of <class 'pyspark.sql.types.DataType'>
                            
                                How to highlight searched queries in result page of Django template?
                            
                                How to write a pandas.DataFrame to csv file with custom header?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With