Let's say I have a string <code>'gfgfdAAA1234ZZZuijjk'</code> and I want to extract just the <code>'1234'</code> part. I only know what will be the few characters directly before <code>AAA</code>, and after <code>ZZZ</code> the part I am interested in <code>1234</code>. With <code>sed</code> it is possible to do something like this with a string: <pre class="prettyprint"><code>echo "$STRING" | sed -e "s|.*AAA$.*$ZZZ.*|\1|" </code></pre> And this will give me <code>1234</code> as a result. How to do the same thing in Python?

Using regular expressions - documentation for further reference <pre class="prettyprint"><code>import re text = 'gfgfdAAA1234ZZZuijjk' m = re.search('AAA(.+?)ZZZ', text) if m: found = m.group(1) # found: 1234 </code></pre> or: <pre class="prettyprint"><code>import re text = 'gfgfdAAA1234ZZZuijjk' try: found = re.search('AAA(.+?)ZZZ', text).group(1) except AttributeError: # AAA, ZZZ not found in the original string found = '' # apply your error handling # found: 1234 </code></pre>

<pre class="prettyprint"><code>>>> s = 'gfgfdAAA1234ZZZuijjk' >>> start = s.find('AAA') + 3 >>> end = s.find('ZZZ', start) >>> s[start:end] '1234' </code></pre> Then you can use regexps with the re module as well, if you want, but that's not necessary in your case.

How to extract the substring between two markers?

Tags:

python

string

substring

Let's say I have a string 'gfgfdAAA1234ZZZuijjk' and I want to extract just the '1234' part.

I only know what will be the few characters directly before AAA, and after ZZZ the part I am interested in 1234.

With sed it is possible to do something like this with a string:

echo "$STRING" | sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"

And this will give me 1234 as a result.

How to do the same thing in Python?

821

asked Jan 12 '11 09:01

ria

2 Answers

Using regular expressions - documentation for further reference

import re  text = 'gfgfdAAA1234ZZZuijjk'  m = re.search('AAA(.+?)ZZZ', text) if m:     found = m.group(1)  # found: 1234

or:

import re  text = 'gfgfdAAA1234ZZZuijjk'  try:     found = re.search('AAA(.+?)ZZZ', text).group(1) except AttributeError:     # AAA, ZZZ not found in the original string     found = '' # apply your error handling  # found: 1234

119

answered Sep 20 '22 05:09

eumiro

>>> s = 'gfgfdAAA1234ZZZuijjk' >>> start = s.find('AAA') + 3 >>> end = s.find('ZZZ', start) >>> s[start:end] '1234'

Then you can use regexps with the re module as well, if you want, but that's not necessary in your case.

answered Sep 20 '22 05:09

Lennart Regebro

Related questions
                            
                                Automatically creating directories with file output [duplicate]
                            
                                JSONDecodeError: Expecting value: line 1 column 1 (char 0)
                            
                                How do I get the path of the Python script I am running in? [duplicate]
                            
                                Installing Python packages from local file system folder to virtualenv with pip
                            
                                How to get POSTed JSON in Flask?
                            
                                Implement touch using Python?
                            
                                Removing Conda environment
                            
                                How do I create test and train samples from one dataframe with pandas?
                            
                                Selecting/excluding sets of columns in pandas [duplicate]
                            
                                Convert Python dict into a dataframe
                            
                                What is __main__.py?
                            
                                Sorting arrays in NumPy by column
                            
                                How do I add default parameters to functions when using type hinting?
                            
                                How to re import an updated package while in Python Interpreter? [duplicate]
                            
                                How to install python3 version of package via pip on Ubuntu?
                            
                                When is del useful in Python?
                            
                                How to round to 2 decimals with Python?
                            
                                How to select all columns, except one column in pandas?
                            
                                Convert base-2 binary number string to int
                            
                                How to save a Python interactive session?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With