By using <code>(</code>,<code>)</code>, you are capturing the group, if you simply remove them you will not have this problem. <pre class="prettyprint"><code>>>> str1 = "a b c d" >>> re.split(" +", str1) ['a', 'b', 'c', 'd'] </code></pre> However there is no need for regex, <code>str.split</code> without any delimiter specified will split this by whitespace for you. This would be the best way in this case. <pre class="prettyprint"><code>>>> str1.split() ['a', 'b', 'c', 'd'] </code></pre> If you really wanted regex you can use this (<code>'\s'</code> represents whitespace and it's clearer): <pre class="prettyprint"><code>>>> re.split("\s+", str1) ['a', 'b', 'c', 'd'] </code></pre> or you can find all non-whitespace characters <pre class="prettyprint"><code>>>> re.findall(r'\S+',str1) ['a', 'b', 'c', 'd'] </code></pre> The <code>str.split</code> method will automatically remove all white space between items: <pre class="prettyprint"><code>>>> str1 = "a b c d" >>> str1.split() ['a', 'b', 'c', 'd'] </code></pre> Docs are here: http://docs.python.org/library/stdtypes.html#str.split When you use <code>re.split</code> and the split pattern contains capturing groups, the groups are retained in the output. If you don't want this, use a non-capturing group instead. Its very simple actually. Try this: <pre class="prettyprint"><code>str1="a b c d" splitStr1 = str1.split() print splitStr1 </code></pre>

Split string based on a regular expression

Tags:

python

regex

By using (,), you are capturing the group, if you simply remove them you will not have this problem.

>>> str1 = "a    b     c      d"
>>> re.split(" +", str1)
['a', 'b', 'c', 'd']

However there is no need for regex, str.split without any delimiter specified will split this by whitespace for you. This would be the best way in this case.

>>> str1.split()
['a', 'b', 'c', 'd']

If you really wanted regex you can use this ('\s' represents whitespace and it's clearer):

>>> re.split("\s+", str1)
['a', 'b', 'c', 'd']

or you can find all non-whitespace characters

>>> re.findall(r'\S+',str1)
['a', 'b', 'c', 'd']

The str.split method will automatically remove all white space between items:

>>> str1 = "a    b     c      d"
>>> str1.split()
['a', 'b', 'c', 'd']

Docs are here: http://docs.python.org/library/stdtypes.html#str.split

When you use re.split and the split pattern contains capturing groups, the groups are retained in the output. If you don't want this, use a non-capturing group instead.

Its very simple actually. Try this:

str1="a    b     c      d"
splitStr1 = str1.split()
print splitStr1

Related questions
                            
                                What is the most efficient way of finding all the factors of a number in Python?
                            
                                How to convert a string of bytes into an int?
                            
                                How to list only top level directories in Python?
                            
                                Why is x**4.0 faster than x**4 in Python 3?
                            
                                Type hints with user defined classes
                            
                                Why do you need to create a cursor when querying a sqlite database?
                            
                                When should I use ugettext_lazy?
                            
                                Cython: "fatal error: numpy/arrayobject.h: No such file or directory"
                            
                                What is the difference between os.path.basename() and os.path.dirname()?
                            
                                How to convert a boolean array to an int array
                            
                                Get Output From the logging Module in IPython Notebook
                            
                                Why were pandas merges in python faster than data.table merges in R in 2012?
                            
                                OSError: [Errno 2] No such file or directory while using python subprocess in Django
                            
                                Adding a y-axis label to secondary y-axis in matplotlib
                            
                                What's the difference between %s and %d in Python string formatting?
                            
                                What is the difference between isinstance('aaa', basestring) and isinstance('aaa', str)?
                            
                                Python debugging tips [closed]
                            
                                Flask-SQLalchemy update a row's information
                            
                                Add SUM of values of two LISTS into new LIST
                            
                                Why can't non-default arguments follow default arguments?

Split string based on a regular expression

Tags:

python

regex

Recent Activity

Donate For Us