Use Split () Function This function splits the string into smaller sections. This is the opposite of merging many strings into one. The split () function contains two parameters. In the first parameter, we pass the symbol that is used for the split.
String parsing in java can be done by using a wrapper class. Using the Split method, a String can be converted to an array by passing the delimiter to the split method. The split method is one of the methods of the wrapper class. String parsing can also be done through StringTokenizer.
"2.7.0_bf4fda703454".split("_")
gives a list of strings:
In [1]: "2.7.0_bf4fda703454".split("_")
Out[1]: ['2.7.0', 'bf4fda703454']
This splits the string at every underscore. If you want it to stop after the first split, use "2.7.0_bf4fda703454".split("_", 1)
.
If you know for a fact that the string contains an underscore, you can even unpack the LHS and RHS into separate variables:
In [8]: lhs, rhs = "2.7.0_bf4fda703454".split("_", 1)
In [9]: lhs
Out[9]: '2.7.0'
In [10]: rhs
Out[10]: 'bf4fda703454'
An alternative is to use partition()
. The usage is similar to the last example, except that it returns three components instead of two. The principal advantage is that this method doesn't fail if the string doesn't contain the separator.
Split a string on space, get a list, show its type, print it out:
el@apollo:~/foo$ python
>>> mystring = "What does the fox say?"
>>> mylist = mystring.split(" ")
>>> print type(mylist)
<type 'list'>
>>> print mylist
['What', 'does', 'the', 'fox', 'say?']
If you have two delimiters next to each other, empty string is assumed:
el@apollo:~/foo$ python
>>> mystring = "its so fluffy im gonna DIE!!!"
>>> print mystring.split(" ")
['its', '', 'so', '', '', 'fluffy', '', '', 'im', 'gonna', '', '', '', 'DIE!!!']
Split a string on underscore and grab the 5th item in the list:
el@apollo:~/foo$ python
>>> mystring = "Time_to_fire_up_Kowalski's_Nuclear_reactor."
>>> mystring.split("_")[4]
"Kowalski's"
Collapse multiple spaces into one
el@apollo:~/foo$ python
>>> mystring = 'collapse these spaces'
>>> mycollapsedstring = ' '.join(mystring.split())
>>> print mycollapsedstring.split(' ')
['collapse', 'these', 'spaces']
When you pass no parameter to Python's split method, the documentation states: "runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace".
Hold onto your hats boys, parse on a regular expression:
el@apollo:~/foo$ python
>>> mystring = 'zzzzzzabczzzzzzdefzzzzzzzzzghizzzzzzzzzzzz'
>>> import re
>>> mylist = re.split("[a-m]+", mystring)
>>> print mylist
['zzzzzz', 'zzzzzz', 'zzzzzzzzz', 'zzzzzzzzzzzz']
The regular expression "[a-m]+" means the lowercase letters a
through m
that occur one or more times are matched as a delimiter. re
is a library to be imported.
Or if you want to chomp the items one at a time:
el@apollo:~/foo$ python
>>> mystring = "theres coffee in that nebula"
>>> mytuple = mystring.partition(" ")
>>> print type(mytuple)
<type 'tuple'>
>>> print mytuple
('theres', ' ', 'coffee in that nebula')
>>> print mytuple[0]
theres
>>> print mytuple[2]
coffee in that nebula
If it's always going to be an even LHS/RHS split, you can also use the partition
method that's built into strings. It returns a 3-tuple as (LHS, separator, RHS)
if the separator is found, and (original_string, '', '')
if the separator wasn't present:
>>> "2.7.0_bf4fda703454".partition('_')
('2.7.0', '_', 'bf4fda703454')
>>> "shazam".partition("_")
('shazam', '', '')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With