I have a problem with a certain Scala code, where I found this split line. Before I only used split lines like:
var newLine = line.split(",")
But what does this split mean?
var newLine2 = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)")
The line I need to split looks like this:
1966, "Green, Green Grass of Home", Tom Jones, 850000
Thanks in advance!
The string inside split method defines a regular expression.
The group (?=([^\"]*\"[^\"]*\")*[^\"]*$)
is a positive lookahead assertion. That means split on a comma, but only if the pattern ([^\"]*\"[^\"]*\")*[^\"]*$
is following the comma.
([^\"]* # a series of non double quote characters
\" # a double quote
[^\"]* # a series of non double quote characters
\") # a double quote
* # repeat that whole group 0 or more times
[^\"]*$ # a series of non double quote characters till the end of the string
that means it will only split on commas, when there is an equal amount of double quotes following the comma, so in other words, split only if the comma is not inside double quotes. (This will work as long there are only pairs of quotes in the string.)
This is an regular expression ("RegEx"), see http://en.wikipedia.org/wiki/Regular_expression for an explanation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With