Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java String split not returning the right values

Tags:

I'm trying to parse a txt file that represents a grammar to be used in a recursive descent parser. The txt file would look something like this:

SPRIME ::= Expr eof
Expr ::= Term Expr'
Expr' ::= + Term Expr' | - Term Expr' | e

To isolate the left hand side and split the right hand side into seperate production rules, I take each line and call:

String[] firstSplit = line.split("::="); String LHS = firstSplit[0]; String productionRules = firstSplit[1].split("|"); 

However, when I call the second split method, I am not returned an array of the Strings separated by the "|" character, but an array of each indiviudual character on the right hand side, including "|". So for instance, if I was parsing the Expr' rule and printed the productionRules array, it would look like this:

"+"
"Term"
"Expr'"
""
"|"

When what I really want should look like this:

  • Term Expr'

Anyone have any ideas what I'm doing wrong?

like image 651
Richard Stokes Avatar asked Apr 15 '11 11:04

Richard Stokes


People also ask

Why is split function not working?

Answers. The SPLIT function belongs to VBA. It isn't part of Excel because it returns an array. Spreadsheets show elements of arrays in different cells.

Why split is not working for dot in Java?

backslash-dot is invalid because Java doesn't need to escape the dot. You've got to escape the escape character to get it as far as the regex which is used to split the string.

Does split () alter the original string?

The splitter can be a single character, another string, or a regular expression. After splitting the string into multiple substrings, the split() method puts them in an array and returns it. It doesn't make any modifications to the original string.

Why does split return empty string?

In the case of splitting an empty string, the first mode (no argument) will return an empty list because the whitespace is eaten and there are no values to put in the result list. In contrast, the second mode (with an argument such as \n ) will produce the first empty field.


2 Answers

The parameter to String.split() is a regular expression, and the vertical bar character is special.

Try escaping it with a backslash:

String productionRules = firstSplit[1].split("\\|"); 

NB: two backslashes are required, since the backslash character itself is special within string literals.

like image 117
Alnitak Avatar answered Sep 19 '22 06:09

Alnitak


Since split takes a regex as argument you have to escape all non-intended regex symbols.

like image 45
dcn Avatar answered Sep 20 '22 06:09

dcn