Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java: how to parse double from regex

Tags:

I have a string that looks like "A=1.23;B=2.345;C=3.567"

I am only interested in "C=3.567"

what i have so far is:

     Matcher m = Pattern.compile("C=\\d+.\\d+").matcher("A=1.23;B=2.345;C=3.567");      while(m.find()){          double d = Double.parseDouble(m.group());         System.out.println(d);     } 

the problem is it shows the 3 as seperate from the 567

output:

3.0

567.0

i am wondering how i can include the decimal so it outputs "3.567"

EDIT: i would also like to match C if it does not have a decimal point: so i would like to capture 3567 as well as 3.567

since the C= is built into the pattern as well, how can i strip it out before parsing the double?

like image 361
Will Avatar asked Sep 09 '10 23:09

Will


2 Answers

I may be mistaken on this part, but the reason it's separating the two is because group() will only match the last-matched subsequence, which is whatever gets matched by each call to find(). Thanks, Mark Byers.

For sure, though, you can solve this by placing the entire part you want inside a "capturing group", which is done by placing it in parentheses. This makes it so that you can group together matched parts of your regular expression into one substring. Your pattern would then look like:

Pattern.compile("C=(\\d+\\.\\d+)") 

For the parsing 3567 or 3.567, your pattern would be C=(\\d+(\\.\\d+)?) with group 1 representing the whole number. Also, do note that since you specifically want to match a period, you want to escape your . (period) character so that it's not interpreted as the "any-character" token. For this input, though, it doesn't matter

Then, to get your 3.567, you would you would call m.group(1) to grab the first (counting from 1) specified group. This would mean that your Double.parseDouble call would essentially become Double.parseDouble("3.567")

As for taking C= out of your pattern, since I'm not that well-versed with RegExp, I might recommend that you split your input string on the semi-colons and then check to see if each of the splits contain the C; then you could apply the pattern (with the capturing groups) to get the 3.567 from your Matcher.

Edit For the more general (and likely more useful!) cases in gawi's comment, please use the following (from http://www.regular-expressions.info/floatingpoint.html)

Pattern.compile("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?") 

This has support for optional sign, either optional integer or optional decimal parts, and optional positive/negative exponents. Insert capturing groups where desired to pick out parts individually. The exponent as a whole is in its own group to make it, as a whole, optional.

like image 105
Brian Avatar answered Oct 09 '22 11:10

Brian


Your regular expression is only matching numeric characters. To also match the decimal point too you will need:

Pattern.compile("\\d+\\.\\d+") 

The . is escaped because this would match any character when unescaped.

Note: this will then only match numbers with a decimal point which is what you have in your example.

like image 30
Shadwell Avatar answered Oct 09 '22 09:10

Shadwell