Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Searching for number after a specific word that does not immediately precede the number

Tags:

java

regex

I am trying to use a pattern to search for a Zip Code within a string. I cannot get it to work correctly.

A sample of the inputLine is

What is the weather in 75042?

What I am trying to use for a pattern is

public String getZipcode(String inputLine) {

        Pattern pattern = Pattern.compile(".*weather.*([0-9]+).*");
        Matcher matcher = pattern.matcher(inputLine);

        if (matcher.find()) {

            return matcher.group(1).toString();
        }

        return "Zipcode Not Found.";

    }

If I am looking to only get 75002, what do I need to change? This only outputs the last digit in the number, 2. I am terribly confused and I do not completely understand the Javadocs for the Pattern class.

like image 330
Thomas Walker Avatar asked Jul 05 '18 08:07

Thomas Walker


3 Answers

The reason is because the .* matches the first digits and let only one left for your capturing group, you have to throw it away

A more simple pattern can be used here : \D+(\d+)\D+ which means

  • some non-digits \D+, then some digits to capture (\d+), then some non-digits \D+
public String getZipcode(String inputLine) {
    Pattern pattern = Pattern.compile("\\D+(\\d+)\\D+");
    Matcher matcher = pattern.matcher(inputLine);

    if (matcher.find()) {
        return matcher.group(1).toString();
    }
    return "Zipcode Not Found.";
}

Workable Demo

like image 198
azro Avatar answered Jan 04 '23 04:01

azro


The problem is that your middle .* is too greedy and eats away 7500. One easy fix is to add a space before your regexp: .*weather.* ([0-9]+).* or even use \\s. But the best is to use non-greedy version of .*? so regexp should be .*weather.*?([0-9]+).*

like image 40
Damir Kovačić Avatar answered Jan 04 '23 05:01

Damir Kovačić


Spaces are missing in your regex (\s). You can use \s* or \s+ based on your data

Pattern pattern = Pattern.compile("weather\\s*\\w+\\s*(\\d+)");
Matcher matcher = pattern.matcher(inputLine);
like image 36
Yati Sawhney Avatar answered Jan 04 '23 06:01

Yati Sawhney