Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract an ISBN with regex

Tags:

java

regex

I have an extremely long string that I want to parse for a numeric value that occurs after the substring "ISBN". However, this grouping of 13 digits can be arranged differently via the "-" character. Examples: (these are all valid ISBNs) 123-456-789-123-4, OR 1-2-3-4-5-67891234, OR 12-34-56-78-91-23-4. Essentially, I want to use a regex pattern matcher on the potential ISBN to see if there is a valid 13 digit ISBN. How do I 'ignore' the "-" character so I can just regex for a \d{13} pattern? My function:

public String parseISBN (String sourceCode) {
  int location = sourceCode.indexOf("ISBN") + 5;
  String ISBN = sourceCode.substring(location); //substring after "ISBN" occurs
  int i = 0;
  while ( ISBN.charAt(i) != ' ' )
    i++;
  ISBN = ISBN.substring(0, i); //should contain potential ISBN value
  Pattern pattern = Pattern.compile("\\d{13}"); //this clearly will find 13 consecutive numbers, but I need it to ignore the "-" character
  Matcher matcher = pattern.matcher(ISBN); 
  if (matcher.find()) return ISBN;
  else return null;
}
like image 780
Adam Storm Avatar asked Nov 30 '22 07:11

Adam Storm


2 Answers

Try this:

Pattern.compile("\\d(-?\\d){12}")
like image 33
Jonathan M Avatar answered Dec 09 '22 19:12

Jonathan M


  • Alternative 1:

    pattern.matcher(ISBN.replace("-", ""))
    
  • Alternative 2: Something like

    Pattern.compile("(\\d-?){13}")
    

Demo of second alternative:

String ISBN = "ISBN: 123-456-789-112-3, ISBN: 1234567891123";

Pattern pattern = Pattern.compile("(\\d-?){13}");
Matcher matcher = pattern.matcher(ISBN);

while (matcher.find())
    System.out.println(matcher.group());

Output:

123-456-789-112-3
1234567891123
like image 178
aioobe Avatar answered Dec 09 '22 18:12

aioobe