Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex differentiating between ISBN-10 and ISBN-13

Tags:

I have an If-else statement which checks a string to see whether there is an ISBN-10 or ISBN-13 (book ID).

The problem I am facing is with the ISBN-10 check which occurs before the ISBN-13 check, the ISBN-10 check will match anything with 10 characters or more and so may mistake an ISBN-13 for an ISBN-10.

here is the code...

$str = "ISBN:9780113411436";  if(preg_match("/\d{9}(?:\d|X)/", $str, $matches)){    echo "ISBN-10 FOUND\n";      //isbn returned will be 9780113411    return 0; }  else if(preg_match("/\d{12}(?:\d|X)/", $str, $matches)){    echo "ISBN-13 FOUND\n";    //isbn returned will be 9780113411436    return 1; } 

How do I make sure I avoid this problem?

like image 626
mk_89 Avatar asked Dec 30 '12 23:12

mk_89


People also ask

What is the difference between ISBN 10 and ISBN 13 numbers?

ISBN-10 and ISBN-13 indicate how many digits are in the ISBN and are two separate systems for identifying books. Before 2007, there were only 10-digit ISBNs; thereafter, 13-digit ISBNs were introduced and used to increase the availability of ISBNs worldwide.

Do I use ISBN 10 or ISBN 13?

For more than thirty years, ISBNs were 10 digits long. On January 1, 2007 the ISBN system switched to a 13-digit format. Now all ISBNs are 13-digits long. If you were assigned 10-digit ISBNs, you can convert them to the 13-digit format at the converter found at this website.

How do I verify ISBN 13?

The algorithm for checking an ISBN-13 beginning with '979' is the same as the algorithm for ISBN-13s beginning with '978'. To validate any ISBN-13, drop the given check digit, recalculate it, and compare the result of the new calculation to the original check digit. If they match, the number is a valid ISBN-13.

How do I verify an ISBN?

To verify an ISBN, calculate 10 times the first digit, plus 9 times the second digit, plus 8 times the third digit and so on until we add 1 time the last digit. If the final number leaves no remainder when divided by 11, the code is a valid ISBN.


2 Answers

You really only need one regex for this. Then do a more efficient strlen() check to see which one was matched. The following will match ISBN-10 and ISBN-13 values within a string with or without hyphens, and optionally preceded by the string ISBN:, ISBN:(space) or ISBN(space).

Finding ISBNs :

function findIsbn($str) {     $regex = '/\b(?:ISBN(?:: ?| ))?((?:97[89])?\d{9}[\dx])\b/i';      if (preg_match($regex, str_replace('-', '', $str), $matches)) {         return (10 === strlen($matches[1]))             ? 1   // ISBN-10             : 2;  // ISBN-13     }     return false; // No valid ISBN found }  var_dump(findIsbn('ISBN:0-306-40615-2'));     // return 1 var_dump(findIsbn('0-306-40615-2'));          // return 1 var_dump(findIsbn('ISBN:0306406152'));        // return 1 var_dump(findIsbn('0306406152'));             // return 1 var_dump(findIsbn('ISBN:979-1-090-63607-1')); // return 2 var_dump(findIsbn('979-1-090-63607-1'));      // return 2 var_dump(findIsbn('ISBN:9791090636071'));     // return 2 var_dump(findIsbn('9791090636071'));          // return 2 var_dump(findIsbn('ISBN:97811'));             // return false 

This will search a provided string to see if it contains a possible ISBN-10 value (returns 1) or an ISBN-13 value (returns 2). If it does not it will return false.

See DEMO of above.


Validating ISBNs :

For strict validation the Wikipedia article for ISBN has some PHP validation functions for ISBN-10 and ISBN-13. Below are those examples copied, tidied up and modified to be used against a slightly modified version of the above function.

Change the return block to this:

    return (10 === strlen($matches[1]))         ? isValidIsbn10($matches[1])  // ISBN-10         : isValidIsbn13($matches[1]); // ISBN-13 

Validate ISBN-10:

function isValidIsbn10($isbn) {     $check = 0;      for ($i = 0; $i < 10; $i++) {         if ('x' === strtolower($isbn[$i])) {             $check += 10 * (10 - $i);         } elseif (is_numeric($isbn[$i])) {             $check += (int)$isbn[$i] * (10 - $i);         } else {             return false;         }     }      return (0 === ($check % 11)) ? 1 : false; } 

Validate ISBN-13:

function isValidIsbn13($isbn) {     $check = 0;      for ($i = 0; $i < 13; $i += 2) {         $check += (int)$isbn[$i];     }      for ($i = 1; $i < 12; $i += 2) {         $check += 3 * $isbn[$i];     }      return (0 === ($check % 10)) ? 2 : false; } 

See DEMO of above.

like image 189
cryptic ツ Avatar answered Sep 20 '22 20:09

cryptic ツ


Use ^ and $ to match beginning and end of string. By using the string delimiters, the order in which you test the 10 or the 13-digit codes will not matter.

10 digits

/^ISBN:(\d{9}(?:\d|X))$/ 

13 digits

/^ISBN:(\d{12}(?:\d|X))$/ 

Note: According to http://en.wikipedia.org/wiki/International_Standard_Book_Number, it appears as though ISBNs can have a - in them as well. But based on the $str you're using, it looks like you've removed the hyphens before checking for 10 or 13 digits.

Additional note: Because the last digit of the ISBN is used as a sort of checksum for the prior digits, regular expressions alone cannot validate that the ISBN is a valid one. It can only check for 10 or 13-digit formats.


$isbns = array(   'ISBN:1234567890',       // 10-digit   'ISBN:123456789X',       // 10-digit ending in X   'ISBN:1234567890123',    // 13-digit   'ISBN:123456789012X',    // 13-digit ending in X   'ISBN:1234'              // invalid );  function get_isbn($str) {    if (preg_match('/^ISBN:(\d{9}(?:\d|X))$/', $str, $matches)) {       echo "found 10-digit ISBN\n";       return $matches[1];    }    elseif (preg_match('/^ISBN:(\d{12}(?:\d|X))$/', $str, $matches)) {       echo "found 13-digit ISBN\n";       return $matches[1];    }    else {       echo "invalid ISBN\n";       return null;    } }  foreach ($isbns as $str) {    $isbn = get_isbn($str);    echo $isbn."\n\n"; } 

Output

found 10-digit ISBN 1234567890  found 10-digit ISBN 123456789X  found 13-digit ISBN 1234567890123  found 13-digit ISBN 123456789012X  invalid ISBN 
like image 32
maček Avatar answered Sep 20 '22 20:09

maček