I have an If-else statement which checks a string to see whether there is an ISBN-10 or ISBN-13 (book ID).
The problem I am facing is with the ISBN-10 check which occurs before the ISBN-13 check, the ISBN-10 check will match anything with 10 characters or more and so may mistake an ISBN-13 for an ISBN-10.
here is the code...
$str = "ISBN:9780113411436"; if(preg_match("/\d{9}(?:\d|X)/", $str, $matches)){ echo "ISBN-10 FOUND\n"; //isbn returned will be 9780113411 return 0; } else if(preg_match("/\d{12}(?:\d|X)/", $str, $matches)){ echo "ISBN-13 FOUND\n"; //isbn returned will be 9780113411436 return 1; }
How do I make sure I avoid this problem?
ISBN-10 and ISBN-13 indicate how many digits are in the ISBN and are two separate systems for identifying books. Before 2007, there were only 10-digit ISBNs; thereafter, 13-digit ISBNs were introduced and used to increase the availability of ISBNs worldwide.
For more than thirty years, ISBNs were 10 digits long. On January 1, 2007 the ISBN system switched to a 13-digit format. Now all ISBNs are 13-digits long. If you were assigned 10-digit ISBNs, you can convert them to the 13-digit format at the converter found at this website.
The algorithm for checking an ISBN-13 beginning with '979' is the same as the algorithm for ISBN-13s beginning with '978'. To validate any ISBN-13, drop the given check digit, recalculate it, and compare the result of the new calculation to the original check digit. If they match, the number is a valid ISBN-13.
To verify an ISBN, calculate 10 times the first digit, plus 9 times the second digit, plus 8 times the third digit and so on until we add 1 time the last digit. If the final number leaves no remainder when divided by 11, the code is a valid ISBN.
You really only need one regex for this. Then do a more efficient strlen()
check to see which one was matched. The following will match ISBN-10 and ISBN-13 values within a string with or without hyphens, and optionally preceded by the string ISBN:
, ISBN:(space)
or ISBN(space)
.
function findIsbn($str) { $regex = '/\b(?:ISBN(?:: ?| ))?((?:97[89])?\d{9}[\dx])\b/i'; if (preg_match($regex, str_replace('-', '', $str), $matches)) { return (10 === strlen($matches[1])) ? 1 // ISBN-10 : 2; // ISBN-13 } return false; // No valid ISBN found } var_dump(findIsbn('ISBN:0-306-40615-2')); // return 1 var_dump(findIsbn('0-306-40615-2')); // return 1 var_dump(findIsbn('ISBN:0306406152')); // return 1 var_dump(findIsbn('0306406152')); // return 1 var_dump(findIsbn('ISBN:979-1-090-63607-1')); // return 2 var_dump(findIsbn('979-1-090-63607-1')); // return 2 var_dump(findIsbn('ISBN:9791090636071')); // return 2 var_dump(findIsbn('9791090636071')); // return 2 var_dump(findIsbn('ISBN:97811')); // return false
This will search a provided string to see if it contains a possible ISBN-10 value (returns 1
) or an ISBN-13 value (returns 2
). If it does not it will return false
.
See DEMO of above.
For strict validation the Wikipedia article for ISBN has some PHP validation functions for ISBN-10 and ISBN-13. Below are those examples copied, tidied up and modified to be used against a slightly modified version of the above function.
Change the return block to this:
return (10 === strlen($matches[1])) ? isValidIsbn10($matches[1]) // ISBN-10 : isValidIsbn13($matches[1]); // ISBN-13
Validate ISBN-10:
function isValidIsbn10($isbn) { $check = 0; for ($i = 0; $i < 10; $i++) { if ('x' === strtolower($isbn[$i])) { $check += 10 * (10 - $i); } elseif (is_numeric($isbn[$i])) { $check += (int)$isbn[$i] * (10 - $i); } else { return false; } } return (0 === ($check % 11)) ? 1 : false; }
Validate ISBN-13:
function isValidIsbn13($isbn) { $check = 0; for ($i = 0; $i < 13; $i += 2) { $check += (int)$isbn[$i]; } for ($i = 1; $i < 12; $i += 2) { $check += 3 * $isbn[$i]; } return (0 === ($check % 10)) ? 2 : false; }
See DEMO of above.
Use ^
and $
to match beginning and end of string. By using the string delimiters, the order in which you test the 10 or the 13-digit codes will not matter.
/^ISBN:(\d{9}(?:\d|X))$/
/^ISBN:(\d{12}(?:\d|X))$/
Note: According to http://en.wikipedia.org/wiki/International_Standard_Book_Number, it appears as though ISBNs can have a -
in them as well. But based on the $str
you're using, it looks like you've removed the hyphens before checking for 10 or 13 digits.
Additional note: Because the last digit of the ISBN is used as a sort of checksum for the prior digits, regular expressions alone cannot validate that the ISBN is a valid one. It can only check for 10 or 13-digit formats.
$isbns = array( 'ISBN:1234567890', // 10-digit 'ISBN:123456789X', // 10-digit ending in X 'ISBN:1234567890123', // 13-digit 'ISBN:123456789012X', // 13-digit ending in X 'ISBN:1234' // invalid ); function get_isbn($str) { if (preg_match('/^ISBN:(\d{9}(?:\d|X))$/', $str, $matches)) { echo "found 10-digit ISBN\n"; return $matches[1]; } elseif (preg_match('/^ISBN:(\d{12}(?:\d|X))$/', $str, $matches)) { echo "found 13-digit ISBN\n"; return $matches[1]; } else { echo "invalid ISBN\n"; return null; } } foreach ($isbns as $str) { $isbn = get_isbn($str); echo $isbn."\n\n"; }
Output
found 10-digit ISBN 1234567890 found 10-digit ISBN 123456789X found 13-digit ISBN 1234567890123 found 13-digit ISBN 123456789012X invalid ISBN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With