Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking if an ISBN number is correct

Tags:

python

I'm given some ISBN numbers e.g. 3-528-03851 (not valid) , 3-528-16419-0 (valid). I'm supposed to write a program which tests if the ISBN number is valid.

Here' my code:

def check(isbn):
    check_digit = int(isbn[-1])
    match = re.search(r'(\d)-(\d{3})-(\d{5})', isbn[:-1])

    if match:
        digits = match.group(1) + match.group(2) + match.group(3)
        result = 0

        for i, digit in enumerate(digits):
          result += (i + 1) * int(digit)

        return True if (result % 11) == check_digit else False

    return False

I've used a regular expression to check a) if the format is valid and b) extract the digits in the ISBN string. While it seems to work, being a Python beginner I'm eager to know how I could improve my code. Suggestions?

like image 348
helpermethod Avatar asked Oct 28 '10 22:10

helpermethod


2 Answers

First, try to avoid code like this:

if Action():
    lots of code
    return True
return False

Flip it around, so the bulk of code isn't nested. This gives us:

def check(isbn):
    check_digit = int(isbn[-1])
    match = re.search(r'(\d)-(\d{3})-(\d{5})', isbn[:-1])

    if not match:
        return False

    digits = match.group(1) + match.group(2) + match.group(3)
    result = 0

    for i, digit in enumerate(digits):
      result += (i + 1) * int(digit)

    return True if (result % 11) == check_digit else False

There are some bugs in the code:

  • If the check digit isn't an integer, this will raise ValueError instead of returning False: "0-123-12345-Q".
  • If the check digit is 10 ("X"), this will raise ValueError instead of returning True.
  • This assumes that the ISBN is always grouped as "1-123-12345-1". That's not the case; ISBNs are grouped arbitrarily. For example, the grouping "12-12345-12-1" is valid. See http://www.isbn.org/standards/home/isbn/international/html/usm4.htm.
  • This assumes the ISBN is grouped by hyphens. Spaces are also valid.
  • It doesn't check that there are no extra characters; '0-123-4567819' returns True, ignoring the extra 1 at the end.

So, let's simplify this. First, remove all spaces and hyphens, and make sure the regex matches the whole line by bracing it in '^...$'. That makes sure it rejects strings which are too long.

def check(isbn):
    isbn = isbn.replace("-", "").replace(" ", "");
    check_digit = int(isbn[-1])
    match = re.search(r'^(\d{9})$', isbn[:-1])
    if not match:
        return False

    digits = match.group(1)

    result = 0
    for i, digit in enumerate(digits):
      result += (i + 1) * int(digit)

    return True if (result % 11) == check_digit else False

Next, let's fix the "X" check digit problem. Match the check digit in the regex as well, so the entire string is validated by the regex, then convert the check digit correctly.

def check(isbn):
    isbn = isbn.replace("-", "").replace(" ", "").upper();
    match = re.search(r'^(\d{9})(\d|X)$', isbn)
    if not match:
        return False

    digits = match.group(1)
    check_digit = 10 if match.group(2) == 'X' else int(match.group(2))

    result = 0
    for i, digit in enumerate(digits):
      result += (i + 1) * int(digit)

    return True if (result % 11) == check_digit else False

Finally, using a generator expression and max is a more idiomatic way of doing the final calculation in Python, and the final conditional can be simplified.

def check(isbn):
    isbn = isbn.replace("-", "").replace(" ", "").upper();
    match = re.search(r'^(\d{9})(\d|X)$', isbn)
    if not match:
        return False

    digits = match.group(1)
    check_digit = 10 if match.group(2) == 'X' else int(match.group(2))

    result = sum((i + 1) * int(digit) for i, digit in enumerate(digits))
    return (result % 11) == check_digit
like image 99
Glenn Maynard Avatar answered Oct 08 '22 13:10

Glenn Maynard


Pointless improvement: replace return True if (result % 11) == check_digit else False with return (result % 11) == check_digit

like image 23
Brian Avatar answered Oct 08 '22 13:10

Brian