Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reverse engineering a check digit algorithm

I am trying to reverse engineer an algorithm used to generate a check digit.

Numbers are 8 digits long and the last digit is the check digit. I have thousands of valid numbers to test it on.

I have tried a standard Luhn, Verhoeff and modulo-10 algorithms (brute force checking of all possible weights), but could not find an answer!

Is it possible to calculate this? Any ideas?

Here is some examples of valid numbers:

1002784-5
1000514-7
1001602-8
1001255-2
1001707-1
1003355-5
1005579-1
1004535-0
1004273-1
1001695-9
1004565-9
1000541-9
1001291-1
1005866-1
1004352-7

EDIT: Thanks guys - I don't have access to the code unfortunately. The number is a tax number, I need to be able to verify that the number was typed in correctly. From my research is looks like most countries use a pretty standard modulo-10 type system. I've got access to about 60 000 numbers.

I understand that the problem could be impossible to solve, it was more of academic concern.

like image 618
Neil Avatar asked Nov 21 '12 14:11

Neil


People also ask

What is check digit algorithm?

A check digit algorithm calculates a check digit based on an original character string, such as an account number. The receiver recalculates the check digit to verify data entry accuracy. If the recalculated character string contains the correct check digit, the data is error-free and may be used.

Can you reverse engineer an algorithm?

At the same time, algorithms must always have an input and output; the black box actually has two little openings. We can take advantage of those inputs and outputs to reverse engineer what's going on inside.

What is the formula for check digit?

To calculate the check digit, take the remainder of (5 / 10), which is also known as (5 modulo 10), and if not 0, subtract from 10: i.e. (5 / 10) = 0 remainder 5; (10 - 5) = 5. Therefore, the check digit x value is 5.

How does check digit validation work?

A check digit is a digit that is appended onto an identifier using a set algorithm so a vendor can quickly verify whether the identifier you have given is a valid one. For example, if you have a credit card with a 16 digit account number, the number is generated as a 15 digit number.


1 Answers

First check your context:

If context is credit cards, driver's licenses, government licensing numbers (not SSN) think Luhn or Mod 10. If some other industry, does that industry have a defacto standard? If not, is the developer of the system using the numbers also a player in an industry that has a de facto standard?

Nobody likes to reinvent the wheel if they don't have to.

If that doesn't help keep in mind:

Don't assume that all the numbers in the keys you are testing against are used to arrive at the check digit. It's possible only 4 or the 8 digits are being used to calculate the check digit (or any other combination). It's also possible there is some external PREFIX number that is used with the other digits to arrive at the check digit. So... line up all your numbers with the same check digit, and see what the similarities are. Can you add a number to them and then always reach check digit? Can you test only the first few digits? Last few digits? every other digit?

Good luck.

like image 54
Carol Susie Odiorne Avatar answered Nov 13 '22 08:11

Carol Susie Odiorne