I am trying to reverse engineer an algorithm used to generate a check digit.
Numbers are 8 digits long and the last digit is the check digit. I have thousands of valid numbers to test it on.
I have tried a standard Luhn, Verhoeff and modulo-10 algorithms (brute force checking of all possible weights), but could not find an answer!
Is it possible to calculate this? Any ideas?
Here is some examples of valid numbers:
1002784-5
1000514-7
1001602-8
1001255-2
1001707-1
1003355-5
1005579-1
1004535-0
1004273-1
1001695-9
1004565-9
1000541-9
1001291-1
1005866-1
1004352-7
EDIT: Thanks guys - I don't have access to the code unfortunately. The number is a tax number, I need to be able to verify that the number was typed in correctly. From my research is looks like most countries use a pretty standard modulo-10 type system. I've got access to about 60 000 numbers.
I understand that the problem could be impossible to solve, it was more of academic concern.
A check digit algorithm calculates a check digit based on an original character string, such as an account number. The receiver recalculates the check digit to verify data entry accuracy. If the recalculated character string contains the correct check digit, the data is error-free and may be used.
At the same time, algorithms must always have an input and output; the black box actually has two little openings. We can take advantage of those inputs and outputs to reverse engineer what's going on inside.
To calculate the check digit, take the remainder of (5 / 10), which is also known as (5 modulo 10), and if not 0, subtract from 10: i.e. (5 / 10) = 0 remainder 5; (10 - 5) = 5. Therefore, the check digit x value is 5.
A check digit is a digit that is appended onto an identifier using a set algorithm so a vendor can quickly verify whether the identifier you have given is a valid one. For example, if you have a credit card with a 16 digit account number, the number is generated as a 15 digit number.
First check your context:
If context is credit cards, driver's licenses, government licensing numbers (not SSN) think Luhn or Mod 10. If some other industry, does that industry have a defacto standard? If not, is the developer of the system using the numbers also a player in an industry that has a de facto standard?
Nobody likes to reinvent the wheel if they don't have to.
If that doesn't help keep in mind:
Don't assume that all the numbers in the keys you are testing against are used to arrive at the check digit. It's possible only 4 or the 8 digits are being used to calculate the check digit (or any other combination). It's also possible there is some external PREFIX number that is used with the other digits to arrive at the check digit. So... line up all your numbers with the same check digit, and see what the similarities are. Can you add a number to them and then always reach check digit? Can you test only the first few digits? Last few digits? every other digit?
Good luck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With