First of all, I'll state that this is a university assignment so I'm not asking for someone to write the code for me I just need to be pointed in the right direction. :)
Ok, so I need to write an algorithm to solve any (solvable) sudoku board of arbitrary size. I've written a recursive function that can solve any 9x9 board quickly (~1ms) but when I do larger boards (16x16) that are hard to solve it struggles.. I've had one test going for 20 minutes and it can't seem to solve it. It can solve easy 16x16 puzzles or even a blank 16x16 board so I don't think it's the dimensions that are the problem.. it's more likely to be the algorithm that is the problem I think.
Anyway, this is the basic logic of my program..
Then my solve function is basically:
bool solve() {
if (there are no unfilled squares)
return true
if (the board is unsolvable - there are empty squares that have no possible values)
return false
while (there are empty squares)
{
int squaresFilled = fillSquaresWithOnlyOneChoice(); //this method updates the possible results vector whenever it fills a square
if (squaresFilled == 0)
break;
}
//exhausted all of the 'easy' squares (squares with only one possible choice), need to make a guess
while (there are empty squares that have choices left) {
find the square with the least number of choices
if (the square with the least number of choices has 0 choices)
return false; //not solvable.
remove that choice from the 3D vector (vector that has the choices for each square)
make a copy of the board and the 3D choices vector
fill the square with the choice
if (solve())
return true; //we're done
restore the board and choices vector
//the guess didn't work so keep looping and make a new guess with the restored board and choices -- the choice we just made has been removed though so it won't get made again.
}
return false; //can't go any further
}
Is there anything inefficient about this? Is there any way I could get it to work better? I'm guessing that a 16x16 board takes so long is because the decision tree for it is so large for a board that isn't filled in very much. It's weird though, because a 9x9 board will solve really fast.
Any ideas or suggestions would be absolutely awesome. If there's any information I've missed let me know too!
Fast algorhitm for solving sudoku is Algorithm X by Donald Knuth. You represent solving sudoku as exact cover problem and then use Algorithm X for solving EC problem. Then use DLX as efficient implementation of Algorithm X.
There is great explanation on wikipedia on how to apply exact cover for solving sudoku.
I can tell you that DLX is extremely fast fost solving sudoku in is commonly used in fastest algorhitm.
http://www.setbb.com/phpbb/index.php?mforum=sudoku is great forum whit probably best sudoku programmers.
Between filling the squares with only one choice and going full recursive on the board there are more advanced actions you can do. Lets take that "region" is one row, or one column, or one square region (3x3 or 4x4).
If there are K squares in a region that can take only identical K numbers (for instance two squares that can take only 2 an 5, or three squares that can take only 1, 7 and 8) then all other squares in that region can't take those specific numbers. You need to iterate each region to weed out "taken" numbers, so you can find a square with only one logical choice (for instance third square with 2, 4 and 5 logically can take only 4, or fourth square with 1, 3, 7 and 8 logically can take only 3).
This has to bi solved with iteration if you consider the following example. A region has squares with this possible numbers:
A: 1 2 3
B: 2 3
C: 2 3 4 5
D: 4 5
E: 4 5
The algorithm should detect that squares D and E hold numbers 4 and 5, so 4 and 5 are excluded from other squares in the region. The algorithm then detects that squares B and C hold numbers 2 and 3, and so excludes them from other squares. This leaves square A with only number 1.
If a number occurs in the region in only one square then logically that square holds that number.
Tactics 1 and 2 are only special cases of Tactic 3 having K squares with only K identical numbers. You can have K squares and a set of K numbers and those K squares can hold any subset of those K numbers. Consider the following example of a region:
A: 1 2
B: 2 3
C: 1 3
D: 1 2 3 4
Squares A, B and C can hold only numbers 1, 2 and 3. That's K for K. That means that any other square can't logically hold these numbers, which leaves square D with only number 4.
Tactic 2 is special case of Tactic 3 when K = N - 1.
Take advantage of regions overlap. Suppose that some number can exist only in certain squares of the region. If all those squares belong to another overlapping region then that number should be excluded from all other squares in this other region.
Cache results. All regions should have a "dirty" flag that denotes that something in the region has changed from the last time the region is processed. You don't have to process the region with this flag not set.
Human beings use all those tactics, and really hate to guess a number, because backtracking is a real pain. Actually, the difficulty of a board is measured with the minimum number of guesses one has to make to solve the board. For most "extreme" boards one good guess is enough.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With