Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi-Sudoku AI approach

I'm conceptualizing a solver for a variant of sudoku called multi-sudoku, where multiple boards overlap like so:

image of a multi-sudoku

If I understand the game correctly, you must solve each grid in such a way that the overlap between any two or more grids is part of the solution for each.

I'm unsure as to how I should be thinking about this. Anybody got any hints/conceptual clues? Additionally, if any topics in artificial intelligence come to mind, I'd like to hear those too.

like image 509
Regress.arg Avatar asked Dec 13 '15 01:12

Regress.arg


People also ask

How do you write an algorithm for Sudoku?

The first step is easy, list out all possible numbers in each cell. If you have used an app to play Sudoku, there should already be a function allowing you to write down some possible numbers in a cell. In the first step of the algorithm, for each cell, you follow the rules and write down all possible numbers.

How do you solve Sudoku programmatically?

Take a square and an intersecting column (or row). The values in the intersecting cells must match (3 cells, same 3 numbers). This implies the values in the remaining, non-intersecting, cells in the column (6 cells) must be the same as the remaining, non-intersecting cells in the square (6 cells).


2 Answers

Constraint programming (CP)

Sudoku is a typical constraint programming problem. You have a set of variables (the fields in the grid) each with a domain (here the digits 0 to 9) and a set of constraints over these variables (the fact that a number occurs only once in a row, column, block,...).

A generic way to solve constraint programming problems is arc consistency (AC): you propagate constraints. By variables that are (partly) filled in, you can reduce the domain of the remaining variables, etc. Finally if propagation no longer can make the domains smaller, you have to make a choice.

With a choice, you select a value for a certain variable. A good strategy is to select a variable with a small amount of possible values left. Next you propagate again and possibly make another choice and so on.

It is possible that your program finds out that a choice was wrong: it makes the domain of one or more variables empty. In that case, you backtrack: you undo a choice you've made earlier (as well as the propagation that was done after that choice) and pick another value.

This answer evidently does not aim to provide an in-depth overview of the topic, but the Wikipedia page can give a better overview and pointers to more information.

There are constraint programming solvers like ECLiPSe (not the IDE), MiniZinc, etc. where one can simply define the variables, domains and constraints.

Solving the problem with ECLiPSe

On the ECLiPSe website, you can find a model for sudoku. Given you read some documentation about ECLiPSe, you can turn this file into a model for multi-sudoku. I've made some small modifications resulting in the following quick-and-dirty solution:

% credits to Joachim Schimpf for his model of sudoku
% http://eclipseclp.org/examples/sudoku.ecl.txt
:- lib(ic).
:- import alldifferent/1 from ic_global.

solve(ProblemName) :-
    problem(ProblemName,BA,BB),
    multi_sudoku(3,BA,BB),
    print_board(BA),
    print_board(BB).

multi_sudoku(N,BA,BB) :-
    sudoku(N,BA,VA),
    sudoku(N,BB,VB),
    N2 is N*N,
    Inc is N2-N,
    (multifor([I,J],1,N,1),param(BA,BB,Inc) do
        BA[I+Inc,J+Inc] #= BB[I,J]
    ),
    append(VA,VB,Vars),
    labeling(Vars).

sudoku(N,Board,Vars) :-
    N2 is N*N,
    dim(Board,[N2,N2]),
    Board[1..N2,1..N2] :: 1..N2,
    ( for(I,1,N2), param(Board,N2) do
        Row is Board[I,1..N2],
        alldifferent(Row),
        Col is Board[1..N2,I],
        alldifferent(Col)
    ),
    ( multifor([I,J],1,N2,N), param(Board,N) do
        ( multifor([K,L],0,N-1), param(Board,I,J), foreach(X,SubSquare) do
        X is Board[I+K,J+L]
        ),
        alldifferent(SubSquare)
    ),
    term_variables(Board, Vars).


print_board(Board) :-
    dim(Board, [N,N]),
    ( for(I,1,N), param(Board,N) do
        ( for(J,1,N), param(Board,I) do
            X is Board[I,J],
        ( var(X) -> write("  _") ; printf(" %2d", [X]) )
        ), nl
    ), nl.


%----------------------------------------------------------------------
% Sample data
%----------------------------------------------------------------------

problem(1, [](
    [](_, _, _, _, 6, _, _, _, _),
    [](_, _, _, 4, _, 9, _, _, _),
    [](_, _, 9, 7, _, 5, 1, _, _),
    [](_, 5, 2, _, 7, _, 8, 9, _),
    [](9, _, _, 5, _, 2, _, _, 4),
    [](_, 8, 3, _, 4, _, 7, 2, _),
    [](_, _, _, 2, _, 8, _, _, _),
    [](_, _, _, 6, _, 4, _, _, _),
    [](_, _, _, _, 5, _, _, _, _)
           ),
           [](
    [](_, _, _, _, 3, _, _, _, _),
    [](_, _, _, 8, _, 7, _, _, _),
    [](_, _, _, 1, _, 6, 3, _, _),
    [](_, 9, 8, _, _, _, 1, 2, _),
    [](2, _, _, _, _, _, _, _, 3),
    [](_, 4, 3, _, _, _, 6, 5, _),
    [](_, _, 7, 3, _, 5, 9, _, _),
    [](_, _, _, 4, _, 2, _, _, _),
    [](_, _, _, _, 6, _, _, _, _)
           )
    ).

I "borrowed" the model of the sudoku from Joachim Schimpf, so credits to him. Furthermore note that this answer does not advice to use ECLiPSe over another tool. I'm simply more familiar with the Prolog toolchain when it comes to constraint programming. But if you are more into C++, Gecode will do the trick with approximately the same (or even better) performance.

which generates the output:

ECLiPSe Constraint Logic Programming System [kernel]
Kernel and basic libraries copyright Cisco Systems, Inc.
and subject to the Cisco-style Mozilla Public Licence 1.1
(see legal/cmpl.txt or http://eclipseclp.org/licence)
Source available at www.sourceforge.org/projects/eclipse-clp
GMP library copyright Free Software Foundation, see legal/lgpl.txt
For other libraries see their individual copyright notices
Version 6.1 #199 (x86_64_linux), Sun Mar 22 09:34 2015
[eclipse 1]: solve(1).
lists.eco  loaded in 0.00 seconds
WARNING: module 'ic_global' does not exist, loading library...
queues.eco loaded in 0.00 seconds
ordset.eco loaded in 0.01 seconds
heap_array.eco loaded in 0.00 seconds
graph_algorithms.eco loaded in 0.03 seconds
max_flow.eco loaded in 0.00 seconds
flow_constraints_support.eco loaded in 0.00 seconds
ic_sequence.eco loaded in 0.01 seconds
ic_global.eco loaded in 0.07 seconds
  2  1  4  8  6  3  9  5  7
  8  7  5  4  1  9  2  6  3
  6  3  9  7  2  5  1  4  8
  4  5  2  3  7  1  8  9  6
  9  6  7  5  8  2  3  1  4
  1  8  3  9  4  6  7  2  5
  5  4  1  2  3  8  6  7  9
  7  2  8  6  9  4  5  3  1
  3  9  6  1  5  7  4  8  2

  6  7  9  5  3  4  2  8  1
  5  3  1  8  2  7  4  6  9
  4  8  2  1  9  6  3  7  5
  7  9  8  6  5  3  1  2  4
  2  6  5  7  4  1  8  9  3
  1  4  3  2  8  9  6  5  7
  8  2  7  3  1  5  9  4  6
  9  1  6  4  7  2  5  3  8
  3  5  4  9  6  8  7  1  2

It took my machine approximately 0.11 seconds. Furthermore there are 60 valid solutions in total.

The last two "matrices" show the solution for the two Sudoku's. As you can see (I haven't checked it fully), they share a block (the same output), and all sudoku constraints are valid. A more convenient representation of the solution is shown below:

+-----+-----+-----+
|2 1 4|8 6 3|9 5 7|
|8 7 5|4 1 9|2 6 3|
|6 3 9|7 2 5|1 4 8|
+-----+-----+-----+
|4 5 2|3 7 1|8 9 6|
|9 6 7|5 8 2|3 1 4|
|1 8 3|9 4 6|7 2 5|
+-----+-----+-----+-----+-----+
|5 4 1|2 3 8|6 7 9|5 3 4|2 8 1|
|7 2 8|6 9 4|5 3 1|8 2 7|4 6 9|
|3 9 6|1 5 7|4 8 2|1 9 6|3 7 5|
+-----+-----+-----+-----+-----+
            |7 9 8|6 5 3|1 2 4|
            |2 6 5|7 4 1|8 9 3|
            |1 4 3|2 8 9|6 5 7|
            +-----+-----+-----+
            |8 2 7|3 1 5|9 4 6|
            |9 1 6|4 7 2|5 3 8|
            |3 5 4|9 6 8|7 1 2|
            +-----+-----+-----+

I don't know about a constraint programming library in Python, nor do I know of a port of ECLiPSe to Python. But my experience is that all modern programming languages have such tool.

The advantage of using a constraint programming tool like ECLiPSe, Gecode,... is first of all that you only have to specify your problem, how it is solved is something you don't have to worry. Furthermore such libraries implement 30 years of research into constraint programming: they are extremely optimized, exploit constraints and structures in a way most people cannot imagine and are less likely to contain bugs (than a custom made algorithm). Furthermore if new strategies, algorithms,... are found, an update of ECLiPSe will result in faster processing of the model.

A disadvantage is that some hard problems can still not be solved using constraint programming: the search space is simply too large, the constraints are that complex that the domains are not reduced to small sets, and there is not enough processing power to make enough choices in order to find a valid solution. Another disadvantage is that it is not always easy to specify a problem: although programmers aim to design good constraints, there are always complex problems where there are no easy-to-use constraints defined for.

Other techniques

Evidently there are other AI techniques to solve problems. A technique that is commonly used to tackle hard search and optimization problems is evolutionary computing: one starts by filling in the sudoku allowing some values to be wrong, then at each time step they aim to fix one or more fields. Sometimes they introduce new errors in order to find a valid solution eventually.

like image 198
Willem Van Onsem Avatar answered Oct 25 '22 12:10

Willem Van Onsem


A different approach would be to use a genetic algorithm. Which is based on biological evolution. Genetic algorithms are part of evolutionary algorithms and therefore share the analogy "fitness-function". A valid solution is found via recombination, selection and mutation.

The basic concept is to initialize a number of solutions "individuals" which consist of "chromosomes" by random (Solutions can be wrong of course). Define a "fitness-function" which evaluates the quality of every individual within the current "generation". Take with a higher probability the individuals with better fitness to recombine them into a "child" solution and mutate, with a low probability, some individuals within the new generation. Repeat until some criteria (max_iteration_count, valid_solution_found) is true.

To fit a genetic algorithm to your problem. You could do something like the following. Create for every 3x3 or 9x9 sudoku a local correct solution. These are the chromosomes. Put them in one data structure. Thats an individual. Create a bunch of them to form a generation. Evaluate every individual by the constraints its violating. Calculate corresponding probability. Recombine the individuals, mutate some chromosomes, repeat.

You could see it as combination of local and global optimization. Since you would need some kind of greedy algorithm (or whatever you want to use to solve the local sudoku) to find a local, and the GA to find the overall global optima. I think it would not make sense to only use the GA and try to solve this Problem. I implemented a GA on a combinatorial Problem recently and without local optimization the results were, most of the time, quite terrible.

The good thing about GA though, is that they make large steps at the beginning of the search converging towards an optima but still cover big parts of the search space. But as always with EA, tuning the parameters can be quiet tricky.

like image 42
Jesse James Avatar answered Oct 25 '22 14:10

Jesse James