Ruminations on matters of some importance. Also programming and AI.

Wednesday, May 5, 2010

Solving Sudoku with genetic algorithms

I recently wrote a small Python library for genetic algorithms (GA), called optopus. One thing I tried when I played around with it was to solve a Sudoku puzzle. There are plenty of efficient ways to solve Sudoku, but with my shiny new hammer, all problems look like nails.

Also, I remember that I once read something from someone who tried a GA C-library on Sudoku and concluded that it was not a suitable problem. If I could solve it with my slick library, that random person on the internet, whose web page I might never find again but who may exist as far as you know, would certainly be proven wrong. A worthy cause.

Genetic algorithms
A genetic algorithm is a general way to solve optimization problems. The basic algorithm is very simple:
  1. Create a population (vector) of random solutions (represented in a problem specific way, but often a vector of floats or ints)
  2. Pick a few solutions and sort them according to fitness
  3. Replace the worst solution with a new solution, which is either a copy of the best solution, a mutation (perturbation) of the best solution, an entirely new randomized solution or a cross between the two best solutions. These are the most common evolutionary operators, but you could dream up others that use information from existing solutions to create new potentially good solutions.
  4. Check if you have a new global best fitness, if so, store the solution.
  5. If too many iterations go by without improvement, the entire population might be stuck in a local minimum (at the bottom of a local valley, with a possible chasm somewhere else, so to speak). If so, kill everyone and start over at 1.
  6. Go to 2.
Fitness is a measure of how good a solution is, lower meaning better. This measure is performed by a fitness function that you supply. Writing a fitness function is how you describe the problem to the GA. The magnitude of the fitness values returned does not matter (in sane implementations), only how they compare to each other.

There are other, subtly different, ways to perform the evolutionary process. Some are good and some are popular but bad. The one described above is called tournament selection and it is one of the good ways. Much can be said about the intricacies of GA but it will have to be said somewhere else, lest I digress completely.

Optopus and Sudoku
Using optopus is easy:

from optopus import ga, stdgenomes

#Now we choose a representation. We know that the answer to the puzzle must be some permutation of the digits 1 to 9, each used nine times.

init_vect = sum([range(1,10)] * 9, []) # A vector of 81 elements
genome = stdgenomes.PermutateGenome (init_vect)

#I made a few functions to calculate how many conflicts a potential Sudoku solution has. I'll show them later, but for now let us just import the package. I also found a puzzle somewhere and put it in the PUZZLE constant.

import sudoku
solver = ga.GA(sudoku.ga_sudoku(sudoku.PUZZLE) , genome)

#And now, when we have supplied the GA with a fitness function (ga_sudoku, which counts Sudoku conflicts) and a representation (genome), let us just let the solver do its magic.


And in a few minutes (about 2.6 million iterations when I tried) the correct answer pops out!

The nice thing about this method is that you do not have to know anything about how to solve a Sudoku puzzle or even think very hard at all. Note that I did not even bother to just let it search for the unknown values - it also has to find the digits that we already know (which should not be too hard with a decent fitness function, see below). The only bit of thinking we did was to understand that a Sudoku solution has to be a permutation of [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9], but this merely made the evolution part faster. If we wanted to make it faster still, we could make a genome type that let us say that there are actually nine separate vectors who are each guaranteed to be a permutation of 1 to 9. We could have thought even less and represented the solution by 81 ints who are all in the range 1 to 9, by using another genome type:

>> genome = stdgenomes.EnumGenome(81, range(1,10))

The range argument to EnumGenome does not have to be a vector of integers, it could be a vector of any objects, since they are never treated like numbers.

In my experiment this took maybe 15 - 30 minutes to solve. For more difficult Sudoku puzzles, I would definitely go with the permutation genome, since using EnumGenome increases the search space to 9^81 or 196627050475552913618075908526912116283103450944214766927315415537966391196809 possible solutions.

FYI, this is the puzzle in sudoku.PUZZLE:

  4|8  | 17
67 |9  |   
5 8| 3 |  4
3  |74 |1  
 69|   |78 
  1| 69|  5
1  | 8 |3 6
   |  6| 91
24 |  1|5  

I think a Sudoku puzzle that is harder for humans would not be that much harder for optopus to solve, but I have not tested it.

Sudoku fitness function
OK, so that was a ridiculously easy way to solve a Sudoku puzzle, but I skipped one part that is crucial to all GA - describing the problem using a fitness function. I had to do the following:

DIM = 9

def one_box(solution, i):
    """Extract the 9 elements of a 3 x 3 box in a 9 x 9 Sudoku solution."""
    return solution[i:i+3] + solution[i+9:i+12] + solution[i+18:i+21]

def boxes(solution):
    """Divide a flat vector into vectors with 9 elements, representing 3 x 3
    boxes in the corresponding 9 x 9 2D vector. These are the standard
    Sudoku boxes."""
    return [one_box(solution, i) for i in [0, 3, 6, 27, 30, 33, 54, 57, 60]]

def splitup(solution):
    """Take a flat vector and make it 2D"""
    return [solution[i * DIM:(i + 1) * DIM] for i in xrange(DIM)]

def consistent(solution):
    """Check how many different elements there are in each row.
    Ideally there should be DIM different elements, if there are no duplicates."""
    return sum(DIM - len(set(row)) for row in solution)

def compare(xs1, xs2):
    """Compare two flat vectors and return how much they differ"""
    return sum(1 if x1 and x1 != x2 else 0 for x1, x2 in zip(xs1, xs2))

def sudoku_fitness(flatsolution, puzzle, flatpuzzle=None):
    """Evaluate the fitness of flatsolution."""
    if not flatpuzzle:
        flatpuzzle = sum(puzzle, [])
    solution = splitup(flatsolution)
    fitness = consistent(solution) #check rows
    fitness += consistent(zip(*solution)) #check columns
    fitness += consistent(boxes(flatsolution)) #check boxes
    fitness += compare(flatpuzzle, flatsolution) * 10 #check that it matches the known digits
    return fitness

def ga_sudoku(puzzle):
    """Return a fitness function wrapper that extracts the .genes attribute from
    an individual and sends it to sudoku_fitness."""
    flatpuzzle = sum(puzzle, []) #just a precalcing optimization
    def fit(guy):
        return sudoku_fitness(guy.genes, puzzle, flatpuzzle)
    return fit

I know. This made the solution less clean. Still, I made it verbose for readability, so it is perhaps less code than it looks.

Take that, random internet guy!