|
mathematics of Sudoku
The class of
Sudoku puzzles consists of
a partially completed row-column grid of cells partitioned
into N regions each of size N cells, to be filled
in so that each row, column and region contains each of the
numbers {1, ..., N}. The puzzle can be investigated
using mathematics.
Overview
The mathematical analysis of Sudoku falls into two main areas:
analyzing the properties of a) completed grids and b) puzzles.
Grid analysis has largely focused on counting (enumerating)
possible solutions for different variants. Puzzle analysis
centers on the initial given values. The techniques
used in either are largely the same: combinatorics
and permutation
group theory, augmented by the dextrous application of
programming tools.
There are many Sudoku variants, (partially)
characterized by the size (N) and shape of their regions.
For classic Sudoku, N=9 and the regions are 3x3 squares
(blocks). A rectangular Sudoku uses rectangular regions of
row-column dimension R×C. For R×1 (and
1×C), i.e. where the region is a row or column, Sudoku becomes
a Latin square.
Other Sudoku variants also exist, such as those with irregularly-shaped
regions or with additional constraints (hypercube)
or different (Samunamupure)
constraint types. See Sudoku - Variants for
a discussion of variants and Sudoku terms and jargon for an expanded listing.
Mathematics uses elegant and succinct symbolism, often meaningful
only with formal training. For this article, an abstract mathematical
statement is used at the upper levels. Detail descriptions
are deferred to sub-sections, usually with the term 'detail'
in the section heading, and can be skipped where the condensed
mathematical statement is (deemed) sufficient. The extended
explanations can also be used as a practical introduction
to the mathematical concepts involved.
The mathematics of Sudoku is a relatively new area of exploration,
mirroring the popularity of Sudoku itself. NP-completeness
was documented late 2002 [1], enumeration results began appearing in May 2005 [2]. This article is incomplete and will undoubtedly remain so
as these efforts continue.
Mathematical context
The general problem of solving Sudoku puzzles on n2
× n2 boards of n × n blocks
is known to be NP-complete [3]. This gives some indication of why Sudoku is difficult
to solve, although on boards of finite size the problem is
finite and can be solved by a deterministic finite automaton that knows the entire game tree.
Solving Sudoku puzzles can be expressed as a graph colouring
problem. Let us talk about the 9 × 9 = 32 × 32
case. The aim of the puzzle in its standard form is to construct
a proper 9-colouring of a particular graph, given a partial
9-colouring. The graph in question has 81 vertices, one vertex
for each cell of the grid. The vertices can be labelled with
the ordered pairs (x, y), where x and
y are integers between 1 and 9. In this case, two distinct
vertices labelled by (x, y) and (x′,
y′) are joined by an edge if and only if:
- x = x′ (same column) or,
- y = y′ (same row) or,
- ? x/3 ? = ? x′/3 ? and ? y/3 ? =
? y′/3 ? (same 3 × 3 cell)
The puzzle is then completed by assigning an integer between
1 and 9 to each vertex, in such a way that vertices that are
joined by an edge do not have the same integer assigned to
them.
A valid Sudoku solution grid is also a Latin square.
There are significantly fewer valid Sudoku solution
grids than Latin squares because Sudoku imposes the
additional regional constraint. Nonetheless, the number of
valid Sudoku solution grids for the standard 9×9 grid
was calculated by Bertram Felgenhauer in 2005 to be 6,670,903,752,021,072,936,960
[4] (sequence
A107739 in OEIS). This number is equal to 9! × 722 × 27
× 27,704,267,971, the last factor of which is prime. The result
was derived through logic and brute force
computation. The
derivation of this result was considerably simplified by analysis
provided by Frazer Jarvis and the figure has been confirmed
independently by Ed Russell. Russell and Jarvis also showed
that when symmetries were taken into account, there were 5,472,730,538
solutions [5] (sequence A109741 in OEIS). The number of valid Sudoku solution grids for the
16×16 derivation is not known.
The maximum number of givens that can be provided while still
not rendering the solution unique, regardless of variation,
is four short of a full grid; if two instances of two numbers
each are missing and the cells they are to occupy are the
corners of an orthogonal rectangle, and exactly two of these
cells are within one region, there are two ways the numbers
can be added. The inverse of this—the fewest givens that render
a solution unique—is an unsolved problem, although the lowest number yet found for the
standard variation without a symmetry constraint is 17, a
number of which have been found by Japanese puzzle enthusiasts
[6] [7],
and 18 with the givens in rotationally symmetric cells.
Terminology
A puzzle is a partially completed grid. The
initial puzzle values are known as givens or clues.
The regions are also called boxes or blocks.
The use of the term square is generally avoided because
of ambiguity. Horizontally adjacent blocks constitute a band
(the vertical equivalent is called a stack).
A proper puzzle has a unique solution. The constraint
'each digit appears in each row, column and region'
is called the One Rule
See basic terms
or the List of Sudoku terms and jargon for an expanded list of terminology.
Type identification and classification
Sudoku types can be identified by their region dimensions.
"3x3" Sudokus denotes the classic format. RxC
denotes a rectangular region with R rows and C
columns. The implied grid configuration has R blocks
per band (C blocks per stack), C×R bands×stacks and
grid dimensions NxN, with N = R · C. Region dimensions are
preferable to region size for characterisation, as size can
be ambiguous, e.g. 2×6 and 3×4 regions both have size 12.
The following framework can be used to structure the relation
of Sudoku variants:
- region size
- rectangular regions, with a sub-types ordered by region
perimeter length as needed
- irregular (polyomino)
regions, used for prime region sizes and other variants
The application of additional constraints can be used to
generate further specializations.
The three principal (sub)classes of Sudoku are rectangular,
square, prime (region). The size ordering of each can be used
to define an integer series, e.g. for square Sudoku, the integer
series of possible solutions ((sequence A107739 in OEIS)).
Sudoku with square NxN regions are more symmetrical
than immediately obvious from the One Rule. Each row
and column intersects N regions and shares N cells with each.
The number of bands and stacks also equals N. Rectangular
Sudoku do not have these properties. The "3x3" Sudoku is additionally
unique: N is also the number of row-column-region constraints
from the One Rule (i.e. there are N=3 types of units).
See the List of Sudoku terms and jargon for an expanded list and classification
of variants.
Definition of terms and labels
Let
- s be a solution to a Sudoku grid with specific dimensions,
satisfying the One Rule constraints
- S = {s}, be the set of all solutions
- |S|, the cardinality of S, is the number of elements
in S, i.e. the number of solutions, also known as the size
of S.
The number of solutions depends on the grid dimensions, rules
applied and the definition of distinct solutions. For the
3×3 region grid, the conventions for labeling the rows, columns,
blocks (boxes) are as shown. Bands are numbered top to bottom,
stacks left to right. By extension the labeling scheme applies
to any rectangular Sudoku grid.
Term labels for box-row and box-column triplets are
also shown.
- triplet - an unordered combination of 3 values used in a box-row (or box-column), e.g.
a triplet = {3, 5, 7} means the values 3, 5, 7 occur in
a box-row (column) without specifying their location order.
A triplet has 6 (3!) ordered permutations. By convention, triplet values are represented
by their ordered digits. Triplet objects are labeled as:
- rBR, identifies a row triplet for box B
and (grid) row R, e.g. r56 is the triplet for box
5, row 6, using the grid row label.
- cBC, identifies similarly a column triplet for
box B and (grid) column row C.
The {a, b, c} notation also reflects that fact a triplet
is a subset of the allowed
digits. For regions of arbitrary dimension, the related object
is known as a minicol(umn) or minirow.
Enumerating Sudoku solutions
The answer to the question 'How many Sudokus are there?'
depends on the definition of when similar solutions are considered
different.
Enumerating all possible Sudoku solutions
For the enumeration of all possible solutions, two
solutions are considered distinct if any of their corresponding
(81) cell values differ. Symmetry relations between similar
solutions are ignored., e.g. the rotations of a solution are
considered distinct. Symmetries play a significant role in
the enumeration strategy, but not in the count of all
possible solutions.
Enumerating the Sudoku 9×9 grid solutions directly
The first approach taken historically to enumerate Sudoku
solutions (Enumerating possible Sudoku grids by Felgenhauer and Jarvis)
was to analyze the permutations of the top band used in valid solutions. Once the
Band1 symmetries and equivalence
classes for the partial grid solutions were identified,
the completions of the lower two bands were constructed and
counted for each equivalence class. Summing completions over
the equivalence classes, weighted by class size, gives the
total number of solutions as 6,670,903,752,021,072,936,960
(6.67×1021). The value was subsequently confirmed numerous
times independently. The Algorithm
details section (below) describes the method.
Enumeration using band generation
[A second enumeration technique based on band generation
was later developed that is significantly less computationally
intensive (requires < 1 second!). See the discussion
page, until the results are integrated here.]
Enumerating essentially different Sudoku solutions
Two valid grids are essentially the same if one can
be derived from the other.
Sudoku preserving symmetries
The following operations always translate a valid Sudoku
grid into another valid grid: (values represent permutations
for classic Sudoku)
- Relabeling symbols (9!)
- Band permutations (3!)
- Row permutations within a band (3!3)
- Stack permutations (3!)
- Column permutations within a stack (3!3)
- Reflection, transposition or (1/4 turn) rotation (2) (Given
any of these 3 (in conjunction with the above permutations),
the other 2 can be produced, so this operation only contributes
a factor of 2).
These operations define a symmetry relation between equivalent
grids. Excluding relabeling, and with respect to the 81 grid
cell values, the operations form a subgroup of the symmetric group
S81, of order 3!8×2 = 3359232.
Identifying distinct solutions with Burnside's Lemma
For a solution, the set of equivalent solutions which can
be reached using these operations (excluding relabeling),
form an orbit of the
symmetric group.
The number of essentially different solutions is then the
number of orbits, which can be computed using Burnside's lemma.
The Burnside fixed points are solutions that differ
only by relabeling. Using this technique, Jarvis/Russell computed the number of essentially different (symmetrically
distinct) solutions as 5,472,730,538.
Enumeration results
The number of ways of filling in a blank Sudoku grid was
shown in May 2005 to be 6,670,903,752,021,072,936,960 (~6.67×1021)
(original announcement). The paper Enumerating possible Sudoku grids, by Felgenhauer and Jarvis,
describes the calculation.
Since then, enumeration results for many Sudoku variants
have been calculated: these are summarised below.
Sudoku with rectangular regions
In the table, "Dimensions" are those of the regions (e.g.
3x3 in normal Sudoku). The "Rel Err" column indicates how
a simple approximation, using the generalised method of Kevin Kilfoil, compares to the true grid
count: it is an underestimate in all cases evaluated so far.
| Dimensions |
Nr Grids |
Attribution |
Verified? |
Rel Err |
| 1x? |
see Latin squares |
n/a |
| 2×2 |
288 |
various |
Yes |
-11.1% |
| 2×3 |
28200960 = c. 2.8×107 |
Pettersen |
Yes |
-5.88% |
| 2×4 |
29136487207403520 = c. 2.9×1016 |
Russell |
Yes |
-1.91% |
| 2×5 |
1903816047972624930994913280000 = c. 1.9×1030 |
Pettersen |
Yes |
-0.375% |
| 3×3 |
6670903752021072936960 = c. 6.7×1021 |
Felgenhauer/Jarvis |
Yes |
-0.207% |
| 3×4 |
81171437193104932746936103027318645818654720000 =
c. 8.1×1046 |
Pettersen/Silver |
No |
-0.132% |
| 3×5 |
unknown, estimated c. 3.5086×1084 |
Silver |
n/a |
| 4×4 |
unknown, estimated c. 5.9584×1098 |
Silver |
n/a |
| 4×5 |
unknown, estimated c. 3.1764×10175 |
Silver |
n/a |
| 5×5 |
unknown, estimated c. 4.3648×10308 |
Silver/Pettersen |
n/a |
The standard 3×3 calculation can be carried out in less than
a second on a PC. The 3×4 (= 4×3) problem is much harder and
took 2568 hours to solve, split over several computers.
Sudoku bands
For large (R,C), the method of Kevin Kilfoil (generalised here) is used to estimate the number of grid completions. The
method asserts that the Sudoku row and column constraints
are, to first approximation, conditionally independent given the box constraint. Omitting a
little algebra, this gives the Kilfoil-Silver-Pettersen formula:
where bR,C is the number of ways of completing
a Sudoku band of R horizontally adjacent R×C
boxes. Pettersen's algorithm, as implemented by Silver, is currently the fastest known technique for exact evaluation
of these bR,C.
The band counts for problems whose full Sudoku grid-count
is unknown are listed below. As in the previous section,
"Dimensions" are those of the regions.
-
| Dimensions |
Nr Bands |
Attribution |
Verified? |
| 2×C |
(2C)! (C!)2 |
(obvious result) |
Yes |
| 3×C |
 |
Pettersen |
Yes |
| 4×C |
(long expression: see below) |
Pettersen |
Yes |
| 4×4 |
16! × 4!12 × 1273431960 = c. 9.7304×1038 |
Silver |
Yes |
| 4×5 |
20! × 5!12 × 879491145024 = c. 1.9078×1055 |
Russell |
Yes |
| 4×6 |
24! × 6!12 × 677542845061056 = c. 8.1589×1072 |
Russell |
Yes |
| 4×7 |
28! × 7!12 × 563690747238465024 = c.
4.6169×1091 |
Russell |
Yes |
| (calculations up to 4×100
have been performed by Silver, but are not listed here) |
| 5×3 |
15! × 3!20 × 324408987992064 = c. 1.5510×1042 |
Silver |
Yes# |
| 5×4 |
20! × 4!20 × 518910423730214314176
= c. 5.0751×1066 |
Silver |
Yes# |
| 5×5 |
25! × 5!20 × 1165037550432885119709241344
= c. 6.9280×1093 |
Pettersen/Silver |
No |
| 5×6 |
30! × 6!20 × 3261734691836217181002772823310336
= c. 1.2127×10123 |
Pettersen/Silver |
No |
| 5×7 |
35! × 7!20 × 10664509989209199533282539525535793414144
= c. 1.2325×10154 |
Pettersen/Silver |
No |
| 5×8 |
40! × 8!20 × 39119289737902332276642894251428026550280700096
= c. 4.1157×10186 |
Pettersen/Silver |
No |
- # : same author, different method
The expression for the 4×C case is:

where:
the outer summand is taken over all a,b,c such that 0<=a,b,c
and a+b+c=2C
the inner summand is taken over all k12,k13,k14,k23,k24,k34
≥ 0 such that
k12,k34 ≤ a and
k13,k24 ≤ b and
k14,k23 ≤ c and
k12+k13+k14 = a-k12+k23+k24 = b-k13+c-k23+k34 = c-k14+b-k24+a-k34
= C
Sudoku with additional constraints
The following are all restrictions of 3x3 Sudokus. The
type names have not been standardised: click on the attribution
links to see the definitions.
All Sudokus remain valid (no repeated numbers in any row,
column or region) under the action of the Sudoku
preserving symmetries (see also Jarvis). Some Sudokus are special in that some operations merely
have the effect of relabelling the digits; several of these
are enumerated below.
| Transformation |
Nr Grids |
Attribution |
Verified? |
| Transposition |
10980179804160 |
Russell |
Indirectly |
| Quarter Turn |
4737761280 |
Indirectly |
| Half Turn |
56425064693760 |
Indirectly |
| Band cycling |
5384326348800 |
Indirectly |
| Within-band row cycling |
39007939461120 |
Indirectly |
Further calculations of this ilk combine to show that the
number of essentially different Sudoku grids is 5,472,730,538,
for example as demonstrated by Jarvis/Russell and verified by Pettersen. Similar methods have been applied to the 2x3 case,
where Jarvis/Russell showed that there are 49 essentially different
grids (see also the article by Bailey, Cameron and Connelly), to the 2x4 case, where Russell showed that there are 1,673,187 essentially different
grids (verified by Pettersen), and to the 2x5 case where Pettersen showed that there are 4,743,933,602,050,718 essentially
different grids (not verified).
Minimum number of givens
Proper puzzles have a unique solution. An irreducible
puzzle (or minimal puzzle) is a proper puzzle
from which no givens can removed leaving it a proper puzzle
(with a single solution). It is possible to construct minimal
puzzles with different numbers of givens. This section discusses
the minimum number of givens for proper puzzles.
Ordinary Sudoku
The lowest known is 17 givens in general Sudoku, or 18
when the positions of the givens are constrained to be half-turn
rotationally symmetric. It is conjectured that these are
the best possible, evidence for which stems from extensive
randomised searching:
- Gordon Royle has compiled a list of (currently) 36628 17-clue
puzzles, each of which is unique up to isomorphism.
None of these was isomorphic to a symmetric puzzle, nor
contained a 16-clue puzzle.
- An independent construction of 700 distinct 17-clue
puzzles found only 33 not already on an earlier, size
32930, version of Royle's list. From that, the MLE estimate
of the number of 17-clue grids was c. 34550. If
the construction methods were truly random and independent
then this would imply a negligible probability of there
being a 16-clue puzzle waiting to be discovered -- since
any such "16" would give rise to 65 17-clue puzzles, all
of which were somehow missed in the search.
- The most fruitful set of clue positions, in terms of
number of distinct 17-clue puzzles they admit, from Royle's
list have been exhaustively searched for 17-clue puzzles.
All 36 puzzles found by this process were already in Royle's
list. All 34 puzzles on the next most fruitful set of
clue positions were also in Royle's list.
- The most fruitful solution grids (in the same sense)
have been exhaustively searched for 16-clue puzzles using
CHECKER with no success. This includes one, "strangely familiar", grid that yields at least 29 different
17-clue puzzles.
Sudoku with additional constraints
Additional constraints (here, on 3×3 Sudokus) lead to a
smaller minimum number of clues.
- 3doku: no results for this variant
- Disjoint Groups: some 12-clue puzzles have been demonstrated by Glenn Fowler.
It is not known if this is the best possible.
- Hypercube: various 8-clue puzzles
(the best possible) have been demonstrated by Guenter
Stertenbrink.
- Magic Sudoku: a 7-clue example has been provided by Guenter Stertenbrink.
It is not known if this is the best possible.
- Sudoku X: a 13-clue
example has been provided by Gordon Royle. It is not
known if this is the best possible.
- NRC Sudoku: an 11-clue
example has been provided by Andries Brouwer. It is
not known if this is the best possible.
Sudoku with irregular regions
"Du-sum-oh" (a.k.a. "geometry number place") puzzles replace
the 3×3 (or R×C) regions of Sudoku with irregular
shapes of a fixed size. Bob Harris has proved that it is always possible to create N-1 clue du-sum-ohs
on an N×N grid, and has constructed several
examples.
Sum number place ("Killer Sudoku")
In sum number place (Samunamupure), the regions are of
irregular shape and various sizes. The usual constraints
of no repeated value in any row, column or region apply.
The clues are given as sums of values within regions (e.g.
a 4-cell region with sum 10 must consist of values 1,2,3,4
in some order). The minimum number of clues for Samunamupure
is not known, nor even conjectured.
A variant
on Miyuki Misawa's web site replaces sums with relations:
the clues are symbols =, < and >
showing the relative values of (some but not all) adjacent
region sums. She demonstrates an example with only eight
relations. It is not known whether this is the best possible.
Method and algorithm details for the 9×9 grid direct enumeration
The approach described here was the historically first
strategy employed to enumerate the Sudoku 9×9 grid solutions,
as published by Felgenhauer and Jarvis in Enumerating possible Sudoku grids. The methods and algorithms
used are very straight forward and provide a practical introduction
to several mathematical concepts. The development is presented
here for those wishing to explore these topics.
The strategy begins by analyzing the permutations of the top band used in valid solutions. Once the
Band1 symmetries and equivalence
class for the partial solutions are identified, the
completions of the lower two bands are constructed and counted
for each equivalence class. Summing the completions over
the equivalence classes gives the total number of solutions
as 6,670,903,752,021,072,936,960 (c. 6.67×1021).
Counting the top band permutations
The Band1 algorithm proceeds as follows:
- Choose a canonical labeling of the digits by assigning
values for B1, e.g.
1 2 3
4 5 6
7 8 9
- Compute the rest of the Band1 permutations relative
to the B1 canonical choice.
- Compute the permutations of B2 by partitioning
the B1 cell values over the B2 row triplets.
From the triplet combinations compute the B2 permutations.
There are k=0..3 ways to choose the:
- B1 r11 values for B2 r22, the rest must go to r23,
- B1 r12 values for B2 r23, the rest must go to r21,
- B1 r13 values for B2 r21, the rest must go to r22, i.e.
(This expression may be generalized to any R×3 box
band variant. (Pettersen). Thus B2 contributes 56 × 63 permutations.
- The choices for B3 triplets are row-wise determined
by the B1 B2 row triplets. B3 always contributes 6^3 permutations.
The permutations for Band1 are 9! × 56 × 66
= 9! × 2612736 ~ 9.48 × 1011.
Band1 permutation details
Triplet rBR(box/row) Labels
| |
|
|
The permutations of B1 are the number of ways to relabel
the 9 digits, 9! = 362880. Counting the permutations for
B2 is more complicated, because the choices for B2 depend
on the values in B1. (This is a visual representation of
the expression given above.) The conditional calculation
needs a branch (sub-calculation) for each alternative. Fortunately,
there are just 4 cases for the top B2 triplet (r21): it
contains either 0, 1, 2, or 3 of the digits from the B1
middle row triplet(r12). Once this B2 top row choice is
made, the rest of the B2 combinations are fixed. The Band1
row triplet labels are shown on the right.
(Note: Conditional combinations becomes an increasingly
difficult as the computation progresses through the grid.
At this point the impact is minimal.)
Case 0 Matching Cells Triplets
| |
|
|
Case 0: No Overlap. The choices for the triplets can be
determined by elimination.
- r21 can't be r11 or r12 so it must be = r13; r31 must
be = r12 etc.
The Case 0 diagram shows this configuration, where the
pink cells are triplet values that can be arranged in any
order within the triplet. Each triplet has 3! = 6 permutations.
The 6 triplets contribute 66 permutations.
Case 3: 3 Digits Match: triplet r21 = r12. The same logic
as case 0 applies, but with a different triplet usage. Triplet
r22 must be = r13, etc. The number of permutations is again
66. (Felgenhauer/Jarvis call the cases 0 and 3 the pure match
case.
Case 1 Match - Triplet Cell Options
| |
|
|
Case 1: 1 Match for r21 from r12
In the Case 1 diagram, B1 cells show canonical values,
which are color coded to show their row-wise distribution
in B2 triplets. Colors reflect distribution but not location
or values. For this case: the B2 top row triplet (r21) has
1 value from B1 middle triplet, the other colorings can
now be deduced. E.g. the B2 bottom row triplet (r23) coloring
is forced by r21: the other 2 B1 middle values must go to
bottom, etc. Fill in the number of B2 options for each color,
3..1, beginning top left. The B3 color coding is omitted
since the B3 choices are row-wise determined by B1, B2.
B3 always contributes 3! permutations per row triplet, or
63 for the block.
For B2, the triplet values can appear in any position,
so a 3! permutation factor still applies, for each triplet.
However, since some of the values were paired relative to
their origin, using the raw option counts would overcount
the number of permutations, due to interchangeability within
the pairing. The option counts need to be divided by the
permuted size of their grouping (2), here 2!.= 2 (See n Choose k) The
pair in each row cancels the 2s for the B2 option counts,
leaving a B2 contribution of 33 × 63.
The B2×B3 combined contribution is 33 × 66.
Case 2 Match - Triplet Cell Options
| |
|
| |