Gudoku is a differentiable Sudoku solver that leverages gradient-based optimization to solve Sudoku puzzles using PyTorch. Instead of traditional backtracking or constraint propagation, it encodes Sudoku rules as differentiable loss functions, enabling learning-based or gradient descent-based puzzle solving.
This notebook is divided into two main phases:
Uses gradient descent and continuous relaxation to solve Sudoku by minimizing constraint violations directly. No training is involved — the puzzle is solved purely through optimization.
Uses a trainable neural network to learn a mapping from partially-filled Sudoku grids to complete solutions. A custom loss function enforces Sudoku rules during training, and fixed clues are respected during both training and inference.
Sudoku is a combinatorial logic puzzle composed of a
- Each row contains the digits
$1$ through$9$ (no repeats) - Each column contains the digits
$1$ through$9$ (no repeats) - Each
$3 \times 3$ box contains the digits$1$ through$9$ (no repeats)
This project explores differentiable programming techniques for constraint satisfaction problems. Sudoku rules are expressed as soft differentiable constraints, allowing gradient-based methods to operate on a problem typically solved by discrete logic.
Let the Sudoku grid be a 9 x 9 x 9 tensor X, where:
- X[i,j,k] = 1 if cell (i, j) is assigned number k + 1
- X[i,j,k] ∈ [0, 1] during optimization or training
-
Cell Constraint
Each cell must contain exactly one number:For all i,j in {1,...,9}:
sum over k=1 to 9 of X[i,j,k] = 1 -
Row Constraint
Each number must appear exactly once in every row:For all i,k in {1,...,9}:
sum over j=1 to 9 of X[i,j,k] = 1 -
Column Constraint
Each number must appear exactly once in every column:For all j,k in {1,...,9}:
sum over i=1 to 9 of X[i,j,k] = 1 -
Box Constraint
Each number must appear exactly once in every 3x3 box:For all a,b in {0,1,2} and k in {1,...,9}:
sum over i=3a+1 to 3a+3 and j=3b+1 to 3b+3 of X[i,j,k] = 1 -
Clue Constraint (for given values)
Fixed cells are clamped to their given number:If cell (i,j) is fixed with number k+1:
X[i,j,k] = 1 and X[i,j,l] = 0 for all l ≠ k
The dataset is stored in a Parquet file:
grid: The initial Sudoku state (0s indicate blanks)solution: The full correct solution
Data is parsed into tensors for training and optimization.
You can find and download dataset here
- Inputs: One-hot relaxed representation of unknown cells
- Loss: Sum of squared constraint violations
- Method: Gradient descent (e.g., Adam)
- Inputs: Partial grid (with clues)
- Network: MLP or CNN (simple ResNet model used in this notebook)
- Loss: Same constraint loss as optimization + optional supervised loss on known solutions
Further implementation details are available in the notebook.
This project is licensed under the MIT License.