Puzzle Solving mostly using SMT (integer) solvers

This blog started as collection of descriptions about how to reframe common puzzles into the input language of an SMT (integer) solvers. One or two other problem-solving articles are included. Scripts to transform puzzles, to the language, suitable for a solver, and for re-formatting the output to an HTML display page are available on request from the author. Those items mostly result from a lockdown project, since then the wider field of logic programming and languages has been explored.

Picat - First steps

Getting started with a new software system can always be a little daunting. Just a little bit of help to get over the curb can make that journey a lot easier. Picat is no exception, a quick review of the available resources, and then a review of these points hints will get you started in your journey of discovery. Picat is available on PC, Mac, Cygwin and Linux. Starting with the assumption that you know how to drive your environment and can create, open and edit files from the command prompt.

If you like to just explore a new environment, just follow along with these simple steps:

Resources page -> Lots of links and resources here

Installing Picat - > Go here and download a suitable version. Note the location of the main program. Usually called picat.exe or just picat on unix systems.

Creating a Picat program -> Use a text Editor to create a files called fib.pi and include just this text ...

main =>
S = 0,

    I = 1,
    F = fib(I),
    while (F <= 4000000)

        if (F mod 2 == 0) then
            S := S+F

        end,
        I := I+1,
        F := fib(I)

    end,
    printf("Sum of the even-valued terms is %w%n",S).

main([A1]) =>
    printf("fib(%s)=%w%n",A1,A1.to_integer().fib()).

table
fib(1) = 1.
fib(2) = 2.
fib(N) = fib(N-1)+fib(N-2).

Running Picat ->

From the command prompt/terminal either run inside the environment

% picat
Picat 3.6, (C) picat-lang.org, 2013-2023.
Type 'help' for help.
Picat> cl(fib)
Compiling:: fib.pi
fib.pi compiled in 4 milliseconds
loading...

yes

Picat> main
Sum of the even-valued terms is 4613732

yes

Picat> X = fib(2500)
X = 2131097222364817258963242995170479743181820536370126875362868255835307759626590863125487300020126214968986354893376810558993752038316785973052343747096937468058048464282949473925420848907002689994007955245760613229271988138914634517152250189728338958657820391875946096623727383020868089975480618277687427319326526655563351096370488572490223863066558332769194471114405709169250300091519443513355074402828988912117721507439260724419664834993923257783450437301079831914800243767657600808784351534033062411236406608421704602501
yes

Picat> exit

OR as a stand alone command starting at the main entry point

% picat fib.pi
Sum of the even-valued terms is 4613732

OR as a stand alone command, with parameter starting at the main([A1]) entry point

% picat fib.pi 2500

fib(2500)=2131097222364817258963242995170479743181820536370126875362868255835307759626590863125487300020126214968986354893376810558993752038316785973052343747096937468058048464282949473925420848907002689994007955245760613229271988138914634517152250189728338958657820391875946096623727383020868089975480618277687427319326526655563351096370488572490223863066558332769194471114405709169250300091519443513355074402828988912117721507439260724419664834993923257783450437301079831914800243767657600808784351534033062411236406608421704602501

Understanding Picat notes ->

Upper case / lower case names for variables/functions are significant. Use Uppercase for variables and lower case for function names apparently.
The structure of a program is , (comma) separated phrases, ending in .(dot) for end of sentence.
There can be multiple entry points in to a program as show above.
Running interactively. Picat does not seem to like cut&paste to add a program into a session unless preceded by go =>. Seems unfortunate that the interpreter format is slightly different from the in-file format. (still looking into this)
To get the program generate all the possible answers (rather than just the first found answer) add in fail close to the end of the function.
The definition of more variables than are explicitly used in the program can cause the solver to run long and generate values for the unset variables.
Picat has a rich syntax that uses some symbols in novel ways. Check out this post of comparisons with other languages.
The meanings of words such as "mode", "predicate", "function" and pattern matching require some study and should not be assumed as understood from study of other programming languages.
Looks like Picat has very long number arithmetic "bigint" built in. (see arbitrary precision numbers )

Notes and code extracted from this getting started guide.

Picat Stands for

Picat is a general-purpose language that incorporates features from logic programming, functional programming, and scripting languages. The letters in the name summarize Picat’s features:

Pattern-matching: A predicate defines a relation, and can have zero, one, or multiple answers. A function is a special kind of a predicate that always succeeds with one answer. Picat is a rule-based language. Predicates and functions are defined with pattern-matching rules.
Intuitive: Picat provides assignment and loop statements for programming everyday things. An assignable variable mimics multiple logic variables, each of which holds a value at a different stage of computation. Assignments are useful for computing aggregates and are used with the foreach loop for implementing list and array comprehensions.
Constraints: Picat supports constraint programming. Given a set of variables, each of which has a domain of possible values, and a set of constraints that limit the acceptable set of assignments of values to variables, the goal is to find an assignment of values to the variables that satisfies all of the constraints. Picat provides three solver modules: cp, sat, and mip. These three modules follow the same interface, which allows for seamless switching from one solver to another.
Actors: Actors are event-driven calls. Picat provides action rules for describing event- driven behaviors of actors. Events are posted through channels. An actor can be attached to a channel in order to watch and to process its events.
Tabling: Tabling can be used to store the results of certain calculations in memory, allowing the program to do a quick table lookup instead of repeatedly calculating a value. As computer memory grows, tabling is becoming increasingly important for offering dynamic programming solutions for many problems. The planner module, which is implemented by the use of tabling, has been shown to be a more efficient tool than ASP and PDDL for solving many planning problems.

Picat - A new avenue for exploration

The field of constraint programming CP and Satisfaction Modular theory SMT appear to have a high barriers to entry. Many of the concepts that lie behind the solvers, rely on mathematical explanations and verified computational theories. However when it gets down to it, understanding a problem thoroughly, and being able to express it in precise logical terms is all that is needed for making progress in this area.

What has been very rewarding is seeing a computer generate a solution, after giving it just the description of the problem, rather than providing a method to solve the problem. This is very different from the current mode of artificial intelligence included in ChatGPT et cetera which provides solutions by looking for answers near places that textually look like the question being asked.

The general method used up to this point, has been encoding a problem into a textual / numerical description, then transforming that description into the language required by either Z3, or MathSat5. This has been a successful approach, providing answers to all shapes and sizes of Soduku, Gogen, Hidato, and other similar numerical puzzles. Whilst this has proved a satisfactory avenue of exploration, it did feel like spoon feeding the solver with a rather low level and basic language. (SMT-2) being their language of choice.

I felt there must be a better way, and I think I found it. Picat is a problem text processor that encompasses in its logical language how to describe various shapes and sizes of problems and includes different solvers. I am just taking the very first steps in understanding how to use Picat, but I am encouraged by its compact syntax and flexibility.

Peoples way of learning a new language or skill vary. Some like to tinker explore tryout, poke with a stick and others like to read the technical descriptions and before getting their hands dirty. Personally I like to tinker around with new things but have a goal in mind. Picat was very amenable to trying out things once a couple of fundamentals were understood. I'll document my first efforts in the next post.

A shortlist of resources is provided.

The introduction paper

My First Look At Picat as a Modeling Language for Constraint Solving and

Planning by Håkan Kjellerstrand1

http://picat-lang.org/download/a_first_look_at_picat.pdf

This is a great starting point for existing users of solver type programs.

The Website

From which the "Getting started" and "User guides" are available along with install downloads for PC, Mac and Linux platforms.

http://picat-lang.org/

The Book

"Constraint Solving and Planning with Picat"

Whilst there is a .pdf available having the book makes access easier.

The jumbo boxes of Picat examples

There is a set of 99 converted problems here

and

A big set of other example including numerical puzzles here.

The Guy

I had a couple of email dialogs with Haken Kjellerstrand, the Sweden based co-author of the book and major contributor to Picat. He was just so helpful, engaged, and willing to put effort into looking at the questions I had that it gave me a warm, fuzzy feeling about Picat community.

The other Guys

Picat is created by Neng-Fa Zhou and Jonathan Fruhman.

ChatGPT - not so clever after all - on logical tasks

I really like the work of Adrian Groza who explores the puzzle solving using first order logic.

In this recently printed paper, Adrian explores the reasoning capabilities of ChatGPT. Starting with a large quantity of logical puzzles reformed in a way that can be presented to ChatGPT. The large language model then tries to present reasoned answers. No spoilers, but it doesn't go well.

Link to paper on ResearchGate here:

(PDF) Measuring reasoning capabilities of ChatGPTresearchgate.net

The paper goes the extra mile beyond just testing ChatGPT problem solving capabilities. The paper looks in detail at the dialogue and extracts the type of nature of the logical fallacies that are included in chat GPT's reasoning. There appears to be lack of common sense, better answers are presented as being worse answers. In some cases extra constraints are unnecessarily added to problems, leading to the inability to find the correctly available answer.

One conclusion of the paper correctly identifies that ChatGPT does have strengths in the language processing area, but falls very short when required to do detailed factual, correct problem-solving work.

The dangerous/great thing about "trusting an Oracle" without substantive background and showing the workings is that it can learn from feedback....

From the paper ... Preprint · October 2023

Run today 3 December 2023

Good to see that ChatGPT can make a fair assessment of its own abilities on logical problems and puzzles.

After concluding that 693 is equal to 729 and a father aged 29 can have children aged 24 and 21 Chat GPT provided the admission that "For the most accurate and reliable results on logical puzzles, it's often better to use dedicated problem-solving tools or consult with experts in the relevant field."

*** Update Jan 2024 -- See also ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate

Solving Minesweeper on paper using SMT (integer) solver

Solving Minesweeper on paper using an SMT (integer) solver

Sunday 25 September 2022

An Article by Clive England, CEng, MBCS

See Original and updates on https://gannett-hscp.blogspot.com September 2022

Continuing the series of solving puzzles using an integer SMT solver we look now at the paper version of the Minesweeper puzzle. Originally Minesweeper was an active game shipped with various versions of Windows operating systems. This version is static as published in newspapers. See Minesweeper link for more information about the active game.

This popular logic puzzle is published in the i newspaper. The puzzle offers a rectangular grid with numeric clues from which the locations of hidden mines must be deduced. The initial clue numbers show the quantity of mines adjacent to the clue square. The numbers are generated by counting the hidden mines one step in any direction (kings move). No edge wraparound is allowed. No mines are located under clue squares.

Encoding

The puzzle is simply encoded into text then a script is used to generate the underlying logic of the puzzle. These logic lines are then passed to an SMT solver for analysis and resolution. Microsoft's Z3 and MathSAT solvers are used. The above puzzles are encoded as follows.

#iMine iMine_220917.txt

.0..00...2...1..

...3..3....2..0.

2....1....43.1..

3.......3.3.....

.3..2..3.3...3.1

..1.....2....3..

0....0..2..1.1..

....1.0.....1..1

.2.0...1..3.....

...0..0.0..3.22.

1.1.3....13.....

......2.....3..2

#iMine iMine_test03.txt

0...0

..1..

0...0

The puzzle logic for iMine_test03.txt is generated from the encoding as follows:

After some pre-amble to set the solvers into “linear integer” mode a value is declared for each of the empty “.” squares.

The range of that value then limited to 0 or 1. In the results of the solver a 1 indicates the presences of a mine in that square. The numbering of the values is critical to the simplicity and generation of the logic to solve the problem.

Each cell in each row is numbered sequentially but there is a jump in numbers between the sequential rows.

#iMine iMine_test03_Cell_numbers.txt

V0 V1 V2 V3 V4

V9 V10 V11 V12 V13

V18 V19 V20 V21 V22

The jump in value numbering is to avoid wrap around at the row edges. For example the first cell on the second row is V9, to check the cell to left of V9 the logic generation script checks to see if V8 exists. As V8 is not defined the cell is treated as on the edge and V8 is not included in the logic definition. The same applies on the right hand edge of Row 1, where counting from cell V0, V4 is not defined as it is a clue square and V5 is not defined as being off the edge of the board. The second row starts with V9.

Words after a ; on the logic line are comments and are ignored.

A value V000 is defined and set to value zero to assist with the case where the contents of an empty cell are defined by just one clue cell.

(set-logic LIA)

(set-option :produce-models true)

(set-option :produce-assignments true)

(declare-const V000 Int)

(assert (= 0 V000))

(declare-const V1 Int) ; Row 1

(assert (or (= 1 V1) (= 0 V1)))

(declare-const V2 Int)

(assert (or (= 1 V2) (= 0 V2)))

(declare-const V3 Int)

(assert (or (= 1 V3) (= 0 V3)))

(declare-const V9 Int) ; Row 2

(assert (or (= 1 V9) (= 0 V9)))

(declare-const V10 Int)

....

(declare-const V21 Int)

(assert (or (= 1 V21) (= 0 V21)))

The main logic of the puzzle is generated, one line for each of the numeric clue squares. Each of the potential mine squares has to be 0 or 1 with the total number of 1s in the surrounding 8 squares being the same as the provided clue. Mines cannot be under clue squares.

(assert (or (= 1 V21) (= 0 V21)))

(assert ( = 0 ( + V000 V1 V9 V10 ))) ; sqr=0

(assert ( = 0 ( + V000 V3 V12 V13 ))) ; sqr=4

(assert ( = 1 ( + V000 V1 V2 V3 V10 V12 V19 V20 V21 ))) ; sqr=11

(assert ( = 0 ( + V000 V9 V10 V19 ))) ; sqr=18

(assert ( = 0 ( + V000 V12 V13 V21 ))) ; sqr=22

After testing it was found that some boards generate logic such as

(assert ( = 1 ( + V11 ))) ; sqr=10

giving an SMT error

(error "ERROR: + takes at least 2 arguments (1 given) (line: 256)")

The was fixed by this defining V000 being value 0 and then generating the logic line

(assert ( = 1 ( + V000 V11 ))) ; sqr=10

Finally, after triggering the satisfiability of the logic, the values of the possible mine holding squares are output.

(check-sat)

(get-value ( V1 V2 V3 V9 V10 V12 V13 V19 V20 V21 ))

(exit)

The above logic is provided to the SMT logic solver Mathsat5 ( or Microsoft’s 3 ) and the results are obtained.

sat

( (V1 0)

(V2 1)

(V3 0)

(V9 0)

(V10 0)

(V12 0)

(V13 0)

(V19 0)

(V20 0)

(V21 0) )

The results are read, alongside re-reading the puzzle file, to generate answer display table in html. The Red “<->” squares indicate the placement of a hidden mine. Green “.” squares are empty, clue squares show the clue value.

Result validation

Using the results from the solver each of the clue number squares are checked to see if the final solution is validated.

To check the validation one of the provided answer values is changed and .

$ ./iMine_smt.pl < iMine_220917.txt | ~/CodeProjects/solver_m5 | sed 's/(V43 0)/(V43 1)/'|

./iMine_smt.pl -f iMine_220917.txt > iMine_220917.html

Two dreamboards

In active Minesweeper a dreamboard is a very easy intermediate game that can be solved in record times. Full exploration here Two are solved here in about real 0m0.048s each.

Paper Versus active game

The dream boards, taken from the active game, are showing up the gap between the active game and the paper version of the game. In the active game there are situations where you just have to hold breath and guess because not all the clues are exposed at the beginning. In the paper game all the mines / no mines should be deductible from the provided clues. The solver has suggested placed mines in the open ground where there is no evidence from the clues that there are or are not mines. Because the clues in the paper game are incomplete the results are somewhat indeterminate.

In this section half way down on the right hand side of dreamboard_001

.	.	.
.	.	.
.	1	1
.	2	.
.	3	.
.	2	.
.	1	1
.	.	.

There are two possible valid answers from the clues provided, the one given by the solver suitable for the paper game

and the one shown in the active game solution.

In the active game the has the "No mines can have naked edges" rule that does not apply in the paper version of the game. This causes the anomaly. A few carefully placed 0s would guide the solver away from the large open areas. A 0 diagonally near the 3 in the section above would resolve the dilemma in this case.

Cartoons

Some Minesweeper cartoons here.

References

Programming Z3 Sudoku section from Section 8.1 page 139

Building a Sudoku Solver with SAT,

A SAT-based Sudoku Solver,

Sudoku as a SAT Problem,

Optimized CNF Encoding for Sudoku Puzzles

Mathsat5 solver

Usage for the Minesweeper to SMT generation utility

Usage: Minesweeper {options} < puzzle.txt or

Minesweeper {options} -f puzzle.txt < SolverOutput

-v N :Be verbose to level N ( use v=1 for detailed output )

-f puzzle.txt :puzzle file needed when reading Solver output use STDIN otherwise

Programs reads STDIN looking for either puzzleFile format lines or Solver output lines

If the input is in puzzleFile format the program will generate solver input lines.

If the input is in Solver output lines format the program will expect a -f puzzleFile paremeter.

Using both these input streams program will then generate display .html output.

Game Rules

The layout of a minefield is described using a rectangular grid with integers and gaps.

Each number indicates the proximity of mines on directly adjcent squares (8 directions)

Mines cannot be on numbered squares and there is no board wraparound.

Each Minesweeper puzzle Input as follows can be any size rectangle

#iMine iMine_220917.txt

.0..00...2...1..

...3..3....2..0.

2....1....43.1..

3.......3.3.....

.3..2..3.3...3.1

..1.....2....3..

0....0..2..1.1..

....1.0.....1..1

.2.0...1..3.....

...0..0.0..3.22.

1.1.3....13.....

......2.....3..2

or this minimal test

#iMine iMine_test03.txt

0...0

..1..

0...0

to generate an SMT2 input file.

OR Process results from solver in the following format

sat

( (V0 9)

(V1 7)

(V2 5)

(V3 3)

....

to generate an html table as output.

The Solver results are read,

alongside re-reading the puzzle file, to generate answer

display table in html. The Red “<->” squares indicate the

placement of a hidden mine. Green “.” squares are empty,

clue squares show the clue value.

The results are validated by checking each of the clue numbers

has the correct numbers of mines in the surrounding squares.

VALIDATION FAIL or VALIDATION PASS is provided.

Picat - First steps

Picat Stands for

Picat - A new avenue for exploration

This is a great starting point for existing users of solver type programs.

The Website

The jumbo boxes of Picat examples

The Guy

The other Guys

ChatGPT - not so clever after all - on logical tasks

Solving Minesweeper on paper using SMT (integer) solver

Encoding

Two dreamboards

Paper Versus active game

SMT Solvers, introduction and links (Start here with the readme)

Picat - First steps