Chat-GPT and other OCR performance tested on grid data.

Chat-gpt OCR fail in June 2025, Excel does better.


 The image above is a Samurai Sudoku that consists of five interlocked normal sudoku. Each of which works using the normal sudoku rules. This puzzle has been solved using SMT integer solvers but the tedious part is typing in the initial puzzle. Extracting the numbers into a matrix seems like the kind of work that would be suitable for AI. Turns out neither AI nor other "free" web OCR offerings are suitable.

Chat-GPT just can't read the grid ( or the numbers )


That was a rather disappointing fail. It seems obvious that the grid has not been correctly removed before extracting the numbers and the alignment of the numbers has not been preserved. There are also numbers missing from the top right hand corner of the puzzle. Refining the ask produces hardly any better results.

We follow up with a refinement ... 

And the results .... It seems a bit like a sulky teenager who knows what work needs to be done but can't actually put in sufficient effort to get correct and accurate results.

The best Chat-gpt managed on two passes


Other "Free OCR" services

After a quick search, a couple of "free" image to OCR services were tried out. Both of these offered to do the conversion work but would not deliver the results until a fee was paid. The fee was often tied up with some kind of subscriptions to the service. Only one service would deliver the results by email but provided this useless file where all the structure of the numbers has been lost.


Excel has the answer (almost)

After a bit of further searching for a solution the following Excel feature was found in this support note with a handy demo video. "Insert data from a picture" This would seem to be the exact answer required until the killjoy text ....
Important: Data from Picture in Excel for Windows is only supported on Windows 11 or Windows 10 version >=1903 (must have Microsoft Edge WebView2 Runtime installed).

However Excel for Office365, being a web platform, was more accommodating even when driven from a Mac. Using Data -> Data from Picture, the image was loaded and examined automatically then choices offered to correct the data. After correction, 1 seemed to be misread, the data was accepted into the sheet.

**Update as of June 2025 with Excel for Mac 16.98 the Data from picture feature is available offering the choice of reading the source picture from file or clipboard. The read from Clipboard was a fail but the read data from file gave similar results than the Office365 method. The OCR Engine may be different so results might vary. **




After some coloured boxed are added, a very presentable and nearly accurate enough looking sheet. Can you see what's wrong ?


Solution

This far down the workflow, it's just a short push to the solution using an SMT integer solver. First some minor changes are required to prepare the input text before exporting as a .csv for use by the solving scripts.


Some "x" gap markers and end of data "z" markers were added to help with the later editing of the exported .csv file.


Once exported the X & Z markers are removed and ,, converted to ,0, using vi text editor commands. After the first pass at solving it becomes clear there is an image to grid transcription error. An extra line has been inserted and some of the data ( 5 then 1 & 2 ) was split between those lines. This may have been caused by the slight tilt in the original image but is not very satisfactory. 



After correcting the misalignment and using the previously described sudoku .csv to SMT integer conversion script Mathsat solver completed the task in  about 0.34 seconds on an M2 Mac.

% time ./sudoku_GiantX_smt2.pl < RG_002.csv | ../../../solver_mathsat5 | ./sudoku_GiantX_smt2.pl -f RG_002.csv > RG_002.html

./sudoku_GiantX_smt2.pl < RG_002.csv  0.01s user 0.01s system 88% cpu 0.021 total

../../../solver_mathsat5  0.34s user 0.02s system 98% cpu 0.369 total

./sudoku_GiantX_smt2.pl -f RG_002.csv > RG_002.html  0.01s user 0.00s system 4% cpu 0.370 total

giving this answer image. 


Mac native method also failed

A Mac native method was suggested but did not go well.

Using photos to find text in an image.
Only the highlighted cells are found and extracted.

Grid format, number order are lost when pasted into Numbers.

In conclusion

Chat GPT did not get anywhere near the required level of accuracy for this relatively simple looking OCR task. Other web based "text extraction from an image" services required upfront payments, even though they said they were free.

Digging deep into the reliable proven, if idiosyncratic, Microsoft Excel found an answer. Even that wasn't perfect with a line split
incorrectly in the data. Further refinement of the Excel extraction workflow will probably provide a close enough solution, removing the need to type in the Samurai Sudoku grids. To get the best capture of numbers and positions, as with any OCR process the text should stand out from the background. Using a picture editor to change the background from grey to white helped make the image more readable.

The Mac native "Find text in an image" was also disappointing but as the "data from picture" feature is now available in Excel for Mac, latest versions, this platform can be used.

Reworking the process on a similar image

From the original image after being "de grayed" ...

Original "degray ed" image

Recognised by Excel import from picture becomes ( after a bit of text editing ) ..



In the above text 3 numbers were out of place in the grid, found by manual inspection.
After passing through the translation and solving process becomes


And finally it only seemed fair to give Chat-GPT another go using the cleaned up picture ...


Slightly better on the shape but the contents are all woefully incorrect and over the place.

No comments:

Post a Comment

SMT Solvers, introduction and links (Start here with the readme)

Domino Placement from The Turing Tests - Expert Numbers puzzles, solved using SMT (Integer) solver.

I had a book gifted that really fits right in my wheelhouse. The book The Turing Tests - Expert Numbers puzzles, has a large collection of p...