My Nonogram-solving software tackles a puzzle by
considering only one line at a time, and attempting to
partially solve that line. Information it gets from one
line can be fed into the processing of a perpendicular
line, and so on, until the puzzle is solved.
Until recently, I've been using two algorithms to
partially solve lines.
The original algorithm is called ‘complete’. It
systematically finds all valid arrangements of solids,
and generates a complete result from that.
That is, a certain amount of information can be obtained
logically from a line in a given state and its clue, and
this algorithm gets all of that. (It doesn't mean that it
necessarily determines the states of all unknown
cells!)
But it is a little too diligent, wasting time on
arrangements which do not need to be considered.
Consider a line of 10 cells, all unknown, and the clue
3.3. The first block has 4 possible positions, and in
each of these, it leaves the second 4, 3, 2 and 1
positions, hence 10 arrangements in total:
1 >###-###---<
2 >###--###--<
3 >###---###-<
4 >###----###<
5 >-###-###--<
6 >-###--###-<
7 >-###---###<
8 >--###-###-<
9 >--###--###<
10 >---###-###<
This pattern has an arithmetic progression on the
length of the line minus the minimum space required by
the blocks, x, so the algorithm has complexity
O(x2) for two blocks, and probably
turns into O(xn) for
n blocks. Consequently, the algorithm can be
very slow on the more difficult lines.
In fact, we only need to consider the following seven
arrangements:
1 >###-###---<
2 >###--###--<
3 >###---###-<
4 >###----###<
7 >-###---###<
9 >--###--###<
10 >---###-###<
Note that no block needs to move leftwards.
I realised that another algorithm existed, that would
be much simpler and faster. I refer to it as the ‘fast’
algorithm.
This pushes all blocks to their left-most positions,
then to their right-most, and compares these two
extremes. Consider a line of 10 unknown cells, and the
clue 4.4:
leftwards >####-####-<
rightwards >-####-####<
solution > ### ### <
Where a block overlaps itself in its extremes, the
algorithm can deduce a solid; similarly, dots, where the
same gap overlaps itself.
This is much faster than the ‘complete’ algorithm, but
it can miss some dots occasionally.
So, which algorithm should we choose? Here are some
options:
-
Use ‘fast’ alone, but back it up with
bifurcation.
I implemented this ages ago, since bifurcation is
necessary in general anyway, even when using a
complete linesolver.
-
Use ‘fast’ first, then ‘complete’ only if ‘fast’
failed to determine anything.
I also implemented this some time ago, partly to
make it easier to replace and combine as-yet
unwritten algorithms. You can configure the software
to use a stack of any number of linesolvers.
-
Write a new complete algorithm which avoids many
arrangements by not moving any blocks towards the
left (except at the start). It should be similar to
‘fast’ in speed.
I've been struggling to realise the third option for
some time, while requesting ideas and help here. Over the
years, a number of people have offered their thoughts (or
implementations) on the matter:
-
Mikael Lagerkvist suggested
the use of regular expressions as a comprehensive
implementation.
Gecode: Nonogram Class
Reference —
A Nonogram-solving part of a
constraint library in C++
-
Petr & Mirek Olšák adapted
the fast algorithm to make it complete by making
extra guesses locally within the line.
Griddlers
solver, nonogram solver — A solver by
Czechs Petr Olšák and Mirek Olšák (does colours
too!) — portable GPL C source and Windows
binary
-
Most recently, Thomas
Nygreen sent me some ideas which helped an
implementation to crystalize in my mind.
-
I recall that a number of others have emailed in
the past, with ideas, or with implementations in
Haskell or OCaml (which, I regret, I'm still not
familiar with), but there are too many details to go
over here!
My thanks go to all those people, along with my
apologies for not responding in all (a lot of) cases!
Anyway, I have now implemented what appears to be a
fast and complete algorithm, which I have named
‘fast-complete’, and labelled fcomp:
It is already incorporated into the software,
including the on-line solver, but not the Java
applet.