Solving a Regular Expression Crossword with Haskell, Part 1

    _   _              _    ___  _            _     _                  _
   /_\ | |_ __  ___ __| |_ / _ \| |__ ___ ___| |___| |_ ___   _ _  ___| |_
  / _ \| | '  \/ _ (_-<  _| (_) | '_ (_-</ _ \ / -_)  _/ -_)_| ' \/ -_)  _|
 /_/ \_\_|_|_|_\___/__/\__|\___/|_.__/__/\___/_\___|\__\___(_)_||_\___|\__|
                                                   In Glorious ASCII-VISION

UPDATE: Part 2 is now available

2014-02-03 Mon Solving a Regular Expression Crossword with Haskell, Part 1

The Challenge

Last Tuesday I arrived at my desk at The Skiff to find this sheet of paper had been left on my desk:

achallenge.jpg click here for a clearer version

It's a Regular Expression Crossword. And it clearly needed to be solved.

I briefly considered trying it by hand before realising that there was far more entertainment to be had by writing a program to solve it. To makes things more entertaining I decided that there would be no brute forcing and no back tracking.

By the way, it turned out the leaver-of-crossword-printouts was @jhugman, so he shares the blame for this.

The Problem


The objective of the game is to match regular expressions on a hexagonal grid with each hexagon matching up to 3 of the expressions. It's made a little harder by the prevalence of match all (".*") patterns in the expressions.

You can split the Regular Expressions round the sides into 3 groups and each hexagon will have be in the path of one from each of those groups. The matching string always starts at the side with the label.


The Regular Expressions themselves use only the capital letters A-Z. They do make things more complicated by throwing in Back References making them not actually true Regular Expressions (but that ship sailed long ago so I'll continue to refer to them as that).

If you want to follow along at home here are the Regular Expressions from the puzzle in text form.


The strategy for completing it by the hand seems to be to first find the easy hexagons. For example, at the bottom one of the expressions is "[CR]*" which in Regular Expression speak means "only Cs and Rs". This on its own doesn't give you any answers but when you that one of the intersecting expressions is "R*D*M*" it's clear that the hexagon where they intersect is going to R. As you find more letters it should get easier, but where's the fun in that?

A Program to Solve It

An obvious way to solve it programmatically is to use a similar technique to the one for solving it by hand. We can keep a set of possible letters for each hexagon which starts as the set of all letters then iteratively gets narrowed done until, hopefully, we're left with only one possible letter per hexagon.

Clearly we'll need to present the game board, more on that below, but we also need the method of narrowing. For this I quickly realised that a normal regular expression library wasn't going to do what I wanted. What I need is something that can take a regular expression and my currently narrowed down possibilities for a set of hexagon and return a further narrowed set of possibilities. I'll write about the Massive Yak Shaving Expedition VM Based Regex Engine in Haskell I wrote soon.

The complete program is up on Github and you can an example of its output in a Gist.

Tune in next time

In the next parts I'll show how I represented the game state (hint: it's to do with Q*Bert) and how I solved it using a custom VM based Regular Expression engine (which I think is kind of cool) written in Haskell.

So tune in next time for Haskell, Regular Expressions, Q*Bert and more Yak Shaving than you can shake a stick at.

Stick your email in the box below if you want an email when I add new articles to this site (including the next part).

< follow me on Twitter: @almostobsolete >
        \   ^__^
         \  (OO)\_______
            (__)\       )\/\
                 |   | ----w |
                 ||     ||

About me: I'm Tom, a freelance developer in Brighton, UK. You can hire me to write software if you want. I do Python and JavaScript mainly but I'm pretty flexible if you've got something interesting.

Date: 2014-02-04 20:03:02 GMT

Author: Thomas Parslow

Org version 7.7 with Emacs version 23

Validate XHTML 1.0