Isolating Side-Effects with State Machines

Greetings, readers, and welcome to another examination of the approaches to and benefits of writing software using using explicit state machines.

In case you missed my introductory post or the follow-up posts about general and test-specific benefits of state machines you may want to read those before continuing with this post.

Today I want to discuss another way in which explicit state machines can help make your software easier to write and maintain. I want to discuss side-effects and how they can be isolated in a way that makes them easier to test and understand.

If you recall, in an earlier post I gave an example implementation of how you might process inputs in an explicit state machine. It looked something like this:

def input(self, what):
    try:
        output, next_state = self._transitions[self.state][what]
    except KeyError:
        pass
    else:
        self.state = next_state
        if output is Outputs.ENGAGE_LOCK:
            # Signal the motor to engage the lock
        elif output is Outputs.DISENGAGE_LOCK:
            # Signal the motor to disengage the lock

This is a very simple way to handle inputs. The current state is used to look up a mapping from inputs to outputs. Then the input is looked up in this mapping to find an output to generate. The output is handled by the if-elif tree at the end of the function.

One drawback of this approach is that it puts all of the output logic into a single method. Even for an application as simple as the one in this example this has its drawbacks. One obvious one is that it combines somewhat unrelated functionality (engage a lock, disengage a lock) in a single function. This means it’s difficult to invoke either of these pieces of functionality directly. It also means the function may end up getting so long that it isn’t easily understood. It also requires this if-elif tree to grow any time new outputs are introduced.

Fortunately a simple refactoring solves this problem and eliminates these drawbacks. A common pattern in Python to replace this kind of if-elif construct is to use mapping from some kind of input (names or objects – in this case, objects that themselves represent outputs) to callable objects. There are a lot of variations on this pattern but one implementation of it might look like this:

def engage_lock(self):
    # ...

def disengage_lock(self):
    # ...

output_actions = {
    Outputs.ENGAGE_LOCK: engage_lock,
    Outputs.DISENGAGE_LOCK: disengage_lock,
    }

def input(self, what):
    try:
        output, next_state = self._transitions[self.state][what]
    except KeyError:
        pass
    else:
        self.state = next_state
        action = self.output_actions[output]
        action(self)

One clear consequence of this refactoring is that the input method is now entirely independent of the definition of the state machine. New outputs can be introduced and the only change that is required is the addition of an item to output_actions.

Another consequence is that it is very easy to call engage_lock and disengage_lock directly. This is quite beneficial for the unit tests for these two methods because they can now focus solely on verifying their behavior: they don’t have to jump through any hoops (like getting the state machine into the “right” state and then sending it the “right” input), they can just call the method they want to call.

It’s possible to extend this idea even further and implement the output actions on an entirely different object than is responsible for handling the state transitions.

Perhaps you already have a library for controlling the lock on the turnstile and you just want to use it. A simple helper class can translate from state machine outputs to method calls on the domain-specific object (ie, the class that represents the lock):

class StateMachineOutputs(object):
    def __init__(self, model, actions):
        self.model = model
        self.actions = actions

    def output(self, symbol):
        method_name = self.actions[symbol]
        method = getattr(self.model, method_name)
        method()

Combine this with another dictionary mapping outputs to the right methods and the entire state machine class might become application agnostic:

class StateMachine(object):
    def __init__(self, transitions, outputer):
        self._transitions = transitions
        self._outputer = outputer

    def input(self, what):
        try:
            output, next_state = self._transitions[self.state][what]
        except KeyError:
            pass
        else:
            self.state = next_state
            self._outputer.output(output)

t = namedtuple("transition", "output next_state")
transitions = {
    States.UNLOCKED: {
        Inputs.ARM_TURNED: t(Outputs.ENGAGE_LOCK, States.ACTIVE),
        },
    States.LOCKED: {
        Inputs.FARE_PAID: t(Outputs.DISENGAGE_LOCK, States.ACTIVE),
        },
    States.ACTIVE: {
        Inputs.ARM_LOCKED: t(None, States.LOCKED),
        Inputs.ARM_UNLOCKED: t(None, States.UNLOCKED),
        },
    }
turnstile = StateMachine(
    transitions,
    StateMachineOutputs(
        Lock(), {
            Outputs.ENGAGE_LOCK: "engage",
            Outputs.DISENGAGE_LOCK: "disengage",
        }))

Using this approach, the turnstile state machine is now an instance of a generic class that can represent all manner of different state machines, complete with their different side-effects.

This version of the turnstile example demonstrates many of important features of the state machine library I developed at ClusterHQ for our replicator functionality.

Get Involved

Sign up for email updates about Flocker