Logging as Storytelling #3: Types

In my previous two blog posts I talked about how logging should be like storytelling. Specifically, structured as a series of causally-related actions. The next step is making sure log messages are consistent: objects that are used in different log messages should be referred to consistently, messages should be organized in a consistent manner to ease searching, etc.. The way the Eliot logging library does this is by providing a type system for messages, built out of “fields” that know how to serialize arbitrary Python objects.

For example, let’s declare a logging action that describes a state machine transition. The start of the action will include the identity of the state machine, its current state, and the input. The end of the action, if successful, will include the new state and some outputs.

from eliot import Field, ActionType

# A Field that knows how to serialize an object to a loggable format:
FSM_IDENTIFIER = Field(u"fsm_identifier", lambda fsm: fsm.identifier(),
    u"An unique identifier for the FSM to which the event pertains.")
# Some fields that merely expect to receive inputs of specified types:
FSM_STATE = Field.forTypes(
    u"fsm_state", [unicode], u"The state of the FSM prior to the transition.")
FSM_INPUT = Field.forTypes(
    u"fsm_input", [unicode],
    u"The string representation of the input symbol delivered to the FSM.")
FSM_NEXT_STATE = Field.forTypes(
    u"fsm_next_state", [unicode],
    u"The string representation of the state of the FSM after the transition.")
FSM_OUTPUT = Field.forTypes(
    u"fsm_output", [list], # of unicode
    u"A list of the string representations of the outputs produced by the "

# The definition of an action:
    # The name of the action:
    # Fields included in the start message of the action:
    # Fields included in the successful end message of the action:
    # Fields (beyond built-in exception and reason) included in the failure end
    #  message of the action:
    # Description of the action:
    u"A finite state machine received an input made a transition.")

We can now use this to log actions in our state machine implementation:

from eliot import Logger

class FiniteStateMachine(object):
    logger = Logger()

    def __init__(self, name, state, transitions, handler):
        self.name = name
        self.state = state
        self.transitions = transitions
        self.handler = handler

    def identifier(self):
        return self.name

    def input(self, what):
        with LOG_FSM_TRANSITION(self.logger, fsm_identifier=self,
                                fsm_input=what) as action:
             # Look up the state machine transition:
             outputs, next_state = self.transitions[self.state][what]
             # Tell the action what fields to put in the success message:
             # Handler's logging will be in the context of the
             # LOG_FSM_TRANSITION action:
             for output in outputs:
             self.state = next_state

What benefits do we get from having an explicit type and fields?

  1. Re-usable serialization. The fsm_identifier field for example may be used by other log messages that refer to the FiniteStateMachine, ensuring references to the objects are consistent. This will allow us to find all log messages for a specific state machine by finding messages where the fsm_identifer field has a specific value.
  2. Easier log searching. We can search log messages by type of message, e.g. find all fsm:transition messages.
  3. Validation of log messages. Field optionally supports validation, e.g. as we see in the Field.forTypes variant. While doing validation of inputs is disabled in production code due to performance overhead, it can be done when unit testing to ensure log messages are created correctly.
  4. Easier unit testing. In my next post I’ll demonstrate how having types for messages makes it much easier to make assertions about log messages.
  5. Documentation. Each message or action type describes what it means, as does every field it includes.

In my next post I’ll talk about unit testing, ensuring your logging code is written correctly and actually being run. Meanwhile why not read more about ClusterHQ’s technology underlying our cloud platform for web hosting, the reason this logging system is being written.

Get Involved

Sign up for email updates about Flocker