Event Sourcing Overview

Duncan Edwards Jones

4.27/5 (4 votes)

5 Mar 2015CPOL7 min read

30.5K

An introduction to Event Sourcing for the relational database savvy developer

Introduction

Event sourcing is a way of storing data in computer systems in a forward-only, write once manner such that every state effecting event occurring to a thing is stored. The current state of that thing can then be recreated by replaying the history of state changes.

A good way to illustrate how this works is by the example of a game of chess. If you start from a known state - for example, the game set up with the pieces in the opening positions - and record each move made in turn, it is possible to store and then replay those moves to recreate the current state of the chess game.

Starting from our initial state, we record the events that occur in the game. In the above illustration, the events are WhitePawnMoved, BlackPawnMoved, WhitePawnMoved, BlackPawnTaken.

We record the order in which these events occurred as a sequence number and the aggregate is the game unique identifier that the events occurred to.

Any additional information provided for an event is saved as the payload attached to each event.

At any given point in time, you can recreate the current state of the game by replaying the events in sequence. This state of the board is therefore a projection of the event stream. However you can also use the same event stream to produce other projections - for example, if you wanted to know how far any given piece has moved at any stage of the game, you can create another projection that tracks the move events occurring to the piece.

If you are feeling generous, you might allow an inexperienced player to take back a move. In an event stream, no events may be deleted so this is done by adding a "reverse previous move" type of event.

As you can see from this example, the event stream holds absolutely every event that has occurred in the game.

Glossary of Terms

Aggregate - any uniquely identified thing for which events can occur
Projection - Events played through to determine the state of an entity. This can either be the current state, or the state as it was at a given point in time.

Event Source Sanity Savers

Event sourcing can be made a lot easier and the sanity of the developer(s) can be preserved by following these few guidelines:

Each aggregate has its own event stream. These may or may not be physically distinct but logically any event stream may only contain events pertaining to one aggregate.
The event needs to record the event type (or record type) and also the version of the system that wrote the event. This allows the software to discern if an attribute can be expected to be present for any given event.
Classes that represent events should be final (sealed/not inheritable) to minimize the risk of system changes overflowing their intended target.
The payload (or information content) of each event is variable. In practice, this lends itself to the use of a NOSQL system or linked JSON or other object storage methodology.
The event stream is a forward only, write once data store.
“Adjustment” style events are not allowed. If an incorrect event is written, then a cancellation-rebook pair of events must be written to effect the change.
Events should be made idempotent.
All the data pertaining to any given event should be stored - whether or not there is a current business need for that data. Ideally, no data should be discarded.
All read events should operate against projections.
Projections can be cached as at a given event and this can be used to speed up generating the current state by starting from that cached state and only applying events occurring since the cache.
Multiple projections can be generated from the same event stream.

Projections

A projection can be imagined as the combination of a filter that decides whether or not an event is of interest to the projection, and a process to apply the event to the current state. Running the stream of events through this projection will give you a view of the object's current state.

The following guidelines are useful when coding projections:

A projection needs to know which event types (record types) impact its state and it needs to ignore all other event types.
A projection needs a rule for missing attributes - either assigning a default value or ignoring them.
A projection should know the aggregate it is run against and the sequence number of the most recent event that it has processed. This can be used in the creation of cached snapshots of the state.

A projection can also be run up until a given point in an event stream which allows you to get the state of an aggregate as at a given point in time. This historical query functionality is extremely powerful in some application domains, for example, financial systems.

Querying

One of the objections raised to the use of event sourcing is that it is slow to query on an ad-hoc basis as every event has to be played into every projection in order to generate the state to query against.

This is, to an extent, true but it is also very easily mitigated by simply storing the cached projections in a relational database system. These cached states can then be queried as if they were a relational database.

You can also reduce the overhead of running a projection by sending the projection code to wherever the data for the event stream is held rather than bringing the entire event stream over to run the projection over it. This is conceptually similar to a map-reduce system.

Advantages

Event sourcing has a number of advantages over relational database derived systems but chief amongst these is that you have a built in, guaranteed, audit trail. All systems of record need an audit trail and bolting such a system on top of a relational database is sub optimal - it tends to complicate the database and slow down the operations on that system.

It is also the case that a large number of different types of business already operate on an equivalent to an event source - for example, any business that operates a ledger would map across.

On the technology side, the extra cost involved in consistency checking in a relational database system slows it down a great deal. This can be a major headache when scaling the system to cope with rapid growth of a business. Event sourcing systems do not have these same consistency check issues because records cannot be deleted nor amended.

Disadvantages

The tooling for working with event sources lags behind that available for relational database systems. Additionally, the availability of skilled technologist with event sourcing knowledge is an impediment to its adoption by business.

Code Example (VB.NET)

The following (abridged) code example shows how this can be coded:

Aggregates

In order to make the aggregates type safe, I add an interface to create a specific aggregate identifier:

VB.NET

Public Interface IAggregateIdentity

    ''' <summary>
    ''' Get the unique name that identifies this aggregate item
    ''' </summary>
    Function GetAggregateIdentity() As String

End Interface

And a concrete class that implements this identity interface:

VB.NET

''' <summary>
''' Class to identify events pertaining to a known user
''' </summary>
''' <remarks>
''' The user is uniquely identified by a user name (or handle) which must
''' be unique system-wide.
''' </remarks>
Public Class UserAggregateIdentity
    Implements IAggregateIdentity

    ReadOnly m_userName As String

    Public Function GetAggregateIdentity() As String _
           Implements IAggregateIdentity.GetAggregateIdentity
        Return m_userName
    End Function

    Public Sub New(ByVal userName As String)
        m_userName = userName
    End Sub

End Class

This is used to restrict any given event such that it can only apply to one aggregate root type.

Events

For a similar reason, all events are derived from an interface that is bound to the specific aggregate type that the event pertains to:

VB.NET

''' <summary>
''' Bare bones interface to identify an event
''' </summary>
''' <remarks>
''' The payload (data) are provided by the concrete implementation of the event
''' </remarks>
Public Interface IEvent(Of In TAggregate As IAggregateIdentity)

End Interface

and in turn, each distinct event type implements this interface:

VB.NET

''' <summary>
''' A new user was added to the system
''' </summary>
Public NotInheritable Class CreatedEvent
    Inherits EventBase
    Implements IEvent(Of AggregateIdentifiers.UserAggregateIdentity)

    ''' <summary>
    ''' The unique identifier of the user that was created
    ''' </summary>
    Public Property UserIdentifier As String

    ''' <summary>
    ''' The email address the user was initially created with
    ''' </summary>
    Public Property EmailAddress As String

    ''' <summary>
    ''' How was this user created - self service, administrator or bulk import
    ''' </summary>
    Public Property Source As String


    Public Overrides Function ToString() As String
        Return "User was created - " & UserIdentifier
    End Function

End Class

Of course, not every event type needs an additional payload. The event to indicate that a user was disabled could just be:

VB.NET

''' <summary>
''' A user account was disabled
''' </summary>
Public NotInheritable Class AccountDisabledEvent
    Inherits EventBase
    Implements IEvent(Of AggregateIdentifiers.UserAggregateIdentity)


    Public Overrides Function ToString() As String
        Return "User account was disabled - " & Reason
    End Function
End Class

This requires that the event class be NotInheritable so that there can be no ambiguity as to what event happened.

Projections

A projection can run over many different types events, but these must all pertain to the same aggregate so we build that restriction into a base class:

VB.NET

''' <summary>
''' Base class for all projections
''' </summary>
''' <remarks>
''' A projection can only operate on a single aggregate by design
''' </remarks>
Public MustInherit Class ProjectionBase(Of TAggregate As Event.IAggregateIdentity)
    Implements IEventConsumer(Of TAggregate, IEvent(Of TAggregate))

    ''' <summary>
    ''' The aggregate identity that this projection is operating against
    ''' </summary>
    MustOverride ReadOnly Property Identity As TAggregate

    ''' <summary>
    ''' Process an event of the given aggregate to get the current point-in-time state
    ''' </summary>
    ''' <param name="eventToConsume">
    ''' The event with the payload that might impact the state of the projection
    ''' </param>
    MustOverride Sub ConsumeEvent(eventToConsume As IEvent(Of TAggregate)) _
      Implements IEventConsumer(Of TAggregate, IEvent(Of TAggregate)).ConsumeEvent

End Class

And a concrete projection could look like:

VB.NET

''' &ly;summary>
''' Projection over the User event stream to
''' </summary>
''' <remarks>&ly;/remarks>
Public Class UserSummaryProjection
    Inherits ProjectionBase(Of AggregateIdentifiers.UserAggregateIdentity)

    ReadOnly m_identity As AggregateIdentifiers.UserAggregateIdentity

    ''' <summary>
    ''' The aggregate identifier of the client to which this project applies
    ''' </summary>
    Public Overrides ReadOnly Property Identity _
                   As AggregateIdentifiers.UserAggregateIdentity
        Get
            Return m_identity
        End Get
    End Property

    Public Overrides Sub ConsumeEvent(eventToConsume _
                   As IEvent(Of AggregateIdentifiers.UserAggregateIdentity))

        If (TypeOf (eventToConsume) Is Events.User.CreatedEvent) Then
            Dim userCreated As Events.User.CreatedEvent = eventToConsume
            m_userIdentifier = userCreated.UserIdentifier
            m_emailAddress = userCreated.EmailAddress
        End If

        If (TypeOf (eventToConsume) Is Events.User.AccountEnabledEvent) Then
            m_enabled = True
        End If

        If (TypeOf (eventToConsume) Is Events.User.AccountDisabledEvent) Then
            m_enabled = False
        End If

    End Sub

    ''' <summary>
    ''' The public unique identifier of the user
    ''' (could be a user name or company employee code etc.)
    ''' </summary>
    Private m_userIdentifier As String
    Public ReadOnly Property UserIdentifier As String
        Get
            Return m_userIdentifier
        End Get
    End Property

    ''' <summary>
    ''' The email address of the user
    ''' </summary>
    Private m_emailAddress As String
    Public ReadOnly Property EmailAddress As String
        Get
            Return m_emailAddress
        End Get
    End Property

    ''' <summary>
    ''' Is the user enabled or not
    ''' </summary>
    ''' <remarks>
    ''' This allows users to be removed from the system without any data integrity issues
    ''' </remarks>
    Private m_enabled As Boolean
    Public ReadOnly Property Enabled As Boolean
        Get
            Return m_enabled
        End Get
    End Property

End Class

It is worth noting that the projection has no dependency on where or how the event stream is actually stored. This makes it very easy to put together unit tests to show that the expected state emerges after a given set of events that run over a hard-coded event stream.

History

5^th March, 2015: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)