We are launching a series of articles on the Saga pattern. Our goal is to blend theoretical concepts with practical implementation (in C#), offering clear illustrations of the associated challenges.
Introduction
Microservices are becoming increasingly prevalent, aided by the growing adoption of cloud platforms. It is now feasible to promptly implement architectural best practices and instantiate multiple datastores effortlessly. Using Domain-Driven Design (DDD) terminology, this has facilitated the implementation of distinct bounded contexts with their own code and storage. Consequently, this paradigm has streamlined deployments and reduced delivery times. These benefits are now widely acknowledged.
On the flip side, microservices bring their own set of challenges. As data is now distributed across multiple repositories rather than centralized in a single datastore, we can no longer depend on SQL mechanisms to guarantee integrity and coherence, as developers have traditionally done for many years. In simpler terms, the transactional mechanisms previously provided by Oracle or Microsoft allowed us to confidently trust that our data was always in a consistent state. However, with distributed systems, maintaining data consistency has become a daily concern.
In this series of articles, our goal is to address these challenges by implementing a variation of the Saga pattern on the Azure platform.
Disclaimer
The definitions we offer are not official, and the implementations we present should be subject to scrutiny. The primary objective is to illustrate the issues and demonstrate a potential approach to resolving them.
This article was initially published here. Refer to it for a comprehensive overview.
What is a Transaction?
Transactions in relational databases are a fundamental concept, playing a pivotal role in highlighting the capabilities of these technologies. Developers have traditionally placed implicit trust in the internal mechanisms implemented by Oracle or SQL Server to guarantee data integrity and consistency.
CREATE TABLE ValueTable (id INT);
BEGIN TRANSACTION;
INSERT INTO ValueTable VALUES(1);
INSERT INTO ValueTable VALUES(2);
COMMIT;
In the example provided, a table is created, and two insertions are attempted.
- If one of these insertions fails, no rows will be present in the table.
- Conversely, if both insertions succeed, the two rows will be present in the table.
This functionality is widely accepted, to the extent that the complexity behind this seemingly straightforward operation often goes unnoticed.
As long as traditional architectures were predominantly monolithic, characterized by a single codebase and a unified SQL repository, this wasn't a significant issue. Many successful applications were implemented without concerning themselves with these matters. However, with the advent of the microservices paradigm, some previously overlooked challenges have resurfaced and must now be addressed.
Indeed, consider a modern ecommerce platform designed with a microservices architecture in mind, delineating bounded contexts as illustrated in the figure below. What unfolds when a customer initiates an order?
In this scenario, resorting to a traditional transaction is not feasible due to the involvement of two distinct datastores. The most straightforward course of action is to omit the transactional mechanism altogether.
Most of the time, it will function as expected (given that network issues are relatively uncommon). When an order is placed, it should appear in both datastores. However, consider a scenario where a problem arises just after the insertion into the Order
database.
In such a scenario, the issue can be quite consequential: a record is inserted into the Order
database, and the merchant includes it in their accounting. However, the customer will never receive their order, leading to all the potential consequences one can imagine.
Why is This Problem Arising Now?
In reality, this problem had never vanished; rather, it was concealed by the intricate machinery of existing databases. With the rise of distributed architectures and the myriad technologies involved, it can no longer be delegated and becomes a challenge for developers.
Who claimed that microservices were a simpler solution? Like all engineering paradigms, it simplifies certain aspects but gives rise to other very acute consequences. Transactions are a byproduct of that.
Caution
It is more intricate than that. With the advent of NoSQL databases, such as document-oriented ones, it is not always feasible to span a transaction within the same datastore. For instance, CosmosDB only permits transactions within the same container (not across containers) and imposes severe limitations.
In the subsequent sections of this series, we will now introduce approaches to tackle this issue. Conventionally, these challenges, along with their solutions, are termed as patterns, and there is no exception here. Enter the Saga pattern.
What is the Saga Pattern?
The Saga pattern is a design pattern used in distributed systems to manage long-running transactions. It breaks down a transaction into a series of smaller, self-contained steps or activities, each with its own compensating transaction. This approach allows for better resilience and fault tolerance in distributed environments.
- If one step fails, the compensating transactions of the preceding steps are executed to undo the changes and maintain consistency.
- Each step in the Saga represents an individual atomic transaction, and the entire sequence of steps ensures the overall transaction's integrity.
This definition is somewhat abstract and might seem a bit elusive, but what does it actually mean in concrete terms?
What is the Saga Pattern in Concrete Terms?
We will dissect each term in the definition using illustrative examples.
The Saga Pattern is Used in Distributed Systems
This constitutes the straightforward aspect of the definition. The Saga pattern is predominantly employed in systems utilizing multiple isolated services that need to interconnect and collaborate. Its relevance is less pronounced in monolithic applications where there is typically a singular point of truth and pre-existing transaction mechanisms.
The Saga pattern is particularly valuable in microservices architectures where traditional ACID transactions might be challenging to implement due to the distributed and decentralized nature of the system. It provides a more flexible and scalable way to manage complex, multi-step transactions in such environments.
It Breaks Down a Transaction Into a Series of Smaller Steps
Instead of executing a large transaction that spans multiple systems and might impact performance, the Saga pattern depends on local transactions. Each subsystem, such as the Order microservice or the Delivery module, is tasked with initiating its own coherent mechanism. This approach is inherently logical: each module possesses an intimate understanding of the intricacies of its technology and is, therefore, best suited to determine how to execute a transaction within its environment.
However, if the Saga pattern were merely a sequence of more or less independent local transactions executed by each subsystem at its convenience, it would lack efficiency and value. This is why a mechanism called compensation is introduced.
Each Step Has Its Own Compensating Transaction
The compensating transaction pattern involves defining and implementing compensating transactions for each step or activity within a Saga and are designed to undo the effects of the corresponding original transactions in case of a global failure or error. When a step within the sequence encounters an issue, the compensating transaction for each preceding step is triggered. These compensating transactions are carefully crafted to reverse or compensate for the changes made during the successful execution of their corresponding steps.
In Summary
The compensating transaction pattern ensures that, in the event of a failure, the system can be brought back to a consistent state by systematically applying the compensating transactions. This approach provides a way to maintain data integrity and consistency despite failures in the distributed environment.
Who is Responsible for Overseeing the Execution Throughout the Entire Process?
This mechanism requires a central coordinator, often referred to as an orchestrator, which instructs other services to execute their local transactions and, if necessary, to roll them back in the event of a failure. In the context of a serverless architecture, such as on Azure, this orchestrator could be implemented as an Azure Function.
The Azure Function, functioning as the executor, executes the first local transaction, followed by the second one if the first one succeeds. In the event of a failure, the executor rolls back the first transaction using the previously implemented compensation logic.
Information 1
To provide a comprehensive view, it's important to note that the Saga pattern can also be implemented not only with an orchestrator but with a mechanism known as choreography. In this approach, each local transaction publishes an event upon success, and other transactions subscribe to these events, executing when their conditions are met. However, this process won't be covered in this series.
Information 2
This approach is more straightforward to comprehend and oversee since the saga execution logic is centralized. However, it can also pose a potential bottleneck and a single point of failure if not meticulously designed.
Give Me an Example
We persist with the previously mentioned scenario and envision once more that an order is successfully placed in the Order
database, but an error hinders the recording of the delivery. The following diagram illustrates how this situation is addressed using the Saga pattern.
What Happens if the Compensation Logic Fails?
There are instances when the compensation logic may fail. This situation can be addressed through a combination of the following approaches:
- Implementing the Retry pattern for the composating action
- Utilizing exception handling
If an automated process is unable to resolve the issue, the fallback option is to generate an exception report, which can be reviewed manually. This manual review allows for appropriate actions to be taken based on the identified issue.
Important
Resorting to a manual process may appear cumbersome, but nothing comes without a cost in this world. We trade the advantages of microservices, such as modularity and ease of deployment, against very rare and occasional instances where we need to resort to manual methods.
But let's leave theory behind. It is time to focus on practical implementation, demonstrating the Saga pattern on a serverless architecture using Azure and C#. Please visit this link to see it in action. Alternatively, you can download the source code enclosed.
History
- 21st December, 2023: Initial version