Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / All-Topics

Optimizing Performance with Denormalization, Observer Design Pattern and Asynchronous Technique

4.67/5 (6 votes)
24 Aug 2009CPOL6 min read 34K   158  
Making use of Denormalization, Observer Design Pattern and Asynchronous technique can enhance performance for any system with massive database.

Introduction

Most developers have heard about De-normalization, Observer Design Pattern and Asynchronous technique. But many of those developers fail to look deeper into the capabilities of these techniques. The purpose of this article is to get you up and running with an in-depth understanding of the combination of these techniques in optimizing performance.

Optimize Performance with De-normalization

Let’s take a look at the definition for De-normalization:

De-normalization is the process of attempting to optimize the performance of a database by adding redundant data or by grouping data.  

If the amount of data is not huge, de-normalization is not helpful since performance isn’t improved much. Hence, we only make best use of de-normalization to deal with huge data. Very often, huge data affects performance of module over which there might be lots of complicated calculations (i.e. Reporting module). Simply put, in reporting functionality, summary data might not be changed frequently, like YTD Report (Year-To-Date), QTD Report (Quarter-To-Date)… Especially with historical data, end user rarely changes it. Hence, why do we need to calculate data at all times?

Reporting function usually relies much on calculation such as make a sum, make an average… These formulas incur a processing overhead, so performance is reduced. Denormalization can solve this problem. With this technique, the result of the calculations would be stored in the database.

Specifically, we’re first looking at the following database design:

Image 1

The requirements are very clear. Each Order is associated with one Customer and one SalesPerson. Each Order includes various OrderDetail transactions (relationship one-to-many).

End user wants to view Report or Dashboard including total sales amount for each product, total sold amount each sales person sells, and total payment amount each customer pays. Obviously, making a sum might take a long time to complete, leading to low performance. Instead, we just make a sum once, then store it in the database. Hence, we arrive at the solution: Adding total fields.

  • Customer: Add field “TotalPaymentAmount
  • Product: Add field “TotalSalesAmount
  • SalesPerson: Add field “TotalSoldAmount

These total fields are redundant but their presence has enhanced performance a lot. Whenever there is a change in Order Processing LifeCycle, the total fields must be notified to update its data.

That’s it for de-normalization. Now, let's take a deeper look at how we update total fields.

The Problem – Updating/Synchronizing Total Data

The problem is how to update total fields when there are data changes. When are these fields updated? There are basically two possible ways of doing that. One solution is to update total fields when opening report/dashboard, but we must add a flag field to notify that there are some changes and total fields need to be recalculated. The drawback to this approach is that performance is low each time recalculation is performed. The other solution is to update total fields right after data changes, no need to wait till the report/dashboard has opened.

In this article, I’ll work through the second solution. Specifically, when SalesPerson updates Order data, system updates 3 total fields too. This gives me a hint about Subscriber-Publisher Model. With this model in mind, it sounds like Observer Pattern fits in well with this design.

What is Observer Design Pattern?

According to the definition in www.dofactory.com, Observer Pattern defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically.

Image 2

Observer Pattern simplifies the problems, fosters low coupling and scalability. The beauty of the Observer Pattern is that you can completely attach/detach any observer object in run-time.

The above UML diagram might be abstract, a bit difficult to understand. Here’s the clear picture of a real-world scenario:

Image 3

In this figure, I add two different objects called “Subject State” and “Observer State”. Subject State will let observers know how the subject object operates. Also, Observer State will let other objects know how the observer object operates (i.e., update successful or failed). This would be very helpful for some business scenarios, for example dealing with transactions; we will need to know which observer fails to update data so a suitable action is given to handle this.

Now, let’s go back to the above database design with applied de-normalization. Here’s how the class diagram looks:

Image 4

It’s better to define a state object for Subject and a state object for Observer. I implement two state objects called “SubjectState” and “ObserverState” :

C#
public enum SubjectState
{
   UPDATE_ORDER, 	//Indicate that only general information is updated such as 
		//changing customer information, changing sales person information...
   UPDATE_ORDER_DETAIL 	//Indicate that detail information is updated including 
			//product price, quantity...
}

public enum ObserverState
{
   NO_UPDATE,
   UPDATE_SUCCESSFUL,
   UPDATE_FAILED
}

Each observer will base on the state value passed on from subject object to determine whether observer object would be updated or not. Here’s how it looks:

C#
/// <summary>
/// The 'ConcreteObserver' class
/// </summary>
class ProductObserver : Observer
{
    …
    public override void Update()
    {
      //Subject object notifies this observer with a state as "UPDATE_ORDER_DETAIL".
      //Only with this state, ProductObserver object decides to update its internal works
        if (_subject.State == SubjectState.UPDATE_ORDER_DETAIL)
      {
          //YOUR CODE GOES HERE: Send a request to database to update 
	 //total field for Product table
          bool updateResult = true;

          if (updateResult) //Updated successfully
          {
              this.State = ObserverState.UPDATE_SUCCESSFUL; //Update Observer State

              Console.WriteLine("'Total Sales Amount' for Products which are 
			chosen in Order {0} has been updated successfully too.", 
              _subject.OrderID);
              Console.WriteLine();
          }
          else
              this.State = ObserverState.UPDATE_FAILED;
      }
      else
          this.State = ObserverState.NO_UPDATE;

      Console.WriteLine("Observer '{0}'s state is '{1}'", _name, this.State);
      Console.WriteLine();
    }
    …
}

You can download the complete sample included in this article. The below table shows the output after running this sample:

Subject StateActionsObserver State
Customer ObserverProduct ObserverSalesPerson Observer
UPDATE_ORDERNO_UPDATEUPDATE_<br />SUCCESSFULUPDATE_<br />SUCCESSFUL
UPDATE_ORDER<br />_DETAILUPDATE_<br />SUCCESSFULUPDATE_<br />SUCCESSFULUPDATE_<br />SUCCESSFUL
UPDATE_ORDER<br />_DETAILDetach SalesPerson ObserverUPDATE_<br />SUCCESSFULUPDATE_<br />SUCCESSFUL

Update Observer Objects Asynchronously

There is a minor drawback to the above solution. It incurs the processing overhead once subject object has been updated that leads to updates on observers as well. How do we solve the performance problem in this case? Notice that updating Observers doesn’t relate to rendering UI at all. We don’t need to force end user to wait for Observer objects to complete their updation, otherwise UI pages feel less responsive. One of technologies that is often overlooked is the ability to update object asynchronously. Hence, Observer objects can be designed to run asynchronously, independent from UI thread. In ASP.NET Page Life Cycle, we can do this easily by hooking up Unload event handler.

Image 5

Safely Update Observer Objects in Transaction

Observer Pattern works behind de-normalization to ensure the up-to-date data. What’s going on if some observers fail to update data? To protect data, we will need to make sure that not any observer fails to update. In essence, we could put all of these updates in a transaction just to be on the safe side, and we can redo it if transaction fails. The downsize to this approach is that transaction design might require writing complicated code. The other solution is that you can set a flag field somewhere else in the database to notify that data has been updated safely. Reporting functionality will base on this flag to determine whether total fields need to be recalculated or not. Overall, either solution has the pros and cons. It’s up to you to decide the best solution to fit in with your business requirement.

Conclusion

Observer Pattern is great. You can get more creative with it by defining a series of Observers, make them execute in pre-defined order… Combining this design pattern with other techniques skillfully can bring in great result. Hopefully this article has given you a few good ideas on how to enhance performance.

History

  • Version 1.0 (24 August, 2009) - Initial release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)