Introduction
Caching: no web application should be
without it, but there are many ways to achieve it, and everyone seems to have
an opinion on what works and what doesn’t. Read on for an overview of Windows
Azure caching and determine how you can use it for your platform or
application.
Cache Terminology
Before diving in, let’s cover some
definitions. Differences between caching options can be reduced to fine
details, but small differences in definitions can mean big differences in
behaviour.
Cache: For the
purpose of this article, a cache is a temporary store of data that is volatile,
but fast to access. For example, reading in a table of data from a database and
storing it in a cache allows an application to run faster, because reading a
cache is faster than reading a database. But the data in the cache is not
guaranteed to exist.
In Process vs. Out of Process: In process simply means that the data stored in the cache stays in
the same memory space as the application itself. Out of process means that the
data stored is in a different process to the application. That process could be
on the same server or it could be on another server completely. In Process vs.
Out of Process is a dividing line between whether cached items increase or
decrease the amount of memory the application process is using.
In Memory vs. Out of Memory Cache: In an in-memory cache, the cached data is kept within the running
memory of a server (either in-process or out-of-process). In Memory caching is
always faster, because retrieving data directly from memory is always fastest,
but memory is also the most expensive way of storing data. In Out of Memory
caches, the data is persisted to some other type of storage: usually disk storage
in the form of a file or a storage table.
Application Cache vs. Page Output Cache: As web applications are running and serving up web pages to
visitors, caching will generally be used heavily. There are two styles to caching
pieces of the application: caching data specific to running the application
(such as a group of application settings), and caching the output of a web
page, where either chunks of, or the entire HTML output of a page will be
saved. For a page cache, application logic will generally read the HTML from
the cache and send it back to the browser. For data stored in the application
cache, there is more processing to be done before the data is transformed into HTML
for the browser to load.
Serialization/Deserialization: Refers to transforming a piece of data from its representation in
the memory of an application, to a format suitable for storage in a cache. Serialization
can be done by the programming language/platform, or can be custom-written by a
developer. When you write down some notes on a to-do list, you serialize your
‘to-do’ data onto a notepad. When you read it back later, you de-serialize that
to-do data back into your memory. It’s the same concept for web applications: there
are many different formats of serialization.
Distributed vs. Single Locale: A distributed cache is one where the cache access is accessible
from different processes running on different servers. The alternative to a
distributed cache is one where only a single process on a single server can
access the cache. Distributed caches are important when building scalable,
flexible Cloud applications.
Other types of caching products are
available by installing caching products on Windows Azure VMs in IaaS. These do
exist but are beyond the scope of this article.
Overview: Azure Caching Models
Azure
Cache Service
The Azure Cache Service is a distributed,
in-memory cache that can easily scale and provide fast access to data by applications
running within Azure. A Cache Service is created separately to specific Azure
platform implementations and can be read to/written from different application
platforms within Azure.
The Cache Service has the advantage of being
available to all different computing platforms within the Azure environment, so
is ideal for data sharing where an application utilizes a range of Azure
technologies. As an example, you may design an application that has worker
roles processing items from a queue, and an Azure Website that reads results
from a database for display to the visitors.
In this scenario, data stored in the Azure
Cache Service would be available to both the Azure Website(s) and the Worker
Role(s). An Azure cache service is created specifically through the Azure
management console, or via the Azure API, and is something that lives beyond
application/server rebuilds and restarts.
Azure
In-Role Caching
In the Azure Cloud Services environment,
there are two types of application platform – Web Roles and Worker Roles. Web
Roles are typically used for delivering websites, and Worker Roles are used for
delivering background processing. In-Role caching is caching delivered within
this environment, either in the memory space of a web or worker role
(Co-located Role caching), or by creating a worker role with the sole purpose
of keeping cached items (Dedicated Role caching).
In-Role caching is automatically
distributed as the cached items are available between multiple roles in the
same deployment. This means that the cache automatically expands to be
available to new roles when an application is scaled up.
If you’re running IIS/ASP.NET, you have
access to the in-process cache that is native to ASP.NET. However, if you’re
using either Cloud Services Web/Worker Roles or IaaS VMs to host your
application, you must have two instances to have an uptime SLA. This means that
any application built on these platforms is naturally a distributed
application, and so you must implement some type of caching (or application)
synchronization so that the two or more instances are communicating changes to
the data with each other.
This is because the native Windows Azure
load-balancing does not have server affinity – there is no guarantee that each
subsequent page requested by the visitor will be processed and returned by the
server each time.
Out-of-memory custom caching
Azure provides for multiple types of
persistent storage which can be leveraged for caching. Azure Table Storage is
a simple key/value store that can store very large volumes of data at very fast
access speeds. In addition, Azure SQL Database can be used, but the performance
and size limits are more restrictive than Table Storage, and there is a higher
likelihood of being throttled if excessive requests are made during period of
high usage.
Finally, there is the option to write cache
files out to Windows Azure Storage, or to the temp locations available in
Windows Azure VMs. Using Azure Storage has the extra capability of being
geographically distributed, but this is only suitable for cached data where
changes are infrequent and cache freshness is not an issue. In all of these
cases the caching solution must be coded manually for each custom project.
Caching Approach |
Characteristics |
Advantages |
Disadvantages |
Comments |
Caching Services |
Out of ProcessIn MemoryDistributed |
Accessible by all types of Azure technologies.
Scalable. |
Extra cost for Service |
For building high-performance, scalable cloud
systems. |
Azure In-Role Caching |
Out of ProcessIn MemoryDistributed |
Can run in the memory space of existing Roles |
Only available to Web/Worker Roles within Cloud
Services. |
Suitable for adaptation to existing applications
migrated to Azure. |
ASP.NET/IIS Cache |
In-ProcessIn MemorySingle-Locale |
Runs in the memory space of the existing
application. |
Requires cache synchronization to work with scalable
Azure applications. |
Will require adaptations to use distributed caching
when used with Azure load balancing. |
Azure SQL Database |
Out of ProcessOut of MemoryDistributed |
Data is persistently stored in a database table. |
Requires custom coding to serialize data, and for
management such as expiration. |
Slower than in-memory applications, may be throttled
in periods of high request. |
Azure Table Storage |
Out of ProcessOut of MemoryDistributed |
Data is persistently stored in Azure table storage. |
High performance for very large datasets,
cost-effective. Requires custom code. |
Slower than in-memory, but faster than SQL. May be
throttled in periods of high requests. |
Azure Blob Storage |
Out of ProcessOut of MemoryDistributed |
Data is persistently stored in Azure Blob storage. |
Can leverage geo-redundant persistent storage.
Requires custom code. |
Slower than in memory, but easy to make highly
distributed. |
Real life implementations of different Azure caching
models
At DNN,
our Cloud Services environment runs on Windows Azure and leverages many
different types of caching to deliver a range of products and internal
services.
Azure Cache Service
We use this as part of the provisioning
process for delivering Evoq Trials. Free trials are available
for Evoq Social and Evoq Content, two of the commercial solutions offered
by DNN. The Free trials are built with Windows Azure Pack and interface with
Worker roles in the cloud services environment for provisioning.
The provisioning service collects customer
details (name, email, product type) from the signup pages, and then allocates a
pre-provisioned Windows Azure Website with Evoq Social or Evoq Content already
installed and ready to go. The Azure Caching Service is leveraged to provide
high performance provisioning between the different Azure environments. This
allows us to deliver a new, personalised Trial to a customer in less than 30
seconds.
You can try this process for yourself by
signing up for an Evoq
Social trial or Evoq
Content trial – you’ll be running various Azure technologies together to go
from ‘Go’ to new website in 30 seconds or less.
Azure In-Role Caching
The Evoq in the Cloud
product runs on the Azure Cloud Services platform, and delivers high
performance Content and/or Social websites with dedicated Web Roles. These are
scalable, high-availability customizable websites built on the DNN Platform.
The DNN Platform comes with a standard
caching provider that utilizes file-based storage, but this is not
distributed. For the Windows Azure implementation, In-Role caching has been
leveraged by building a caching provider that snaps into the DNN modular
architecture. This gives high performance, out-of-process, distributed caching
between scaled Web Roles running DNN.
IIS/ASP.NET caching
The Evoq Trials previously
mentioned run using Azure Websites technology. This means they use a
single-instance model of deployment – trials are expected to be fast but do not
need to be scalable. For this reason, they can use the standard in-process
caching provider as delivered with the DNN Platform as standard. In-process
caching works well for this type of Windows Azure Websites, because there is no
scaling involved. Deploying DNN on a scalable version of Windows Azure Websites
would require use of the Caching Service or another custom solution.
Summary
This article covered the different types of
application caching available on the Windows Azure platform. When building
applications on a cloud platform such as Windows Azure, it is important to take
advantage of the scalability and flexibility of the underlying technologies. Obtaining
the best results from the cloud involves building distributed, scalable
applications. Building these types of applications always necessitates a
careful caching strategy. The information in this article will help with the
decision process when planning or reviewing a Windows Azure based application.