Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Ten Ways to Ensure Windows Azure-Based Applications Are Architected for Success

26 Nov 2012 1  
This whitepaper describes the top ten issues that arise when deploying to SQL Database and explains how to mitigate these issues.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Introduction

Windows Azure SQL Database is a managed version of Microsoft SQL Server 2012, hosted by Microsoft in a Windows Azure datacenter. SQL Database is a core feature of the Windows Azure Platform.

SQL Database is a multi-tenanted system in which many database instances are hosted in a single SQL Server instance running on a physical server or node. Therefore, it has different performance characteristics than pure SQL Server. In particular, high data scalability and high performance are achieved by horizontally scaling the data into multiple instances or shards. To provide high availability and satisfy demanding SLAs, each SQL Database instance is stored as one primary replica and two secondary replicas. When a node fails, the primary and secondary copies it hosts are re-created on other nodes. SQL Database uses a quorum commit in which a commit is deemed successful as soon as the primary and one of the two secondary replicas have completed the commit.

This whitepaper describes the top ten issues that arise when deploying to SQL Database and explains how to mitigate these issues.

Issue #1: Security

SQL Database is offered as service in the Windows Azure cloud, and therefore security must be approached differently than for on-premises enterprise database. Of particular concern for the security of business data in the cloud are

  • The use of secure connections
  • Management of network access
  • User authorization and authentication

Ensuring secure connections

To ensure secure connections, support only TDS protocol secure encrypted connecting on port 1433.

Note that if you want to allow connection from an on-premises database to SQL Database, the local network should not block outbound traffic to SQL Database on that port (for example,there should be no restrictions in the company’s router ACL that prevent outbound traffic to port 1433).

Managing network access

To manage network access, use the SQL Database service firewall that handles the network access control. You can configure firewall rules that grant or deny access to specific IP or range of IPs. The firewall rules can be created on two separate levels:

  • Server level – You can manage the access to the entire SQL Database server and all its associated instances.
  • Database instance level – You can fine tune the access connection granularity for each distinct database. These rules can provide a nice, logical isolation of multiple applications that use the same SQL Database server.

If you are using SQL Database Federations in relation to the database firewall rules, remember that each rule must be configured on each database separately.

Handling user authorization and authentication

SQL Database provides security administration to create logins and users in a way similar to SQL Server. Security administration for the database level is almost the same as in SQL Server. However, the server level administration is different because SQL Database is assembled from distinct physical machines. Therefore, SQL Database uses the master database for server level administration in order to manage the users and logins.

To manage network access, use the SQL Database service firewall that handles the network access control. You can configure firewall rules that grant or deny access to specific IP or range of IPs.

Issue #2: Maintaining high availability

Avoiding connection timeouts

SQL Database offers high availability (HA) out of the box by maintaining three replicas spread out on different nodes in the datacenter and considering a transaction to be complete as soon as two of the three replicas have been updated. In addition, a fault detection mechanism will automatically launch one of the database copies if needed: when a fault is detected, the primary replica is substituted with the second replica. However, this can trigger a short-term configuration modification in the SQL Database management and result in a short connection timeout (up to 30 seconds) to the database.

To mitigate the risk of this connection timeout, it is a best practice to implement an application retry policy for reconnecting to the database. To reduce the overall reconnection time, consider a back-off reconnection strategy that increases the amount of time for each connection attempt.

Backing up the SQL Database instance to another datacenter

HA is also enforced in the scope of the datacenter itself; there is no data redundancy across geographic locations. This means that any major datacenter fault can cause a permanent loss of data.

To protect your data, be sure to back up the SQL Database instance to Windows Azure storage in a different datacenter. To reduce data transfer costs, you can choose a datacenter in the same region.

Issue #3: Using Entity Framework (EF) with SQL Database

What is Entity Framework (EF)?

Entity Framework (EF) is an objectrelational mapper that enables .NET developers to work with relational data using domain-specific objects. Because EF eliminates the need for most of the data-access code that developers usually need to write, it is frequently used with SQL Server in applications. However, there are some issues that developers need be aware off when using EF.

Preventing exceptions resulting from closed connections in the connection pool

First, the EF uses ADO.Net to handle its database connections. Because creating database connections can be time-consuming, a connection pool is used, which can lead to an issue. Specifically, SQL Database and the cloud environment can cause a database connection to be closed for various reasons, such as a network problem or resource shortage. But even though the connection was closed, it still remains in the connection pool, and the EF ObjectContext will try to grab the closed connection from the pool, resulting in an exception.

To mitigate this issue, use a retry policy for the entity connection as offered by the Transient Fault Handling Application Block so that multiple attempts can occur in order to accomplish the command.

Eager loading prevents performance problems

Second, performance degradation can be significant in Windows Azure because many developers are not aware of how a query in EF can be performed. EF offers two options: lazy loading and eager loading.

Lazy loading enables the query to acquire data just as is actually needed. While that approach often sounds optimal, it is actually likely to cause latency problems if the data is spread across separate tables because multiple round trips will be required to traverse each object.

This problem can be eliminated by using eager loading, which enables joining information in separate tables (connected by foreign key) in a single query. To use eager loading, use the Include word in the LINQ query.

Avoid the EF LINQ query, which uses distributed transactions

Finally, SQL Database does not currently support distributed transactions. A distributed transaction includes one or more statements that, individually or as a group, update data on two or more distinct nodes of a distributed database. (This is in contrast to a local transaction, which is performed on a single database.)

Accordingly, avoid using the EF LINQ query, which involves distributed transactions and which will therefore encounter an exception. Instead, use alternatives such an explicit SqlTransaction connection.

Performance degradation can be significant in Windows Azure because many developers are not aware of how a query in EF can be performed. This problem can be eliminated by using eager loading, which enables joining information in separate tables in a single query.

Issue #4: Handling connection failures

Understanding the types of SQL Database connection failures

SQL Database is a distributed system which applications access over a network in a Windows Azure datacenter. Connections across this network are subject to failures that can lead to the connections being killed.

Specifically, when there is a failure in either a data node or the SQL Server instance it hosts, SQL Database moves activity off of that node or instance. Each primary replica it hosts is demoted and an associated secondary replica is promoted. As part of this process, connections to the now demoted primary server are killed. However, it can take several seconds for the information about the new primary replica to propagate through SQL Database, so it is essential that applications handle this transient failure appropriately.

In addition SQL Database is a multitenanted system in which each data node hosts many instances. Connections to these instances compete for the resources provided by the data node. In times of high load, SQL Database can throttle connections that are consuming a lot of resources. This throttling represents another transient failure that applications need to handle appropriately.

(Applications can also encounter connection issues with SQL Database that last too long to be considered transient (such as a SQL Database outage), but these require different handling than described here.)

Designing applications to handle connection failures

The first step in handling connection failures is to determine whether the failure is transient. If it is, the application should wait a brief time for the transient problem to be resolved and then retry the operation until it succeeds.

The details of handling transient connection failures can be complex. Valery Mizonov of the Windows Azure Customer Advisory Team developed a Transient Failure Handling mechanism that the Microsoft Patterns & Practices Team has now released as the Transient Fault Handling Application Block in the Enterprise Library. The Transient Fault Handling Application Block supports various standard ways of generating the retry delay time interval, including fixed interval, incremental interval (the interval increases by a standard amount), and exponential back-off (the interval doubles with some random variation).

The first step in handling connection failures is to determine whether the failure is transient. If it is, the application should wait a brief time for the transient problem to be resolved and then retry the operation until it succeeds.

Issue #5: Throttling

When are applications throttled?

The multi-tenant nature of SQL Database means that the physical resources of a data node are shared among many applications. Microsoft currently does not provide any way to reserve a guaranteed level of resource availability. Instead, SQL Database throttles connections to instances that consume too many resources.

SQL Database considers resource use in 10-second intervals, referred to as throttling sleep intervals. Instances that make excessive use of resources in these intervals may be throttled for one or more throttling sleep intervals until overall resource usage reaches acceptable levels. There are two classes of throttling, soft and hard, depending on how severely resource usage limits are exceeded. The impact of throttling varies from an application being unable to perform any inserts up to it being unable to perform any reads and writes.

Diagnosing throttling problems

An application subject to throttling will suffer transient connection failures. The error number associated with the failure indicates the specific cause of throttling. This information can be logged and, if necessary, the application can be modified to reduce the risk of throttling.

Issue #6: Chatty applications

Latency is higher for SQL Database than SQL Server

SQL Database is a distributed system with applications connecting (at best) from within a Windows Azure datacenter and (at worst) across the internet. Furthermore, the use of commodity hardware means that SQL Database has performance characteristics that are not as good as they would be on a high-end server. The result is that the latency in completing a single database operation is likely to be higher in a SQL Database application than in a SQL Server application.

Making applications less chatty

This increased latency should be taken into account when migrating an application from SQL Server to SQL Database. Care should be taken that the application is not overly chatty - making many unnecessary database calls, for example. This may require that the application be re-architected to modify the number and nature of database operations.

One technique is to cache data so that the SQL Database instance does not have to be queried for frequently used data. Windows Azure Caching (Preview) supports the creation of a cache either in a dedicated role or by sharing available memory in an existing role. In the cache aside pattern, a data retrieval operation accesses the SQL Database instance only when the relevant data is not found in the cache, and whenever data is retrieved from the SQL Database, it is added to the cache. This pattern reduces the chattiness of the data layer in an application, enhancing application performance.

Another technique is to coalesce multiple operations into a single database call. For example, a table valued parameter can be used to parameterize a single database call so that it can insert many rows into a table instead of using one database call per row. This can significantly reduce the amount of network traffic to SQL Database, which helps to reduce application latency.

Issue #7: Monitoring limitations

SQL Database has fewer monitoring options than SQL Server

Various monitoring methods available in SQL Server, such as audit login, running traces and performance counters, are not supported in SQL Database. However, one monitoring option, Dynamic Management Views (DMVs), is supported, although not to the same extent as in SQL Server.

SQL Database supports only a subset of Dynamic Management Views (DMVs)

To discover what DMVs are supported in SQL Database, use the following query:
"select * from sys.all_views where name
like ‘%dm%’.

The DMVs in SQL Database can be roughly divided into three categories:

  • Database-related (prefixed with sys.dm_db)
  • Execution-related (prefixed sys.dm_exec)
  • Transaction-related (prefixed with sys.dm_tran)

In general, SQL Database exposes monitoring information in the logical level hides the physical part.

For example, in the database-related DMVs, notice the absence of the sys. dm_db_file_space_usage DMV, which enables monitoring of the exact disk usage of every database file. (Note that you can use sys.dm_db_partition_state to estimate the amount of data in the database.) Similarly, the tempdb pages allocation DMV is also not available (sys. dm_db_session_space_usage and sys. dm_db_task_space_usage), since, once again the physical part is masked out.

Missing from the execution-related DMVs is sys.dm_exec_query_optimizer_ info, which means that optimization counters like average elapsed time per optimization and number of optimizations performed by the query optimization cannot be displayed.

Finally, missing from the transaction related DMVs are the statistics on the current session transaction and the current snapshot transaction.

Issue #8: Backup and restore

SQL Database provides fault tolerance internally by using triplet copies of each data committed. However, even the strongest database box won’t prevent data corruption due to hardware malfunctions, internal application faults or human errors. Therefore, the DBA for any application needs to be concerned with database backup and restore. We recommend the following practices.

First consider scheduling a backup task every day to create recent restore points.

Second, consider setting the backup target to Azure storage by creating and exporting a BACPAC file from the SQL Database to Windows Azure blob storage; you can either use the Windows Azure portal or an API command (check out sqldacexamples for more information). Be sure to make the link specific. If you do choose to back up to Windows Azure storage, make sure that the storage account is located at a different datacenter (but on the same region to minimize data transfer rates) to prevent loss in the event of a major datacenter failure.

Third, consider using Microsoft SQL Data Sync to sync data between SQL Database instances (copy redundancy) or to sync a SQL Database instance to an on-premises Microsoft SQL Server database (be aware that SQL Data Sync currently does not provide versioning). Finally it’s worth mentioning that if you are planning on a major application upgrade, you should manually back up your databases to prevent an unexpected regression.

SQL Database provides fault tolerance internally by using triplet copies of each data committed. However, even the strongest database box won’t prevent data corruption. Therefore, the DBA for any application needs to be concerned with database backup and restore.

Issue #9: Scaling out the database

SQL Database instance size is limited and performance is not guaranteed

SQL Database is a multi-tenanted system in which the physical resources are shared among many tenants. This resource sharing affects both the maximum instance size supported by SQL Database and the performance characteristics of each instance. Microsoft currently limits the instance size to 150 GB and does not guarantee a specific performance level.

Using sharding to scale out the database with SQL Database Federations

The solution to both the database size problem and the performance problem is to scale the database horizontally into more than one database instance using a technique known as sharding.

SQL Database Federations is the managed sharing feature of SQL Database. A federated database comprises a root database to which all connections are made and one or more federations. Each federation comprises one or more SQL Database instances to which federated data is distributed depending on the value of a federation key that must be present in every federated table. A restriction is that the federation key must be present in each clustered or unique index in the federated tables. The only distribution algorithm currently supported is range, with the federation key restricted to one of a small number of data types.

SQL Database Federations provides explicit support to split a federation instance in two and ensure that the federated data is allocated to the correct database in a transactionally consistent manner. SQL Federations also provides support to merge two instances, but this causes the data in one of the instances to be lost.

An application using a federated database connects to the root database and specifies the USE FEDERATION statement to indicate which instance the connection should be routed to. This provides the benefit of connection pooling on both the client and the server.

SQL Federations provides the ability to scale out a SQL Database application to a far larger aggregate size than can be provided by a single instance. Since each individual instance has the same performance characteristics, SQL Federations allows an application to scale out performance by using many instances.

The solution to both the database size problem and the performance problem is to scale the database horizontally into more than one database instance using a technique known as sharding.

Issue #10: Synchronizing data to maximize performance

Where should the SQL Database instance be located?

Windows Azure is a global cloud service available in eight datacenters on three continents. A website can be hosted in multiple Windows Azure datacenters, and Windows Azure Traffic Manager can be configured to allow users to access the closest datacenter. The question therefore arises of where to locate the SQL Database instance to store application data. Which datacenter should it be in?

There is an increasing interest in hybrid solutions, in which part of the application remains on-premises and part is migrated to Windows Azure. Again, the problem arises of how to handle data. Should it be stored on premises and a VPN set up to allow cloud services hosted in Windows Azure to access it? Or should it be stored in the cloud?

Using Microsoft SQL Data Sync

Microsoft SQL Data Sync provides a solution for both of these situations. It can be used to configure bi-directional data synchronization between two SQL Database instances, or between a Microsoft SQL Server database and a SQL Database instance. It uses a hub-and-spoke topology in which the hub must be a SQL Database instance.

Consequently, Microsoft SQL Data Sync can be used together with Windows Azure Traffic Manager to create truly global applications where both the cloud service and the SQL Database instance are local to each datacenter. This minimizes application latency, which improves the user experience.

Similarly, Microsoft SQL Data Sync can be configured to synchronize data between an on-premises SQL Server database and a SQL Database instance. This removes the need to privilege one location over the other, and again improves application performance by keeping the database close to the application.

Summary

Understanding the major issues developers and DBAs face with Windows Azure SQL Database is critical to a successful deployment. This whitepaper explained the following top ten issues:

SQL Database uses traditional SQL Server security techniques, but also allows the specification of server and database instance level firewalls to restrict access to instances. SQL Database stores three replicas of all data to provide high availability in the event of the failure of a data node. However, to minimize performance problems, applications must handle the transient failures that occur when SQL Database demotes the primary replica. Entity Framework can be used with SQL Database, although care should be taken to avoid unnecessary round trips to the database instance.

Connection failures are more common in SQL Database than in traditional SQL Server, but the Transient Fault Handling Application Block can be used to limit their impact. SQL Database is a multi-tenanted system that throttles connections to ensure a fair allocation of resources and these transient failures can also be handled by the Transient

Fault Handling Application Block. Since SQL Database is a distributed system, it is important to limit chattiness between the application and SQL Database. As a managed service, SQL Database does not need to support the entire set of management DMVs. However, it does provide a number of DMVs that can be used to monitor performance.

SQL Database does not provide point-in-time restore, so it is important to consider alternative backup/restore strategies. SQL Database Federations is a managed sharding feature that supports horizontal scale-out for both data and performance to applications using SQL Database. Microsoft SQL Data Sync provides the ability to synchronize data between Microsoft SQL Server and SQL Database instances.

Understanding the major issues developers and DBAs face with Windows Azure SQL Database is critical to a successful deployment.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here