Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Optimize SharePoint Storage with BLOB Externalization

2 Apr 2012 1  
This document is intended to provide a comprehensive and balanced examination of SharePoint storage concepts and issues, and to equip you to fully understand, communicate and make informed decisions about the role of BLOB externalization in your storage architecture.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

and the SharePoint Zone

As organizations scale the use of SharePoint as a content management system, supporting document management, records management, collaboration, and archiving, they are moving workloads traditionally supported by file shares and by email attachments to SharePoint. In the process, the storage burden for document-centric content is placed on SharePoint’s data tier—the content databases hosted by SQL Server, and often located on the most highly performing and expensive Tier 1 storage resources in the enterprise.

The results include the obvious—skyrocketing storage costs—and the counterintuitive—potentially degraded SharePoint performance. Organizations struggle to architect an optimized, scalable, cost-effective storage architecture for SharePoint that supports potentially terabytes or tens of terabytes (TBs) of content.

There has been ongoing debate in the SharePoint and SQL communities among customers, between experts and MVPs, and even within Microsoft itself about content scalability, the impact of documents on storage capacity and performance, and the role of BLOB externalization, which allows you to relocate the binary large object (BLOB)—the unstructured data that represents the content of a document—to more cost-effective storage tiers.

The debate was reignited when Microsoft updated its guidance for SharePoint sizing in July 2011 to support content databases of up to 4 TB in collaborative workloads, and of unlimited size for document archives. Microsoft’s new guidance contained directives toward thoughtful architecture, high-performance storage subsystems, comprehensive governance, and effective management tools beyond those provided with SharePoint to support Service Level Agreements (SLAs) for performance, backup, recovery, and availability.

This white paper will make sense of the issues as well as the varied perspectives and guidance related to BLOB externalization.

We will begin by reviewing the default configuration of SharePoint—with which SharePoint stores a document’s content as a BLOB in the content database—and by examining the perhaps surprising multiplier effect of a BLOB on the storage tier through the entire lifecycle of the document. We will then detail the two options that Microsoft has provided for BLOB externalization: external BLOB store (EBS) and remote BLOB store (RBS).

The remainder of the white paper will explore in detail the potential benefits of BLOB externalization, which include:

  • Reduced cost of storage
  • Optimized performance
  • Improved storage management and reduced storage footprint
  • Efficient content restructure
  • Greater scalability

And we will look at the potential downsides of BLOB externalization: increased architectural complexity and, in particular, the considerations that must be taken for backup, restore, high availability and disaster recovery.

This white paper is the first of a series of storage optimization and management resources to be authored by a team of SharePoint MVPs. The goals of this white paper are to provide a comprehensive and balanced examination of the concepts and issues, and to equip you to fully understand, to communicate with your peers and your management, and to make an informed decision about the role of BLOB externalization in your storage architecture.

Fundamentals

Let’s begin by examining the default configuration of—and the options for—document storage in SharePoint.

Documents, Databases, and BLOBs

In most organizations, SharePoint hosts myriad documents: Office file formats including Word, PowerPoint, and Excel files; PDFs; media files such as images, podcasts and videos; and other files including maps, engineering specifications, scanned documents, and more.

Users upload or save documents to one or more document libraries in a site collection, or attach documents to list items. By default, these documents are stored in the content database of the site collection containing the document library or list.

Within the content database, the metadata for a document or list item is stored in the AllUserData table. If a document is stored in a library, or as an attachment, SharePoint maintains internally used metadata about the document in the AllDocs table. The content of the document is stored as an unstructured data format called a BLOB in the AllDocStreams table. Globally unique identifiers (GUIDs) are used to link the records in these three tables. If versioning is enabled, metadata about historical versions of the document is stored in the AllDocVersions table, and historical versions of the BLOB are stored in AllDocStreams.

SQL Server is a database service that is optimized for performance of structured, relational data, where records are less than 8 kilobytes (KB) in size. Microsoft moved BLOBs to a separate table in order to optimize performance of SQL and therefore of SharePoint. While there is a small performance penalty when a document is opened or saved, because SQL must join the AllUserData table with other tables before the document binary data can be accessed, myriad other processes that rely on efficient SQL Server performance benefit tremendously. So, the split design of content database tables is a net gain for SharePoint.

The Storage Optimization Challenge

Numerous constituencies are affected by the storage infrastructure that supports an enterprise SharePoint service. Storage architects, database administrators, SharePoint administrators, end users of SharePoint, and organizational management are each concerned, at various levels, about whether the storage infrastructure is available, recoverable, scalable, manageable, cost effective, and able to meet performance expectations.

SharePoint’s out-of-box configuration places all content in the content databases of SQL Server. This presents a potential scalability problem. SQL Server is typically hosted on Tier 1 storage—the fastest, most feature rich, and most costly storage tier in an enterprise. If the entire capacity burden is placed on SQL Server, the cost of scaling Tier 1 storage as enterprise content management usage scales might be prohibitive.

The problem increases over time, because as years pass, a typical SharePoint implementation will host increasing amounts of content, but much of that content will be inactive. While total content often increases geometrically, the amount of active content increases at a much lower, linear rate. The result is that expensive Tier 1 storage is used as an archive for skyrocketing amounts of content that does not benefit from the performance or features of Tier 1 storage.

You might think that moving content to cheaper tiers of storage might penalize performance. Even if this is the case, there is the opportunity to optimize storage by creating a hierarchical or tiered storage architecture in which less important or less active data is moved to cheaper tiers of storage, while more important or more active data is maintained on Tier 1 storage. But, as you will see later in this white paper, it is possible in many scenarios to reduce storage cost and increase performance.

BLOB Externalization

There are those in the SharePoint and SQL community who argue that SharePoint should never have been designed to store documents as BLOBs in a structured relational database like SQL Server. In fact, SharePoint version 1.0 (SharePoint Portal Server 2001) used the Web Storage System, not SQL Server, for content storage. Beginning with SharePoint version 2.0 (Windows SharePoint Services and SharePoint Portal Server 2003), storage was moved to SQL Server.

In recent years, Microsoft has built into SharePoint and SQL Server options that enable moving BLOBs out of SQL Server to other storage tiers, while maintaining metadata in the content database with pointers to the associated documents.

EBS and RBS

In SharePoint 2007 Service Pack 1 (SP1), Microsoft introduced External BLOB Storage (EBS). In SQL Server 2008, Microsoft added Remote BLOB Store (RBS) to SQL Server. At a very high level, both EBS and RBS perform the same basic task: they enable SharePoint to store BLOBs outside of the SQL Server content database.

EBS is part of the SharePoint product, in SharePoint 2007 Service Pack 1 and later, and in SharePoint 2010. SharePoint 2007 supports BLOB externalization only with EBS. 4

In SharePoint 2010, BLOB externalization is supported using either EBS or RBS. EBS, however, is deprecated and is likely to be removed from a future version of SharePoint in favor of RBS.

RBS can be obtained by downloading the Feature Pack for SQL Server 2008 R2 at http://go.microsoft.com/fwlink/?LinkID=177388. This version (or a later version) of RBS is required by SharePoint 2010. The server components of RBS can be installed on either SQL Server 2008 R2 or SQL Server 2008 SP1. The client components are installed on all SharePoint web front end (WFE) servers.

Modules and Providers

Both EBS and RBS require additional components to manage the communication between SharePoint and the non-SQL Server location for BLOB storage, called the BLOB store. EBS uses a plug-in framework that requires a third-party module to implement. RBS exposes a set of Application Programming Interfaces (APIs) on top of which developers and ISVs can build providers. A provider is the interface between RBS and a specific type of BLOB store.

Microsoft created the FILESTREAM provider, which is included free-of-charge with the RBS installation files. The FILESTREAM provider allows you to externalize BLOBs to the local file system of the SQL Server. This can include direct-attached storage and iSCSI-attached SAN and NAS volumes, assuming those volumes meet SharePoint’s storage performance requirements of at least .25 input/output operations per second (IOPS) per gigabyte (GB) stored, and no more than 20ms time-to-first-byte (TTFB).

While many in the industry maligned the performance of the FILESTREAM provider, experience has shown that its performance is at least better than expected. Some ISV providers outperform FILESTREAM, some underperform. An RBS provider for a cloud storage platform, for example, is likely to be slower due to the nature of the cloud storage itself. It is recommended that you test BLOB externalization providers in your environment to identify performance differences and, more importantly, to determine whether such differences have a material impact on overall SharePoint performance.

Performance aside, many organizations turn to ISV providers for their broad support for diverse storage platforms, business rules for externalization, management features, and maintenance. ISVs have built providers for BLOB stores in shared folders on a remote server, on cloud storage platforms, and on specific NAS and SAN platforms. ISV storage solutions provide various levels of rule-based management of BLOBs, so that an organization can specify what types of BLOBs are and are not externalized and thus create a hierarchical storage management platform. Installation and configuration of ISV providers is generally a significantly better experience than the FILESTREAM provider. Maintenance tasks are generally automated. And ISV providers typically deliver monitoring and reporting capabilities. Such features come at a cost, of course, which must also be considered. Later in this white paper, we will discuss the capabilities of third-party BLOB externalization solutions.

Functional Parity

It is important to note that BLOB externalization is transparent to the rest of SharePoint. SharePoint considers BLOBs part of its content repository, regardless of where the content is physically located.

Therefore, whether BLOBs are stored in a content database or are externalized, SharePoint manages SharePoint permissions to the content. SharePoint continues to index the content. All SharePoint document management features, including check-out and versioning, continue to be available. None of the business-value functionality of SharePoint changes when BLOBs are externalized.

Migrating to EBS or RBS

Generally speaking, you can enable EBS or RBS at any point in your SharePoint journey. Similarly, you can modify the rules that govern which BLOBs are externalized at any time. There are tools you can use—Windows PowerShell in the case of RBS—to scan the content database and move content out to the BLOB store.

The following sections will examine the pertinent factors—the benefits and considerations—that you should take into account as you plan for an optimized storage infrastructure for SharePoint. Due to the fact that EBS and RBS deliver conceptually similar benefits and challenges, we will refer generically to BLOB externalization solutions.

Benefits

BLOB externalization has the potential to deliver significant benefits in many workloads. Benefits include reduced cost, optimized performance, improved storage management, reduced storage footprint, efficient content restructure, and greater scalability.

Reduced Cost of Storage

The first, and most obvious consideration related to SharePoint storage optimization is cost, which is directly related to capacity and to the tier(s) of storage that host SharePoint content. Just how much do BLOBs impact capacity and cost? The answer might surprise you.

The 80 Percent Estimate

In a typical content database, documents—their metadata and the BLOBs that contain their content—tend to consume significant real estate. The content database becomes bloated by the BLOBs it stores.

Typically, as much as 80 percent of data for an enterprise-scale deployment of SharePoint Foundation consists of file-based data streams that are stored as BLOB data. These BLOB objects comprise data associated with SharePoint files.

http://msdn.microsoft.com/en-us/library/bb802976.aspx

This “80 percent estimate” doesn’t tell the full story, however. What this 80 percent estimate fails to illuminate is the scale of the impact of BLOB storage. In a typical collaborative environment, one often finds that a document is not stored just one time—that the amount of storage required for a single document is significantly more than you would expect—in fact, many multiples of the document’s size.

Let’s examine the total impact of a single document and its BLOB storage through the entire lifecycle of a document.

Document and Metadata Storage

SharePoint supports documents up to 2 GB in size (http://technet.microsoft.com/en-us/library/cc262787.aspx#ListLibrary), a software boundary that results from a 32-bit pointer used in SQL Server. There is no way to exceed that limit and to store larger documents in the content database—with or without BLOB externalization.

SharePoint includes a file size upload limitation that's configurable per web application. The default maximum upload size is 50 megabytes (MB)—considerably smaller than the 2 GB hard limit. This lower limit reflects practical concerns including network performance, the performance of transferring large files over HTTP, and user expectations for performance of file transfer. Many organizations retain this default upload size or raise the limit. If you choose to raise the maximum upload size, do so slowly and after careful testing.

Each document in a SharePoint library has metadata associated with it. Some metadata is user-configured, such as columns in the library. Other metadata is used internally by SharePoint. The amount of metadata associated with a document will vary based primarily upon user-configured metadata.

It's easy to understand that scenarios with larger documents see a higher ratio of BLOB-to-metadata storage, and scenarios with smaller documents and more metadata will see lower BLOB-to-metadata ratios. The 80 percent estimate is based on an average across multiple SharePoint environments.

But here’s the rub: A document is rarely, if ever, stored only once.

Document Versions

When version history is enabled for a document library, any change to the document or its metadata results in additional storage utilization. Two points are often misunderstood and have significant impact on storage when versioning is enabled:

1. No differential compression is used within SharePoint. When a new version is saved, the amount of storage represents the entire size of the file—not just the differences or deltas between versions. Conceptually, two versions of a document with minor changes will occupy 2 x (document size + metadata) of storage.

2. A new version of the document is created if the document—or its metadata—is modified. If a document is uploaded to a library and is never changed, but the metadata associated with that document is changed five times over the course of a month, the storage occupied by that document is approximately 5 x (document size + metadata).

When versioning is enabled, the impact of a document on storage is multiplied by the number of versions of that document. Therefore, it's critical to enforce limits to version history—unlimited version retention can lead to significant database bloat.

Recycle Bin Contents

When a document is deleted, the document and its versions are retained based on the web application’s settings for the SharePoint Recycle Bin. A user can restore a document she deleted from the Recycle Bin.

When a user empties the Recycle Bin, the document and its versions continue to be retained, and can be restored by site collection administrators, from what is referred to as the second-stage Recycle Bin.

Each site collection has a Recycle Bin. However, the Recycle Bin has two configurable settings that are both scoped to a web application. These settings apply to all site collection Recycle Bins in the web application.

The first Recycle Bin setting specifies the total number of days that a deleted document will be retained by the Recycle Bin. This setting applies from the moment the document is deleted. It doesn’t matter whether the document is in the user Recycle Bin or the second-stage Recycle Bin. X days after a document was originally deleted by the user, it's deleted from the Recycle Bin and the document is removed from the content database.

The second setting applies a storage quota to the second-stage Recycle Bin. When items are moved to the second-stage Recycle Bin, they count against this quota. When the quota is reached, the oldest items in the second-stage Recycle Bin are removed to make room for newly deleted items. The quota is configured as relative to the quota of the site collection. So if a site collection is subject to a 50 GB quota, and the second-stage Recycle Bin is limited to 50 percent of the quota, then the second-stage Recycle Bin for that site collection is effectively capped at 25 GB.

Therefore, the total storage impact of a document on a content database must take into account the fact that, until a document is purged from the second-stage Recycle Bin, the document—its BLOB and metadata—and those of the document’s versions continue to impact the content database.

Auditing

Audited activities generate entries in the audit log. The amount of storage required for auditing can be significant, particularly if you are auditing view activities. However, audit entry size and the size of audit logs is not related to document size, or to whether BLOBs are stored in SQL or are externalized. Therefore, while you should consider auditing when estimating total storage requirements for a content database (http://technet.microsoft.com/en-us/library/cc298801.aspx#Section1), we will not examine auditing in more depth in this white paper. 8

Office Web Apps Cache

In order to improve performance of SharePoint when the Microsoft Word web app and Microsoft PowerPoint web app are used, the web apps create renditions of a document in a cache called the Office Web Apps cache. When a document is rendered, it can be pulled from the cache. A document is re-rendered only if it doesn’t exist in the cache, or if the document has changed after the rendition in the cache was created. A timer job removes documents from the cache after a configurable expiration period.

If a web application is associated with the Microsoft Word or PowerPoint web apps, one content database will contain the cache for all content in the web application. In a document-heavy web application, the cache can grow quite large. By default, the cache is capped at 100 GB. It's a best practice to configure Office Web Apps to use a separate, dedicated content database in a SharePoint web application, and to manage the size of the cache to optimize performance and storage. You can learn more about this at http://technet.microsoft.com/en-us/library/ee837422.aspx.

The size of the Office Web Apps cache doesn’t depend on whether BLOBs are stored in SQL Server or are externalized. Rather, it’s based purely upon the number and size of documents, frequency of access to those documents, and on administrator configuration. So while the Office Web Apps cache should be considered as part of the estimate of storage required for a web application, it will not, if in a dedicated content database, affect the storage required for other content databases in a web application.

Service Databases

A document indirectly affects the storage required by service applications. For example, access to a document might be tracked by the Web Analytics service application. Tagging, commenting and rating activities consume approximately 9 KB per entry in the social tagging database of the User Profile service application. Such data are relatively negligible, and is neither dependent on whether a document’s BLOB is stored in SQL Server or is externalized, nor directly dependent on the document’s size.

However, the Search service application is affected directly by both the number of documents and their size. The crawl database, properties database, and index partitions each have a direct relationship to the number and size of documents. Search capacity planning is both a science and an art, but very rough estimates from typical implementations fall around 20 percent of the total size of indexed content (the corpus). 9

So, if you are indexing 1 TB of typical content, you can expect approximately 200 GB of storage utilization by search-related databases and the index. For more information, see http://technet.microsoft.com/en-us/library/gg750251.aspx.

Search and other service database sizes aren’t dependent on whether BLOBs are stored in SQL Server or are externalized.

Transaction Logs

SQL Server logs all activity to the transaction log for a database before committing the transaction to the data portion of the database. Transaction logs grow until a log backup, at which point space used by the log is cleared, but the file size does not shrink. You can shrink a SharePoint transaction log manually, which can be helpful if a transaction log has grown out of control, but the best practice is to manage transaction log size by managing transaction log backups.

When a document is uploaded or modified, the document BLOB and metadata are first written to the transaction log. Then, the transaction is committed to the appropriate tables in the content database itself. Therefore, the true impact of a document on total content database size, including the transaction log, can be approximated as document size x (creation + modifications) x 2 during the window between log backups.

The transaction log size is directly related to BLOB storage. If BLOBs are externalized, and aren’t stored in the content database, then the BLOB is also not written to the SQL Server transaction log.

BLOB Externalization, Capacity, and Cost

As you can see, the storage required for just one document can vary greatly, based on version retention, modification of the document or its associated metadata, web application settings such as Recycle Bin, auditing settings, the use of Office Web Apps and other service applications, and even backup policies. In a typical, highly collaborative scenario, an active document may be consuming storage equivalent to many multiples of the document’s actual size.

BLOB storage inside the content database can be expensive, from both a hard cost and total cost perspective. Many organizations have traditionally stored documents for business collaboration on file servers, which are relatively cheap compared to the typical, high-IOPS, Tier 1 storage that is allocated to support SQL Server for SharePoint. Moving such content to SharePoint and into Tier 1 storage—especially with the understanding that a document may require multiples of its size to support both the business and internal functionality of SharePoint throughout the lifecycle of that document—can be a costly proposition. 10

BLOB externalization does not by itself reduce the total storage footprint of your SharePoint infrastructure, except for reduced impact on transaction logs. But it does enable you to transfer the storage burden to more cost effective tiers. The cost savings can be tremendous. Some organizations report savings of tens of millions of dollars a year from storage optimization efforts focused on BLOB externalization. And some storage platforms have features that can, in combination with BLOB externalization, reduce the total storage footprint of SharePoint. Such features will be discussed later in this white paper.

Optimized Performance

It might seem that all BLOBs should be moved to cheaper tiers of storage in order to reduce cost of storage. But you must also consider the impact of BLOB externalization on SharePoint performance.

For example, if you use a third-party provider to externalize BLOBs to a cloud storage platform, such as Amazon or Azure, it stands to reason that reads and writes of documents will be slower than local SQL Server storage.

Interestingly, it is not true that all BLOB externalization reduces performance. In fact, it is possible to increase performance through BLOB externalization in certain scenarios and configurations.

The question is: At what point does performance of SharePoint improve with BLOB externalization, and at what point does performance degrade?

The answers to these questions require close examination of two performance-related issues:

  • The performance of read and write access to a single document
  • The performance of the entire SharePoint service—across all site collections in a content database and across all content hosted in SQL Server—in a real-world scenario

Most documentation in the SharePoint community focuses on only the first issue, which unfortunately clouds decision making and leads to designs that sacrifice performance in a typical, real-world production environment.

In the following sections, we will explore these two aspects of performance. It is critical to remember, however, that the only real way to know how SharePoint will perform with or without BLOB externalization in your enterprise is to test it with workloads that are realistic and representative of your environment.

In addition, you must consider whether performance degradation is noticeable to users, and whether—even with reduced performance—SharePoint can still meet user expectations and SLAs for performance. Finally, you must consider each scenario or workload that you are supporting. Users might find it acceptable for access to an archived record to be slower than access to documents around which they are actively collaborating. 11

Performance of Access to a Single Document

The out-of-box configuration of SharePoint stores BLOBs in SharePoint content databases. This can provide optimal performance in certain scenarios.

  • Tier 1 Storage Performance – Many organizations dedicate high-performance, Tier 1 storage subsystems to SQL Server. The performance of the SQL Server storage platform directly impacts SharePoint performance. Documents accessed from a high-performance SAN will perform better than documents with BLOBs externalized to public cloud storage.
  • Small Files – When a small document is read from—or written to—a content database, performance of that single activity is often optimal.
  • Frequently Read Files – When a small document is accessed regularly for read access, SQL caching can further improve performance of access to that document. A recently accessed, cached document can be retrieved from memory, rather than from disk.

SharePoint reaches a cross-over at which point performance is better when the BLOB is externalized for a particular workload.

Reads and writes of a single small document might be faster when stored in SQL. Read performance can be further improved when a document is in memory due to SQL caching. But as files grow in size or are accessed less frequently, performance can be improved by storing BLOBs in a platform that is better suited for file storage.

Write performance crosses over more quickly than read performance because a BLOB must be written twice—first to the transaction log, and then committed to the database table. There is also no caching benefit to a new or modified document. So as a document’s size increases, the performance penalty of writing the document’s BLOB once, let alone twice, reduces performance of documents stored in a content database.

Fortunately, RBS enables you to specify a file size threshold, above which a document’s BLOB is externalized, and below which the BLOB is stored in the content database. EBS solutions as well as custom or third-party RBS solutions can use other rules, such as file type, to determine whether or not to externalize a BLOB.

There is no formula with which to determine the file size threshold at which performance increases with BLOB externalization. There are simply too many factors in the performance equation, including the characteristics and access pattern of the document, and the performance characteristics of underlying storage platforms.

Our testing has shown that BLOBs greater than 1 MB generally perform better when externalized—assuming the BLOB store itself performs well—whereas very small files smaller than 256 KB generally perform better when stored in the content database.

Our findings align with the general consensus among consultants that documents greater than 512 KB or 1 MB should be externalized, at which point performance is improved for both reads and writes, given similar performance characteristics of the underlying storage platform; whereas access to a document smaller than 256 KB is faster with the BLOB stored in the SQL Server content database.

Between 256KB and 512KB or 1MB is the cross-over point that is heavily dependent on size and access patterns. For example, a 512KB file that is accessed frequently for reads may benefit from SQL caching, and therefore perform better when stored in the content database. The same file, archived and accessed infrequently for modification, might perform better when externalized. Write-heavy access patterns are improved by externalizing BLOBs to a greater extent than read-heavy access patterns.

Performance is also heavily dependent upon the specific EBS or RBS solution that you employ, and the characteristics of the BLOB store—the performance of the underlying storage subsystem. For example, a BLOB store on a SAN will perform better than a BLOB store hosted in the public cloud.

The threshold at which performance of file access improves depends on a variety of factors, and should be tested with your EBS or RBS storage solution as well as with content and access patterns that reflect expected workloads in your environment.

Performance of the Content Database, SQL Server and SharePoint Farm

We do not think it is helpful to get too caught up in discussions about—or testing—the performance threshold of access to a single BLOB unless your production SharePoint environment consists of one user accessing one document. In a real-world SharePoint environment, multiple users are accessing documents as well as lists, pages, workflows, services, and applications. Therefore, it is more important that you test the SharePoint environment with appropriate and complex workloads that are representative of day-to-day usage.

As multiple users access SharePoint, the impact of BLOBs on SQL Server is exacerbated, and BLOB storage in a content database can significantly impact performance of the entire content database, of the server running SQL Server and subsequently, the SharePoint farm.

As mentioned earlier, SQL Server is a database service that is optimized for performance of structured data. When you store unstructured data as BLOBs, SQL Server needs to work harder to save and retrieve the data. SQL Server is simply not optimized to be a file server.

The degradation of performance due to BLOBs is also true outside of SQL Server and SharePoint. One customer reported that they had a custom engineering application built on Oracle. When they used functionality in Oracle to remove BLOBs from the database, performance skyrocketed.

The broader performance effects of BLOBs stored in SQL Server are seen in real-world SharePoint environments, as users access documents, and other users access pages or list views.

If a content database has a large number of BLOBs (e.g. a document library with many files), and a reasonable level of activity that involves BLOB access, all other performance is degraded—for example, the performance of list views is slowed. As some users load and store BLOBs, CPU and memory utilization skyrocket, and performance suffers for other users across all content databases stored on the SQL Server. Because SQL Server is the most common bottleneck in the request pipeline, SharePoint performance suffers.

We have observed that the mere existence of BLOBs can visibly degrade performance of content access, even if the content access you are measuring has nothing to do with a document or BLOB. This is partly due to the fact that SharePoint and SQL services and administrators perform operations in the background—including database indexing, search indexing, and management operations such as database backup—that involve BLOB access.

When you externalize BLOBs from a content database, performance of the content database and of the SQL Server—its CPU and memory utilization—can improve. This means that performance of list views and other day-to-day user and maintenance activities—even those not related to the BLOBs themselves—can be significantly faster.

So while it is interesting to discuss whether a single document of 512 KB or 1 MB should be externalized to improve the performance of access to that one document, our experience is that such a discussion detracts from the more significant impact of BLOBs—even small BLOBs—in more complex environments.

Microsoft reported a 25 percent performance increase by externalizing BLOBs, across a mix of user workloads that reflect a real-world collaborative environment. Microsoft’s findings are summarized in the table below. SQL BLOB RBS Gain
Database Size - 1 TB 2292 GB 26 GB 98.9%
Database Backup Size - 100 GB 217 GB 7 GB 96.8%
Database Backup Time - 217 GB 2490 sec 38 sec 98.5%
Database Defrag Time - 100 GB 120 sec 4 sec 96.7%
Avg. SharePoint Response Time 28 msec 21 msec 25.0%
Large File Upload - 500 MB 55 seconds 29 seconds 47.6%

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here