Introduction
Since sandboxed solutions are deprecated in SharePoint 2013, and Microsoft is pushing SharePoint Online and the App model, many developers are finding that they must move from the server side object model to the client side object model (CSOM). With this, there are a number of challenges: many features are missing in the client object model, and many tasks that were once easy are now difficult.
Presumably as time goes on, the client object model will continue to mature and these issues will improve. However, one thing that won't change is the requirement to write efficient code: we are now, after all, talking across a network - our code is usually executing on a different machine whereas we used to be executing right there on the SharePoint server. And if your experience is anything like mine, you'll soon be spending time on improving your app latency and other performance issues.
In this article, I'll outline a number of techniques to make sure your code runs as fast as possible. My examples are written using the managed CSOM, however many (if not all) of the concepts can also be applied to the JavaScript version. If this is of interest to you, let me know in the comments and I might just write up a new JavaScript version.
- Profiling Your Code
- Only Request What You Want
- Call ExecuteQuery Sparingly
- Caching Data in the Session
- Using CAML
- Advanced: Parallel and Async Requests
If you've got a performance problem, then before you jump into micro optimising everything in sight, you need to take some measurements. Use the Stopwatch
class to do this.
System.Diagnostics.Stopwatch s = new System.Diagnostics.Stopwatch();
s.Start();
s.Stop();
Note down your app's performance before and after and use this information to justify your effort.
We'll start with the simplest and most obvious methods and gradually ramp up the complexity.
You must explicitly request every property of every object you want. This is really a fundamental basic of CSOM programming - after all, it's designed to be used across a network. If you only need a user's Title
and LoginName
, then only request those:
var spUser = clientContext.Web.CurrentUser;
clientContext.Load(spUser, user => user.Title, user => user.LoginName);
However, if later on in your code, you sometimes need to send the user an email, then add their email address to the earlier request. Don't go back to the server twice! The cost of always requesting an additional property is miniscule in comparison to going all the way to the server and back an extra time later on.
Again, an obvious one. But there are scenarios where you will be calling ExecuteQuery
where you don't need to. In case you didn't know, ExecuteQuery
is the method that causes all of your requests to go to the server in a single batch, so it's slow!
Take a look at this scenario. If you're creating a list, you might think you need to write code as follows:
List list = web.Lists.Add(...);
ctx.ExecuteQuery();
ctx.Load(list, l => l.DefaultViewUrl);
ctx.ExecuteQuery();
In fact, you don't need that first ExecuteQuery
. It's not intuitive, but you can create the list, get its URL, and submit both requests in one go:
List list = web.Lists.Add(...);
ctx.Load(list, l => l.DefaultViewUrl);
ctx.ExecuteQuery();
A slightly more convoluted example involves a scenario where you are indirectly calling some CSOM code, but it's behind an interface and you may be calling it multiple times. How can you protect against an ExecuteQuery
call every time this interface method is called? For example:
public interface IData { }
public class MyDataClass : IData { }
public interface IDataRetriever
{
IData GetData(string id);
}
public class SPDataRetriever : IDataRetriever
{
public IData GetData(string id)
{
ListItem li = _list.GetItemById(id);
ctx.ExecuteQuery();
return new MyDataClass(li);
}
}
In our scenario, it's being consumed as follows:
data = ids.Select(id => dataRetriever.GetData(id));
Clearly, this is very inefficient since ExecuteQuery
is called for every item in the enumerable. Let's refactor the code to remove the ExecuteQuery
call:
public interface IDataRetriever
{
void RequestData(string id);
IEnumerable<IData> GetAvailableData();
}
public class SPDataRetriever : IDataRetriever
{
private Queue<ListItem> _queue = new Queue<ListItem>();
public void RequestData(string id)
{
ListItem li = _list.GetItemById(id);
_queue.Enqueue(li);
}
public IEnumerable<IData> GetAvailableData()
{
var result = _queue.Select(li => new MyDataClass(li)).ToArray();
_queue.Clear();
return result;
}
}
You can see it's now split into two methods: RequestData
, which "enqueues" requests ready to be sent to the server - and GetAvailableData
, which returns data assuming that ExecuteQuery
has now been called.
We consume this code as follows:
IDataRetriever dataRetriever1 = new SPDataRetriever(ctx);
IDataRetriever dataRetriever2 = new SPDataRetriever(ctx);
foreach (string id in new[] { "id1", "id2" })
dataRetriever1.RequestData(id);
foreach (string id in new[] { "id1", "id2" })
dataRetriever2.RequestData(id);
ctx.ExecuteQuery();
IEnumerable<IData> allData = dataRetriever1.GetAvailableData().Concat(dataRetriever2.GetAvailableData());
This is an example of the type of creative way you can help prevent performance issues by minimising the number of times you call ExecuteQuery
.
If every page in your app is requesting the same data from SharePoint, then you can store it temporarily in the user session cache. This will prevent you having to make a round trip to the SharePoint server on every page request. Additionally, since it's in the user session cache, it is scoped to each user individually. If you want to cache application-wide data, you can store it in the Application cache. See this MSDN article for more information.
So, let's assume a scenario where each page checks whether or not the user is allowed access to it based on whether they are a site administrator:
public bool CheckPrivileges()
{
var spContext = SharePointContextProvider.Current.GetSharePointContext(HttpContext);
using (var clientContext = spContext.CreateUserClientContextForSPHost())
{
var currentUser = clientContext.Web.CurrentUser;
clientContext.Load(currentUser, u => u.IsSiteAdmin);
clientContext.ExecuteQuery();
return currentUser.IsSiteAdmin;
}
}
We can simply wrap that CheckPrivileges
method in another that performs session caching:
public bool CheckPrivilegesWithSessionCaching(HttpContextBase httpContext)
{
string key = "IsSiteAdmin";
var keys = httpContext.Session.Keys.Cast<string>().ToList();
if(keys.Contains(key))
{
return (bool)httpContext.Session[key];
}
else
{
bool result = CheckPrivileges(httpContext);
httpContext.Session[key] = result;
return result;
}
}
Note that if you're storing large amounts of data, this solution won't scale well (since it is 'In Memory' caching) - you could store cached data in your database instead.
Additionally, you cannot assume that data will be available in the session cache - it could be cleared by ASP.NET at any time, or, due to load balancing it could be cached differently on separate servers. As long as you go back to SharePoint to retrieve it when necessary, this shouldn't be a problem.
If you're retrieving some items from a list, it is tempting to retrieve all items and then filter it in your code. However, you can use CAML queries to perform the filtering server side. It can be a little bit awkward (coding in XML) but it's worth getting right for the potential speed increases you'll see, especially for large lists.
For example, this is the lazy way to get list items:
CamlQuery query = CamlQuery.CreateAllItemsQuery();
var items = list.GetItems(query);
And here's a formatted CAML query with a where
clause:
CamlQuery query = new CamlQuery()
{
ViewXml = string.Format("<View><Query><Where><Eq><FieldRef Name='{0}' /><Value Type='String'>{1}</Value></Eq></Where></Query></View>",
"FirstName", "Eric")
};
var items = list.GetItems(query);
Note the 'View
' outer tag which is required when querying with CSOM, unlike the server object model version.
Here's another trick – you can actually get all folders, subfolders and/or files from a document library, in a single query, by specifying RecursiveAll
:
CamlQuery allFoldersCamlQuery = new CamlQuery()
{
ViewXml = "<View Scope='RecursiveAll'>"
+ "<Query>"
+ "<Where>"
+ "<Eq><FieldRef Name='FSObjType' /><Value Type='Integer'>1</Value></Eq>"
+ "</Where>"
+ "</Query>"
+ "</View>"
};
In the above query, Scope
is set to RecursiveAll
. Also, I'm setting the field FSObjType=1
- this means that only folders are returned. If you want only items, set FSObjType=0
. If you want both files and folders, omit it entirely.
You can actually go even further – retrieving all items from multiple lists, by enumerating through the lists and using a caml query on each. The important thing is that you only call ExecuteQuery
once, at the end.
If you're using JavaScript or the CSOM library in Silverlight or Windows Phone, then you'll see that you have access to ExecuteQueryAsync
. Unfortunately for the rest of us in .NET, there's only an ExecuteQuery
method – synchronous. I don't know why.
Update: SharePoint Online CSOM assembles now have support for ExecuteQueryAsync.
What if you want to do a database request, or a different web request, or get user input, or something else at the same time as your CSOM request? Wouldn't it be handy to have that ExecuteQueryAsync
? Let's see if we can create one:
public static class CSOMExtensions
{
public static Task ExecuteQueryAsync(this ClientContext clientContext)
{
return Task.Factory.StartNew(() =>
{
clientContext.ExecuteQuery();
});
}
}
This spawns a new thread which is then itself blocked. This is acceptable, for example, when you want to avoid blocking a UI thread.
Note that this code might need a little optimisation for your particular circumstances. For example, using Task.Factory.StartNew
doesn't always create a new thread. If you're using a lot of these concurrently, you might want to avoid using the thread pool. You can read more about Task here.
Update: For a true async version, try this:
https://gist.github.com/johnnycardy/9e1671cf5087dcd8f4e7892fc3c2cfb8
Now we can do super cool parallel and async stuff! Check this out:
public async Task<ActionResult> Index()
{
var spContext = SharePointContextProvider.Current.GetSharePointContext(HttpContext);
using (var clientContext = spContext.CreateUserClientContextForSPHost())
{
if (clientContext != null)
{
var currentUser = clientContext.Web.CurrentUser;
clientContext.Load(currentUser, u => u.Title);
Task t1 = clientContext.ExecuteQueryAsync();
clientContext.Load(currentUser, u => u.Email);
Task t2 = clientContext.ExecuteQueryAsync();
await t1.ContinueWith((t) =>
{
ViewBag.UserName = currentUser.Title;
});
await t2.ContinueWith((t) =>
{
ViewBag.Email = currentUser.Email;
});
}
}
return View();
}
OK, so this is a contrived and pointless example because if you'd been paying attention, you'd know we should only call ExecuteQuery
once! But let's walk through it:
- Firstly, the method signature has changed. This controller method is now async, and it returns a Task. This means we can now use the
await
keyword within the method. - Within the body, we're loading
Title
and Email
. We're calling ExecuteQueryAsync
, which start (and return) new Task
objects. - We call
ContinueWith
on the Task
object to run code when it completes – namely, using the property that the CSOM code requested. - We use the
await
keyword to signify that the code is asynchronous and the Index controller method should depend on the result of this code.
Consider this next example which is a controller method called List
. It's a bit more sensible: we're retrieving both database data and SharePoint list items concurrently, and combining them into ViewModel
objects to return to the client.
public async Task<ActionResult> List()
{
List<ViewModel> result = new List<ViewModel>();
var spContext = SharePointContextProvider.Current.GetSharePointContext(HttpContext);
using (var clientContext = spContext.CreateUserClientContextForSPHost())
{
var listItems = clientContext.Web.Lists.GetByTitle(listTitle).GetItems(camlQuery);
var clientTask = DB.Clients.ToListAsync();
var spTask = clientContext.ExecuteQueryAsync();
await Task.WhenAll(clientTask, spTask);
result = clientTask.Result.Select(c => new ViewModel(listItems)).ToList();
}
return View(result);
}
In the code above, we've only got a single blocking call instead of two, and we could potentially be doubling the speed of the method.
Well, that's all I've got for now. Clearly the client-side object model is easily abused in terms of performance, and it's very easy to suddenly realise that performance is a problem. Hopefully, this article will provide some ideas and inspiration to keep your app fast and responsive. If you have any suggestions or performance tips for CSOM, then please let me know in the comments!