You might have reached the threshold of a SharePoint List and you’re afraid to increase the limit as you might affect the performance of your server. Now you are thinking of other solutions and one thing came to mind, why not archive the data? Well, that is a good idea but how are you going to achieve this? Well, there are different ways and here are some suggestions:
- Export it manually to Excel and save the Excel document elsewhere – Caveman’s way
- Create an identical list and save it there – So you create a workflow to do this but you still reach the threshold for the archive list, do you create another identical list every exceeded threshold limit?
- You use a third-party open source data connector for SSIS – now you are going somewhere but what if it has bugs? What if newer versions of SharePoint are not supported?
- Implement a retention policy and delete old data – too harsh for your users.
- Create a new solution! – You’re brave but at least you can freely manipulate what you want for your solution and that’s what we are going to do today, make our own solution in SSIS without any third-party dependence apart from what you already have, Visual Studio, SQL Server and Sharepoint.
Let's start and create a solution that will migrate List Items in a SharePoint List to SQL Server using SharePoint Web Services and SSIS.
Let's assume you have a List
named Test
on your SharePoint instance which contains the following fields on the image below.
Go to and create a view not using a datasheet format. This will be the data interface for your SSIS, once created, go below and you will see the Web address for mobile view, get the ListGUID
and ViewGUID
, keep this as you will be using them later.
Now create a new Business Intelligence – Integration Services Project. Let's call it ListArchive
. In case you don’t have the option in your Visual Studio 2012, download it here.
Now on your Control Flow canvas, drag a Script Task.
Create a variable, we only need one for this example. Name it OutputXML
with data type string.
Double click you Script Task, then indicate the variable you created in the ReadWriteVariables
of the script.
Now add a Service Reference to your SharePoint instance. Right click on your Project References and choose Add Service Reference.
When the window opens, go to Advanced.
Then to Add Web Reference.
Now point your Web Reference to URL to the List web service (other services are explained here), in this example we go to http://yoursharepointserver.com/sandbox/_vti_bin/Lists.asmx. Click the arrow beside the URL address bar, then your service methods should show below. Give it a Web Reference name then click Add Reference.
That’s all you need for now, so let's start coding. Copy and paste this on your ScriptMain.cs replacing what's contained in Main()
. Do it now and I will explain the code below, it is commented anyways so it's easy to understand.
public void Main()
{
var sharepointList = new sharepointListWebService.Lists();
sharepointList.UseDefaultCredentials = true;
sharepointList.Url = "http://yoursharepointserver.com/sandbox/_vti_bin/lists.asmx";
var listGuid = "{cb6aedb3-d549-45a0-b3c5-e831cb0b51d0}";
var viewGuid = "{dfbcb0dc-a21f-4f71-94e8-b0a1adbd717a}";
var outputNode = sharepointList.GetListItems
(listGuid, viewGuid, null, null, "10000", null, null);
var outputString = new System.Text.StringBuilder();
outputString.Append("<data xmlns:z=\"#RowsetSchema\">");
outputString.Append(outputNode["rs:data"].InnerXml.Replace
("<z:row ", "<row "));
outputString.Append("</data>");
var output = outputString.ToString();
var xmlResultsDocument = new XmlDocument();
xmlResultsDocument.LoadXml(output);
xmlResultsDocument.Save(@"C:\Users\Raymund\Desktop\Export.xml");
DeleteListItems(output, listGuid, sharepointList);
Dts.Variables["User::OutputXML"].Value = output;
Dts.TaskResult = (int)ScriptResults.Success;
}
private void DeleteListItems(string listContents,
string listGuid, sharepointListWebService.Lists sharepointList)
{
var xmlResultsDocument = new XmlDocument();
xmlResultsDocument.LoadXml(listContents);
var xmlNamespace = new XmlNamespaceManager(xmlResultsDocument.NameTable);
XmlNodeList rows = xmlResultsDocument.SelectNodes("//row", xmlNamespace);
if (rows.Count != 0)
{
foreach (XmlNode row in rows)
{
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
System.Xml.XmlElement batchElement = doc.CreateElement("Batch");
batchElement.SetAttribute("OnError", "Continue");
batchElement.SetAttribute("ListVersion", "1");
int id = int.Parse(row.Attributes["ows_ID"].Value);
batchElement.InnerXml = "<Method ID='1'
Cmd='Delete'><Field Name='ID'>" +
id + "</Field></Method>";
sharepointList.UpdateListItems(listGuid, batchElement);
}
}
}
First is you need to instantiate that Sharepoint webservice and give it credentials to run on, you might want to change this on a live environment.
The two GUIDs that you kept a while ago will be used here, you can do that programmatically but that’s not the focus of this guide.
Then execute the GetListItems
given your ListID
and ViewID
, it will be presented in an XML format. That result will then be passed to the variable you created earlier.
Then we delete the items since it will be saved on the SQL Server anyway, that’s the whole point of archiving isn’t it? While you may might argue why I am doing it in here and not after saving, well I know the fact and I will just include the code in one place to easily explain things. In a real scenario, you need a lot of try catch
es here and perform only the delete once the data is safely stored on the database.
Now run your code and get that XML Export, you need this to create a schema. Once it runs successfully, open that exported XML file in your Visual Studio, go to XML tab then choose Create Schema.
This then creates a schema for you with XSD extension, save it in your project folder.
Now going back to your Control Flow canvas, drag a Data Flow Task, your’e now ready to perform a data dump from your XML variable to SQL Server.
Now before you continue, make sure you have the right tables sorted already in SQL Server, I suggest to name the fields exactly the same as the fields you get in the schema generated, take note of data types as well.
Once you got that sorted, go to the Data Flow Tab then drag an XML Source and an ADO NET Destination, connect them together.
Configure your XML Source with the following. Data access mode should come from the variable and XSD will be the one that was autogenerated earlier.
Now go to columns to confirm, it will warn you of the maximum length. We ignore that first but again on live scenario, you need to give this the proper data types and associated lengths.
Check if you got all Columns you need.
Then let's ignore any failure, just for this demo. If we don’t do this, there will be a lot of truncation error specially if you have long column lengths.
XML Source is finished. Now let's configure the destination, create a new connection manager.
Choose ADO.NET.
Then point to your database.
Then the necessary table.
Click mappings, if you have named the database table fields the same as the XSD, then it would automap beautifully, otherwise you will manually do it.
Now, you will see the red x icon disappears on ADO NET Destination but still on the XML Source, this is because it validates the External Metadata from the XML and you don’t have it as it is from a variable populated on run time. Just disable that checking on the Properties of your XML Source.
Run your project, then you’re all good to go.
To confirm, check your database if the data is there.
Also check your SharePoint list, data should be already deleted at this moment.
This is tested to work using SharePoint 2013, SQL Server 2012 and Visual Studio 2012.