Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / biztalk

Transfer Large Files using BizTalk - Receive Side

4.33/5 (4 votes)
19 Apr 2011CPOL5 min read 43.2K   588  
How to transfer large files using BizTalk - Receive side

Introduction

Dealing with large files (200MB+) on BizTalk can be a major performance bottleneck when a file is parsed by a receive pipeline and stored into the MessageBox. The CPU usage of both the BizTalk Server host instance that receives the file and the SQL Server can go very high and slow down the system, especially when more large files are received at the same time.

One way to solve this problem is to create a custom pipeline component that receives the large file, stores it to disk and creates a small XML message that contains the information about where the large file is stored. The small XML message is stored into the MessageBox instead of the large file. It contains the same context information as the large file and it can be picked up by an orchestration or send pipeline. The sending component then gets the location to the large file and does the sending process. Deleting the large files from disk can be done in a scheduled task. In this article, I describe how to receive the large file, store it to disk and create the small XML message.

The Custom Pipeline Component

Much of this code is basic when creating a custom pipeline component and can also be generated by wizards. However, I list all code below and give a short explanation. The essential code is found in the Execute method which is implemented from the IComponent interface.

The custom pipeline component is created as new Class Library in Visual Studio and signed with a strong name key file.

A reference to Microsoft.BizTalk.Pipeline.dll is added.

The code starts by adding the following namespaces to create a custom pipeline component and to store files to disk.

C#
using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.BizTalk.Message.Interop;
using Microsoft.BizTalk.Component.Interop;
using System.IO;

The code at the beginning of the class is listed below. One attribute tells that this is a pipeline component and a second restricts the component to be used only in the decode stage in a pipeline. The decoder pipeline component needs to implement the interfaces IBaseComponent, IComponentUI and IComponent. In addition, the class below implements the interface IPersistPropertyBag, but this is optional.

C#
namespace Stm.LargeFileDecoder
{
   [ComponentCategory(CategoryTypes.CATID_PipelineComponent)]
   [ComponentCategory(CategoryTypes.CATID_Decoder)]
   [System.Runtime.InteropServices.Guid("53fd04d5-8337-42c2-99eb-32ac96d1105a")]
   public class LargeFileDecoder : IBaseComponent, 
		IComponentUI, IComponent, IPersistPropertyBag
   {

The IBaseComponent interface provides properties that provide basic information about the components.

C#
#region IBaseComponent
private const string _description = 
	"Pipeline component used to save large files to disk";
private const string _name = "LargeFileDecoder";
private const string _version = "1.0.0.0";

public string Description
{
   get { return _description; }
}
public string Name
{
   get { return _name; }
}
public string Version
{
get { return _version; }
}
#endregion

The interface IComponentUI defines a method and property that are used within the Pipeline Designer environment. To keep it simple, I have not provided any code here.

C#
#region IComponentUI
private IntPtr _icon = new IntPtr();
public IntPtr Icon
{
   get { return _icon; }
}
public System.Collections.IEnumerator Validate(object projectSystem)
{
   return null;
}
#endregion

The IPersistPropertyBag interface is implemented in order to store property information for the pipeline component. When used in a receive pipeline, this component allows the user to type values into property fields in the receive pipeline configuration in the BizTalk Administration Console.

There are two properties available for users. LargeFileLocation is a path to the directory where the large files are going to be stored. ThresholdSize is a value in bytes to determine whether or not to treat a file as large.

C#
#region IPersistPropertyBag
private string _largeFileLocation;
private int _thresholdSize;

public string LargeFileLocation
{
   get { return _largeFileLocation; }
   set { _largeFileLocation = value; }
}
        
public int ThresholdSize
{
   get { return _thresholdSize; }
   set { _thresholdSize = value; }
}

public void GetClassID(out Guid classID)
{
   classID = new Guid("53fd04d5-8337-42c2-99eb-32ac96d1105a");
}
public void InitNew()
{
}
public void Load(IPropertyBag propertyBag, int errorLog)
{
   object val1 = null;
   object val2 = null;
   try
   {
      propertyBag.Read("LargeFileLocation", out val1, 0);
      propertyBag.Read("ThresholdSize", out val2, 0);
   }
   catch (ArgumentException)
   {
   }
   catch (Exception ex)
   {
      throw new ApplicationException("Error reading PropertyBag: " + ex.Message);
   }
   if (val1 != null)
      _largeFileLocation = (string)val1;            

   if (val2 != null)
      _thresholdSize = (int)val2;

}
public void Save(IPropertyBag propertyBag, bool clearDirty, bool saveAllProperties)
{
   object val1 = (object)_largeFileLocation;
   propertyBag.Write("LargeFileLocation", ref val1);

   object val2 = (object)_thresholdSize;
   propertyBag.Write("ThresholdSize", ref val2);
}
#endregion

The Load method reads the values from the PropertyBag into the properties. I experienced an error message when adding the component to a disassemble stage in a pipeline in the pipeline designer. However the component worked fine, but to avoid this error message, I made a catch block to catch the ArgumentException.

The Save method writes the value from the property into the PropertyBag.

GetClassID returns the component's unique identified value.

InitNew is used to initialize the object to be persisted in component properties. This is not required in this project.

The core interface is IComponent. In this case, it contains one method that executes the pipeline component to store the input message to disk and create a small XML message for further processing.

C#
#region IComponent
public IBaseMessage Execute(IPipelineContext pContext, IBaseMessage pInMsg)
{
   if (_largeFileLocation == null || _largeFileLocation.Length == 0)
      _largeFileLocation = Path.GetTempPath();

   if (_thresholdSize == null || _thresholdSize == 0)
      _thresholdSize = 4096;

   // Treat as large file only if the size of the file is greater than the thresholdsize
   if(pInMsg.BodyPart.GetOriginalDataStream().Length > _thresholdSize)
   {
      Stream originalStream = pInMsg.BodyPart.GetOriginalDataStream();
      string largeFilePath = _largeFileLocation + pInMsg.MessageID.ToString() + ".msg";

      FileStream fs = new FileStream(largeFilePath, FileMode.Create);

      // Write message to disk
      byte[] buffer = new byte[1]; // Set to 1024 or 4096 for faster processing
      int bytesRead = originalStream.Read(buffer, 0, buffer.Length);
      while(bytesRead!=0)
      {                    
         fs.Flush();
         fs.Write(buffer, 0, buffer.Length);
         bytesRead = originalStream.Read(buffer, 0, buffer.Length);
      }
         fs.Flush();
         fs.Close();

         // Create a small xml message
         string xmlInfo = 
		"<MsgInfo xmlns='http://Stm.LargeFileTransfer'><LargeFilePath>" + 
		largeFilePath + "</LargeFilePath></MsgInfo>";
         byte[] byteArray = System.Text.Encoding.UTF8.GetBytes(xmlInfo);
         MemoryStream ms = new MemoryStream(byteArray);
         pInMsg.BodyPart.Data = ms;
   }
   return pInMsg;
}
#endregion

First the properties LargeFileLocation and ThresholdSize get values if they are not set by the user in the BizTalk Administration Console. The incoming file is processed as a large file only if the size is greater than the decided thresholdsize. If so, a FileStream creates a new file with a unique name and writes the whole incoming stream to that file. This is done in a loop by first reading a part from the incoming stream into a buffer and then write from that buffer into the FileStream. Then a small XML message containing the path to the large file is created as a string. This string is then converted to an array of bytes which is used to create a MemoryStream. This MemoryStream is now the new XML message and this is assigned to the Data property of the BodyPart of the incoming message. The whole message is then returned.

Schema and Receive Pipeline

In order to make BizTalk Server recognize the small XML message returned by the component above, a schema must be created and deployed. (However, it does not need to be recognized. That depends on how the send side processes the message.) Creating a schema can easily be done by creating an XML file from the XML string and use this file to generate a schema using the wizard for creating schemas in Visual Studio. The field that contains the path to the large file needs to be a distinguished field if it is going to be accessed from an orchestration or promoted if it is going to be accessed in a pipeline.

When the component above is built, it can be stored in the same folder as the other pipeline components on the BizTalk Server. In a BTS receive pipeline project, this component DLL must be added to the toolbar and used in the decode stage. In the same decode stage, just below this component, an XML Disassembler component can be used so that the small XML message returned by the custom component can be recognized.

The Receive Side on BizTalk

When the components above are built and deployed, the receive side on BizTalk contains a schema (optional) and a receive pipeline where the custom component is used. The two properties mentioned above can be set in the configuration of the pipeline in the BizTalk Server Administration console. When testing this, a .msg file is created in the decided location to store the large files and a small XML message is passed through the MessageBox. This can be picked up either by an orchestration or a send pipeline component for further process.

Related Articles

The article Transfer large files using BizTalk - Send side describes the send side on BizTalk and how to delete the large files in a scheduled task.

The article Transfer extremely large files using Windows Service and BizTalk Server describes how to transfer extremely large files (up to 2GB) using Windows Service and BizTalk.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)