Introduction
This article talks about a recent assessment that was done in a BizTalk client environment and about the several tips that were applied to improve the server performance.
Background
It was a BizTalk Server 2010 server with SQL Server 2008 database. The BizTalk server interfaced with several external systems using orchestrations and send ports. The orchestrations were exposed as a web service and invoked by external systems. The orchestrtations called the external system/web services and then returned response to the calling system. Orchestrations stored the request and response files in folders and also logged the transactions in SQL Server log table.
The main issues faced were general slowness in production environment, application timeouts in some of the web service calls there by causing failing transactions.
Strategy for Assessment and Steps for Improvement
For making an assessment of the production system, the first step is analyzing using the Message Box Viewer. We ran the tool in the production and analyzed the report it generated. Some of the major findings from Message Box Viewer:
- The archive and Purge job for DTA tracking database did not run successfully and the size of tracking database kept on increasing.
- Lots of suspended messages (60K+) were found in production due to which the performance decreased
- Tracking was enabled in several send ports and orchestrations which was causing performance reduction
- XML Receive and XML Transmit pipelines were used in the receive location
- SOAP protocol was used in the Receive location rather than the WCF-* protocols
- Only single host instance
BizTalkServerApplication
was used for executing orchestrations, send ports doing tracking, etc.
Addressing the Findings
We made the following updates to address the findings.
Script to terminate suspended messages: https://dl.dropboxusercontent.com/u/23405583/terminate.vbs
dim objBtsWmiNS, objMsg, svcinsts, inst, msg, ndx, size, savemessages
Dim aryClassIDs()
Dim aryTypeIDs()
Dim aryInstanceIDs()
Dim aryHostNames()
Dim aryObjQueues()
Dim aryHostBatchSize()
Dim strKey2Instance
Dim strQuery2Msg
Dim daysOld
maxBatchSize = 200
On Error Resume Next
Dim objNamedArgs: Set objNamedArgs = WScript.Arguments.Named
If objNamedArgs.Count = 0 OR objNamedArgs.Count > 3 Then
PrintUsage()
wscript.quit 0
End If
If Not objNamedArgs.Exists("Operation") Then
printUsage()
wscript.quit 0
End If
wmiQuery = ""
If UCase(objNamedArgs("Operation")) = "Z" Then
wmiQuery = "select * from MSBTS_serviceinstance where ServiceStatus=16"
End If
If UCase(objNamedArgs("Operation")) = "A" Then
wmiQuery = "select * from MSBTS_serviceinstance where ServiceStatus=4 _
OR ServiceStatus=32 OR ServiceStatus=16 OR ErrorId='0xC0C01B4C' OR ServiceClass=64"
End If
If UCase(objNamedArgs("Operation")) = "SR" Then
wmiQuery = "select * from MSBTS_serviceinstance where ServiceStatus=4"
End If
If UCase(objNamedArgs("Operation")) = "SNR" Then
wmiQuery = "select * from MSBTS_serviceinstance where ServiceStatus=32"
End If
If UCase(objNamedArgs("Operation")) = "DIS" Then
wmiQuery = "select * from MSBTS_serviceinstance where ServiceClass=32 AND ServiceStatus=8"
End If
If(wmiQuery = "") Then
PrintUsage()
wscript.quit 0
End If
argCount = 1
saveMessagesBeforeTermination = True
If objNamedArgs.Exists("NoSave") Then
If UCase(objNamedArgs("NoSave")) = "TRUE" Then
saveMessagesBeforeTermination = False
ElseIf UCase(objNamedArgs("NoSave")) <> "FALSE" Then
PrintUsage()
wscript.quit 0
End If
argCount = argCount + 1
End If
daysOld = 0
If objNamedArgs.Exists("DaysOld") Then
If IsNumeric(objNamedArgs.Item("DaysOld")) Then
If CLng(objNamedArgs.Item("DaysOld")) = CDbl(objNamedArgs.Item("DaysOld")) Then
daysOld = CLng(objNamedArgs.Item("DaysOld"))
Else
PrintUsage()
wscript.quit 0
End If
Else
PrintUsage()
wscript.quit 0
End If
argCount = argCount + 1
End If
If objNamedArgs.Count <> argCount Then
PrintUsage()
wscript.quit 0
End If
wscript.echo "-->Connecting to BizTalk WMI namespace"
Set objBtsWmiNS = GetObject("WinMgmts:{impersonationLevel=impersonate, _
(security)}\\.\root\MicrosoftBizTalkServer")
If Err <> 0 Then
CheckWMIError
wscript.quit 0
End If
wscript.echo "-->Getting BizTalk host collection"
Set hosts = objBtsWmiNS.ExecQuery("select * from MSBTS_HostSetting")
If Err <> 0 Then
CheckWMIError
wscript.quit 0
End If
hostCount = hosts.count
ReDim aryHostNames(hostCount - 1)
ReDim aryObjQueues(hostCount - 1)
ReDim aryHostBatchSize(hostCount - 1)
wscript.echo "-->Retrieve BizTalk host names and loading host queues"
ndx = 0
For Each host in hosts
wscript.echo "Found host " & host.Properties_("Name")
aryHostNames(ndx) = host.Properties_("Name")
Set aryObjQueues(ndx) = objBtsWmiNS.Get("MSBTS_HostQueue.HostName=""" _
& aryHostNames(ndx) & """")
If Err <> 0 Then
CheckWMIError
wscript.quit 0
End If
ndx = ndx + 1
Next
wscript.echo "-->Getting collection of service instances"
Set svcinsts = objBtsWmiNS.ExecQuery(wmiQuery)
ReDim aryClassIDs(hostCount, maxBatchSize-1)
ReDim aryTypeIDs(hostCount, maxBatchSize-1)
ReDim aryInstanceIDs(hostCount, maxBatchSize-1)
wscript.echo "-->Start iterating service instances"
totalCount = 0
saveMessages = saveMessagesBeforeTermination
For Each inst in svcinsts
sSuspendDate = inst.Properties_("SuspendTime")
sSuspendDay = Left(sSuspendDate,4) & "-" & _
Mid(sSuspendDate, 5, 2) & "-" & Mid(sSuspendDate, 7, 2)
dtSuspendDate = CDate(sSuspendDay)
If DateDiff("d", dtSuspendDate, Date()) >= daysOld Then
saveMessagesBeforeTermination = saveMessages
wscript.echo "Found suspended instance """ & _
inst.Properties_("ServiceName") & """ _
on host " & inst.Properties_("HostName")
For hostIdx = 0 To hostCount-1
If aryHostNames(hostIdx) = inst.Properties_("HostName") Then
Exit For
End If
Next
If 16 = inst.Properties_("ServiceClass") Then
wscript.echo "Skipping BizTalk internal service instances _
(they cannot be terminated anyway)"
Else
If 64 = inst.Properties_("ServiceClass") _
Or 16 = inst.Properties_("ServiceClass") Then
saveMessagesBeforeTermination = False
End If
errorCountSavingMessages = 0
If saveMessagesBeforeTermination Then
strQuery2Msg = "select * from MSBTS_MessageInstance _
where ServiceInstanceID=""" & _
inst.Properties_("InstanceId") & """"
Set msgInsts = objBtsWmiNS.ExecQuery(strQuery2Msg)
For Each msg in msgInsts
msg.SaveToFile "C:\Temp"
If Err <> 0 Then
CheckWMIError
wscript.echo "Failed to save MSBTS_MessageInstance"
wscript.echo Err.Description & Err.Number
errorCountSavingMessages = errorCountSavingMessages + 1
Else
wscript.echo "Saved message " & _
msg.Properties_("MessageInstanceID")
End If
Next
End If
If 0 = errorCountSavingMessages Then
aryClassIDs(hostIdx, aryHostBatchSize(hostIdx)) = inst.Properties_("ServiceClassId")
aryTypeIDs(hostIdx, aryHostBatchSize(hostIdx)) = inst.Properties_("ServiceTypeId")
aryInstanceIDs(hostIdx, aryHostBatchSize(hostIdx)) = inst.Properties_("InstanceId")
aryHostBatchSize(hostIdx) = aryHostBatchSize(hostIdx) _
+ 1
Else
wscript.echo "Skipping the instance since couldn't save its messages"
End If
totalCount = totalCount + 1
If(aryHostBatchSize(hostIdx) = maxBatchSize) Then
TerminateAccumulatedInstacesForHost hostIdx
End If
End If
End If
Next
For hostIdx = 0 To hostCount-1
If aryHostBatchSize(hostIdx) > 0 Then
TerminateAccumulatedInstacesForHost hostIdx
End If
Next
wscript.echo "SUCCESS> " & totalCount & _
" instances were found and attempted to be terminated"
Sub TerminateAccumulatedInstacesForHost(hostIdx)
wscript.echo "Sending termination request for host " _
& aryHostNames(hostIdx) & " service instances"
Dim aryClassIDs4Host()
Dim aryTypeIDs4Host()
Dim aryInstanceIDs4Host()
ReDim aryClassIDs4Host(aryHostBatchSize(hostIdx)-1)
ReDim aryTypeIDs4Host(aryHostBatchSize(hostIdx)-1)
ReDim aryInstanceIDs4Host(aryHostBatchSize(hostIdx)-1)
For i = 0 to aryHostBatchSize(hostIdx)-1
aryClassIDs4Host(i) = aryClassIDs(hostIdx, i)
aryTypeIDs4Host(i) = aryTypeIDs(hostIdx, i)
aryInstanceIDs4Host(i) = aryInstanceIDs(hostIdx, i)
Next
aryObjQueues(hostIdx).TerminateServiceInstancesByID aryClassIDs4Host, _
aryTypeIDs4Host, aryInstanceIDs4Host
CheckWMIError
aryHostBatchSize(hostIdx) = 0
End Sub
Sub CheckWMIError()
If Err <> 0 Then
On Error Resume Next
Dim strErrDesc: strErrDesc = Err.Description
Dim ErrNum: ErrNum = Err.Number
Dim WMIError : Set WMIError = CreateObject("WbemScripting.SwbemLastError")
If (TypeName(WMIError) = "Empty" ) Then
wscript.echo strErrDesc & " (HRESULT: " & Hex(ErrNum) & ")."
Else
wscript.echo WMIError.Description & "(HRESULT: " & Hex(ErrNum) & ")."
Set WMIError = nothing
End If
End If
End Sub
Sub PrintUsage()
wscript.echo "Usage:"
wscript.echo "cscript Terminate.vbs < /Operation:Z | _
A | DIS | SR | SNR > [/NoSave:true | false] [/DaysOld:n]"
wscript.echo
wscript.echo " Z terminates all ""Zombie"" _
instances (e.g. completed with discarded messages)"
wscript.echo " A terminates all suspended and zombie instances _
as well as all routing failure reports"
wscript.echo " SR terminates suspended resumable instances only"
wscript.echo " SNR terminates suspended non-resumable instances only"
wscript.echo " DIS terminates all dehydrated 'isolated adapter' instances"
wscript.echo " NoSave:true terminates instances without saving messages _
they reference. Default is false"
wscript.echo " DaysOld:n specifies number of days back to terminate. _
ie: ""/DaysOld:2"" terminates instances suspended more than 2 days ago."
wscript.echo
wscript.echo " Default action is to save instances to the C:\Temp folder on the local computer"
wscript.echo
wscript.echo " Ensure that the C:\Temp folder exists before _
running terminate if you want to save instances"
wscript.echo
wscript.echo " Example: cscript Terminate.vbs /Operation:z /NoSave:true /DaysOld:7"
wscript.echo
End Sub
- Created a dedicated Tracking Host and Send Host. Since all the applications used
BizTalkServerApplication
, it is advisable to move some of the Send Ports to BizTalkSend
host - Create and use 64 bit host and application pool for some of the receive locations for applications that were facing performance issues
- Start and End Shape for orchestrations and other tracking disabled from production. Since we log the transactions in separate table and file, tracking is not critical for us.
- DTA Purge and Backup job enabled to run successfully. Previously, it failed due to lack of disk space to save the backup of DTA DB. Now we cleared the space and ran it again, and it went through successfully.
- For the numerous suspended messages that were there in the system, we cleared the messages before last week or so. Found the following script that has options to clear off all suspended messages by specifying the number of days to keep the messages. It can be run in a schedule to clear off the messages that are older than 7 days or so.
- By default, the maximum number of concurrent outgoing connections is 2 from BizTalk. Hence we increased the number of concurrent connections for some of the web services calls (selectively) by changing the BTSNtSvc.exe.config file as follows. The remaining servers were kept at the existing default 2 connections.
<system.net>
<connectionManagement>
<add address = "http://<external server>" maxconnection = "12" />
<add address = "*" maxconnection = "2" />
</connectionManagement>
</system.net>
Note: Do not set the maxconnection
to a very large value, but keep it to around 10-15. In case of using large value, it may happen as follows: If the BizTalk Server is restarted, then after it comes up, BizTalk will invoke concurrent calls to the destination server for the queued up/pending requests, which may cause the external server to go down or queue up all the requests or cause time outs. Hence to be on the safer side, use this within 10-15.
- Change the pipelines to use
PassThruReceive
and PassThruTransmit
in some of the places where XML Receive/Transmit were used. - The following updates were planned to be done for the installed products:
- BizTalk Cumulative Update 8 not installed on server
- BizTalk Adapter Pack 2010 CU3 not installed on server
- SQL Server 2008 R2 less than SP2 installed on DB Server - To be updated to SP2
- SOAP and SQL adapters to be replaced with WCF based adapters
Some of the points such as 7, 8 and 9 were not completed yet. But after completion of the first 6 steps itself, we faced improvement in performance and drastic reduction in time out issues.