Turn off tracking globally in BizTalk Server 2006

The BizTalk Tracking (BizTalkDTADb) database grows in size as BizTalk Server processes data on your system. If the size of the BizTalk Tracking database causes poor disk performance or fills up the disk subsystem, you can manually purge the data from the Tracking database. See my previous post about how to purge and maintain the BizTalkDTADb database.
If you repeatedly have issues with the BizTalk tracking database, you may want to configure BizTalk to no longer collect tracking information. This is possible by turning off global tracking for the whole BizTalk Server.

Here is the procedure to turn off tracking globally for BizTalk server 2006 and 2006 R2:

  • Open SQL Server Management Studio and connect to the Database Server where the BizTalk Management Database is running, BizTalkMgmtDb
  • Expand the BizTalkMgmtDb, database, expend Tables, right-click the adm_Group table, and then click Open Table.
  • In the GlobalTrackingOption column, change the value from 1 to 0 and then press ENTER. A value of 0 turns off global tracking for the whole BizTalk server while a value of 1 turns it on.
  • Restart all your BizTalk hosts for the change to take effect.

As when Tracking is turned off, tracking information is not collected so no information will be available in HAT anymore (Health and Activity Tracking). You should consider this effect against keeping tracking data for a shorter time when thinking about turning off Tracking globally.

An alternative to turn off Tracking globally is to turn it off on an application per application basis.
I think that once a BizTalk application is deployed and running smoothly for a while, there is no reason to still have tracking turned on at all time.
If you deploy new BizTalk applications or keep updating existing ones on your production server regularly, you will probably want to be able to consult HAT after the application is freshly deployed. For that to be possible you would have to keep global tracking on.
In that case, I think that a best practice kind of approach is to have Tracking turned on for the freshly deployed application for a few days until you deem it running fine so that you do not need to consult HAT anymore. Once HAT is not needed regularly anymore, you can use the BizTalk Server Administration Console to disable tracking for all the artifacts belonging to the concerned application (orchestrations, ports, etc.). Would you need to turn it on again, it would just take a few minutes to configure tracking back on for the application’s artifacts.

Reference can be found at the MSDN BizTalk documentation How to Turn Off Global Tracking

BizTalk Messaging architecture

The core of the BizTalk Server product is the BizTalk Messaging subsystem, called the Message Bus. As said in the BizTalk documentation, The Message Bus is a publisher/subscriber model; indeed the Message Bus queries messages published into the BizTalk Message Box database looking for messages that match a particular subscription.

The most important point to understand about the publisher/subscriber model is that publishing and subscription concepts are relative to the Database.

Here is a picture illustrating the concept:

BizTalk Server Publisher Subscriber Architecture
The Messaging infrastructure is composed of the Message Box database and also different software components, called the Messaging Components. The Db and the components together compose the publisher/subscriber BizTalk Messaging subsystem, the Message Bus.

An important side effect of this architecture is that in BizTalk Server, messages are immutable once published (in the Message Box DB). This is because more than one end-point can subscribe to the same message. Would messages be mutable, some end-points might not match the subscriptions rule after the message has been modified. Having the subscription query result vary with time on the same message (due to change in the message payload) would break the publisher/subscription architecture thus making the whole BizTalk product unpredictable (and so useless).

1. The Message Box.

The Message Box is an SQL Server database which stores XML messages as well as metadata related to each messages. The message’s metadata is called the message context. Each metadata item (a key/value pair) of the message context is called a context property. The most important information to know about the message context is that it holds all the necessary information for message routing – message subscription.

2. Messaging Components.

While the Message Box database is the message storage facility of the Message Bus, the messaging components are software components that actually move messages between subscribers and publishers. They receive and send messages in and out of the BizTalk Server system.

2.1 Host Services.

A BizTalk host is a logical container. It provides the ability to structure the BizTalk application into groups that could be spread across multiple processes or machines.
When you create a host in BizTalk server, it creates a logical unit in which you can run different BizTalk applications or different type of BizTalk artifact.
For example, if your BizTalk applications are pretty small, you could create 1 host per BizTalk application you develop. On the opposite if your applications are big, you can create different hosts to separate logical grouping of your application, such adapters, orchestrations, ports and so on. If each host runs on a separate physical machine this help in balancing the load between processors (some sort of manual load balancing).

A host instance is simply a running instance of the host logical grouping. It runs as a Windows Service, each host being a separate windows process; a separate instance of BTSNTSvc.exe (or BTSNTSvc64.exe for BizTalk Server 64 bit).

As explained before, the host instance raison d’être is to provide logical grouping units. It does not implement itself the BizTalk runtime, it is a container where the BizTalk subservices run. These subservices running inside the host instance implements together the actual runtime of the BizTalk Message Bus.

Host instances can run all BizTalk subservices or only some of them depending on what type of BizTalk artifacts they are running. To understand which subservice is used by which type of artifact, here is a list of the different subservices running inside a BizTalk host instance – note that the list of services can be found in the the adm_HostInstance_SubServices table in the Management Database:

Service Description
Caching Service used to cache information that is loaded into the host. Examples of cached information would be assemblies that are loaded, adapter configuration information, custom configuration information, etc.
End Point Manager (EPM) Go-between for the Message Agent and the Adapter Framework. The EPM hosts send/receive ports and is responsible for executing pipelines and BizTalk transformations. The Message Agent is responsible to search for messages that match subscriptions and route them to the EPM.
Tracking Service that moves information from the Message Box to the Tracking Database.
XLANG/s Host engine for BizTalk Server orchestrations.
MSMQT MSMQT adapter service; serves as a replacement for the MSMQ protocol when interacting with BizTalk Server. The MSMQT protocol has been deprecated in BizTalk Server 2006 and should only be used to resolve backward compatibility issues.

2.2 Subscriptions.

In a publish/subscribe design, you have three components:

  • Publishers
  • Subscribers
  • Events

Publishers include:

  • Receive ports that publish messages that arrive in their receive locations
  • Orchestrations that publish messages when sending messages (orchestration send shape)
  • Orchestrations that start another orchestration asynchronously (start orchestration shape). On a side note, the call orchestration shape does not publish the message into the Message Box, the message is just passed as a parameter.
  • Solicit/response send ports publish messages when they receive a response from the target application or transport.

Subscriptions:

Subscription is the mechanism by which ports and orchestrations are able to receive and send messages within BizTalk server (see picture above).

A subscription is a collection of comparison statements, known as predicates, comparing the values of message context properties and the values specific to the subscription.

There are two types of subscriptions: activation and instance.

An activation subscription is one specifying that a message fulfilling a subscription should create a new instance of the subscriber when it is received. Examples of things that create activation subscriptions include:

  • Send ports with filters
  • send ports that are bound to orchestrations
  • orchestration receive shapes that have their Activate property set to true.

An instance subscription indicates that messages fulfilling the subscription should be routed to an already-running instance of the subscriber. Examples of things that create instance subscriptions are:

  • Orchestrations with correlated receives.
  • request/response-style ports waiting for a response.

It is also important to know that when you define filter criteria on a send port, you are actually modifying the subscription of the port. As a reminder, filter expressions determine which messages are routed to the send port from the Message Box.

Enlisting:

The process of enlisting a port simply means that a subscription is written for that port in the Message Box. Consequently, un-enlisted ports do not have subscriptions in the Message Box.
The same is true for other BizTalk artifacts. An un-enlisted orchestration is an orchestration ready to process messages but having no way to receive messages from the Messaging Engine as no subscription is created for it yet.

The difference between an un-enlisted artifacts and a stopped artifacts is that ports and orchestrations that are enlisted, but not started, will have any messages with matching subscription information queued within the Message Box and ready to be processed once the artifact is started. If the port or orchestration is not enlisted, the message routing will fail, since no subscription is available and the message will produce a “No matching subscriptions were found for the incoming message” exception within the Windows Event Log.

Typical port usage with an orchestration:

What happens in an orchestration that as a send shape connected to a logical port which is in turn bound to a physical port, is that the message sent by the send shape will have a TransportID context property set to a value that matches the physical port TransportID. As the TransportID uniquely defines the port, this mechanism assures that the physical port will always receive the messages coming from the orchestration. It does not mean that only that port will receive the message as due to the nature of a publisher/subscriber architecture, any other port having a subscription matching the message context will also receives the message.

2.3 Messages

As said earlier a Message is more than just an XML document. It is actually a message containing both data and context. To be more precise, a message is composed of context properties and zero or more message parts.

Keep in mind that message parts are not always XML document. If the message is received through a port using the pass-through pipeline, the message can be any kind of data including binary data. On a side note, a pass-through pipeline does not promote context properties; this makes sense as the message is not even supposed to be XML in a pass-through pipeline, so it is not possible to evaluate XPath expression on the message to determine the value of the context property.

As said earlier a message is immutable once it is published. This means that once stored in the MessageBox DB, it can’t be changed. A message can nevertheless be changed once it is out of the database. In a receive pipeline component, a message can be modified before it is published in the MessageBox. In a send pipeline component, a message can be modified after being received from the MessageBox. A typical place to create or modify a message is also inside an orchestration.

2.4 Message Context Properties

Message context properties are used for the subscription mechanism (routing the message to its appropriate end point). They are defined in a property schema. At runtime, the property values are stored into a context property bag.

The property schema is associated with the message schema within BizTalk so that every inbound schema-based message has a schema and a property schema attached to it.

The property schema consists of a global property schema that every message can use by default and of an optional custom property schema which can be created to define application-specific properties. Both types of properties are essentially the same at runtime and both are stored in the context property bag.

So, both types of properties can be used by the subscription mechanism to evaluate which endpoints have a subscription matching the message. The most common subscription is based on a global property called the messageType which is a combination of the XML namespace of the message and the root node name separated by a # character. Ex: http://www.abc.com#RootElementName.

Using subscription to route documents to the proper endpoint is called Content Based Routing (CBR).
For information, if the message is not schema-based, there will be no MessageType property value. Such is the case for binary data message.

Message context properties are populated by the BizTalk runtime in 2 artifacts:

  • The adapter writes and promotes into the message context properties related to the location, adapter type, and others properties related to the adapter.
  • The Receive Pipeline can write and promote properties into the message context in any of its pipeline components. Disassembling components are of particular interest because they promote the messageType property which is commonly used for Content Based Routing.

Property bag.

It is possible to use the BizTalk API in pipeline component code to read/write context properties from the property bag. The property bag is an object implementing the IBasePropertyBag interface. If you intend to use that interface in a custom pipeline to write properties that will be used for routing, you have to keep in mind that properties that are simply written into the property bag using the Write() method are not available for routing. To have a property available for routing, you need to promote the property with a different API call, the Promote() method. This method writes the property and its value in the property bag but ALSO flag the property as promoted and so make it available for routing.

BizTalkDTADb grows too large – How to purge and maintain the database?

The BizTalkDTADb is a BizTalk database that stores health monitoring data tracked by the BizTalk Server tracking engine. It is commonly called the “BizTalk Tracking Database”. This database can grow relatively quickly in size depending of the kind of load your server is under.

I will explain first what should be done to keep the database healthy (by which I mean to keep it under a reasonable size) and after how to clean up the database if it grew up so large that the normal clean up method doesn’t work anymore.

1. How to maintain the BizTalkDTADb?

Each BizTalk service instance running is processing data and while the data is processed, BizTalk tracks it and saves it in the BiztTalk Tracking Database. This means that the Tracking Database will grow indefinitely over time which is obviously not a viable option.
There is an SQL job called “DTA Purge and Archive (BizTalkDTADb)” that is installed on the BizTalk SQL Server which is used for cleaning up (deleting old tracking information) the BizTalkDTADb. That job is not enabled by default so the first thing that should be done after installing BizTalk server is to configure and enable the job. See here for information about how the cleanup process works and here for information on how to configure the SQL job. Basically, the job calls a single stored procedure on the BizTalkDTADb and once edited should looks like the following:

exec [dbo].[dtasp_BackupAndPurgeTrackingDatabase] 1, 0, 1, ‘\\MyBizTalkServer\backup’, null, 0

The 4 first parameters are the one that you need to know about. The 2 first are the number of hours and days for which completed instances will be cleaned up. The third one is the number of days after which even non completed instance will be cleaned up. The fourth is the location of the backup folder.
This means that the SQL job will back up the BizTalkDTADb each time it runs, making that backup files will fill up your disk subsystem pretty quickly if nothing is done about it! Backups are important in case of a Database crash and that the Tracking Database needs to be restored.

If you do not consider the Tracking Database to be of enough importance to be backed up and have the extra burden to manage the backups, you can modify the “DTA Purge and Archive (BizTalkDTADb)” SQL job as explained here. This way, the job will only purge the tracking database without backing it up. It is especially applicable for development and QA environments and might also apply to your production environment.
In short, the only change that needs to be done in the SQL job is to modify the T-SQL statement run by the job. It needs to execute the SP dtasp_PurgeTrackingDatabase instead of dtasp_BackupAndPurgeTrackingDatabase.

The final T-SQL statement executed by the SQL job will be similar to the following:

declare @now as datetime
set @now = GetUTCDate()
exec [dbo].[dtasp_PurgeTrackingDatabase] 0, 3, 6, @now

In this case I keep complete instances in the Tracking DB for 3 days and incomplete one for 6 days, everything older should be purged. As you see there is no path to specify for the backup location as no database backup is executed.

Instead of modifying the original SQL job you could alternatively disable it and create a new job with the appropriate T-SQL call. That is how I have done it myself and consider it to be a best practice.
Moreover, I scheduled the job to run every 5 minutes. This has proven to be a good time interval. I used to run the job every 30 minutes only but I ever encountered cases where the clean up procedure did not keep up with the amount of tracked data and I ended up with a huge tracking database which I had to purge manually, as I will explain next.
So, a 5 minutes interval to run the job seems to also be a best practice from my experience.

2. How to manually purge the BizTalkDTADb?

You will have to manually purge the BizTalkDTADb database if it grew too large either because the clean up procedure was not started or the clean up procedure could not keep up with the amount of data saved in the Tracking Database.
This is explained in details here but, in short, the important points are:

– All the BizTalk services used by BizTalk needs to be stopped. This means all the BizTalk host service instances, Enterprise SSO, BizTalk Rules Engine, EDI service, BAM, BAS and IIS if they are used.

– Open Microsoft SQL Server Management Studio and run the following SQL statement on the BizTalkDTADb: exec dtasp_PurgeAllCompletedTrackingData

Once the procedure is executed, a lot of space will have been freed in the Tracking Database. The database will nevertheless still take the same amount of space on the disk subsystem because deleting data in a database does not reduce the size the database takes on the disk. If you want to reduce its size on the disk, you need to shrink it. You can do that in 2 ways:

1. Through SQL Server Management Studio, right click on the BizTalkDTADb database, click on Tasks > Shrink > Database

How to shrink BizTalkDTADb database using SQL Server Management Studio

2. Through T-SQL using the DBCC SHRINKDATABASE command:
DBCC SHRINKDATABASE (BizTalkDTADb);
The reference of the T-SQL DBCC SHRINKDATABASE command can be found here.

Another useful trick is the ability to truncate the Log file (which should not be done on production as it “breaks” the backup). See some information about it here: Truncating the Database Log File.

Custom Pipeline Component deployment gotcha.

When you create custom pipeline, you might want to compile and install the custom pipeline component in the GAC first before adding the pipeline component in the toolbox. That way, the custom pipeline solution will reference the custom pipeline component from the GAC instead of its location on the disk.
The reason behind this is that if you install the pipeline component library on your production servers only in the GAC and if you did not GACed the pipeline component before using it in the custom pipeline, an exception is raised saying that the custom pipeline cannot be found (as the custom pipeline refers to a custom pipeline component on the local drive instead of the GAC). See a full explanation of the problem on Stephen W. Thomas’ blog.

Note that if you follow the BizTalk documentation on Deploying Pipeline Components, it is advised that the assembly should be deployed in both the GAC and the <installation directory>\Pipeline Components folder. I would advise to follow this way so it won’t matter how you or your developers create and compile the pipeline and their components; it will always work when the code is moved to the production environment and thus avoid unnecessary stress in the case it would have gone wrong because one developer referenced his component in a different way.

EDIT for BizTalk Server 2006 R2:
I have just notice after having published the post that the documentation has been changed for BizTalk Server 2006 R2 and that now the documentation says:

All the .NET pipeline component assemblies (native and custom) must be located in the \Pipeline Components folder to be executed by the server.

and:

You do not need to add a custom pipeline component to be used by the BizTalk Runtime to the Global Assembly Cache (GAC).

So now, Microsoft officially advises to put the custom pipeline component libraries only in the \Pipeline Components and not in the GAC anymore. If you still have BizTalk Server 2006 installed on your machine like I do, you will see that the local BizTalk documentation still says as mentioned in my original post.

BizTalk Server 2006 Custom Functoid Documentation Mistake

When you develop a custom functoid, there are 5 important tasks to do:

  1. Create a resource file with the various resources such as functoid’s name, description, tooltip and icon.
  2. Create the class that will implement the functoid. This class must derive from the Microsoft.BizTalk.BaseFunctoids.BaseFunctoid class.The constructor must override the bas class constructor and call a certain number of methods and properties so that the custom functoid can run properly at run-time as well as integrate well into the Visual Studio’s mapper.
  3. Create a member method in the functoid class that will actually implement the functionality of the custom functoid (its business logic).
  4. Compile and sign the functoid project into an assembly with a strong name key file (so that the assembly containing the custom functoid can be deployed in the GAC).
  5. Copy the assembly to C:\Program Files\Microsoft BizTalk Server 2006\Developer Tools\Mapper Extensions and add the functoid in Visual Studio’s ToolBox. You must also install the assembly into the GAC so that the functoid is available to BizTalk at runtime.

Today I found out that the BizTalk Server 2006 documentation has a mistake in the custom functoid development section: Using BaseFunctoid, which relates to the point number 2 in my previous list of tasks.
That section says that the constructor of the functoid must make a call to the SetupResourceAssembly method to point to the resource assembly. It details:

Include a resource file with your project. If building with Visual Studio, the resource assembly must be ProjectName.ResourceName.

In my experience this is not true, when I build my custom functoid assembly in Visual Studio, I have to call the SetupResourceAssembly method with FunctoidNameSpace.ResourceName as parameter instead of ProjectName.ResourceName. If I follow the documentation’s advice, the resource information such as functoid’s name and icon do not appear in Visual Studio’s toolbox, proving that the resources could not be found by the IDE.

The only case that it would work as mentioned in the documentation would be if the project name and the namespace of the functoid were identical, which happens by default in Visual Studio. Indeed, when you create a new project with Visual Studio 2005, a default namespace will be created for the project, its value being the project name. You can see the default namespace value by right-clicking on your project files and click “properties”.