BizTalk Server Pipeline Technology Explained

In this post I will introduce the pipeline technology in BizTalk Server.

1. What is a BizTalk Server Pipeline?

BizTalk server being an integration platform it is most likely that it will have to communicate with systems using various document formats. Because the BizTalk Server 2006 engine works with XML documents internally, it must provide a way to convert other formats to and from XML. Other services may also be required when converting formats, such as authentication of the sender of a message, encryption or decryption, property promotion and so on.
To handle these tasks in a modular and customizable way, a pipeline is constructed from some number of stages; each stage contains one or more .NET or COM (Component Object Model) components. Each component handles a particular part of message processing. It is because all these components are run in sequence that the whole process is called a pipeline.
A stage is a container of components, each stage is itself a component with metadata. Stages have no execution code, as opposed to pipeline components, which do have execution code.

The BizTalk Server 2006 engine provides several standard components that address the most common cases. If these aren’t sufficient, developers can also create custom components for both receive and send pipelines.

To summarize, pipelines enable the developer to define a series of transformations that will be performed on a message as it is being received or sent.

Message processing workflow:

BizTalk Message Processing Workflow
The message is passed from the adapter to the receive pipeline where it is transformed to XML. The message can then be used by orchestrations, or passed to a send pipeline, and then to a send adapter.

As seen in the picture above, there are 2 types of pipelines:
Receive Pipeline, its purpose is to prepare a message for being processed by the server after being received by an adapter.
Send Pipeline, its purpose being to prepare a message for sending to an adapter after being processed by the server.

Execution mode

Each pipeline stage can have its own Execution Mode setting, so different stages within a pipeline can have different execution modes.

When the Execution Mode property is set to All, all the components within the stage are run in the configured sequence. A run-time error occurs if any component of the stage encounters an error while processing a message. Use this mode when several components must be run to complete a logical task.

When the Execution Mode property is set to FirstMatch, only the first component that recognizes the message is run. If no components in the stage recognize the message, a run-time error results. Use this mode when the pipeline stage receives messages in several formats.

All the stages execution mode of the standard pipelines delivered with BizTalk Server 2006 are set to All, except for the Disassemble stage of the receive pipeline which is set to FirstMatch.

2. Receive Pipeline.

The receive pipeline is the pipeline executed by a receive port; it is executed after an adapter reads the message from a physical location through a particular protocol (FTP, HTTP, MSMQT …)
The receive pipeline takes the initial message, performs some transformations, and disassembles the raw data into zero, one, or multiple messages which are then processed by the BizTalk server engine.

Details of tasks executed by a Receive Pipeline:

BizTalk Receive Pipeline
As shown, there are 4 stages executed by a receive pipeline:

2.1 Decode Stage.
– This stage is used for components that decode or decrypt the message. Decoding is the process of transforming information from one format into another.
– This stage takes one message and produces one message.
– This stage can contain between zero and 255 components.
– All components in this stage are run.

2.2 Disassemble Stage.
– This stage is used for components that parse and disassemble the message.
– Disassembling is a process of breaking up a large interchange message into smaller messages by removing the Envelopes. It is also often called “debatching”
For example, instead of sending n messages of the same type sequentially, a system might send 1 large message containing n messages which are the actual purpose of the communication. The message containing the sub-messages is what is called the envelope.
For this feature to take place, the use of an Envelope Schema is necessary. The Envelope schema is a special type of schema that can be created within BizTalk. It defines the schema of the enveloping message.

– This stage creates a BizTalk Message Object and a Message Context Object.
– This stage promotes properties from the interchange message (envelope and individual messages) to the Message Context of each individual messages generated.

– The components within this stage probe the message to see if the format of the message is recognized. Based on the recognition of the format, one of the components disassembles the message.
– If this stage contains more than one component, only the first component that recognizes the message format is run. If none of the components within the stage recognize the message format, the message processing fails.
– This stage should include any custom components that implement special behavior to disassemble the message contents.
– This stage can contain between zero and 255 components. If there are no components in the stage, the message is passed through.
– This stage takes one message and can produce one or more messages.

2.3 Validate Stage.
– This stage is used for components that validate the message format – XML document validation against a XML schema (XSD).
A pipeline component processes only messages that conform to the schemas specified in that component.
– Components in this stage are used to validate the XML messages produced by the Disassemble stage. The XML is validated through the use of XML schemas defined in the component.
– This stage can contain between zero and 255 components.
– All components in this stage are run.
– This stage may be run more than once. This is because it runs once per message created by the Disassemble Stage. So if the disassemble stage produce 10 messages, it will run 10 times.

2.4 Resolve Party Stage.
– This stage is a placeholder for the Party Resolution Pipeline Component.
– This stage may be run more than once. It runs once per message created by the Disassemble stage.
– This stage can contain between zero and 255 components.
– All components in this stage are run.

3. Send Pipeline.

A send pipeline is responsible for processing documents before sending them to their final destinations. The send pipeline takes one message and produces one message to send.

Details of tasks executed by a Send Pipeline:

BizTalk Send Pipeline
There are 3 stages executed by a send pipeline:

3.1 Pre-assemble Stage
– This stage is a placeholder for custom components that should perform some action on the message before the message is serialized.
– This stage is run once per message.
– This stage can contain between zero and 255 components.
– All components in this stage are run.

3.2 Assemble Stage
– Components in this stage are responsible for assembling or serializing the message and converting it to or from XML. It is the inverse operation of the disassemble stage in a receive pipeline.
– Assembles the message and prepares it to be transmitted by taking steps such as adding envelopes, move context properties to the message and other tasks complementary to the disassemble stage in a receive pipeline
– This stage accepts zero components or one component.
– All components in this stage are run.

3.2 Encode Stage
– This stage is used for components that encode or encrypt the message. Encoding is the process of transforming information from one format into another. The encoding stage is the inverse operation of the decoding stage in a receive pipeline.
– This stage is run once per message.
– This stage can contain between zero and 255 components.
– All components in this stage are executed.

4. Standard Pipelines and Pipeline Components shipped with BizTalk.

BizTalk is shipped with a series of existing pipeline components ready to use.
BizTalk is also shipped with 2 pairs of default pipelines, which are composed of standard pipeline components.

4.1 Standard Pipeline Components

4.1.1 Standard components provided for each stages of a receive pipeline:

BizTalk Receive Pipeline Standard Components
Decode Stage: BizTalk Server 2006 provides one standard component for this stage, the MIME/SMIME Decoder. This component can handle messages and any attachments they contain in either MIME or Secure MIME (S/MIME) format. The component converts both types of messages into XML, and it can also decrypt S/MIME messages and verify their digital signatures.

Disassemble Stage: BizTalk Server 2006 provides three standard components for this stage.

1. The Flat File Disassembler component turns flat files into XML documents. These files can be positional, where each record has the same length and structure, or delimited, with a designated character used to separate records in the file. A typical example of such flat file is the CSV (Comma Separated Values) file format.

2. The XML Disassembler parses incoming messages that are already described using XML.
Main role of the XML disassemble component:
– Removes envelopes.
– Disassembles the interchange. The interchange is the envelope and the documents contained in it. Disassembling is the process of creating individual messages from the messages contained in an Envelope.
– Responsible for promoting properties from the interchange (the envelope) and the individual documents to the message object context.

3. The BizTalk Framework Disassembler. It accepts messages sent using the reliable messaging mechanism defined by the BizTalk Framework, which was implemented in BizTalk Server 2000.

Validate Stage: BizTalk Server 2006 provides one standard component for this stage, the XML Validator. This component validates an XML document produced by the Disassemble stage against a specified schema or group of schemas, returning an error if the document doesn’t conform to one of those schemas.

Resolve Party: BizTalk Server 2006 provides one standard component for this stage, Party Resolution, which attempts to determine an identity for the message’s sender. If the message was digitally signed, the signature is used to look up a Windows identity in the Management database of BizTalk Server 2006. If the message carries the authenticated security identifier (SID) of a Windows user, this identity is used. If neither mechanism succeeds, the message’s sender is assigned a default anonymous identity.

4.1.2 Standard components provided for each stages of a send pipeline:

BizTalk Send Pipeline Standard Components
Pre-assemble Stage: No standard components are provided but custom components can be inserted here as needed.

Assemble Stage: As the Disassemble stage in a receive pipeline, this stage also has three standard components. They implement the inverse operation of the receive pipeline standard components.

1. The Flat File Assembler converts an XML message into a positional or delimited flat file.

2. The XML Assembler supports adding an envelope and making other changes to an outgoing XML message.
Main role of the assembler:
– The assembler creates the envelope by using a specified envelope.
– The component copy the context properties values to the XML message by using the predefined XPaths coded as annotations in the XSD schemas of the message.
– The component copy the context properties to the envelope by using the predefined XPaths coded as annotations in the XSD schemas associated with envelopes.
– The component appends the message to the envelope.

3. The BizTalk Framework Assembler packages messages for reliable transmission using the BizTalk Framework messaging technology.

Encode Stage: The MIME/SMIME Encoder. It implements the inverse operation as the MIME/SMIME decoder component of the receive pipeline.
This component packages outgoing messages in either MIME or S/MIME format. If S/MIME is used, the message can also be digitally signed and/or encrypted.

4.2 Standard Pipelines

The standard pipelines are shipped with the BizTalk server product. They cannot be modified in the Pipeline Designer. The pipeline designer is a tool to create new pipelines from within Visual Studio 2005.
These default pipelines are ready to use and can be selected when configuring a send port or receive location in BizTalk Explorer or during the configuration of a send or receive shape within the orchestration designer.
The two pairs of default pipelines available are the pass-through pipelines and the XML pipelines.

Pass-Through Pipelines.
The pass-through pipelines have no components. They are used for simple pass-through scenarios when no message payload processing is necessary. These pipelines are generally used when the source and the destination of the message are known, and the message requires no validation, encoding, or disassembling.

Particularity of the pass-through receive pipeline:
Because it does not contain a disassembler, the pass-through receive pipeline cannot be used to route messages to orchestrations (as it is the disassemble stage that does the property promotion).
The pass-through receive pipeline does not support property promotion.

Particularity of the pass-through send pipeline:
No particularity, it just sends the message to the adapter as it was received from the MessageBox.

XML Pipelines.
The XML pipelines are the default pipelines that should be used when the message sent or received are already in XML and must be fully functional within the BizTalk system (for example, orchestration must be able to subscribe to them, properties promotion must be possible).

Particularity of the XML receive pipeline:
The XML receive pipeline consists of the following stages:
1. Decode. Empty
2. Disassemble. Contains the XML Disassembler component.
3. Validate. Empty
4. Resolve Party. Runs the Party Resolution component.

Particularity of the XML send pipeline:
The XML send pipeline consists of the following stages:
1. Pre-assemble. Empty
2. Assemble. Contains the XML Assembler component
3. Encode. Empty

Pictures in this post were taken from the BizTalk documentation. Hope that’s alright!

BizTalk process and Service name relation.

Under Windows, the simplest way to see the CPU usage and Memory usage of a process is by using the Windows Task Manager.

If your BizTalk server contains many BizTalk applications and no performance monitoring system such as WMI installed or configured, a quick way to check which BizTalk process is using the most resource is through the Task Manager.
The only problem with this method is that all you will see under the Task Manager is something like:

Under Windows XP and Windows Server 2003, the Task Manager will show the executable name of the process and the PID (the Process ID – a unique number across all processes) but does not show the Windows Service(s) name that the process is running.

The Windows Services browser (found in the Computer Management application or by running “services.msc”) shows service names:

In the picture above, you can see that I have many BizTalk Windows Services started; each of them being in fact what is called a “BizTalk host”. A BizTalk host is the windows service process that will host 1 or more BizTalk application. As you can see, each BizTalk host is assigned a different Service name – a concatenation of the BizTalk group name and the BizTalk Host name.

To know which BizTalk application uses the most server resource, I need to relate my readings from Task Manager (the process PID) and the Service Name in the services browser. There are 2 ways to discover this relationship; the first is by using a standard command line utility (tasklist) and the second is by downloading a tool (process explorer).

1. Tasklist command.

The simplest way to know which BizTalk host is run by which process is to use the command line utility tasklist.

Tasklist.exe is a command line utility program available in standard for both Windows XP Pro and Windows Server 2003. XP Home seems to not have that utility but this is a non-issue for BizTalkers.
Tasklist displays a list of applications (processes) running on a system. One interesting thing is that it has an option to display the name of the windows services running under each process. Note that some processes, such as svchost.exe, can host more than 1 Windows Service. Svchost is a special process hosting Windows Services that don’t have their own executable host (process). Basically, svchost is used to run Windows Services which are encapsulated as a dll, such as drivers, network management and other basic services.
If a process does not run any Windows Service, such as normal executables, tasklist will display N/A instead of a service name. For Example the process of MS Word, WINWORD, displays N/A for the service name.

Tasklist usage example:

“Tasklist /svc” will display all the processes with the windows services name running in each process (if any).

“tasklist /svc /fi “imagename eq btsntsvc.exe” will display all the BizTalk host processes and the service (host) name running in each process. In my environment I have the following result:

In this way, through the PID, I can relate which process is running which Windows Service (BizTalk host in my instance). So, I can go to Task Manager and see which BizTalk host is consuming the most CPU and memory resource.

2. Process Explorer.

If you don’t like command line tools and prefer to use a Windows application, you can download and install Process Explorer, available for download on Technet.

Process Explorer is some kind of “Task Manager on steroids” which shows you much more information than Task Manager does. For our particular need is able to show what Windows Services is running under each process.

To see the services running under a process, open Process Explorer, find the services.exe node (under which are all processes running Windows Services), locate the BizTalk.exe processes and right click on them, then click “properties” in the context menu and finally click on the “Services” tab; you will see the name of the Windows Service run by that process.

Screenshot of the properties window of a BizTalk process showing the name of the BizTalk Windows Service host running.

Using the Enterprise Library in Biztalk Functoid.

Enterprise Library.

The Enterprise Library is a set of .Net application blocks providing many useful features that can be re-used across projects. I find it now a more mature library which developers can rely on. At the time of writing, the Enterprise Library 3.1 is available at the Microsoft Patterns & Practices site.

Custom Database BizTalk Functoid implementing Caching.

I created a BizTalk functoid which is responsible to lookup values into a database. This particular lookup not being a simple query, a custom functoid was required. With the database lookups becoming a bottleneck when mapping larger messages, I had to implement a caching strategy within the functoid.
I chose to go with the Enterprise Library to implement the caching functionality as it provides nice features such as scavenging and expirations (similar to the ASP.NET cache engine). Some people might think of using the ASP.NET Cache instead but I did not and I would not advise it. Although some people say it is possible to use it outside ASP.NET application, it is not recommended by Microsoft so I did not even try to go down that road. See the System.Web.Caching.Cache reference.

Enterprise Library application block configuration.

The Enterprise Library instantiates objects through the use of a static factory. The application blocks’ factories create objects and configure them by using information from a configuration file, by default the web.config or app.config XML config file – depending if it’s a running from an executable or an ASP.NET page.

The BizTalk runtime is also a .Net application, so it has an XML config file called BTSNTSvc.exe.config which is located in the root folder where BizTalk is installed.
As the functoid using the Enterprise Library Caching application block will run from the BizTalk runtime, the application block configuration will need to reside in BTSNTSvc.exe.config. It is important to remember that all BizTalk hosts running on a BizTalk server will use the same BTSNTSvc.exe.config file. This is somewhat some kind of limitation.
Note that it is possible to use other configuration source than the default .Net .config file, see the Enterprise Library documentation for more details.

Adding a component in the Visual Studio Toolbox.

Adding a custom functoid in the Visual Studio Toolbox is pretty straightforward. See http://msdn2.microsoft.com/en-us/library/aa559309.aspx
Note that if you deploy the BizTalk solution on the same machine you develop, you will have to add the functoid in the GAC so that the map using it will be able to find it at runtime.

One important thing to know is that when adding a component in the Visual Studio Toolbox, the component constructor is called. This is because the Visual Studio design time service needs to retrieve some of the component properties to be able to display and modify them in the properties window.
Most of you know that when working with .Net Windows Forms, all components or controls have a constructor which calls the InitializeComponent() method where all the properties and members of the component are initialized.
This initialization is of course needed for runtime but also for design time, so that’s why a component’s constructor is called during design time as well.

Example of design time properties for a Windows Application Form control (Button).

The Button control properties are displayed in the “properties” windows correctly because an instance of the component has been initialized by the Visual Studio design time service, so its properties have been initialized and can be read and displayed by the IDE.

In the case of a BizTalk functoid, the same mechanism exists, the properties which are needed for design time are initialized in the constructor.
For example, the functoid’s name, tooltip, description and also the functoid bitmap used to display the functoid in the toolbox and in the mapper editor are set in the constructor using calls to inherited methods:
SetName(“ResourceKeyForFunctoidName”);
SetTooltip(“ResourceKeyForFunctoidToolTip”);
SetDescription(“ResourceKeyForFunctoidDescription”);
SetBitmap(“ResourceKeyForFunctoidBitmap”);

Example of design time properties for a BizTalk functoid (Cumulative Sum)

Example of design time properties for a custom BizTalk functoid

As you can see, it is obvious that the Visual Studio Design time environment needs to make an instance of the component so that it can evaluates all its properties to display in the Visual Studio IDE.

Adding the custom functoid in the Toolbox is not working / Testing a map using the custom functoid throws an exception.

During design time and when testing a map, the Visual Studio runtime is used to instantiate and call methods on the functoid. This means that until the configuration block is configured for Visual Studio, the application block’s static factory won’t find its configuration and an error will occur when the call to the application block’s static factory runs.
This error will happen either when:
– adding the functoid in the toolbox, if the call to the application block factory is in the functoid constructor
– testing a map using the functoid, if the call to the application block factory is in the functoid method.
The solution to this problem is to add the application block configuration in the Visual Studio IDE config file, devenv.exe.config, typically located at C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\

Conclusion:

To summarize, when using the Enterprise Library in a BizTalk project – such as a functoid or any other piece of custom code, the application block configuration information must be stored in the BTSNTSvc.exe.config (for the BizTalk runtime) and devenv.exe.config (for the design time experience under Visual Studio runtime).