Multiple Routing Failures Moving from BizTalk 2004 to 2006
Question:
When an application was upgraded from BizTalk 2004 to 2006, the number of routing failures for a particular situation changed.
When a message is received by the BizTalk receive port, it is subscribed by an orchestration (OrchA). This orchestration performs a transform and converts this message to a generic format and drops it to the message box using a direct binding send port. There is another Orchestration (OrchB) that is subscribes to this generic format message and it does its thing.
When I have Unenlisted OrchB, I get 3 messages suspended in the msgbox. Two of which are non-resumable and only contain the context information of the generic message . One is resumable and contains the actual message. This message is NOT in the generic format, which I would have expected as OrchA completed its processing. Also, I noticed that the map in OrchA is being called twice.
This is very interesting and I will appreciate if someone can shed light on this two points
1) Why is the message that got suspended (resumable) not the generic message as it should be after the OrchA?
2) Why is my map of OrchA is being called twice when the subscribing Orchestration (OrchB) is Unenlisted?
Answer, from Lee Graber:
Sure I have an explanation.
1) In 2006 the behavior has changed with regards to the handling of uncaught exceptions. In 2004 if an exception was thrown we passed it along down the flow of your orchestration looking for someone to handle it. If you had no exception handling block we would eventually suspend the orchestration non-resumable as we had advanced the orchestration all the way to its termination. That behavior was seen as sub-optimal by most customers who indicated that it was possible that they missed catching the exception and would like to be given a chance to maybe fix a faulty component and drop the new version in so that they can get it to work. As such, in 2006, if an exception is thrown for which we do not find any exception handling block (and we don’t have anything to compensate), instead of suspending non-resumable and advancing the orchestration all the way to the end, we throw out all information about what we have done and suspend resumable. By not persisting state about the advancement, we will simply resume at the previous persistence point.
2) By default all orchestrations run in what we call “optimized” mode. This allows the engine to make certain decisions about combining actions which can improve performance. However, if an exception occurs at the wrong moment, this can mess up our tracking stream and exception handling. The engine can then decide that it is not in a good position to handle this exception. It will throw out what it is has done and indicate to the messagebox that it would like to be restarted in non-optimized mode. At that point it will run back through the steps and be able to correctly handle the exception that was thrown.
a. Also, to be clear, routing failures are probably the most common cause of us having to switch to non-optimized mode. The problem is that we try to batch the send with some subsequent activity (like maybe an end-of-service) and so we have to just assume the send will succeed and add the “send completed” to our tracking stream. If the send fails, then we have completely ruined all of our tracking information and so have to redo it to get a correct tracking stream which could be committed if you handled the exception.
Hence, the suspended orchestration would have only the original message since we suspended it at the first persistence point, which was before the map was run. If you looked in the instances view in the MMC, you would see two routing failure reports, which are what the context only messages are, since the orchestration was run once in optimized mode and once non-optimized.