I have a Windows 2008 R2 server that hosts many back end NServiceBus endpoints. All of the services that rely on the NServiceBus.Host.exe host (installed as Windows Services) are able to interact with MSDTC perfectly, averaging a small handful of concurrent distributed transactions throughout the day. There are 2 small Web.API applications, however, that self host NServiceBus endpoints (as publishers) that constantly receive the following error when trying to process subscription requests:
NServiceBus.Transports.Msmq.MsmqDequeueStrategy Error in receiving messages. System.Transactions.TransactionAbortedException: The transaction has aborted. ---> System.Transactions.TransactionManagerCommunicationException: Communication with the underlying transaction manager has failed. ---> System.Runtime.InteropServices.COMException: The Transaction Manager is not available. (Exception from HRESULT: 0x8004D01B) at System.Transactions.Oletx.IDtcProxyShimFactory.ConnectToProxy(String nodeName, Guid resourceManagerIdentifier, IntPtr managedIdentifier, Boolean& nodeNameMatches, UInt32& whereaboutsSize, CoTaskMemHandle& whereaboutsBuffer, IResourceManagerShim& resourceManagerShim) at System.Transactions.Oletx.DtcTransactionManager.Initialize() --- End of inner exception stack trace --- at System.Transactions.Oletx.OletxTransactionManager.ProxyException(COMException comException) at System.Transactions.Oletx.DtcTransactionManager.Initialize() at System.Transactions.Oletx.DtcTransactionManager.get_ProxyShimFactory() at System.Transactions.Oletx.OletxTransactionManager.CreateTransaction(TransactionOptions properties) at System.Transactions.TransactionStatePromoted.EnterState(InternalTransaction tx) --- End of inner exception stack trace --- at System.Transactions.TransactionStateAborted.CheckForFinishedTransaction(InternalTransaction tx) at System.Transactions.Transaction.Promote() at System.Transactions.TransactionInterop.ConvertToOletxTransaction(Transaction transaction) at System.Transactions.TransactionInterop.GetDtcTransaction(Transaction transaction) at System.Messaging.MessageQueue.StaleSafeReceiveMessage(UInt32 timeout, Int32 action, MQPROPS properties, NativeOverlapped* overlapped, ReceiveCallback receiveCallback, CursorHandle cursorHandle, IntPtr transaction) at System.Messaging.MessageQueue.ReceiveCurrent(TimeSpan timeout, Int32 action, CursorHandle cursor, MessagePropertyFilter filter, MessageQueueTransaction internalTransaction, MessageQueueTransactionType transactionType) at System.Messaging.MessageQueue.Receive(TimeSpan timeout, MessageQueueTransactionType transactionType) at NServiceBus.Transports.Msmq.MsmqDequeueStrategy.ReceiveMessage(Func`1 receive) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Transports\Msmq\MsmqDequeueStrategy.cs:line 313
Some other notes:
Additional notes from below conversations
Not really an answer, but too long for a comment.
What part of your operation requires DTC? A Distributed Transaction gets enlisted automatically when needed, usually when you are talking to two different DTC-supporting bits of infrastructure (e.g. MSMQ and a database).
You said you tested via DTC tracing--do you mean DTC Ping? Did you test by having it run on both machines (or all machines if there are more than two involved in the transaction)? The DTC tool is pretty esoteric, and its output can be confusing.
Also, if it did work before the reboot, is it possible the reboot reset firewall settings? Firewalls are a common cause of DTC problems.
Also, I assume you checked and rechecked your DTC settings on the local machine? Did you ensure that your MSMQ queues are set up to be transactional?
From your comments:
Note that this particular failure occurs when attempting to dequeue a message from a local private MSMQ queue [...]
The stack trace makes it appear that that's all it's doing, but I suspect that as it is attempting dequeue it is also trying to enlist the transaction between multiple servers. See below.
Why MSDTC? It's the original way to support exactly-once messaging in NServiceBus (see here).
Right, but what I'm asking is why the particular operation requires a distributed transaction. If all a handler is doing is reading from a queue and (for example) writing output to the console, MSDTC will never be enlisted, even though the handler is wrapped in a transaction scope. It will simply use a local transaction to read from the queue. The escalation to a distributed transaction is automatic, and only happens when it is needed to support multiple bits of infrastructure.
So if you recently deployed code in a handler that writes data to a new database server, you may be getting a failure because you are now enlisting a transaction that includes the new server, which may be where the failure is happening.
So determining all the servers involved in the distributed transaction is the first step. The next step would be to check the DTC settings on all involved servers. If DTC settings aren't the problem, I'd recommend testing communication between the servers using DTCPing. The NServiceBus documentation has some good instructions for using DTCPing.
What "fixed" this for us in the production environment was adding the application pool identity user to the local Administrators group on the server. Unfortunately we don't have time to determine what setting required that security setup, as this isn't a required configuration in other similar servers. Also, this isn't the most desirable solution from a security perspective, but in our particular situation, we're willing to live with it.
User contributions licensed under CC BY-SA 3.0