Get on the Bus—The Azure Service Bus

Azure Service Bus is Microsoft's cloud offering for developing message-based applications. In this article, I'll explore some fundamental Azure Service Bus techniques like how to configure queues, how to send and receive messages, and how to make sure you're properly handing exceptions and poison messages. But first, let's take a moment to think about why you'd want to build a message-based application in the first place.

What's a Message-Based Architecture For?

The purpose of a message-based architecture is to decouple the components of a system. What does that really mean? To see what decoupling is all about, let's first look at the opposite: a synchronous, tightly coupled system.

A Tightly Coupled System

Imagine you're feeling hungry. You walk over to a nearby vending machine, put in some money, and push a button. Voila! The vending machine gives you your snack. This is a traditional, tightly coupled call sequence, as shown in Figure 1.

Figure 1: A tightly coupled call sequence

If you think of this in terms of software, perhaps the vending machine is a web service that exposes an “OrderSnack” endpoint. Your customer interaction with the web service is tightly coupled and synchronous. You POST a request message to the service and get back a response, your snack (or maybe a “400: Snack not found” error).

A Loosely Coupled, Message-Based System

Compare this to the following sequence: now you are really hungry, so you walk into your favorite cafe, go to the counter, and order a full breakfast. Back in the kitchen, someone prepares your meal and then someone brings it out to you. In this sequence, the process of placing an order and preparing the order are decoupled and are connected by a message, as shown in Figure 2.

Figure 2: A decoupled, message-based call sequence

The key feature of this diagram is the presence of the Order Queue. This queue decouples the Counter and the Kitchen. The Counter doesn't cook your food or even interact with the Kitchen directly. Instead, the Counter enqueues the order and later a cook in the Kitchen dequeues it and starts preparing it. (In a real kitchen, there's likely a second “delivery queue,” i.e., a shelf with heat lamps and a bell, to signal that there is food ready to be brought out).

This decoupling allows the Counter and the Kitchen to be implemented and scaled separately. In the real world, the people you hire for the Counter and the Kitchen do totally different jobs, have different skills and require different equipment. It makes sense to “implement,” i.e., staff and equip, these departments separately. Likewise, if the Kitchen is falling behind, you can “scale” it, i.e., hire more cooks, without having to hire more counter clerks. Overall, the system is much more flexible and scalable and can efficiently serve a nice hot meal to many more people than a set of vending machines.

Software systems are filled with these kinds of queue-based systems. Think of an e-commerce platform: the place order, pick-and-pack, ship, and invoice functions are all going to be chained together with messages, not tightly coupled. Azure Service Bus allows you to develop this kind of architecture. It provides the plumbing—the queues and messages—that make a messaging-based architecture possible.

Getting Set Up

Before you can start sending and receiving messages, you need to get a few things ready. Specifically, you need to:

Create an Azure Service Bus Namespace
Create an Azure Service Bus Queue
Install the Azure.Messaging.ServiceBus NuGet Package in Visual Studio.
Connect to the Service Bus Namespace using a ServiceBusClient instance.

Create an Azure Service Bus Namespace

An Azure Service Bus Namespace is a container that holds one or more Queues, Topics, and Subscriptions. Before you create the first Queue, you need to provision a Namespace for it to reside in. The Azure CLI code below assumes that you already have an active Azure subscription and have created a resource group.

To create an Azure Service Bus namespace, use an Azure CLI command like this

az servicebus namespace create `
    --resource-group Resource_Group_1 `
    --name TestBusNamespace `
    --location westus

Create an Azure Service Bus Queue

Once you've created the Azure Service Bus Namespace, you can create your first Azure Service Bus Queue using an Azure CLI command like this:

az servicebus queue create `  
    --resource-group Resource_group_1 `  
    --namespace-name TestBusNamespace `  
    --name MyFirstQueue

Looking in the Azure Portal, you can see how the newly created queue looks (Figure 3).

Figure 3: The new queue, shown on the Azure Portal

Note that the queue is immediately active and has no messages. You'll soon fix that that by sending your first message!

Install the NuGet Package

All of the Azure Service Bus code will reply on a NuGet package called Azure.Messaging.ServiceBus. You can install this package using the Package Manager Console in Visual Studio. To access the Package Manager Console, click Tools > NuGet Package Manager > Package Manager Console. Once in the Package Manager Console, run this command:

Install-Package Azure.Messaging.ServiceBus

Then, in any unit where you're writing Azure Service Bus code, be sure to include this line in your list of using statements:

using Azure.Messaging.ServiceBus;

Connect to the Service Bus Namespace

For both sending and receiving messages, you'll first need to instantiate a ServiceBusClient that manages the authentication and access to resources within a Service Bus Namespace. Here's an example of what creating a ServiceBusClient might look like:

string nameSpace = "TestBusNamespace.servicebus.windows.net";

DefaultAzureCredential creds = new();

ServiceBusClientOptions options = new()
{
    TransportType = ServiceBusTransportType.AmqpWebSockets
};

ServiceBusClient client = new(nameSpace, creds, options);

The use of DefaultAzureCredentials is great for a production environment with sophisticated Azure authentication flows in place. But for this little demonstration app, it's overkill. Instead, use a connection string. Like a SQL connection string, an Azure Service Bus connection string contains all the elements needed to identify and authenticate to an Azure Service Bus Namespace, including a key, so you'll want to take care to store these securely.

You can access a connection string for a queue via the Azure Portal by navigating to the Namespace and then selecting Settings > Shared access policies > RootManagerSharedAccessKey and then copying the Primary Connection String from the panel that appears on the right. You can also access the keys and connection strings via the Azure CLI with a command like this:

az servicebus namespace authorization-rule `
    keys list `
        --resource-group "Resource_group_1" ` 
        --namespace-name "TestBusNamespace " `
        --name "RootManageSharedAccessKey" `
        --query "primaryConnectionString" `
        --output tsv

Once you have your connection string (which will be quite long) you can save it to a configuration file or other secure storage and load it as needed. Now, the ServiceBusClient constructor will be quite a bit simpler:

string constr = ConfigurationManager.AppSettings["ConStr"];

ServiceBusClient client = new(constr);

That is much easier.

Send and Receive

The fundamental two actions when using Azure Service Bus are sending and receiving messages. Later in the article, I'll explore some of the complexities involved in constructing useful messages and building a production quality message receiver, but to start with, let's just send and receive a single message.

Send a Message

The basic pattern for sending a message is very simple: create a sender, create a message, send.

ServiceBusClient client = new(constr);

string queue = "MyFirstQueue";

ServiceBusSender sender = client.CreateSender(queue);

ServiceBusMessage msg = new("This is my first message!");

await sender.SendMessageAsync(msg);

Once you run this code, the message should show up as an Active message in the Azure Portal, as shown in Figure 4.

Receive a Message

Receiving a message is also very simple: create a receiver, receive, complete.

ServiceBusClient client = new(constr);

string queue = "MyFirstQueue";

ServiceBusReceiver receiver = client.CreateReceiver(queue);

ServiceBusReceivedMessage msg = await receiver.ReceiveMessageAsync();

if (msg != null)
{
    await receiver.CompleteMessageAsync(msg);
}

Once you run this code, the Active message count in the queue returns to zero and you can see both the send and receive activity recorded in a handy graph, shown in Figure 5.

Figure 5: Send and receive events graphed on the Azure Portal

The one part of that code you might not have anticipated was the need to Complete the message after receiving it. You'll be looking at exactly what that's for later when I talk about Peek-Locks.

Message Delivery Order

What happens if there's more than one message in the queue? Which one does the receiver get? Remember, a queue is a first-in-first-out (FIFO) data structure. That means that if you send multiple messages to a queue, your receiver will receive them in the order they were sent, as shown in in Figure 6.

Figure 6: A queue is a FIFO data structure

Azure Service bus does contain features like peeking and deferring that you can use to get around strict FIFO ordering. But as long as you stick with typical receiver patterns, it's easy to preserve strict FIFO if you want to.

Topics and Subscriptions

I'm sure you've heard the term “pub-sub” which is short for publish-subscribe or publisher-subscriber. The idea with a pub-sub architecture is that a publisher can publish a message once and multiple subscribers can then get the message. Azure Service bus supports this pattern very naturally with Topics and Subscriptions.

A Topic is an Azure Service Bus Queue that has been configured to allow multiple receivers, which are called Subscriptions. You create them the same way on the Azure Portal as you create Queues, except you use the topic create command:

az servicebus topic create `
    --resource-group Resource_group_1 `
    --namespace-name TestBusNamespace `
    --name MyFirstTopic

Once a Topic is created, it shows up in the Topics list of your Namespace, right next to the Queues, as shown in Figure 7.

Note that unlike a Queue, your Topic does not have Message count property, but it does have a Subscription count. More on that in a second.

Once your Topic is set up, you can create multiple Subscriptions. Here's the CLI command to create one:

az servicebus topic subscription create `
    --resource-group Resource_group_1 `
    --namespace-name TestBusNamespace `
    --topic-name MyFirstTopic `
    --name Subscription1

Figure 8 shows the result in the Azure Portal.

Note that this Subscription looks just like a Queue, complete with Message count and all the other Queue properties. Why is that? When you send a message to a Topic, what really happens is a copy of that message gets sent to each Subscription, so each Subscription functions like an independent queue, as shown in Figure 9.

Figure 9: Sending to a Topic with three Subscriptions

The Topic itself doesn't really hold any messages. It has no message counts but the Subscriptions do. This pattern continues as you start to receive messages. Let's say that the Sender posted a second message. Meanwhile, Subscription 1 receives the first message. The state would then look like Figure 10.

Figure 10: Receiving from a Subscription

The key point here is that the three Subscriptions function as totally independent message queues. Receiving from one has no effect on the others, but sending to the Topic always sends to all Subscriptions.

The code to send a message to a Topic is identical to the code for sending to a Queue except that you use the Topic Name.

The code to send a message to a Topic is identical to the code for sending to a Queue except that when you create the ServiceBusSender, you use the Topic Name:

string topic = "MyFirstTopic";

ServiceBusSender sender = client.CreateSender(topic);

ServiceBusMessage msg = new("Hello to all my subscribers!");

await sender.SendMessageAsync(msg);

Similarly, the code to receive from a Subscription is identical to the code for receiving from a Queue except that when creating the ServiceBusReceiver you specify both the Topic Name and the Subscription name:

string topic = "MyFirstTopic";
string sub = "Subscription1";

ServiceBusReceiver receiver = client.CreateReceiver(topic, sub);

ServiceBusReceivedMessage msg = await receiver.ReceiveMessageAsync();

When receiving from a Subscription, you have all the same choices and options that I discussed for receiving from a Queue.

Message Payload

In this simple send and receive example, you used the message body, “This is my first message!” For a real application, you're going to want to send something a little more useful than that. Before you start building a message-based architecture, it's important to give some thought to the kinds of messages you plan to send and how that relates to the overall pattern and processing logic of your application.

Heavy vs. Light Messages

In the world of messaging-based applications, I generally see messages that fall into one of two categories that I call “heavy” and “light.” Note that these are my terms and others may use different words for the same concepts (another phrasing I've seen is “stateful vs. stateless”).

A heavy message is one where the body of the message contains everything the receiver needs to process the message. Consider a real-world example. Let's say you're a sales representative, and you're working on a bid for a customer. Your company has a strict approval process. Now imagine that it's 30 years ago, so to get your bid approved you need to print it out, put a sticky note on top saying “Please review”, and place it in your manager's inbox, as shown in Figure 11.

Figure 11: An example of a “heavy” message

In this case, the queue is the physical inbox and the message is the printed bid with your post it. This message contains everything the manager needs to review the bid. They don't have to ask you for a copy or pull it from a file cabinet. It's all right there. This is a “heavy” message.

Now fast-forward to the present day and you're using a modern CRM system to prepare your bids. As you work on the bids, the data is saved in a database of some sort. When you're ready for your manager to approve the bid, you click a “Submit for Approval” button, and a message gets posted to a collaboration channel saying Bid 123 for Customer X is ready for your approval. To review and approve the bid, the manager needs to retrieve the item from the database, as shown in Figure 12.

Figure 12: An example of a “light” message

In this case, the queue is the manager's collaboration feed and the message is a simple post notifying them that they have something to approve. The message is light. It does not contain everything needed for the review. Instead, it acts as a signal telling the manager that something happened.

This distinction between heavy and light messages has significant implications for your architecture. Both approaches have their pros and cons. Heavy messages are convenient. They free the application from having to grant every component access to a repository and it removes a shared dependency that can work against a truly decoupled architecture.

On the downside, heavy messages are stateful and if the real-world state of the data changes, they become stale, which can create data consistency issues, especially if you're not strict with message ordering. In addition, if you use heavy messages and these messages contain the only copy of the data, you need to be extra careful not to lose the messages because there's no way to recover the lost information. Imagine if your boss misplaced the only copy of that bid proposal!

Light messages are simple, easy to compose and support an event-based architecture where one component can signal another about an event and then the other can decide what action to take. The downside is that component systems may need to share some form of access to a repository that can tightly re-couple the systems.

I've used both approaches and sometimes have even mixed them in a single application. The key is to understand the difference and make an intentional choice.

JSON Serialized and Binary Content

A very common way to construct the body of an Azure Service Bus message is to use a simple model class and JSON serialization. This works for both light and heavy messages. Using the example of the bid sent for approval, you might begin with a simple model summarizing the bid:

public class Bid
{
    public int BidId { get; set; }
    public DateTime BidDate { get; set; }
    public string Customer { get; set; }
    public string Description { get; set; }
    public decimal Amount { get; set; }
}

Your code to construct the message would then look like this:

Bid bid = new()
{
    BidId = 1000,
    BidDate = DateTime.UtcNow,
    Customer = "ACME Widgets",
    Description = "Plastic Pellets",
    Amount = 25000M
};

ServiceBusMessage msg = new(JsonSerializer.Serialize(bid));

// or

ServiceBusMessage msg = new() { Body = bid };

For a heavy message, you could simply use a much larger and more complex data model.

A Service Bus Message Body is BinaryData, so you can send a complete document if you want. Just know your message size limitations that depend on subscription and protocol.

Interestingly, the Body property of a Service Bus Message is actually BinaryData. This means if you wanted to go with a really heavy message where a complete binary document is in the message, you could do something like this:

byte[] docBytes = await File.ReadAllBytesAsync(bidDocumentPath);

ServiceBusMessage msg = new()
{
    Subject = JsonSerializer.Serialize(bid),
    Body = new BinaryData(docBytes)
};

In this case, you're using the Subject of the message like the sticky note and the complete document is in the Body. Just remember that the Maximum Message Size in Azure depends on your subscription and protocol. On the Premium Tier using AMQP Web Sockets, messages can be up to 100 MB.

Correlation IDs

Before I move on from the subject of message payload, I want to talk about Correlation IDs. To me, Correlation IDs are one of the most important and useful features of any message-based or microservices architecture. In any decoupled system where there are multiple independent components that handle part of a process, there's often the need to trace the journey of a single request across the entire chain. Consider the example of an e-commerce application.

When you click “Submit Order” in your favorite shopping app, a whole series of actions happens. A credit card service processes your payment. An email service sends you a confirmation. A fulfillment service starts picking and packing your order. A shipping service ships your order. The email service notifies you of the shipment. The billing service may send you a statement. How do you map out the journey of a single order across all these systems, each of which may have its own repository and its own log mechanism?

The answer is to use Correlation IDs. A Correlation ID is an identifier (often a GUID), assigned by the first system in the chain, and then passed through in every subsequent message. This is so important that the Azure Service Bus Message class has an explicit CorrelationId property. To add a Correlation ID to the message is very easy:

ServiceBusMessage msg = new()
{
    Subject = JsonSerializer.Serialize(bid),
    Body = new BinaryData(docBytes),
    CorrelationId = Guid.NewGuid().ToString()
};

Correlation IDs are one of the most important and useful features of any message-based or microservices architecture.

I strongly recommend that you use them.

Receiver Modes and Options

Whether you're using simple queues or topics and subscriptions and regardless of your message payload, there's quite a bit to understand and consider when it comes to receiving Azure Service Bus messages. You have a lot of options and behaviors to be aware of.

Receive Timeout

What happens if there are no more messages in the queue when you call ReceiveAsync? The answer is that your code will wait for sixty seconds (the default timeout) and then return null. You can customize the receive timeout by passing a TimeSpan to ReceiveMessageAsync. Regardless of the timeout you use, it's possible that ReceiveMessageAsync may return null, so before processing your message, you should always check that you did indeed get one:

TimeSpan timeout = TimeSpan.FromSeconds(5);

ServiceBusReceivedMessage msg = await receiver.ReceiveMessageAsync(timeout);

if (msg != null)
{
    // process the message...
}

Receive-and-Delete

Azure Bus supports two different receive modes: Peek-Lock and Receive-and-Delete. Understanding these receive modes is one of the most important concepts in the world of Azure Service Bus. Let's start with the simpler of the two modes, Receive-and-Delete.

With Receive-and-Delete, the act of receiving a message causes it to be immediately and permanently deleted from the queue. There's no need to do any additional processing, like completing the message. In fact, if you call CompleteMessageAsync() when using Receive-and-Delete mode, you'll get an exception.

Azure Bus supports two different receive modes: Peek-Lock and Receive-and-Delete. Understanding these receive modes is one of the most important concepts in the world of Azure Service Bus.

The advantage of this approach is that the behavior is simple and easy to understand. The downside is that you can't take advantage of some of the sophisticated retry behaviors outlined below. You get one shot at the message and then it's deleted.

It's important to note that Receive-and-Delete is not the default receive mode. To use it, you need to explicitly configure your receiver as shown:

ServiceBusReceiverOptions options = new()
{
    ReceiveMode = ServiceBusReceiveMode.ReceiveAndDelete
};

ServiceBusReceiver receiver = client.CreateReceiver(queue, options);

ServiceBusReceivedMessage msg = await receiver.ReceiveMessageAsync(timeout);

Peek-Lock

With Peek-Lock, the act of receiving a message is more complicated, and potentially more confusing. When you receive the message, the server sets up a peek-lock on your message. This means that the message is marked as locked and no other receiver can receive it while it's locked. Then the server starts a timer. Your client code needs to take some sort of completion action before that timer expires. You have four options, as shown in Table 1.

Choosing the correct completion option is critical. If you processed the message successfully, you want to Complete it. But what if something goes wrong? This gets tricky and your choice depends a lot on the nature of the problem and the logic of your application. If you're dealing with a “poison message” (i.e., one that will never be able to be processed) it's best to just Dead Letter it right away, otherwise you risk clogging up your queue with bad messages. However, if the message failed but you might be able to recover and process it on a retry, then maybe Abandon is the right move. I'll return to topics related to poison messages and other issues at the end of the article.

Peek-Lock Expiration

What happens if your peek-lock expires? On the server side, if a peek-lock expires, the server treats it like an Abandon. This means the message is unlocked and the retry counter is incremented. On the client side, if your peek-lock expires as soon as you call one of the completion methods, you'll receive an exception.

There are a couple of techniques available to you to help manage Peek-Lock expirations. First, you can configure the Message lock duration. On the Azure Portal, locate your queue and click to see the details. Under settings, you can access the Message lock duration, as shown in Figure 13.

Figure 13: Queue settings highlighting message lock duration

Click Change to adjust the duration from five minutes to five seconds. This approach can be helpful if, on a routine basis, message processing takes a bit longer than one minute. By simply expanding the lock, you can save yourself from experiencing unwanted timeouts.

The other approach you can take is to renew the lock using RenewMessageLockAsync(). Conceptually, this is simple, but implementing it is a little tricky because your code needs to somehow keep track of how much time has expired while you're processing your batch of messages. This involves some parallel processing code, which is beyond the scope of this article.

Receiver Patterns

So far, you've looked at code that receives a single message. A real application is going to receive and process multiple messages over a long period of time. There are several ways you can approach this.

A Message Pump

A message pump is a long-running loop that processes messages as they come in, like this:

while (true)
{
    // long running loop
    var msg = await receiver.ReceiveMessageAsync(timeout);
    if (msg != null)
    {
        // process the message
        await receiver.CompleteMessageAsync(msg);
    }
}

Please note that this loop is missing some very important parts. For production code, you need to add exception handling and a way to gracefully exit the loop with something like a CancellationToken. But this gives you the basic idea of how you could start writing a message-pump based Service Bus Receiver. The Visual Studio Sample Project associated with this article (and which you can download from www.CODEMag.com) has a more fully realized message pump for you to try out.

Batch Processing

All of the receiver code up to this point has operated one message at a time, but it's quite possible to create a Service Bus Receiver that works in a more batch-based fashion by receiving a set of messages:

IReadOnlyList<ServiceBusReceivedMessage> messages = await 
  receiver.ReceiveMessagesAsync(10, timeout);

foreach (ServiceBusReceivedMessage msg in messages)
{
    // process the message
    await receiver.CompleteMessageAsync(msg);
}

In this example, you passed the number 10 to ReceiveMessagesAsync. This is the maximum number of messages you want returned in the batch. If there were 25 messages in the queue, you'd get the first 10. If there were only seven messages in the queue, you'd get all seven. Interestingly, you can combine the Message Pump pattern with Batch Processing and write a long-running loop that pulls batches of messages instead of single messages.

One thing to be aware of with this kind of batch processing technique: When you receive a batch of messages, the peek-lock timer starts for all of the messages in the batch. This can lead to unexpected lock timeouts if you're not careful. Let's take a moment to think this through and see how you might encounter unexpected lock expirations.

When you receive a batch of messages, the peek-lock timer starts for all of the messages in the batch. This can lead to unexpected lock timeouts if you aren't careful.

Let's say that your lock timeout has the default of one minute. Then, let's say you know that it takes about 10 seconds to process each message. Doing the math, you calculate that you can realistically pull five messages at a time because 5 * 10 = 50 seconds, which is comfortably under your timeout of 60 seconds.

Now, imagine that, for some reason, messages are moving slowly. Maybe there's a dependent service that's lagging, so now, your messages are taking 20 seconds each, rather than 10. Here's what happens:

Message 1: Total elapsed time 20 seconds,
Message 2: Total elapsed time 40 seconds
Message 3: Total elapsed time 60 seconds
Message 4: Total elapsed time: Error–Peek-Lock Timeout
Message 5: Total elapsed time: Error–Peek-Lock Timeout

You have several ways to manage this situation, such as lengthening the peek-lock timeout, reducing your batch size, and tracking elapsed time, so you can renew locks if needed. Just be aware of how this works. If you don't plan for it, it can be a major gotcha.

Event-Based Message Receiver

The Azure Service Bus namespace contains a useful class called the ServiceBusProcessor. This class implements a message pump for you and provides an elegant, event-based programming model. Under the hood, ServiceBusProcessor is, in fact, a wrapper around one or more ServiceBusReceivers. Many of the Azure Service Bus quick-starts and tutorials you'll find start with the ServiceBusProcessor because it's so easy to use:

ServiceBusProcessor processor = client.CreateProcessor(queue);

processor.ProcessMessageAsync += OnMessageAsync;
processor.ProcessErrorAsync += OnErrorAsync;

await processor.StartProcessingAsync();

In this code, you create a ServiceBusProcessor, wire up two event handlers, and then start the processor. Here are what the event handlers look like:

async Task OnMessageAsync(ProcessMessageEventArgs args)
{
    // process message here
}

async Task OnErrorAsync(ProcessErrorEventArgs args)
{
    // process error here
}

By default, the ServiceBusProcessor is using Peek-Lock as the receive mode and it's pulling one message at a time. It's also using a feature called autocomplete, which means that at the end of your ProcessMessageAsync handler, the message is automatically completed for you. You don't need to call CompleteMessageAsync. Likewise, ProcessErrorAsync handler automatically calls AbandonMessageAsync for you.

The ServiceBusProcessor class implements a message pump for you and provides an elegant, event-based programming model.

You can control all these behaviors by using ServiceBusProcessorOptions. For example, let's say you wanted to use Retrieve-and-Delete with a batch-based receiver that can pull ten messages at a time. Your code would look list this:

ServiceBusProcessorOptions options = new()
{
    ReceiveMode = ServiceBusReceiveMode.ReceiveAndDelete,
    PrefetchCount = 10
};

ServiceBusProcessor processor = client.CreateProcessor(queue, options);

The Visual Studio sample project associated with this article on www.CODEMag.com also has a ServiceBusProcessor for you to try.

Errors and Fault Tolerance

If you stopped reading this article right now, you'd likely have enough to go build a workable, messaging-based application using Azure Service Bus. But before I wrap up, I do want to talk about some error conditions that you need to anticipate and have a plan for so that your Azure Service Bus Solution will be robust and fault tolerant.

To facilitate the conversation about errors, it's helpful to have a more specific application example in mind. Imagine that you have a Weather Data collection application. There are weather stations all over that periodically send weather reports to an input queue. From there, a report processor receives the messages, validates them, and writes them to a SQL Server database, as shown in Figure 14.

Figure 14: A queue-based weather reporting app

With this application in mind, let's look at the two most important kinds of error conditions to plan for: poison messages and transient error conditions.

Poison Messages

A poison message is a message that can never be processed by your application. No matter how many times you retry it, it will never succeed. Imagine that the Weather Report application uses a ServiceBusProcessor and the ProcessMessageAsync event handler looks like this:

async Task OnMessageReceived(ProcessMessageEventArgs args)
{
    string json = args.Message.Body.ToString();

    WeatherData report = 
      JsonSerializer.Deserialize<WeatherData>(json); // poison?
    // more message processing code...
}

What happens if the message body isn't valid JSON? When you attempt to deserialize it, the call to Deserialize throws a System.Text.Json.JsonException. This is a poison message. No matter how many times you try to deserialize it, it will never work.

If you leave the code above as is (without any exception handling), that exception will cause your ProcessErrorAsync event handler to fire. Let's say this is your handler:

async Task OnMessagError(ProcessErrorEventArgs args)
{
    await LogError(args.Exception.Message);
}

What's the end result? If you're using Peek-Locks and the default Max Delivery Count of 10, this poison message is going to retry 10 times and then, after the tenth try, it ends up in the Dead Letter Queue. This is not a great behavior. Now imagine that instead of one bad message, you have thousands. Maybe it's a bad actor intentionally sending poison messages. Your Service Bus Processor will dutifully retry them all (10 times by default) and then they will all end up in the Dead Letter Queue. In the best case, your app churns and it costs you money on your Azure subscription. In the worst case, valid requests start getting choked off.

The solution is to identify poison messages on the first attempt to process them and Dead Letter them immediately. One way to do this in the specific example is with a typed exception handler like this:

try
{
    // all the same code to process a message
}
catch (JsonException) // poison message
{
    await args.DeadLetterMessageAsync(args.Message);
}

Now you can eliminate this kind of Poison Message more efficiently.

Transient Error Conditions

Let's consider a different kind of error. Let's say that after deserializing the weather report, you need to write to SQL. That code might look like this:

WeatherData report = JsonSerializer.Deserialize<WeatherData>(json);

using SqlConnection con = new(sqlConStr);

await con.OpenAsync(); // what if SQL is down?

await WriteToSql(con, report);

What happens if the SQL Server is unavailable for a while? The call to Connection.OpenAsync is going to throw an exception, and this means that during the SQL outage, every single message that comes through is going to fail and retry 10 times and then end up in the Dead Letter Queue. In practical terms, you might suddenly find thousands or tens of thousands of messages piled up in the Dead Letter Queue just because you had a SQL Server service interruption. This is a big mess. Your server was churning for hours doing nothing but logging errors, and now you have a pile of viable messages to somehow replay.

The trick to handling these kinds of transient error conditions is to implement a “stand-off and try later” approach. Once you know that SQL Server is unavailable, there's no point processing any more messages. You want to pause your service bus processor and only re-enable it once you have verified SQL availability:

try
{
    // all the same code to process a message
}
catch (SqlException)  // stand off and try later
{
    await args.AbandonMessageAsync(args.Message);
    await PauseTheProcessor();
}

The PauseTheProcessor might look something like this:

async Task PauseTheProcessor()
{
    await processor.StopProcessingAsync();
    while (true)
    {
        if (await TestTheSqlConnection())
        {
            await processor.StartProcessingAsync();
            break;
        }
        await Task.Delay(60000);
    }
}

I should note that this technique applies not only to a SQL Server connection but to any dependencies in your code. Maybe you're pulling data from one or more APIs. Maybe you need to send messages to other queues. Any one of these external dependencies is a possible source of transient error conditions. You need to try to account for them all.

Conclusion

At a high level, a message-based architecture is easy to conceptualize, and Azure Service Bus provides tools that make them easy to build. Below the surface, you have some important choices to make and situations to consider if you want a robust message-based application to really do what you want:

Use simple Queues or Topics and Subscriptions.
Use heavy (stateful) or light (stateless) messages.
Use Peek-Lock or Receive-and-Delete.
Use the ServiceBusProcessor or roll your own message pump.
Pull one message at a time or use batches.
Enforce strict FIFO ordering or allow techniques that can end up with messages out of order.
Abandon and retry failed messages or Dead Letter them.

In summary, you have a lot to think about when building a Service Bus architecture: I find that it usually takes some careful thinking and a bit of trial and error to find which pattern and combination works best. My hope is that you now have enough information to get started and make solid choices with Azure Service Bus.

Table 1: Four options for receiving a message

Option	Method	Description
Complete	CompleteMessageAsync(msg);	The message is removed from the queue. This is used for normal, successful message processing.
Dead Letter	DeadLetterMessageAsync(msg);	The message is removed from the queue and placed in a special Dead Letter Queue. This is generally used when a message is not processable. It's possible to "replay" messages in the Dead Letter queue using the Azure Portal or CLI.
Abandon	AbandonMessageAsync(msg);	The message is unlocked and its retry counter is incremented. This is used when the message might be able to be processed in the future, so you want to put it back in the queue and retry later. The number of allowed retries is configurable. When a message hits this maximum number of allowed deliveries, the next Abandon will send it to the Dead Letter Queue instead.
Defer	DeferMessageAsync(msg);	The message technically stays in the main queue but is marked as deferred. In order to receive this message again in the future, you need to save its SequenceNumber and receive it using the special `ReceiveDeferredMessageAsync` method. This approach creates complex message ordering and access issues and should be used with caution.