Message Queue Telemetry Transport (MQTT) - Note

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

9 Message Queue Telemetry Transport (MQTT)

MQTT is an OASIS standard messaging protocol for the Internet of Things (IoT). It is designed as an
extremely lightweight publish/subscribe messaging transport that is ideal for connecting remote
devices with a small code footprint and minimal network bandwidth. MQTT today is used in a wide
variety of industries, such as automotive, manufacturing, telecommunications, oil and gas, etc.

9.1 Introduction to MQTT


“MQTT is a Client Server publish/subscribe messaging transport protocol. It is light weight, open,
simple, and designed so as to be easy to implement. These characteristics make it ideal for use in many
situations, including constrained environments such as for communication in Machine to Machine
(M2M) and Internet of Things (IoT) contexts where a small code footprint is required and/or network
bandwidth is at a premium. “

Citation from the official MQTT 3.1.1 specification

The abstract of the MQTT specification does a good job describing what MQTT is all about. It is
a very light weight and binary protocol, and due to its minimal packet overhead, MQTT excels when
transferring data over the wire in comparison to protocols like HTTP. Another important aspect of
the protocol is that MQTT is extremely easy to implement on the client side. Ease of use was a key
concern in the development of MQTT and makes it a perfect fit for constrained devices with limited
resources today.

9.2 History
The MQTT protocol was invented in 1999 by Andy Stanford-Clark (IBM) and Arlen Nipper (Arcom, now
Cirrus Link). They needed a protocol for minimal battery loss and minimal bandwidth to connect with
oil pipelines via satellite. The two inventors specified several requirements for the future protocol:

 Simple implementation
 Quality of Service data delivery
 Lightweight and bandwidth efficient
 Data agnostic
 Continuous session awareness

These goals are still at the core of MQTT. However, the primary focus of the protocol has
changed from proprietary embedded systems to open Internet of Things (IoT) use cases. This shift in
focus has created a lot of confusion about what the acronym MQTT stands for. The short answer is
that MQTT is no longer considered an acronym. MQTT is simply the name of the protocol.

The longer answer is that the former acronym stood for MQ Telemetry Transport.

“MQ” refers to the MQ Series, a product IBM developed to support MQ telemetry transport.
When Andy and Arlen created their protocol in 1999, they named it after the IBM product. Many
sources label MQTT incorrectly as a message queue protocol. That is simply not true. MQTT is not a
traditional message queuing solution (although it is possible to queue messages in certain cases, a fact
that we discuss in detail in an upcoming post). Over the next ten years, IBM used the protocol
internally until they released MQTT 3.1 as a royalty-free version in 2010. Since then, everyone is
welcome to implement and use the protocol.

HiveMq has become acquainted with MQTT in 2012 and built the first version of HiveMQ that
very same year. In 2013, HiveHQ released to the public. Along with the release of the protocol
specification, IBM contributed MQTT client implementations to the newly founded Paho project of
the Eclipse Foundation. These events were definitely a big thing for the protocol because there is little
chance for wide adoption without a supportive ecosystem.

9.3 OASIS Standard and current version


Approximately 3 years after the initial publication, it was announced that MQTT would be
standardized under the wings of OASIS, an open organization with the purpose of advancing
standards. AMQP, SAML, and DocBook are just a few of the previously released OASIS standards. The
standardization process took around 1 year. On October 29, 2014 MQTT became an officially
approved OASIS Standard. The minor version change from 3.1 to 3.1.1 shows that few changes were
made to the previous version. For detailed information about these changes, see our blog post on the
advantages of 3.1.1.

In March 2019, OASIS ratified the new MQTT 5 specification. This new MQTT version
introduced new features to MQTT that are required for IoT applications deployed on cloud platforms,
and those that require more reliability and error handling to implement mission-critical messaging.

It is recommended to use of MQTT 5.

Figure 9-1 MQTT timeline

9.3.1 MQTT 5 Design Goals


The OASIS technical committee (TC) that is responsible for specifying and standardizing MQTT faced a
complex balancing act:

 Add features that long-term users want without increasing overhead or decreasing ease of
use.
 Improve performance and scalability without adding unnecessary complexity.

The TC decided on the following functional objectives for the MQTT 5 specification:

 Enhancement for scalability and large-scale systems


 Improved error reporting
 Formalize common patterns including capability discovery and request response
 Extensibility mechanisms including user properties
 Performance improvements and support for small clients

Based on these objectives and the needs of existing MQTT deployments, the TC managed to
specify several extremely useful new features. Sophisticated MQTT brokers like the HiveMQ
Enterprise MQTT Broker already implemented features such as Shared Subscriptions and Time to Live
for messages and client sessions in MQTT 3.1.1. With the release of MQTT 5, these popular features
became part of the official standard.

A key goal of the new specification is enhancement for scalability and large-scale systems.
MQTT 3.1.1 proved that MQTT is a uniquely scalable and stateful IoT protocol. (For example, the
HiveMQ enterprise MQTT broker achieved benchmarking 10.000.000 MQTT simultaneous
connections on cloud infrastructure for a single MQTT broker cluster. The design of MQTT 5 aims to
make it even easier for an MQTT broker to scale to immense numbers of concurrently-connected
clients. In this series, we’ll examine how the new version handles a broad spectrum of IoT use cases
and large-scale deployments of MQTT.

9.3.2 Trivia: What Happened to Four?


You might be curious why the successor to MQTT 3.1.1 is MQTT 5.

The answer is surprisingly simple: The MQTT protocol defines a fixed header in the CONNECT
packet. This header contains a single byte value for the protocol version.

If you inspect a few CONNECT packets on the wire, you’ll notice something interesting: MQTT
3.1 has the value "3" as protocol version and MQTT 3.1.1 has the value "4". To synchronize the
protocol version value on the wire with the official protocol version name, the new MQTT version gets
to use "5" for both the protocol name and value.

9.3.3 Why Five?


If you are still wondering whether MQTT 5 is worth your while, stay with us as we tackle this question.
In our next MQTT 5 Essentials post, we’ll establish a clear overview of the foundational changes that
MQTT 5 introduces. In part three, we’ll examine the top reasons existing MQTT users have decided
to upgrade to MQTT 5.

https://www.youtube.com/watch?v=r89uHL2wj5Q

https://youtu.be/RPf_rr1ZDvE

9.4 The publish/subscribe pattern


The publish/subscribe pattern (also known as pub/sub) provides an alternative to traditional client-
server architecture. In the client-server model, a client communicates directly with an endpoint. The
pub/sub model decouples the client that sends a message (the publisher) from the client or clients
that receive the messages (the subscribers). The publishers and subscribers never contact each other
directly. In fact, they are not even aware that the other exists. The connection between them is
handled by a third component (the broker). The job of the broker is to filter all incoming messages
and distribute them correctly to subscribers. So, let’s dive a little deeper into some of the general
aspects of pub/sub (we’ll talk about MQTT specifics in a minute).

Figure 9-2 MQTT Publish/Subscribe Architecture

The most important aspect of pub/sub is the decoupling of the publisher of the message from the
recipient (subscriber). This decoupling has several dimensions:

 Space decoupling: Publisher and subscriber do not need to know each other (for example, no
exchange of IP address and port).
 Time decoupling: Publisher and subscriber do not need to run at the same time.
 Synchronization decoupling: Operations on both components do not need to be interrupted
during publishing or receiving.

In summary, the pub/sub model removes direct communication between the publisher of the
message and the recipient/subscriber. The filtering activity of the broker makes it possible to control
which client/subscriber receives which message. The decoupling has three dimensions: space, time,
and synchronization.

9.4.1 Scalability
Pub/Sub scales better than the traditional client-server approach. This is because operations on the
broker can be highly parallelized and messages can be processed in an event-driven way. Message
caching and intelligent routing of messages are often a decisive factors for improving scalability.
Nonetheless, scaling up to millions of connections is a challenge. Such a high level of connections can
be achieved with clustered broker nodes to distribute the load over more individual servers using load
balancers. (This topic is beyond the scope of the current article, we’ll cover it in a separate post).

9.4.2 Message filtering


It’s clear that the broker plays a pivotal role in the pub/sub process. But how does the broker manage
to filter all the messages so that each subscriber receives only messages of interest? As you’ll see, the
broker has several filtering options:
9.4.2.1 OPTION 1: SUBJECT-BASED FILTERING
This filtering is based on the subject or topic that is part of each message. The receiving client
subscribes to the broker for topics of interest. From that point on, the broker ensures that the
receiving client gets all message published to the subscribed topics. In general, topics are strings with
a hierarchical structure that allow filtering based on a limited number of expressions.

9.4.2.2 OPTION 2: CONTENT-BASED FILTERING


In content-based filtering, the broker filters the message based on a specific content filter-language.
The receiving clients subscribe to filter queries of messages for which they are interested. A significant
downside to this method is that the content of the message must be known beforehand and cannot
be encrypted or easily changed.

9.4.2.3 OPTION 3: TYPE-BASED FILTERING


When object-oriented languages are used, filtering based on the type/class of a message (event) is a
common practice. For example,, a subscriber can listen to all messages, which are of type Exception
or any sub-type.

Of course, publish/subscribe is not the answer for every use case. There are a few things you
need to consider before you use this model. The decoupling of publisher and subscriber, which is the
key in pub/sub, presents a few challenges of its own. For example, you need to be aware of how the
published data is structured beforehand. For subject-based filtering, both publisher and subscriber
need to know which topics to use. Another thing to keep in mind is message delivery. The publisher
can’t assume that somebody is listening to the messages that are sent. In some instances, it is possible
that no subscriber reads a particular message.

9.4.3 MQTT
Now that we’ve explored the publish/subscribe model in general, let’s focus on MQTT specifically.
Depending on what you want to achieve, MQTT embodies all the aspects of pub/sub that we’ve
mentioned:

 MQTT decouples the publisher and subscriber spatially. To publish or receive messages,
publishers and subscribers only need to know the hostname/IP and port of the broker.
 MQTT decouples by time. Although most MQTT use cases deliver messages in near-real time,
if desired, the broker can store messages for clients that are not online. (Two conditions must
be met to store messages: the client had connected with a persistent session and subscribed
to a topic with a Quality of Service greater than 0).
 MQTT works asynchronously. Because most client libraries work asynchronously and are
based on callbacks or a similar model, tasks are not blocked while waiting for a message or
publishing a message. In certain use cases, synchronization is desirable and possible. To wait
for a certain message, some libraries have synchronous APIs. But the flow is usually
asynchronous.

Another thing that should be mentioned is that MQTT is especially easy to use on the client-
side. Most pub/sub systems have the logic on the broker-side, but MQTT is really the essence of
pub/sub when using a client library and that makes it a light-weight protocol for small and constrained
devices.
MQTT uses subject-based filtering of messages. Every message contains a topic
(subject) that the broker can use to determine whether a subscribing client gets the message or not.

To handle the challenges of a pub/sub system, MQTT has three Quality of Service (QoS)
levels. You can easily specify that a message gets successfully delivered from the client to the broker
or from the broker to a client. However, there is the chance that nobody subscribes to the particular
topic. If this is a problem, the broker must know how to handle the situation. For example, the HiveMQ
MQTT broker has a plugin system that can resolve such cases. You can have the broker take action or
simply log every message into a database for historical analyses. To keep the hierarchical topic tree
flexible, it is important to design the topic tree very carefully and leave room for future use cases. If
you follow these strategies, MQTT is perfect for production setups.

9.4.4 Distinction from message queues


There is a lot of confusion about the name MQTT and whether the protocol is implemented as a
message queue or not. We will try to shed some light on the topic and explain the differences. In
our last post, we mentioned that MQTT refers to the MQseries product from IBM and has nothing to
do with “message queue“. Regardless of where the name comes from, it’s useful to understand the
differences between MQTT and a traditional message queue:

A message queue stores message until they are consumed When you use a message queue,
each incoming message is stored in the queue until it is picked up by a client (often called a consumer).
If no client picks up the message, the message remains stuck in the queue and waits to be consumed.
In a message queue, it is not possible for a message not to be processed by any client, as it is in MQTT
if nobody subscribes to a topic.

A message is only consumed by one client Another big difference is that in a traditional
message queue a message can be processed by one consumer only. The load is distributed between
all consumers for a queue. In MQTT the behavior is quite the opposite: every subscriber that
subscribes to the topic gets the message.

Queues are named and must be created explicitly A queue is far more rigid than a topic.
Before a queue can be used, the queue must be created explicitly with a separate command. Only
after the queue is named and created is it possible to publish or consume messages. In contrast, MQTT
topics are extremely flexible and can be created on the fly.

https://youtu.be/HCzQJMdHcy0

9.5 MQTT Client and Broker and MQTT Server and Connection
Establishment
Because MQTT decouples the publisher from the subscriber, client connections are always handled by
a broker. Before we get into the details of these connections, let’s be clear about what we mean by
client and broker.

9.5.1 Client
When we talk about a client, we almost always mean an MQTT client. Both publishers and subscribers
are MQTT clients. The publisher and subscriber labels refer to whether the client is currently
publishing messages or subscribed to receive messages (publish and subscribe functionality can also
be implemented in the same MQTT client). An MQTT client is any device (from a micro controller up
to a full-fledged server) that runs an MQTT library and connects to an MQTT broker over a
network. For example, the MQTT client can be a very small, resource-constrained device that connects
over a wireless network and has a bare-minimum library. The MQTT client can also be a typical
computer running a graphical MQTT client for testing purposes. Basically, any device that speaks
MQTT over a TCP/IP stack can be called an MQTT client. The client implementation of the MQTT
protocol is very straight forward and streamlined. The ease of implementation is one of the reasons
why MQTT is ideally suited for small devices. MQTT client libraries are available for a huge variety of
programming languages. For example, Android, Arduino, C, C++, C#, Go, iOS, Java, JavaScript, and
.NET.

9.5.2 Broker
The counterpart of the MQTT client is the MQTT broker. The broker is at the heart of any
publish/subscribe protocol. Depending on the implementation, a broker can handle up to millions of
concurrently connected MQTT clients.

The broker is responsible for receiving all messages, filtering the messages, determining who
is subscribed to each message, and sending the message to these subscribed clients. The broker also
holds the session data of all clients that have persistent sessions, including subscriptions and missed
messages. Another responsibility of the broker is the authentication and authorization of clients.
Usually, the broker is extensible, which facilitates custom authentication, authorization, and
integration into backend systems. Integration is particularly important because the broker is
frequently the component that is directly exposed on the internet, handles a lot of clients, and needs
to pass messages to downstream analyzing and processing systems. In brief, the broker is the central
hub through which every message must pass. Therefore, it is important that your broker is highly
scalable, integratable into backend systems, easy to monitor, and (of course) failure-resistant.

9.5.3 MQTT Connection


The MQTT protocol is based on TCP/IP. Both the client and the broker need to have a TCP/IP stack.

Figure 9-3 MQTT TCP/IP stack

The MQTT connection is always between one client and the broker. Clients never connect to
each other directly. To initiate a connection, the client sends a CONNECT message to the broker. The
broker responds with a CONNACK message and a status code. Once the connection is established, the
broker keeps it open until the client sends a disconnect command or the connection breaks.

Figure 9-4 Connect Flow


9.5.4 MQTT connection through a NAT
In many common use cases, the MQTT client is located behind a router that uses network address
translation (NAT) to translate from a private network address (like 192.168.x.x, 10.0.x.x) to a public
facing address. As we already mentioned, the MQTT client initiates the connection by sending a
CONNECT message to the broker. Because the broker has a public address and keeps the connection
open to allow bidirectional sending and receiving of messages (after the initial CONNECT), there is no
problem at all with clients that are located behind a NAT.

9.5.5 Client initiates connection with the CONNECT message


Now let’s look at the MQTT CONNECT command message. To initiate a connection, the client sends a
command message to the broker. If this CONNECT message is malformed (according to the MQTT
specification) or too much time passes between opening a network socket and sending the connect
message, the broker closes the connection. This behavior deters malicious clients that can slow the
broker down. A good-natured MQTT 3 client sends a connect message with the following
content (among other things):

Figure 9-5 Connect

Some information included in a CONNECT message is probably more interesting to


implementers of an MQTT library rather than to users of that library. For all the details, have a look at
the MQTT 3.1.1 specification.

We will focus on the following options:

9.5.5.1 ClientId
The client identifier (ClientId) identifies each MQTT client that connects to an MQTT broker. The
broker uses the ClientId to identify the client and the current state of the client.Therefore, this Id
should be unique per client and broker. In MQTT 3.1.1 you can send an empty ClientId, if you don’t
need a state to be held by the broker. The empty ClientId results in a connection without any state. In
this case, the clean session flag must be set to true or the broker will reject the connection.
9.5.5.2 Clean Session
The clean session flag tells the broker whether the client wants to establish a persistent session or not.
In a persistent session (CleanSession = false), the broker stores all subscriptions for the client and all
missed messages for the client that subscribed with a Quality of Service (QoS) level 1 or 2. If the
session is not persistent (CleanSession = true), the broker does not store anything for the client and
purges all information from any previous persistent session.

9.5.5.3 Username/Password
MQTT can send a user name and password for client authentication and authorization. However, if
this information isn’t encrypted or hashed (either by implementation or TLS), the password is sent in
plain text. We highly recommend the use of user names and passwords together with a secure
transport. Brokers like HiveMQ can authenticate clients with an SSL certificate, so no username and
password is needed.

9.5.5.4 Will Message


The last will message is part of the Last Will and Testament (LWT) feature of MQTT. This message
notifies other clients when a client disconnects ungracefully. When a client connects, it can provide
the broker with a last will in the form of an MQTT message and topic within the CONNECT message. If
the client disconnects ungracefully, the broker sends the LWT message on behalf of the client.

9.5.5.5 KeepAlive
The keep alive is a time interval in seconds that the client specifies and communicates to the broker
when the connection established. This interval defines the longest period of time that the broker and
client can endure without sending a message. The client commits to sending regular PING Request
messages to the broker. The broker responds with a PING response. This method allows both sides to
determine if the other one is still available.

Basically, that is all the information that is all you need to connect to an MQTT broker from an MQTT
3.1.1 client. Individual libraries often have additional options that you can configure. For example, the
way that queued messages are stored in a specific implementation.

9.5.6 Broker response with a CONNACK message


When a broker receives a CONNECT message, it is obligated to respond with a CONNACK message.

The CONNACK message contains two data entries:

 The session present flag


 A connect return code

9.5.6.1 Session Present flag


The session present flag tells the client whether the broker already has a persistent session available
from previous interactions with the client. When a client connects with Clean Session set to true, the
session present flag is always false because there is no session available. If a client connects with Clean
Session set to false, there are two possibilities: If session information is available for the clientId. and
the broker has stored session information, the session present flag is true. Otherwise, if the broker
does not have any session information for the clientId, the session present flag is false. This flag was
added in MQTT 3.1.1 to help clients determine whether they need to subscribe to topics or if the topics
are still stored in a persistent session.

9.5.6.2 Connect return code


The second flag in the CONNACK message is the connect acknowledge flag. This flag contains a return
code that tells the client whether the connection attempt was successful or not.

Figure 9-6 Connack

Here are the return codes at a glance:

Return Code Return Code Response


0 Connection accepted
1 Connection refused, unacceptable protocol version
2 Connection refused, identifier rejected
3 Connection refused, server unavailable
4 Connection refused, bad user name or password
5 Connection refused, not authorized
For a more detailed explanation of each of these codes, see the MQTT specification.

https://youtu.be/vVJk5rES5vY

9.6 MQTT Publish, Subscribe & Unsubscribe


9.6.1 Publish
An MQTT client can publish messages as soon as it connects to a broker. MQTT utilizes topic-based
filtering of the messages on the broker. Each message must contain a topic that the broker can use
to forward the message to interested clients. Typically, each message has a payload which contains
the data to transmit in byte format. MQTT is data-agnostic. The use case of the client determines
how the payload is structured. The sending client (publisher) decides whether it wants to send binary
data, text data, or even full-fledged XML or JSON.

A PUBLISH message in MQTT has several attributes that we want to discuss in detail:
Figure 9-7 Publish Packet

Topic Name The topic name is a simple string that is hierarchically structured with forward slashes as
delimiters. For example, “myhome/livingroom/temperature” or
“Germany/Munich/Octoberfest/people”.

QoS - This number indicates the Quality of Service Level (QoS) of the message. There are three levels:
0, 1, and 2. The service level determines what kind of guarantee a message has for reaching the
intended recipient (client or broker).

Retain Flag - This flag defines whether the message is saved by the broker as the last known good
value for a specified topic. When a new client subscribes to a topic, they receive the last message that
is retained on that topic.

Payload - This is the actual content of the message. MQTT is data-agnostic. It is possible to send
images, text in any encoding, encrypted data, and virtually every data in binary.

Packet Identifier - The packet identifier uniquely identifies a message as it flows between the client
and broker. The packet identifier is only relevant for QoS levels greater than zero. The client library
and/or the broker is responsible for setting this internal MQTT identifier.

DUP flag - The flag indicates that the message is a duplicate and was resent because the intended
recipient (client or broker) did not acknowledge the original message. This is only relevant for QoS
greater than 0. Usually, the resend/duplicate mechanism is handled by the MQTT client library or the
broker as an implementation detail.

When a client sends a message to an MQTT broker for publication, the broker reads the message,
acknowledges the message (according to the QoS Level), and processes the message. Processing by
the broker includes determining which clients have subscribed to the topic and sending the message
to them.
Figure 9-8 Publish flow

The client that initially publishes the message is only concerned about delivering the PUBLISH message
to the broker. Once the broker receives the PUBLISH message, it is the responsibility of the broker to
deliver the message to all subscribers. The publishing client does not get any feedback about whether
anyone is interested in the published message or how many clients received the message from the
broker.

9.6.2 Subscribe
Publishing a message doesn’t make sense if no one ever receives it. In other words, if there are no
clients to subscribe to the topics of the messages. To receive messages on topics of interest, the client
sends a SUBSCRIBE message to the MQTT broker. This subscribe message is very simple, it contains a
unique packet identifier and a list of subscriptions.

Figure 9-9 Subscribe Packet

Packet Identifier - The packet identifier uniquely identifies a message as it flows between the client
and broker. The client library and/or the broker is responsible for setting this internal MQTT identifier.

List of Subscriptions - A SUBSCRIBE message can contain multiple subscriptions for a client. Each
subscription is made up of a topic and a QoS level. The topic in the subscribe message can contain
wildcards that make it possible to subscribe to a topic pattern rather than a specific topic. If there are
overlapping subscriptions for one client, the broker delivers the message that has the highest QoS
level for that topic.
9.6.3 Suback
To confirm each subscription, the broker sends a SUBACK acknowledgement message to the client.
This message contains the packet identifier of the original Subscribe message (to clearly identify the
message) and a list of return codes.

Figure 9-10 Suback packet

Packet Identifier - The packet identifier is a unique identifier used to identify a message. It is the same
as in the SUBSCRIBE message.

Return Code - The broker sends one return code for each topic/QoS-pair that it receives in the
SUBSCRIBE message. For example, if the SUBSCRIBE message has five subscriptions, the SUBACK
message contains five return codes. The return code acknowledges each topic and shows the QoS
level that is granted by the broker. If the broker refuses a subscription, the SUBACK message conains
a failure return code for that specific topic. For example, if the client has insufficient permission to
subscribe to the topic or the topic is malformed.

Return Code Return Code Response


0 Success - Maximum QoS 0
1 Success - Maximum QoS 1
2 Success - Maximum QoS 2
128 Failure

Figure 9-11 Subscribe flow


After a client successfully sends the SUBSCRIBE message and receives the SUBACK message, it gets
every published message that matches a topic in the subscriptions that the SUBSCRIBE message
contained.

9.6.4 Unsubscribe
The counterpart of the SUBSCRIBE message is the UNSUBSCRIBE message. This message deletes
existing subscriptions of a client on the broker. The UNSUBSCRIBE message is similar to the SUBSCRIBE
message and has a packet identifier and a list of topics.

Figure 9-12 Unsubscribe packet

Packet Identifier - The packet identifier uniquely identifies a message as it flows between the client
and broker. The client library and/or the broker is responsible for setting this internal MQTT identifier.

List of Topic - The list of topics can contain multiple topics from which the client wants to unsubscribe.
It is only necessary to send the topic (without QoS). The broker unsubscribes the topic, regardless of
the QoS level with which it was originally subscribed.

9.6.5 Unsuback
To confirm the unsubscribe, the broker sends an UNSUBACK acknowledgement message to the client.
This message contains only the packet identifier of the original UNSUBSCRIBE message (to clearly
identify the message).
Figure 9-13 Unsuback packet

Packet Identifier - The packet identifier uniquely identifies the message. As already mentioned, this is
the same packet identifier that is in the UNSUBSCRIBE message.

Figure 9-14 Unsubscribe flow

After receiving the UNSUBACK from the broker, the client can assume that the subscriptions in the
UNSUBSCRIBE message are deleted.

https://youtu.be/t2b1CwQmDRY

9.7 MQTT Topics & Best Practices


9.7.1 Topics
In MQTT, the word topic refers to an UTF-8 string that the broker uses to filter messages for each
connected client. The topic consists of one or more topic levels. Each topic level is separated by a
forward slash (topic level separator).
Figure 9-15 Topic basics

In comparison to a message queue, MQTT topics are very lightweight. The client does not
need to create the desired topic before they publish or subscribe to it. The broker accepts each valid
topic without any prior initialization.

Here are some examples of topics:

myhome/groundfloor/livingroom/temperature
USA/California/San Francisco/Silicon Valley
5ff4a2ce-e485-40f4-826c-b1a5d81be9b6/status
Germany/Bavaria/car/2382340923453/latitude

Note that each topic must contain at least 1 character and that the topic string permits empty
spaces. Topics are case-sensitive.

For example, myhome/temperature and MyHome/Temperature are two different topics.


Additionally, the forward slash alone is a valid topic.

9.7.2 Wildcards
When a client subscribes to a topic, it can subscribe to the exact topic of a published message or it can
use wildcards to subscribe to multiple topics simultaneously. A wildcard can only be used to subscribe
to topics, not to publish a message. There are two different kinds of wildcards: single-level and multi-
level.

9.7.3 Single Level: +


As the name suggests, a single-level wildcard replaces one topic level. The plus symbol represents a
single-level wildcard in a topic.

Figure 9-16 Topic wildcard plus

Any topic matches a topic with single-level wildcard if it contains an arbitrary string instead of
the wildcard.

For example, a subscription to myhome/groundfloor/+/temperature can produce the


following results:
Figure 9-17 Topic wildcard plus example

9.7.4 Multi Level: #


The multi-level wildcard covers many topic levels. The hash symbol represents the multi-level wild
card in the topic. For the broker to determine which topics match, the multi-level wildcard must be
placed as the last character in the topic and preceded by a forward slash.

Figure 9-18 Topic wildcard hash

Figure 9-19 Topic wildcard hash example

9.7.5 Topics beginning with $


Generally, you can name your MQTT topics as you wish. However, there is one exception: Topics that
start with a $ symbol have a different purpose. These topics are not part of the subscription when
you subscribe to the multi-level wildcard as a topic (#). The $-symbol topics are reserved for internal
statistics of the MQTT broker. Clients cannot publish messages to these topics. At the moment, there
is no official standardization for such topics. Commonly, $SYS/ is used for all the following
information, but broker implementations varies. One suggestion for $SYS-topics is in the MQTT GitHub
wiki. Here are some examples:

$SYS/broker/clients/connected
$SYS/broker/clients/disconnected
$SYS/broker/clients/total
$SYS/broker/messages/sent
$SYS/broker/uptime
9.7.6 Summary
These are the basics of MQTT message topics. As you can see, MQTT topics are dynamic and provide
great flexibility. When you use wildcards in real-world applications, there are some challenges you
should be aware of. We have collected the best practices that we have learned from working
extensively with MQTT in various projects and are always open to suggestions or a discussion about
these practices. Use the comments to start a conversation, Let us know your best practices or if you
disagree with one of ours!

9.7.7 Best practices


9.7.7.1 Never use a leading forward slash
A leading forward slash is permitted in MQTT. For example, /myhome/groundfloor/livingroom.
However, the leading forward slash introduces an unnecessary topic level with a zero character at the
front. The zero does not provide any benefit and often leads to confusion.

9.7.7.2 Never use spaces in a topic


A space is the natural enemy of every programmer. When things are not going the way they should,
spaces make it much harder to read and debug topics. As with leading forward slashes, just because
something is allowed, doesn’t mean it should be used. UTF-8 has many different white space types,
such uncommon characters should be avoided.

9.7.7.3 Keep the topic short and concise


Each topic is included in every message in which it is used. Make your topics as short and concise as
possible. When it comes to small devices, every byte counts and topic length has a big impact.

9.7.7.4 Use only ASCII characters, avoid non printable characters


Because non-ASCII UTF-8 characters often display incorrectly, it is very difficult to find typos or issues
related to the character set. Unless it is absolutely necessary, we recommend avoiding the use of non-
ASCII characters in a topic.

9.7.7.5 Embed a unique identifier or the Client Id into the topic


It can be very helpful to include the unique identifier of the publishing client in the topic. The unique
identifier in the topic helps you identify who sent the message. The embedded ID can be used to
enforce authorization. Only a client that has the same client ID as the ID in the topic is allowed to
publish to that topic. For example, a client with the client1 ID is allowed to publish to client1/status,
but not permitted to publish to client2/status.

9.7.7.6 Don’t subscribe to #


Sometimes, it is necessary to subscribe to all messages that are transferred over the broker. For
example, to persist all messages into a database. Do not subscribe to all messages on a broker by using
an MQTT client and subscribing to a multi-level wildcard. Frequently, the subscribing client is not able
to process the load of messages that results from this method (especially if you have a massive
throughput). Our recommendation is to implement an extension in the MQTT broker. For example,
with the plugin system of HiveMQ you can hook into the behavior of HiveMQ and add an
asynchronous routine to process each incoming message and persist it to a database.
9.7.7.7 Don’t forget extensibility
Topics are a flexible concept and there is no need to preallocate them in any way. However, both the
publisher and the subscriber need to be aware of the topic. It is important to think about how topics
can be extended to allow for new features or products. For example, if your smart-home solution adds
new sensors, it should be possible to add these to your topic tree without changing the whole topic
hierarchy.

9.7.7.8 Use specific topics, not general ones


When you name topics, don’t use them in the same way as in a queue. Differentiate your topics as
much as possible. For example, if you have three sensors in your living room, create topics
for myhome/livingroom/temperature, myhome/livingroom/brightness and myhome/livingroom/hum
idity. Do not send all values over myhome/livingroom. Use of a single topic for all messages is a anti
pattern. Specific naming also makes it possible for you to use other MQTT features such as retained
messages.

https://youtu.be/juq_l70Vg1w

9.7.8 MQTT Infrastructure in the Production Environment


The use of a pub-sub protocol such as MQTT and a message broker as the central component
represents a fundamental change in architecture.

All messages are sent via a central MQTT broker and all MQTT clients connect to the broker
and can subscribe to specific topics.

The MQTT broker takes over the task of the server and handles each communication with an
unlimited number of MQTT clients.

MQTT clients are implemented directly on gateways, devices, or within applications, and all
of the clients are loosely coupled. There are no direct relationships between the clients. In addition to
functional requirements, the MQTT broker handles needs such as redundancy, failover, high
availability, and scalability within a given infrastructure.

Figure 9-20 9.7.8 MQTT Infrastructure in the Production Environment

You might also like