BUG: MQTT (pub/sub/unsub) functions very slow

Hi

the subscribe(), unsubscribe() and publish() functions seem to take a long time

slowMQTT.toe (4.0 KB)

start a broker like mosquito, then just run the sub, unsub, pub dats a few times >> the take 2-8 ms to sub/unsub to 5 topics.

win 10
b 2020.20625
running local mosquitto broker

1 Like

Hello Achim, i´m working on mqtt this days and i think your problem like mine are with mosquitto.

Now using https://www.emqx.io/ works good and pretty fast in both direction. This week i will be doing some youtube live explaining how it works (but in spanish sorry) and will post some files.

I did some changes on your file, test it and try to change to another broker.

Regards

Javo

slowMQTT.toe (4.5 KB)

Hey Javo,

I get similar long pub/sub/unsub cooktimes with emqx. Same with the hivemq broker. Its maybe a bit faster, but still way too much for just (un) registering a subscription.

The 2-8ms time, is that per operation or for all 3 – subscribe, publish, unsubscribe? I’m getting around 1.5ms, 0.8ms, 1.5ms on average, respectively, when testing with your toe file.

The MQTT Client API provides a way to subscribe/unsubscribe to multiple topics at once – I changed the python subscribe and unsubscribe methods to accept a list and subscribe/unsubscribe to all at once and got much better results:

n.subscribe(['1', '2', '3', '4', '5']) – 0.2ms
n.unsubscribe(['1', '2', '3', '4', '5']) – 0.2ms

Hey Eric.

the 2-8 ms was per operation e.g. for subscribe it behaves like this:
1 x sub = 0.1 - 0.2 ms
2 x sub = 0.5 - 2.0 ms
3 x sub = 2.0 - 4.0 ms
4 x sub = 3.0 - 6.0 ms
…
same for unsubscribe. So the multi-topic sub/unsub will help a lot for these two functions. Thx!.

But i still wonder why executing it multiple times increases the cooktime so much. Especially as I have the same issue when publishing: Not as bad, but still way to expensive for pushing out a few messages:
1 x pub: 0.07 - 0.13 ms
2 x pub: 0.7- 1.0 ms
3 x pub: 1.5 - 2.0 ms
…

Feels a bit like something is blocking. The first publish goes out quickly, but if on the same frame I publish a second message, it seems that internally paho is still busy with the first publish. Are you using the mqtt::async_client?

Are you using the mqtt::async_client?

Yep, we are.

Your publish times scale up much more severely than mine running the same file, not sure why that would be. These are my results with 10 topics and 10 published messages:

You could try upping the max in flight parameter, however I had it set to 1 in my tests, and any increases provided only minimal improvement.

EDIT: After looking at the MQTTAsync implementation, I expect what’s happening is the send thread that’s responsible for consuming commands (ie. subscribe, publish, unsubscribe) grabs the lock for the command queue after receiving the first command, then the main thread has to wait until that’s done before adding a new command. What the MQTT API needs is a way to add multiple message commands at once, like they currently have with subscribe/unsubscribe.

Thanks for looking into it Eric!

Our backend at the TelekomDesignGallery is completely based on MQTT, so a well performing node is pretty critical for us (our java script guys don’t seem to have this problem). So please excuse if I’m suggesting possibly stupid ideas to improve this.

  • The mqtt DAT has a publish callback (which doesn’t seem very useful as it has no args). Could that also slow things down? i.e. it is called not called directly after publish, but after acknowledge, therefore blocking ?

  • In below example, they seem to be able to publish multiple topics by using some kind of queue. Would this help or is this something you guys are doing anyways?https://github.com/eclipse/paho.mqtt.cpp/blob/master/src/samples/pub_speed_test.cpp

And here’s a proper test file (with looked trail CHOP for different amount of messages)
slowMQTTv2.toe (9.2 KB)

I believe the developers for the JavaScript MQTT and the C MQTT API are different so there’s possibly a discrepancy in performance between the two.

The mqtt DAT has a publish callback (which doesn’t seem very useful as it has no args). Could that also slow things down? i.e. it is called not called directly after publish, but after acknowledge, therefore blocking ?

That shouldn’t be an issue. In fact I just tested with it not used and it had no effect.

In below example, they seem to be able to publish multiple topics by using some kind of queue. Would this help or is this something you guys are doing anyways?https://github.com/eclipse/paho.mqtt.cpp/blob/master/src/samples/pub_speed_test.cpp

This is more or less what we are doing. However, I haven’t tested this specific example to see what kind of timing they get on publish calls.

One suggestion I have is to lower the quality of service (QOS) to 0, if your project allows it. This is the fastest mode of transfer but there is no guarantee that the message will be delivered. “Fire and forget” as it’s called in the MQTT documentation. QOS 2 is the slowest transfer mode as it requires acknowledgement from the server that it received the message, while also ensuring it does not send duplicates.