TMRh20s Project Blog

Thursday

RF24Gateway - What Does It Do And How Do I Use It?

RF24Gateway - What Does It Do And How Do I Use It?

A simplified explanation...

So maybe you've taken a look at the RF24 communication stack and have seen RF24Gateway, along with the RF24Ethernet libraries, and are wondering what they do? I've noticed some confusion among users who aren't sure what the libraries do exactly.

The simplest explanation, is that it allows users to use nRF24L01 radios as standard Network Interface Cards (NICs). Once RF24Gateway is installed, its like having a standard NIC linked into the wireless RF24 radio network.

This means users don't need to understand programming, RF24 APIs, or anything much about how the RF24 comm. stack operates in order to use RF24Gateway &/or RF24Ethernet. It means users can use standard networking tools like the 'ping' command to test network connectivity, or more advanced tools & protocols like MQTT, HTTP, etc to interact with their Arduino devices or other Raspberry Pi/Linux devices wireless, over RF24Gateway & the radio network.

Linux users, for example, can connect to other Linux devices over the RF24 based network by using SSH, SSL or other secure protocols.

A good general understanding of standard networking and protocols is probably going to be beneficial to anyone trying to make use of RF24Gateway/RF24Ethernet. The system uses the proprietary ESB protocol by Nordic for communication, and although it operates on similar frequencies to WiFi, it is not a WiFi based system. However, it does allow your RF24Ethernet & RF24Gateway devices to interact with WiFi devices on your network, because RF24Gateway can link the WiFi & RF24 networks together using standard networking techniques like forwarding and routing, or by using an intermediary like an MQTT server.

This means advanced users can also do things like link RF24 based sensor networks together by using VPNs, SSH tunneling, etc and create a centralized environment to collect and display data, control devices, or otherwise interact with the networks. I've experimented with automation systems like Home Assistant and Node Red to create these types of networks myself, and there are a number of alternatives that can be easily integrated into the system. Networks can work over very wide geographical areas, covering many KM locally, via radio link, and spanning the globe using VPNs or SSH tunneling.

RF24Ethernet is companion software to RF24Gateway, which is meant to run on smaller devices like Arduino Uno, Nano, etc. while RF24Gateway is designed solely for Linux devices. RF24Gateway typically requires little to no programming, while RF24Ethernet lets users build web servers, sensor nodes, interactive nodes with on/off switches, lighting controllers, etc. and customize things to the nth degree, or simply build off the included examples.

ESP32 based devices have become extremely popular, but in a large scale sensor network, WiFi is a very power hungry alternative to nRF24 based devices, which can offer similar features, wireless protocols & connectivity.

To sum it up, RF24Gateway & RF24Ethernet provide connectivity & functionality similar to a WiFi network, but using much less power, and are able to support the smallest devices like ATTiny all the way to larger, much more powerful computers like Raspberry Pi and other Linux based devices that support GPIO & SPI capabilities. By default, the system uses the RF24Mesh and RF24Network libraries, so all of the internal networking and routing is handled automatically.

Friday

Creating IoT Sensor Networks & Systems using nRF24 & nRF52840 - An Analysis

Creating IoT Sensor Networks & Systems using nRF24 & nRF52840

A Brief Analysis

In playing around with the nRF24 and nRF52x devices, I've been running and monitoring a couple large IoT networks of them over the past few years. In that time, I've developed a system that works very well, with nodes staying connected over long periods of time, despite the mesh protocols and need to re-establish connectivity periodically.

The system is based on TCP/IP & Node-Red, and makes heavy use of HTTP & MQTT protocols. Nodes either interact with an MQTT server, or are polled periodically via HTTP requests. Devices range from Raspberry Pi running RF24Gateway & Node-Red, to a wide range of Arduino devices utilizing RF24 and nrf_to_nrf libraries.

A graph showing connection times for various network nodes

The graph above shows connection times for nodes connecting via MQTT. They report in periodically, showing how long they have been connected to the MQTT server. By default, in the MQTT library I use, the keep-alive interval is 10 seconds. This means each node must communicate with the MQTT server successfully every 10 seconds or be disconnected. Given the mesh nature of the system, this works out very well, with nodes staying connected for hours and days at a time.

With the default logic in place, it is very interesting to see how nodes assemble themselves around the master node over time. At first, it can be very chaotic, with many nodes on a network, each grabbing a unique identifier and place in the network. Over time, the nodes with the best connectivity find their way to a direct connection with the master node, or at least a secondary connection, with only one routing node between them and the master node.
One would almost think there was a more intelligent algorithm handling placement of nodes in the networks, but it is a relatively simply methodology, with nodes contacting the master node first, then secondary (children) nodes, then children of those nodes, etc. in order to establish a connection to the mesh. Nodes will also contact their parent node periodically to verify connectivity to the mesh.

Ping Response Times for a Connected Node

The above picture shows ping response times for a connected Arduino node. At the left of the top graph one can see that the node was able to re-establish its connection with better connectivity to the master node, not having its traffic routed through another node. Response times vary, with the node typically responding in about 8mS, up to around 275ms in some cases when there is a direct connection to the master node.

"External Network Activity"

The above picture shows the results while "external activity" interferes with the network. During this time, due to the nature of radio and jamming techniques, most nodes on the network lose connectivity to the mesh, but the self-healing nature of the network ensures that they only go offline while the jamming is taking place. Nodes quickly re-converge around the master node as the mesh re-establishes itself, and every sensor or device resumes normal operations.

HTTP based nodes reporting in during the jamming

I recently updated many of my nodes, due to bug fixes and testing, per the above graph. In it, as can be seen, some of my HTTP nodes lost connectivity during the jamming. Others stayed online with only small blips in connectivity. Depending on proximity to other nodes and the master node, as well as the location of the jamming device, different nodes are affected differently, but again, they all resume operation after the interference stops.

Temperature & Humidity Readings from a Connected Node

This system works extremely well, with a distant node reporting in its temperature and humidity over time. You can see in the chart above, it has been reporting in consistently.

Ping Statistics over a Number of Days

In the above graph, you can see the success rate of pinging one of my sensor nodes over the course of a few days. In this picture, the success rate is down a bit from the norm, due to ongoing "external network activity" beyond my control. This has been the case for years and years, with one of my neighbors continuously interacting with my network on an almost daily basis. I can't understand the obsession, but I've witnessed it on many occasions.

Conclusion:

The end results kind of speak for themselves. In a given networking scenario where reliability and consistency is key, making use of the advanced features of TCP/IP networking in combination with HTTP & MQTT allows users to construct networks to suit many different scenarios. This system is highly reliable and can be customized to the nth degree.

The combination of nRF24L01 & nRF52x radios works quite well, and allows even more advanced functionality in a given network. With the 52x devices supporting things like encryption, Bluetooth & I2S, users can make great use of these features, with everything being done wireless.

Since we are using a mesh based system, and TCP/IP was designed to operate even on partially damaged networks, the combination works very well. Add that to recent fixes and updates in the RF24 communication stack, and you have a system that works extremely well over long periods of time. I measure uptime on the network, and nodes tend to only go offline when there is a power outage, hardware failure or they are manually taken offline.

Considering that there are great quality nRF24 based radios available with range anywhere from 2.5-5 KM (E-01 ML01DP5 or 2G4M27D), networks can be constructed that cover wide distances and manage themselves automatically. The network will support the smallest of devices, ranging from ATTiny all the way up to Raspberry Pi and other Linux based devices that have GPIO & SPI capabilities.

Libraries & Software Used:

1. RF24, RF24Network, RF24Mesh, RF24Ethernet (Arduino) & RF24Gateway (Linux/Raspberry Pi)

2. nrf_to_nrf radio library (nRF52x devices)

3. Node-Red (Linux/Raspberry Pi)

4. Mosquitto (MQTT broker)

All of the software and systems used are open source, so anybody can create similar systems. Any software that is MQTT compatible like Home Assistant or Node Red will work.

Thursday

Recreating nRF24 greater than 32 byte payload issue with nRF52840

Recreating nRF24 greater than 32-byte payload issue with nRF52840

A D.O.S. attack against nRF24L01P that do not validate Dynamic Payload Sizes

I made a video demonstrating how to send 33 byte payloads to the nRF24L01P and either cause the device to flush its RX buffer, or create a denial of service attack for any devices that do not validate Dynamic Payload sizes.

I find it interesting that the nRF24L01P devices accept these payloads at all, as well as sending an ACK to the transmitting device.

Sunday

The nrf_to_nrf Radio Library for nRF52x Devices and nRF24L01 Compatible Communication.

The nrf_to_nrf Radio Library for nRF52x Devices and nRF24L01 Compatible Communication.

A brief tutorial on payload sizes & handling

Max Payload Sizes: The nrf_to_nrf Core Library

The nrf_to_nrf library allows many configuration options, but here I want to talk about payload sizes and communication using the core nrf_to_nrf library, as well as that in combination with RF24Network and higher libraries.

The max payload size of the nRF52840s is 127 bytes. This is with CRC disabled and static payload sizes. The radios support 0, 8, 16 or 24-bit CRC, so that extra data takes up 0, 1, 2 or 3 bytes depending on your chosen options. nRF24 devices only support 0, 8 or 16-bit CRC, so users are limited in their selection when communicating with nRF24 devices.

Similarly, the nRF24 will only handle 32-bytes maximum payload size, with any CRC value and either static or dynamic payloads. The nRF52x devices are slightly different, so one needs to be aware how changing these options affect the max payload size.

Max Payload Sizes:

1. Static Payloads, CRC Disabled: 127 bytes

2. Static Payloads, 16-bit CRC: 125 bytes

3. Dynamic Payloads, w/16-bit CRC: 123 bytes

4. Static Payloads, Encryption enabled, 16-bit CRC: 113 bytes

5. Dynamic Payloads, Encryption enabled, 16-bit CRC: 111 bytes

Max Payload Sizes: The RF24Network Library

The above is with the core nrf_to_nrf driver. Once we get into libraries like the RF24Network library, another 8-bytes of data are required for the network header, which contains information like the destination address, sender address, and a counter.

This means that when communicating solely between nRF52 devices, we almost always want to configure the max payload size to 123, since RF24Network will figure out the max payload size based on this information, and fragment payloads accordingly. By default RF24Network is nRF24 compatible, so only is configured to handle 32-byte payloads.

To configure RF24Network, before calling network.begin(); we want to call radio.begin(); followed by radio.enableDynamicPayloads(123); This will allow maximum payload sizes prior to fragmentation.

The same is true when using the RF24Mesh library, call radio.begin(); followed by radio.enableDynamicPayloads(123); prior to calling mesh.begin();

If using encryption, the RF24Network layer will take this into account, so the max payload sizes with RF24Network are as follows:

Max Payload Sizes in RF24Network:

1. RF24Network, default: 24 bytes before fragmentation, 144 bytes with fragmentation

2. RF24Network, w/enableDynamicPayloads(123): 115 bytes before fragmentation, 144 bytes with fragmentation

3. RF24Network, w/enableDynamicPayloads(123) + enableEncryption = true: 103 bytes before fragmentation, 144 bytes with fragmentation

Wednesday

Maximizing Throughput on Linux Devices using the RF24 communication Stack

Maximizing Throughput on Linux Devices using the RF24 communication Stack

How to achieve peak performance via TCP/IP connections

With the recent changes to the core RF24 driver improving stability, I've begun more thorough investigation & testing regarding max throughput and speed on Linux devices.

What I've found is that the typical TCP/IP stack for Linux is designed to increase throughput for high speed, bidirectional communication devices, whereas the RF24 Comm Stack is built around the RF24 and RF52 radio devices, which can only either send or receive at a given time. To work around this problem, users are encouraged to modify the TCP/IP window sizes. This allows the system to send smaller payloads, one at a time, thus increasing throughput and overall speed of communication when using the RF24Gateway layers for Linux to Linux device communication.

To configure the window sizes for RF24Gateway, just run the following commands:

sudo sysctl net.ipv4.tcp_wmem="1500 1500 1500"
sudo sysctl net.ipv4.tcp_rmem="1500 1500 1500"

This sets the window sizes to 1500, which is right around the MAX_PAYLOAD_SIZE configured in the RF24Network layer for Linux devices.

Results can be tested by running the following commands before and after making this change:

iperf3 -c 10.1.3.134 -4 -t 60 - For no window limit

iperf3 -c 10.1.3.134 -4 -t 60 -w 1500 - With a 1500 byte TCP/IP window

These changes are only temporary, but users can edit /etc/sysctl.conf to make them permanent.

Note: These changes can severely impact or disable other network systems & services. Users are advised to put these commands into a script, to enable/disable enhanced RF24 throughput.

With the RF24Gateway ncurses interrupt example, I'm achieving speeds up to 150-175Kbps or 20-25KB/S over TCP/IP. Up to around 30KB/s using UDP. In order to do this, one needs to modify the line gw.poll(2); and change it to gw.poll(1);. This reduces the delay in the handling of data, allowing maximum throughput.

IPerf3 results over TCP/IP

This works out to around 100-150Kbps on average or 12.5-19KB/s over TCP/IP, which includes the RF24Mesh node renewing its address periodically, interrupting communication slightly.

IPerf3 results over UDP

The results with UDP (shown above) vary depending on what bit-rate you set, in this case 130Kbps or about 16.25KB/s was the chosen bit-rate.

More IPerf3 results over UDP

Here, a slightly higher speed, 145Kbps was chosen, however there was a slight bit of loss.

(1/14500 Datagrams)

Saturday

The RF24 Core Library - Recent Bug Fixes Affecting Stability & Auto-Ack/Pipe 0

The RF24 Core Library -Major Bug Fix

Recent Bug Fixes Affecting Auto-Ack/Pipe 0 and RF24 Core Lib w/Dynamic Payloads

So after all this time developing and maintaining the RF24 core library, I found yet another bug affecting the Auto-Acknowledgement functionality of the radios. An issue had long ago been identified and fixed regarding pipe 0, where the assigned reading address would be overwritten when transmitting, since the radios use pipe 0 exclusively for transmission. This affects the RX_ADDR_P0 register.

Somehow, we never realized that the receiving address, when written to the radios after switching from TX to RX mode would interfere with the reception of Acknowledgement packets in TX mode, since it would overwrite the RX_ADDR_P0 register on the switch to RX. Now the radio library caches BOTH the RX & TX address for RX_ADDR_P0, and writes to it when switching between modes as required.

This enables full functionality of the radios on all pipes, since previously, auto-ack would not work properly on pipe 0 in some cases. The changes have been committed to the source code and will be included in the next release, after v1.4.11.

There are minor impacts to throughput, but after careful consideration, these changes were included to fully enable the radios capabilities. We are working on a more efficient resolution.

This bug was mainly discovered due to my work on the nrf_to_nrf driver for nRF52x radios, which I already had caching both the TX & RX addresses. I realized the RF24 driver didn't do that.

***

I also found a bug that appeared to affect SPI functionality on Linux devices, but it turns out it affects all devices using the RF24 core driver with Dynamic Payloads. This includes the entire RF24 Comm Stack.

I first thought it affected SPI, assumed I was getting bad data, but nothing that I tried to adjust worked. Eventually I narrowed it down to the available() function consistently returning true. Once that was identified, I then discovered it had to do with 0 length payloads, and also discovered that doing a radio.read() had no effect when this happens, the RX buffers need to be flushed. So that's what is being implemented in the RF24 Core layer. When using Dynamic Payloads, and the payload size returns either >32 or 0, the buffers need to be flushed, so the RF24 layer will now do that for both sizes.

After all this time and searching for the problem, it was one line of code that had to be changed.

I'd included failure handling in all of the RF24Gateway examples due to this bug, which would intermittently cause the radios to become unresponsive, requiring them to be restarted/reconfigured. I searched high and low for a long time to find it, but at one point I thought it came down to an issue with the network layer not being able to process information fast enough.

The current approach is for the update() function in RF24Network to return a new system type, indicating there has been corruption, and the RX buffers have been flushed. The core RF24 layer will simply return 0 for Dynamic Payload Length.

This is being patched, will be available in the source code very soon, available in the next release. The Linux installer downloads from the source code.

RF24Gateway now displays a count of corrupted payloads

With these changes, I've begun keeping track of how often this happens. On faster devices like RP2040 or Raspberry Pi, it seems to be more prevalent. Of course, it happens way more on a device that is doing more reception than transmitting. In the picture above, the network has detected and flushed 32 corrupt payloads over a short period of time. This was how I was able to replicate the bug, by utilizing RF24Gateway as a testing tool, and hammering it with data from another RPi and from Arduino. Over time, the bug and its workings became clearer, so I was able to narrow it down to the network.update() function in RF24Network, and then further down the stack to the getDynamicPayloadSize() function of the RF24 core library.

I am also now logging the data on Arduino devices, via MQTT and NodeRed. I'm testing on an Arduino Nano, Due and RP2040, to see just how often this affects slower devices. As of writing this, no data is available yet, but I am running the tests long-term, so data will come in eventually, and will report back here, on this blog post.

Update: The issue affected the RP2040 and AVR devices doing standard communication in my "production" environment after a few days. I've filed a ticket with Nordic in hopes of identifying if this is a known issue, new issue or other.

https://devzone.nordicsemi.com/f/nordic-q-a/121237/nrf24l01-radio-r_rx_pl_wid-returns-0

Thursday

The nrf_to_nrf Library: Recent Updates & Changes

The nrf_to_nrf Library: Recent Updates & Changes

Power Management and Encryption

I've made some somewhat significant changes to the nrf_to_nrf library recently. This lib allows NRF52x based devices to communicate with each other and NRF24 based devices.

Power Management:

The NRF52x devices are low power devices, capable of utilizing very little power when all the peripherals are disabled. There was an issue brought to my attention by a user, identifying that the radios were making use of the HF Clock, Random Number Generator, CCM Encryption peripherals and that the hardware was being enabled in the constructor instead of the begin() function.

I was able to modify the library so that the required peripherals are en/disabled only as required and on calls to powerUp() and powerDown() functions. Users need to keep this in mind, since there may be other needs for these peripherals, which may need to be re-enabled after powering down the radio.

This is fixed in the latest release v1.2.14

Authentication & Encryption:

NRF52x devices may also have a CCM mode encryption capability built in. If so, it was not being configured correctly, and the MAC/MIC verification was actually failing prior to recent updates. Changes were made to correctly copy the MAC/MIC to the receive buffer & the modes are now set prior to encryption/decryption to ensure proper MAC/MIC integrity. A verification is also performed after decryption to ensure the integrity of the MAC/MIC.

This will be fixed in the next release and is available in the source code on GitHub.