Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / IoT

High Performance Decoupled Buses for IoT Displays

5.00/5 (2 votes)
18 Feb 2022MIT9 min read 8.4K   68  
Run your IoT display driver independent of the bus it uses, whether I2C, SPI or parallel
Herein, I present a method of achieving high performance bus I/O for IoT display drivers such that the display driver code is independent of the bus it operates on, whether it's I2C, SPI, 8-bit parallel, or some future variant. This way, you can write one SSD1306 driver to handle both the I2C and SPI variants of that display. Similarly, you can create one ILI9341 driver and run it in SPI or parallel.

bus sample

Introduction

I have a graphics library called GFX that I wrote for IoT devices. It has several drivers for various IoT displays. Most of these displays are SPI based, but some are I2C or even 8-bit parallel. When I went to add the parallel support, I realized I should refactor everything so the bus is independent of the driver or I'd duplicate a ton of code.

That said, ultimately, this was about performance. Adding parallel support was also about performance, not really device support. If you went through the trouble of finding a parallel display, you want the performance to go with it, otherwise what's the point? Furthermore, I realized that I could get quite a bit better framerates by taking advantage of platform specific SPI operations.

As I said, this undertaking originally started out as a way to increase performance, not as a way to add flexibility. As such, performance is the primary goal of the code, even if it's not the primary goal of the article. However, because it's performance oriented, it's not always the easiest to understand. I'll do my best to break it down as best as I understand it myself. A lot of my SPI and parallel code was inspired by TFT_eSPI by Bodmer, and while it looks pretty much nothing like that code, it derives many of its operating principles from it.

Disclaimer: Some of the low level SPI code was ported directly over from TFT_eSPI and I don't understand the processor specific optimized portions entirely. After conversing with Bodmer some on the subject, it seems he learned from several sources? If there's documentation out there for the layer I'm using SPI at on the ESP32, I have yet to find it. I normally don't like to release code I don't completely understand, but in this case I'll make an exception, because I don't know if the documentation for this layer of the ESP32 HAL exists, so I may never understand it. The generic code path is pretty understandable. It only gets weird when leveraging particular hardware optimizations. The ESP32 has a hardware SPI controller and the ESP32 codepaths basically interface with that. Somehow. It's black magic, to be sure, but I've found if you just wave a dead chicken over it every once in awhile, it works flawlessly.

Prerequisites

You'll want an ESP32 if you really want to leverage this article, and some supported displays. Included is a wiring_guide.txt that will show you how to hook up the display to an ESP32. GFX will run on other devices but an ESP32 is a good all around unit for testing and it can leverage GFX fully because it has ample memory and CPU power. GFX was developed and primarily tested on the ESP32 platform, though it works on some STM32 platforms and probably others. If you want to use another platform, as long as it's Arduino compatible, it may work. The parallel support probably won't. Despite the code being implemented for the generic Arduino support, the timing isn't tight enough I think for it to work. The ESP32 optimized code does work. TFT_eSPI has a similar limitation.

The code uses PlatformIO to build, so make sure you have that installed. I use it with the VS Code extension, personally. I think most people do.

The included code assumes a 128x32 SSD1306 display, but several drivers are included. Choose the one that matches a screen of you own, and wire it up using the wiring_guide.txt in the root of the project folder. You will have to feed the driver template the appropriate parameters which may be slightly different than those in the sample.

Make sure you upload it and see that the demo runs before continuing.

Understanding this Mess

GFX is implemented using generic programming, as are the drivers. Get used to using templates. Here, we are going to use template arguments in the place of where you'd have used preprocessor #defines for things like the pin assignments.

The idea here is to instantiate the appropriate bus template (tft_spi<>, tft_i2c<>, or tft_parallel8<>) and then once you've done that, you pass it to the driver template as an argument.

Finally, you can instantiate an instance of the driver template and then draw to it. It will use whichever bus style you specify.

Using the Code

Consider the following configuration to the platform.ini file in the project:

[env:example]
platform = espressif32
board = node32s
board_build.partitions = no_ota.csv
framework = arduino
monitor_speed = 115200
upload_speed = 921600
build_unflags=-std=gnu++11
build_flags=-std=gnu++14
    -DI2C ; for I2C displays

This sets up an example configuration for an I2C based SSD1306. For SPI devices, remove the last line above.

Let's start with declaring the bus. The following is a master template you can use that supports the three styles of buses:

C++
#include "common/tft_io.hpp"
using namespace arduino;
#if defined(PARALLEL8)
#define PIN_NUM_BCKL -1
#define PIN_NUM_CS   33  // Chip select control pin (library pulls permanently low
#define PIN_NUM_DC   22  // (RS) Data Command control pin - must use a pin in the range 0-31
#define PIN_NUM_RST  32  // Reset pin, toggles on startup
#define PIN_NUM_WR    21 // Write strobe control pin - must use a pin in the range 0-31
#define PIN_NUM_RD    15 // Read strobe control pin
#define PIN_NUM_D0   2   // Must use pins in the range 0-31 for the data bus
#define PIN_NUM_D1   13  // so a single register write sets/clears all bits.
#define PIN_NUM_D2   26  // Pins can be randomly assigned, this does not affect
#define PIN_NUM_D3   25  // TFT screen update performance.
#define PIN_NUM_D4   27
#define PIN_NUM_D5   12
#define PIN_NUM_D6   14
#define PIN_NUM_D7   4
#elif defined(I2C)
#define TFT_PORT 0
#define PIN_NUM_SDA 21
#define PIN_NUM_SCL 22
#define PIN_NUM_RST -1
#define PIN_NUM_DC -1
#define TFT_ADDR 0x3C
#else
#define TFT_HOST VSPI
#define PIN_NUM_CS 5
#define PIN_NUM_MOSI 23
#define PIN_NUM_MISO 19
#define PIN_NUM_CLK 18
#define PIN_NUM_DC 2
#define PIN_NUM_RST 4
#endif

#ifdef PARALLEL8
using bus_type = tft_p8<PIN_NUM_CS,
                        PIN_NUM_WR,
                        PIN_NUM_RD,
                        PIN_NUM_D0,
                        PIN_NUM_D1,
                        PIN_NUM_D2,
                        PIN_NUM_D3,
                        PIN_NUM_D4,
                        PIN_NUM_D5,
                        PIN_NUM_D6,
                        PIN_NUM_D7>;
#elif defined(I2C)
using bus_type = tft_i2c<TFT_PORT,
                        PIN_NUM_SDA,
                        PIN_NUM_SCL>;
#else
using bus_type = tft_spi<TFT_HOST,
                        PIN_NUM_CS,
                        PIN_NUM_MOSI,
                        PIN_NUM_MISO,
                        PIN_NUM_CLK,
                        SPI_MODE0,
                        PIN_NUM_MISO<0
#ifdef OPTIMIZE_DMA
                        ,(TFT_WIDTH*TFT_HEIGHT)*2+8
#endif
>;
#endif

The code in the example project is similar to the above.

Now, you can use -DPARALLEL8 or -DI2C as a compiler option to switch from SPI to the selected bus type. Choose the appropriate one for your display device. Notice we pass pin assignments to the bus. Once the bus is declared, no further action is needed to initialize it, as the driver will do that automatically.

Next, we need to choose the appropriate display driver and include it. We'll use an SSD1306 since they are cheap and ubiquitous. They typically come in I2C or SPI variants and although internally, they are capable of parallel I/O, I've never seen a breakout with a parallel interface for this device.

Note the driver include in the actual project at the top after the previous include above:

C++
#include "ssd1306.hpp"

Now below the declaration of bus_type, we instantiate the driver template and then declare an instance, feeding it the bus type:

C++
using tft_type = ssd1306<TFT_WIDTH,
                        TFT_HEIGHT,
                        bus_type,
                        TFT_ADDR,
                        TFT_VDC_3_3,
                        PIN_NUM_DC,
                        PIN_NUM_RST,
                        true>;
tft_type tft;

Finally, the driver is ready to be used, but to draw to it, we include GFX as shown in the example project:

C++
#include "gfx_cpp14.hpp"
using namespace gfx;

Even though this is monochrome, it's typical to declare the X11 colors in the tft_type's native pixel format:

C++
using tft_color = 
  color<typename tft_type::pixel_type>;

Now we can use draw to draw to tft:

C++
draw::filled_rectangle(tft,
                      (srect16)tft.bounds(),
                      tft_color::black);
for(int i = 1;i<100;i+=10) {
    // calculate our extents
    srect16 r(i*(tft.dimensions().width/100.0),
            i*(tft.dimensions().height/100.0),
            tft.dimensions().width-i*
              (tft.dimensions().width/100.0)-1,
            tft.dimensions().height-i*
              (tft.dimensions().height/100.0)-1);

    draw::line(tft,
              srect16(0,
                      r.y1,
                      r.x1,
                      tft.dimensions().height-1),
              tft_color::white);
    draw::line(tft,
              srect16(r.x2,
                      0,
                      tft.dimensions().width-1,
                      r.y2),
              tft_color::white);
    draw::line(tft,
              srect16(0,r.y2,r.x1,0),
              tft_color::white);
    draw::line(tft,
              srect16(tft.dimensions().width-1,
                      r.y1,
                      r.x2,
                      tft.dimensions().height-1),
              tft_color::white);
}

That draws a pattern around the borders of the display.

The drawing isn't really the point of this article, but I felt I'd include it so you get a feel for the code from end to end.

The key point here is your bus type can be whatever you like. For this display, as I mentioned before, it typically comes in SPI and I2C varieties. I2C is selected by including the -DI2C in the platformio.ini file which adds that switch to the compiler, defining I2C to the C/C++ preprocessor. If you don't specify it, SPI is selected. Additionally -DPARALLEL8 can be selected, but I've never seen an SSD1306 with that interface.

However, I do have an ILI9341 with a parallel interface. If you have one, you can use the ili9341.hpp driver with the -DPARALLEL8 interface, and the included wiring_guide.txt to hook everything up. Using the driver is exactly the same except slightly different setup, and the drawing code will work with any device. Just make sure to set your #defines appropriately, like the width and height. Also keep in mind some drivers take different template parameters than others.

Using other devices is pretty much the same. Just choose a bus, include the appropriate header and driver instantiation and wire everything up.

How It Works

Now we get into the meat.

We exploit the fact that across almost all devices, there is similar required behavior. For example, devices have commands and data. The data is often parameters to commands, but sometimes it's a stream of pixels, although that is technically a BLOB parameter to a memory write command. Anyway, on an SPI device, you typically have an additional "DC" line that toggles between commands and data. I2C has something similar, except that the toggle is indicated by a code in the first byte of every I2C transaction. Parallel also has a DC line though it's usually called RS but it does the same thing as the SPI variant.

The idea here is we are going to expand the surface area of our bus API to include everything applicable to any kind of bus, so for example, you may have begin_transaction() and end_transaction() which for SPI define transaction boundaries, but do nothing in the parallel rendition.

The I2C bus is pretty straightforward, but the SPI bus and parallel buses are significantly more complicated due to having processor specific optimizations. It should be noted that the generic implementation for the parallel bus does not function in my tests. I think it's a timing issue, and I have at least one more thing I can try when I get the time and motivation. For now, it's basically a feature of the ESP32 and the STM32 ARMs.

One nice thing about using templates for the bus and the driver is that different arguments yield different concrete classes, meaning any statics are specific to that template instantiation. The upshot of this is that you can run multiple displays either of the same, or different types. Contrast this with TFT_eSPI which can only drive displays of a single display type since it uses application wide globals and statics that are not per device.

tft_core.hpp includes some basic code common to all the bus types. tft_spi.hpp, tft_i2c.hpp, and tft_parallel8.hpp each contain the respective bus type, but tft_io.hpp simply includes all of these, and is the recommended header to use.

tft_driver.hpp is used for driving a bus. It drives the DC line for SPI. It also signals to the bus whether it's in command or data mode, but the only bus that needs that information currently is I2C, which doesn't use a DC line, but rather a single leading byte code in each transaction payload as mentioned. The bus can be driven alongside the tft_driver which also drives the bus, exploiting those commonalities I mentioned at the start of this section. Some of the end driver code drives the bus directly rather than going through tft_driver, primarily for performance reasons.

One wrinkle to this is the fact that SPI can do DMA, which allows for asynchronous I/O operations that run in parallel to your CPU task. Basically on non-SPI devices, what we do is a non-op for things like dma_wait(), which waits for a pending DMA operation, and with write_raw_dma() we simply forward to the non-DMA function, performing the I/O synchronously. When DMA is available the OPTIMIZE_DMA define will be present. To enable DMA, the maximum size of a DMA transfer must be specified as an SPI bus argument, and should include an 8 byte padding.

Conclusion

I believe decoupling the bus from the IoT display driver is a novel way to add the flexibility to support the myriad of display configurations available for IoT. Using templates for this and for the driver itself allows for a lot of flexibility while maintaining run time performance.

Hopefully, this code inspires you to use GFX for your IoT projects. With GFX, you can do advanced things like alpha blending, JPG display, and TrueType fonts, allowing you to create fresh modern interfaces. With this new driver framework, you get all of this plus better framerates than the previous code.

Enjoy!

History

  • 12th February, 2022 - Initial submission
  • 18th February, 2022 - Updated and simplified code. Added more drivers.

License

This article, along with any associated source code and files, is licensed under The MIT License