Introduction
A while ago, I wrote a program that allows two people to chat over the GPIO pins. The protocol allowed for only one bit to be sent at a time and used morse to encode the characters. This of course meant that the contents to be sent are limited to whatever character-set is implemented by the morse-implementation used. Nevertheless, it worked great. I only have a single Pi at my disposal, so I had to test and debug by connecting the Pi to itself. When everything worked, I achieved a stunning transfer-rate of around 50 bit/sec. That's right: bits per second, not bytes. A while later, a friend of mine offered to hook his Pi up to mine to test the protocol as it was meant to be used. For some reason, a whopping 25kb/s was achieved! The cause of this dramatic increase is still unclear to me. I would have expected a speedup of maybe a factor two, but this was simply incredible...
Inspired by these results, I decided to extend the protocol to allow for more bits to be sent at a time (limited only by the number of GPIO pins), and to be independent of character encoding. The result is PiCom
, in terms of which PiChat
was succesfully re-implemeted. More interestingly, it allowed for PiFTP
to be implemented, a file-transfer-protocol based on PiCom
, which is what this article will be about.
I will not assume that you read my previous article on PiChat
. Some parts of text/code might have been copied directly from this article, as is the next section (Background).
Background
Most GPIO tutorials use the Python library to control the pins. While this is perfectly fine and all, I have chosen to build my application in C++ using the WiringPi library for C and C++. This is mainly because I'm much more familiar with C++ than with Python, which means I actually dare to show my code. I'd probably be too afraid to do this with any Python code I write... The downside of writing a program in C or C++ for the Pi is that you have to compile it, which can take pretty long on the 700Mhz ARM chip. Another advantage is that, at least according to this website, the WiringPi library allows for much faster communication than Python. Hopefully, this will lead to higher transfer rates.
On my Pi, I'm using a minimal Debian (Raspbian) image that I downloaded here. I then upgraded to Jessie to be able to use C++11 to its full extent in GCC 4.9 (Wheezy comes with 4.6).
PiCom
What follows will be a description of the PiCom protocol and its C++ implementation. You can use it to implement your own applications that need GPIO communication.
Protocol
Lines
No need for any code yet, let's first discuss the protocol. The PiCom protocol considers two parties, a sender (A) resp. listener (B), attached to eachother through lines, potentially having multiple channels. Except for the message-line, each line has an input and output variant. The inputs of A are connected to the outputs of B and vice versa.
- SRI/SRO: Send Receive In/Out
- Used to notify the other party of the current activity. When A sets SRO high, B will read high on SRI, indicating that A is about to send something.
- SLI1/SLO1, SLI2/SLO2: Sync Line In/Out
- Used for synchronization.
- ML1, ML2, ... , MLn : Message Line
- This line is used for both output (sending) and input (receiving), and can contain multiple channels. The number of channels determines the number of bits sent each iteration. The transfer-rate generally scales linearly with the number of channels.
Synchronization
The protocol highly depends on a reliable synchronization method. In order to send information across, the two Pi's have to communicate to eachother that they are ready to send/read the next bit of information. The sync-lines SLI1, SLI2, SLO1 and SLO2 are used only for this purpose.
Each Pi keeps track of two variables: lineSelect
and syncVal[2]
. The former determines which of the synclines is used, and alternates between SLI(O)1 and SLI(O)2 between subsequent calls to the sync-function. The latter determines the synchronization-value. It's an array of 2 boolean flags that also alternate (true/false) on subsequent calls. While syncing, device A will wait for the appropriate line to assume the appropriate value, which has to be set by B, and vice versa. By alternating lines and values in this manner, it is ensured that multiple (fast subsequent) calls are handled appropriately.
The schematic below shows which lines and values are used on subsequent calls to the sync-function:
Call Line Value
1 1 0
2 2 0
3 1 1
4 2 1
5 1 0
...
The sync-function then boils down to 2 simple statements:
- Write the current value to the current output-line (SLO1 or SLO2).
- Wait for the current input-line (SLI1 or SLI2) to assume the current value.
Sending
Now that we have a reliable synchronization scheme at our disposal, we can try to formulate the way information is sent. Information in this context is any stream of bits, and could represent anything, unlike the previous implementation of PiChat. The information is sent in chunks, packets of multiple bits sent in parallel across multiple channels of the message-line. The number of bits that can be sent in parallel depends of the number of message-line-channels available, and might not be the same on device A and B. For this reason, and because the bitstream need not be an exact multiple of the chunk-size, the sending device has to communicate how many message-lines will be used during the incoming transmission. Therefore, a very crude description of the send-algorithm is as such:
- Set SRO high, indicating that a message will be coming through.
- Configure the Message-Line as output.
- Calculate the number of bits that can be sent in parallel.
- Communicate this number to the receiving device.
- Send the bits on ML1, ML2, ..., MLn.
- Repeat until all bits have been sent.
- Set SRO to low, indicating that this was the entire message.
Each of these steps requires careful synchronization in order to work as expected. This will become apparent in the implementation.
Receiving
The receiving algorithm intertwines with the sending algorithm. It is assumed that the receiving device is constantly monitoring its SRI channel, and the algorithm is triggered as soon as this becomes high (which is step 1 of the sender).
- Configure the Message-Line as input.
- Listen for the number of bits that can be expected, and check if the local setup facilitates this amount.
- Listen for the bits on ML1, ML2, ... , MLn and append them to the result-vector.
- Repeat until SRI becomes low.
Code
Class-Interface
The implementation of the PiCom protocol is summarized by the public class-interface below. Applications can inherit from PiCom, or just include it as a member to use its facilities. A class inheriting from PiCom also inherits the protected members, allowing the child to implement extra functionality, maybe overruling the existing send and listen methods.
class PiCom
{
public:
explicit PiCom(std::string const &pinFile);
virtual ~PiCom() = default;
enum LineName
{
SRI, SRO,
SLI1, SLO1,
SLI2, SLO2,
ML,
N_LINES
};
void send(std::vector<bool> const &bitstream);
std::vector<bool> listen();
protected:
void sync(int num = 1, int timeout = TIMEOUT);
bool wait(int line, int val = 1, int timeout = TIMEOUT);
bool wait(int line, int channel, int val, int timeout);
void reset();
void write(int line, bool value);
void write(int line, int channel, bool value);
bool read(int line);
bool read(int line, int channel);
void mode(int line, int mode);
void mode(int line, int channel, int mode);
void pull(int line, int mode);
void pull(int line, int channel, int mode);
};
The constructor takes the filename of a pin-file, i.e. a file of a specific format that links the lines (SRI, SRO etc) to pin-numbers following the WiringPi pin-numbering convention. For example, a pin-file might look like this:
SRO = 0
SLO1 = 2
SLO2 = 3
SRI = 8
SLI1 = 9
SLI2 = 12
ML1 = 7
ML2 = 13
The parser responsable for parsing such files is very simple. It will just find the equality-symbol on each line (ignoring empty lines), strip blanks from the left- and right-hand-side, and pass the result on to the PiCom-object that called this parser. It's not possible to put comments in such a file. Note that this specific setup uses 2 message-lines (ML1, ML2). This means that, while sending, 2 bits can be sent in parallel in each iteration.
sync()
The sync()
member follows the description above, alternating between lines and values. It uses 2 (private) data-members to keep track of the current line and value: d_syncLineSelect
and d_syncVal
(all data-members are prepended by d_
, like Stroustroup's m_
):
void PiCom::sync(int num, int timeout)
{
static int const in[2] = {SLI1, SLI2};
static int const out[2] = {SLO1, SLO2};
for (int i = 0; i != num; ++i)
{
d_syncLineSelect ^= 1; int s = (d_syncVal[d_syncLineSelect] ^= 1);
write(out[d_syncLineSelect], s);
if (!wait(in[d_syncLineSelect], s, timeout))
throw Exception<TimeOut>("connection timed out");
}
}
send()
The send
member implements the sending protocol. Because the message-lines are used both for transmitting and receiving data, they first have to be configured as outputs, using the wiringPi wrapper functions:
int nML = channels(ML); for (int c = 0; c != nML; ++c)
mode(ML, c, OUTPUT);
A std::vector<bool>
named bitstream
was passed to send()
, which will be sent in chunks of nML
bits using the nML
channels available.
int n = bitstream.size();
int idx = 0;
while (idx < n)
{
int chunkSize = (n - idx >= nML) ? nML : (n - idx);
write(SRO, 1);
for (int i = 0; i != chunkSize; ++i)
sync(2);
write(SRO, 0);
sync(2);
write(SRO, 1);
while ((idx + chunkSize) <= static_cast<int>(bitstream.size()))
{
for (int i = 0; i != chunkSize; ++i)
write(ML, i, bitstream[idx++]);
sync(2);
}
write(SRO, 0);
sync(2);
}
sync(2);
Every call to sync()
serves a very specific purpose, but I'm very aware that this purpose is not always easily visible from this piece of code alone. Also, every call to sync()
in this piece of code contains the argument '2', meaning that it's actually 2 subsequent syncs (equivalent to sync(); sync();
) The reason behind this is that the listening device has to monitor SRI (connected to SRO of the sending device) to detect changes. A single synchronization barrier would be insufficient, because there would be no guarantee that the listening device has had the opportunity to read the correct value on SRI before it's being changed again. The first barrier can be seen as an insurance policy that the value has been written, whereas the second is ensuring that the value has been read.
The first loop is executed chunkSize
times, giving the listening device the opportunity to count the number of iterations before SRI becomes low. When the chunk-size has been communicated, the bits are repeatedly written to the message-lines until the remaining number of bits is less than the chunk-size. When this happens, the outer loop repeats and the chunk-size is recalculated. When all the bits have been sent, the outer loop breaks and SRO (SRI on the other side) remains low. This also means that a chunk-size of 0 is communicated, which indicates to the listener that the message has been completely sent.
listen()
The listen()
-member starts by waiting in an endless loop for the SRI channel to become high. It should be called in daemon-like applications, or by a seperate thread (as was the case in PiChat, where the application was listening and sending simultaneously). If not, the program will stall until it receives input from the sender:
wait(SRI, 0, 1, -1);
Like send()
, listen()
has to configure the message-lines. Only this time, they are set to be used for input instead of output:
for (int channel = 0; channel != nML; ++channel)
{
mode(ML, channel, INPUT);
pull(ML, channel, PUD_DOWN);
}
The loop that follows runs in parallel with the main-loop of the sending algorithm:
while (true)
{
int chunkSize = 0;
while (true)
{
sync();
if (!read(SRI))
break;
++chunkSize;
sync();
}
sync();
if (chunkSize == 0)
break;
else if (chunkSize > nML)
throw Exception<LineError>("Not enough message-lines available");
while (true)
{
sync(); if (!read(SRI))
break;
for (int i = 0; i != chunkSize; ++i)
ret.push_back(read(ML, i));
sync();
}
sync();
}
The first loop is executed as long as SRI remains high. The number of iterations is counted, effectively computing the chunk-size. It's then checked whether this number makes sense. If its value is 0, the loop is broken and if the value exceeds the number of message lines, an exception is thrown. If all is well, the next loop is entered to repeatedly gather all the bits.
PiFTP
An example of an application that uses PiCom to provide the interface between 2 Pi's is PiFTP. I will only show the public class-interface and its implementation:
class PiFTP
{
PiCom d_piCom;
public:
PiFtp(std::string const &pinfile);
void send(std::string const &fname); void send(std::string const &source, std::string const &dest); void listen();
};
send()
The send member is overloaded for both 1 and 2 input strings:
- The destination filename is identical to the source-filename
- The destination filename is different from the source-filename
In the implementation below, the timing-code is left out. The sending algorithm only performs 3 significant actions:
- Send the destination string as a vector of bits (relative to the path where the listening program was run from).
- Send the file as a vector of bits (using the conversion function
file2bits
, provided by the private interface). - Reset the interface, in order to make subsequent calls to the program work without problems.
void PiFtp::send(string const &fname)
{
send(fname, fname);
}
void PiFtp::send(string const &source, string const &dest)
{
d_picom.send(string2bits(dest));
d_picom.send(file2bits(source));
d_picom.reset();
}
listen()
void PiFTP::listen()
{
vector<bool> fname = d_picom.listen();
vector<bool> content = d_picom.listen();
try
{
bits2file(bits2string(fname), content);
}
catch (Exception<NoSuchFile> const &ex)
{
cerr << ex.what() << '\n';
}
d_picom.reset();
}
The listener first listens for the file-name, and then for the content. When both have arrived, it constructs a file from the stream of bits at the specified location. When this fails, an exception is thrown. The interface is reset to make sure the sync-lines are back to their initial state.
Results
Like I said before, I'm only able to test the code on a single Pi, forcing me to attach it to itself. This results in very poor performance, but at least I'm able to see some scaling behavior. Unfortunately, the 17 GPIO pins of my Model B only allow for 2 message-line channels (2x4 sync-lines + 2x2 SRx lines). However, I did notice that the transfer rate doubles when I move from 1 channel to 2 channels.
I'll be able to scale it up soon, when another Pi becomes available again. I will try to generate some nice performance-graphs as soon as I get the chance.
History
Oct 13, first draft