This article describes a C++ framework for defining, parsing, and executing commands entered from the console or another stream. It covers the framework's capabilities, provides examples of its use, and discusses how to implement a command.
Introduction
Command line interfaces (CLIs) need little introduction. They're the usual interface to *nix systems, and MS-DOS (the precursor of Windows) used a CLI exclusively. Although graphical user interfaces (GUIs) have made significant inroads, CLIs are still in widespread use because some platforms only support console applications.
Using the Code
The CLI that we will look at is that of the Robust Services Core (RSC). If this is the first time that you're reading an article about an aspect of RSC, please take a few minutes to read this preface.
Although RSC's repository contains two sample applications, they are not central to its purpose. As a framework, RSC has little need for a GUI. In fact, a GUI would be inappropriate because an application that needs one will usually want to provide its own. Other applications might be targeted at platforms that don't even support GUIs.
If you want to build an application on RSC, this article should help you to write your own CLI commands. If you don't want to use RSC but need a CLI, you can copy and modify the code to meet your needs, subject to RSC's GPL-3.0 license.
Capabilities
Before diving into code, it is helpful to understand what it's trying to do, so let's look at the capabilities that RSC's CLI provides.
Streams. The CLI can take its input from the console or a file (a "script") and also send its results to the console or a file. Input from, and output to, a socket will eventually be supported as well.
Help. A command's syntax is defined using CLI base classes that force a command and each of its parameters to provide basic documentation. This allows all commands to be listed, along with a description of each one's parameters. If further explanation is required, a more detailed help file can be displayed.
Parameters. The CLI's basic parameters are booleans, characters, integers, pointers, and strings. Whenever possible, CLI base classes screen out invalid inputs. Each parameter can be mandatory or optional, and more specialized parameters, such as filenames and IP addresses, can be constructed using the basic parameters.
Break character. The ability to enter multiple commands on the same line is useful when the user doesn't want to wait for a previous one to complete before entering the next one. A semicolon (;
) separates commands on the same line.
Escape character. Now that a character (;
) has a special meaning, we need an escape character that can precede a character to suppress its special meaning and just use it literally. A backslash (\
) is used for this.
Strings. By default, asking the command line parser for a string skips any whitespace and then assembles a string until more whitespace or an endline is reached. A string that contains whitespace must therefore be enclosed in quotation marks ("
).
Comments. The CLI can execute scripts (files that contain commands), so it should be possible to comment them. The forward slash (/
) serves this purpose; anything that follows it on the same line is ignored.
Defaults. Each command takes its parameters in a fixed order, and mandatory parameters must precede optional parameters. This is different from the Unix approach, which tags parameters (e.g., with -p
) so they can be entered in a flexible order, with some of them omitted. Because a fixed order can lead to ambiguous parses, it is useful to define a "skip" character that can be used for any optional parameter. This causes the parameter's default value to be used so that the next optional parameter can be entered. A tilde (~
) is used for this purpose.
Symbols. The use of magic constants can be avoided by defining symbols that represent them. When using trace tools, for example, it is usually desirable to exclude the work of some threads from the trace. RSC assigns each thread to a faction based on the type of work that it performs. An enum
defines these factions, but rather than use the integral values of its enumerators in trace tool commands, it is better to define symbols like faction.system
and faction.audit
to represent them. To simplify the parsing of CLI commands, these symbols are called out by prefixing them with a special character. An ampersand (&
) is used as the prefix.
Transcripts. It can be useful to record all CLI input and output in a single transcript file, so the CLI supports this capability.
Layers. RSC is implemented in static libraries so that a build only has to include the capabilities that an application needs. The CLI must therefore allow each static library to add its own commands, as well as extend the commands provided by other libraries that it uses.
Aborts. The CLI should be able to abort the command in progress and prompt the user for a new command. This is done in the usual way, with ctrl-C.
Overview of the Classes
This article takes a bottom-up approach. It starts by looking at parameters, then commands, and then groups of commands before finally looking at the thread that implements the CLI. The goal is to give you the background required to add your own CLI commands while at the same time covering many of the CLI's capabilities.
CLI commands are created during system initialization and are never deleted. There is some boilerplate involved in creating a CLI command, but I hope you'll find that the payback is worthwhile.
The CLI was originally written around 2006. Although it often evolved to support new capabilities, it was never thoroughly revisited. To prepare for this article, I rewrote parts of it, although its overall design stayed the same. But you may still find some cringeworthy things. I certainly did when taking a fresh look at it!
CliParm
This is the virtual base class for CLI parameters, so its constructor defines what is common to all parameters:
CliParm::CliParm(c_string help, bool opt, c_string tag) :
help_(help),
opt_(opt),
tag_(tag)
{
Debug::Assert(help_ != nullptr);
auto size = strlen(help_);
auto total = ParmWidth + strlen(ParmExplPrefix) + size;
if(size == 0)
Debug::SwLog(CliParm_ctor, "help string empty", size);
else if(total >= COUT_LENGTH_MAX)
Debug::SwLog
(CliParm_ctor, "help string too long", total - COUT_LENGTH_MAX + 1);
}
help
, a C-style string, is mandatory and must be concise enough to fit on the same line as a column that describes the parameter's legal values. The default value of COUT_LENGTH_MAX
is 80
. opt
specifies whether the parameter is mandatory or optional. tag
is optional; it allows a parameter's value to be preceded by tag=
, similar to what Unix does with its -p
style of prefix mentioned earlier. Only a few parameters define a tag, but doing so allows preceding optional parameters to be omitted.
CliBoolParm
This is the base class for a boolean parameter. A subclass only has to provide a constructor that invokes the CliBoolParm
constructor to set its own attributes:
CliBoolParm::CliBoolParm(c_string help, bool opt, c_string tag) : CliParm(help, opt, tag) { }
This constructor only takes the arguments already defined by CliParm
. So why does CliBoolParm
exist? The answer is that it parses input for a boolean parameter. We will look at how the command line is parsed later.
CliCharParm
This is the base class for a character parameter, which accepts a single character from a specified list. A subclass only has to provide a constructor that invokes the CliCharParm
constructor to set its own attributes:
CliCharParm::CliCharParm
(c_string help, c_string chars, bool opt, c_string tag) :
CliParm(help, opt, tag),
chars_(chars)
{
Debug::Assert(chars_ != nullptr);
}
chars
is a string that contains the legal characters.
CliIntParm
This is the base class for an integer parameter. A subclass only has to provide a constructor that invokes the CliIntParm
constructor to set its own attributes:
CliIntParm::CliIntParm(c_string help, word min, word max, bool opt, c_string tag, bool hex) :
CliParm(help, opt, tag),
min_(min),
max_(max),
hex_(hex)
{
}
min
is the parameter's minimum legal value. max
is the parameter's maximum legal value. hex
is set if the parameter must be entered in hex.
CliPtrParm
This is the base class for a pointer parameter. A subclass only has to provide a constructor that invokes the CliPtrParm
constructor to set its own attributes:
CliPtrParm::CliPtrParm(c_string help, bool opt, c_string tag) : CliParm(help, opt, tag) { }
As with CliBoolParm
, no additional arguments are needed. Any value up to uintptr_max
is acceptable, and it must be entered in hex.
CliText
This class defines a specific string that can be a valid parameter. A subclass only has to provide a constructor that invokes the CliText
constructor to set its own attributes:
CliText::CliText(c_string help, c_string text, bool opt, uint32_t size) :
CliParm(help, opt, nullptr),
text_(text)
{
if(text_ == nullptr) text = EMPTY_STR;
parms_.Init(size, CliParm::CellDiff(), MemImmutable);
}
text
is the specific string that can be used as a parameter. size
is the maximum size of parms_
, which contains any additional parameters that can follow text
. parms_
is of type Registry<CliParm>
. RSC's Registry
template is similar to vector
but allows an element to specify the index where it will be placed. Classes in a registry can be invoked polymorphically, each one via its index. The parms_
registry use MemImmutable
, which means that it is write-protected once the system has initialized, very much like a static
const
data member.
If parameters can follow the string, the constructor also creates those parameters. An example of this will appear soon.
CliTextParm
This is a container for a set of strings, any one of which is a valid input for this parameter:
CliTextParm::CliTextParm(c_string help, bool opt, uint32_t size,
c_string tag) : CliParm(help, opt, tag)
{
strings_.Init(size, CliParm::CellDiff(), MemImmutable);
}
size
is the maximum size of strings_
, which contains the legal strings for this parameter. strings_
is of type Registry<CliText>
. The last string in the registry may be empty (see CliText
's constructor), in which case it will match any string.
A CliTextParm
subclass defines an index for each valid string and registers a string against its index. The CLI's parser will return the string's index, which effectively allows the string to be used in a switch
statement. For example, the following class is used by CLI commands that allow the user to enable or disable a configuration parameter by entering on
or off
:
class SetHowParm : public CliTextParm
{
public:
static const id_t On = 1;
static const id_t Off = 2;
SetHowParm() : CliTextParm("setting...")
{
BindText(*new CliText("on", "on"), On);
BindText(*new CliText("off", "off"), Off);
}
};
Note that the SetHowParm
constructor created two instances of CliText
directly. If you prefer, you can create instances of CliBoolParm
, CliCharParm
, CliIntParm
, CliPtrParm
, CliText
, and CliTextParm
this way, by simply passing the appropriate arguments to their constructors. You would then define an actual subclass for a parameter only when it takes additional parameters or when it is used in several commands.
CliCommand
Finally we arrive at the class that is used to implement a CLI command. A command is invoked through a specific string, so it derives from CliText
and is followed by the additional parameters that a command often takes:
CliCommand::CliCommand(c_string comm, c_string help, uint32_t size) :
CliText(help, comm, false, size)
{
if((comm != nullptr) && (strlen(comm) > CommandWidth))
{
Debug::SwLog(CliCommand_ctor, "command name length", strlen(comm));
}
}
The length of a command name is limited for display purposes and to avoid annoying users.
A CliCommand
subclass creates and registers the parameters that follow the command name. For example:
DisplayCommand::DisplayCommand() :
CliCommand("display", "Displays an object derived from NodeBase::Base.")
{
BindParm(*new CliPtrParm("pointer to an object derived from Base"));
BindParm(*new CliCharParm("'b'=brief 'v'=verbose (default='b')", "bv", true));
}
This command invokes the Display
function that most objects provide. Its parameters are a mandatory pointer to the object and an optional character that specifies whether the output should be brief or verbose.
A CliCommand
subclass also overrides a pure virtual ProcessCommand
function that obtains its parameters from the command line and executes the command. We will look at this later.
CliCommandSet
When a command would have a large ProcessCommand
function, splitting it into subcommands, each with its own ProcessCommand
function, can make things more manageable. Therefore CliCommandSet
, which allows CliCommand
instances to register directly under an encompassing command:
CliCommandSet::CliCommandSet(c_string comm, c_string help, uint32_t size) :
CliCommand(comm, help, size) { }
As an example, NtIncrement.cpp contains commands for testing various RSC classes, and the commands that test each class are grouped into a CliCommandSet
. Here's the one for testing NbHeap
:
HeapCommands::HeapCommands() :
CliCommandSet("heap", "Tests an NbHeap function.")
{
BindCommand(*new HeapCreateCommand);
BindCommand(*new HeapDestroyCommand);
BindCommand(*new HeapAllocCommand);
BindCommand(*new HeapBlockToSizeCommand);
BindCommand(*new HeapDisplayCommand);
BindCommand(*new HeapFreeCommand);
BindCommand(*new HeapValidateCommand);
}
This allows each heap function to be executed through its own ProcessCommand
implementation. For example,
>heap alloc 256
ends up invoking HeapAllocCommand::ProcessCommand
. The base class CliCommandSet
retrieves the next parameter ("alloc"
) and uses it to delegate to the second-level command.
CliIncrement
A subclass of CliIncrement
contains commands available in a given static library. Right now, each static library has only one increment, but a library could contain more than one if this was desirable.
RSC's only mandatory library is the one implemented by the namespace NodeBase
. All the others are optional, though some depend on others. A command must be implemented in a library that can use all of the code items that the command needs. Although all of the system's commands could register in a common location, whether they were grouped into increments or not, this could lead to conflicts between command names in a large code base. It would also make visible many commands that were not relevant to what a user was currently doing.
Therefore, when RSC initializes and the CLI prompts the user for input, its command prompt is nb>
. This indicates that only the commands in the NodeBase
increment (NbIncrement.cpp) are available. To access a command in another increment, the increment's name must precede the command. However, an increment's name can also be entered as a standalone command, which pushes that increment onto the stack of available increments. This makes all of its commands accessible without the need for the prefix, with any name conflict being resolved in favor of the increment that is highest on the stack.
Here is an example of entering the nw
increment, which supports RSC's network layer, and then the sb
increment, which supports its session processing layer. The quit
command removes the top increment from the stack and returns to the previous one. Note how the command prompt changes to indicate which increment is currently on top of the stack. After quit
has been entered twice, we're back in the nb
increment:
Although CliIncrement
does not derive from CliParm
, its constructor's arguments are similar to ones that we've already seen:
CliIncrement::CliIncrement(c_string name, c_string help, uint32_t size) :
name_(name),
help_(help)
{
Debug::Assert(name_ != nullptr);
Debug::Assert(help_ != nullptr);
commands_.Init(size, CliParm::CellDiff(), MemImmutable);
Singleton<CliRegistry>::Instance()->BindIncrement(*this);
}
name
is the increment's name, in the same way that each command has a name. help
indicates which static library is supported by the increment's commands. size
is the maximum size of commands_
, which contains all of the increment's commands. commands_
is of type Registry<CliCommand>
.
The constructor registers the new increment with CliRegistry
, which contains all of the system's increments. The incrs
command lists them:
A subclass of CliIncrement
only has to provide a constructor that creates and registers its commands:
NbIncrement::NbIncrement() : CliIncrement("nb", "NodeBase Increment", 48)
{
BindCommand(*new HelpCommand);
BindCommand(*new QuitCommand);
BindCommand(*new IncrsCommand);
}
Walkthroughs
Getting Help
Before discussing how to implement a CLI command, let's see what the help
command does. Entering this command without any parameters provides the user with an overview of the CLI:
That told us how to list the commands that are available in the nb
increment:
Let's investigate the logs
command:
Here, we see an example of SetHowParm
, the CliTextParm
subclass that was shown earlier, which calls for the string parameter on
or off
as part of the command's suppress
option.
The above was automatically generated from the parameters bound against LogsCommand
. A command and its parameters form a parse tree that can also be traversed to generate help documentation by displaying the help
argument that each CLI object must provide. The punctuation in the left column was described in the CLI's general help documentation that appeared at the beginning of this section.
- Parameters indented to the same depth belong to the first parameter above and at the previous depth.
- Parentheses
(
…)
surround the alternatives for a mandatory parameter. - Brackets
[
…]
surround the alternatives for an optional parameter. - A vertical bar
|
separates the alternatives for a CliCharParm
. - A colon
:
separates the minimum and maximum values for a CliIntParm
. <str>
indicates that any string which satisfies the parameter's description is acceptable.
When RSC starts up, this is what appears on the console:
What's this NET500
and NODE500
arcana? The logs
command will tell us:
RSC's help directory contains documentation for its logs, alarms, and static analysis warnings, as well as more detailed help on a few CLI topics. Its files are used by the CLI commands that provide help on those topics. The file help.cli in the docs directory contains the help
output for all of RSC's CLI increments and commands.
Implementing a Command
When CliThread
receives input, it parses the command name and invokes the function that implements the command, which will be an override of the pure virtual function CliCommand::ProcessCommand
:
virtual word CliCommand::ProcessCommand(CliThread& cli) const = 0;
A ProcessCommand
function receives a reference to the CliThread
that is executing the command, and it returns a word
, which is a typedef
for intptr_t
(an int
that matches to the platform's word size, whether 32 or 64 bits). A command can use its return code however it pleases, but almost all of them return 0 or a positive value on success, and a negative value on failure.
A ProcessCommand
function therefore has direct access to two things: functions declared by its base classes (CliCommand
, CliText
, and CliParm
), and functions defined by CliThread
. The command knows the order in which its command line arguments appear and uses the following functions to obtain them:
- For an integer:
GetIntParm
or GetIntParmRc
- For a boolean:
GetBoolParm
or GetBoolParmRc
- For a character:
GetCharParm
or GetCharParmRc
- For a pointer;
GetPtrParm
or GetPtrParmRc
- For a string in a list:
GetTextIndex
or GetTextIndexRc
- For an arbitrary string:
GetString
or GetStringRc
- For a filename:
GetFileName
or GetFileNameRc
- For an identifier:
GetIdentifier
or GetIdentifierRc
All of these are declared by CliParm
, although they are selectively overridden by subclasses. The first function in each pair returns a bool
and is used when looking for a mandatory parameter. The second function returns the enum CliParm::Rc
and is used when looking for an optional parameter. It returns one of
Ok
, when a valid parameter was found None
, when a valid parameter was not found (so the parameter's default value should be used) Error
, when an error occurred (e.g., ill-formed input, failure of the input stream)
Each Get…
function has two arguments: the CliThread
that was passed to the ProcessCommand
function, and a reference to data (word&
, bool&
, char&
, void*&
, or string&
) that is updated when the result is Ok
.
CliThread
knows which command was invoked, so it initializes the command's parse tree, which contains the parameters that were bound against the command. When ProcessCommand
invokes a Get…
function, it actually ends up invoking that function on the current element in the parse tree. Thus, if it tries to read a parameter whose type does not match what was previously defined in the parse tree, the parse will fail. This ensures that the parse tree matches the logic in the ProcessCommand
function and that the automatically generated help documentation is correct. When a parameter is extracted, it updates the position in the parse tree, which is tracked by CliThread
's cookie_
member.
Besides the Get…
functions, ProcessCommand
implementations often use these CliThread
members:
EndOfInput
is always invoked after parsing the last parameter. It outputs an error message and returns false
if the input line still contains unparsed characters. obuf
is a public ostringstream
member for assembling output. When ProcessCommand
returns to CliThread
, obuf
is written to the actual output stream (i.e., the console or a file). BoolPrompt
, CharPrompt
, IntPrompt
, and StrPrompt
allow a ProcessCommand
function to query the user for input after displaying a prompt such as, "Do you really want to do this (y|n)?" Report
is used to return the command's word
result and update obuf
with a string
that will be written to the output stream.
Let's look at the code that was invoked when logs
explain…
was entered in the console image that appeared earlier. The first thing to point out is that there are two ways to implement a command (like logs
) that has subcommands (like explain
):
- Define a
CliCommandSet
subclass for logs
. Define a CliCommand
subclass for each subcommand, and bind these against the first class. - Define a
CliCommand
subclass for logs
. Define a CliTextParm
subclass for the subcommands, which will contain a CliText
instance for each of them. The CliCommand
subclass parses the string associated with a subcommand. This returns the string's index, which is then used in a switch
statement as previously described.
Which approach to use is largely a matter of taste. The first one results in many shorter ProcessCommand
functions, and the second one results in a longer ProcessCommand
function with a switch
statement. Most of RSC's subcommands are implemented the second way, which was the only way until CliCommandSet
was added.
The logs
command uses the second way; code for subcommands other than explain
has been removed:
class LogsCommand : public CliCommand
{
public:
static const id_t ListIndex = 1;
static const id_t ExplainIndex = 2;
static const id_t SuppressIndex = 3;
static const id_t ThrottleIndex = 4;
static const id_t CountIndex = 5;
static const id_t BuffersIndex = 6;
static const id_t WriteIndex = 7;
static const id_t FreeIndex = 8;
static const id_t LastNbIndex = 8;
LogsCommand();
protected:
word ProcessSubcommand(CliThread& cli, id_t index) const override;
private:
word ProcessCommand(CliThread& cli) const override;
};
class LogsExplainText : public CliText
{
public:
LogsExplainText() : CliText("displays documentation for a log", "explain")
{
BindParm(*new CliTextParm("log group name", false, 0));
BindParm(*new CliIntParm("log number", TroubleLog, Log::MaxId));
}
};
class LogsAction : public CliTextParm
{
public:
LogsAction() : CliTextParm("subcommand...")
{
BindText(*new LogsExplainText, LogsCommand::ExplainIndex);
}
};
LogsCommand::LogsCommand() :
CliCommand("Interface to the log subsystem.", "logs")
{
BindParm(*new LogsAction);
}
bool FindGroupAndLog
(const string& name, word id, LogGroup*& group, Log*& log, string& expl)
{
auto reg = Singleton<LogGroupRegistry>::Instance();
group = reg->FindGroup(name);
if(group == nullptr)
{
expl = NoLogGroupExpl;
return false;
}
log = nullptr;
if(id == 0) return true;
log = group->FindLog(id);
if(log == nullptr)
{
expl = NoLogExpl;
return false;
}
return true;
}
word LogsCommand::ProcessCommand(CliThread& cli) const
{
id_t index;
if(!GetTextIndex(index, cli)) return -1;
return ProcessSubcommand(cli, index);
}
word LogsCommand::ProcessSubcommand(CliThread& cli, id_t index) const
{
word rc = 0;
string name, expl, key, path;
word id;
Log* log;
LogGroup* group;
switch(index) {
case ExplainIndex:
if(!GetString(name, cli)) return -1;
if(!GetIntParm(id, cli)) return -1;
if(!cli.EndOfInput()) return -1;
if(!FindGroupAndLog(name, id, group, log, expl))
return cli.Report(-1, expl);
key = group->Name() + std::to_string(id);
path = Element::HelpPath() + PATH_SEPARATOR + "logs.txt";
rc = cli.DisplayHelp(path, key);
switch(rc)
{
case -1: return cli.Report(-1, "This log has not been documented.");
case -2: return cli.Report(-2, "Failed to open file " + path);
}
break;
}
return rc;
}
Besides the classes and functions already mentioned, this code used CliThread::DisplayHelp
, which uses a key to retrieve help text from a file.
Getting a Transcript
All commands executed through the CLI, and all console output, are automatically recorded in a file whose name is console, suffixed by the time when the executable was launched (e.g., console200530-071350-959.txt). These transcripts are occasionally useful but mostly just clutter the output directory.
Running a Script
The read
command executes CLI commands from a file that can, in turn, contain other read
commands. This capability is used to write test scripts and to automate sequences of frequently used commands, such as the buildlib script that defines the code directories to be used by RSC's static analysis tool.
The CLI provides some commands that are primarily used by scripts:
send
redirects CLI output to a file, which is useful for recording the results of a test script. This command can be nested, with send
prev
restoring output to the previous file and send
cout
popping all files and restoring output to the console. echo
writes a string to the current output stream. delay
waits before executing the script's next command, which is useful when a test script needs to give the system time to react to the previous command. if…else
executes a command conditionally. A test script, for example, might check the result of the most recently executed CLI command, which is saved in the symbol cli.result
:
if &cli.result >= 0 echo "passed" else echo "failed"
RSC's input directory contains many scripts, almost all of them for tests.
Extending a Command
A static library may want to add capabilities to a command that is defined by one of the libraries that it uses. For example, the include
and exclude
commands in the NodeBase
namespace/library allow a user to specify the thread(s) whose work should be captured by trace tools. The namespace/library NetworkBase
extends these commands so that the user can specify the IP port(s) whose events should be captured by trace tools. This is implemented by having NwIncludeCommand
derive from IncludeCommand
. See NbIncrement.cpp and NwIncrement.cpp for further details.
Implementing a CLI Application
The class CliAppData
provides thread-local storage for CLI applications whose data needs to survive from one CLI command to the next. An application subclasses CliAppData
to add its data and functions. CliThread
defines a few functions that allow the application to manage this data through a unique_ptr
that CliThread
provides. For example, NtTestData
uses this capability to support test scripts and a database that records which tests have passed or failed.
Points of Interest
Aborting a Command
When an exception occurs on an RSC thread, its override of the Recover
function is invoked. CliThread
always chooses to resume execution so that it can try to continue whatever it was doing. But if it receives a break signal (typically ctrl-C), it clears the data associated with the work in progress so that it will immediately re-prompt the user for a new command as soon as it is reentered:
bool CliThread::Recover()
{
auto sig = GetSignal();
auto reg = Singleton<PosixSignalRegistry>::Instance();
if(reg->Attrs(sig).test(PosixSignal::Break))
{
appsData_.clear();
outFiles_.clear();
stream_.reset();
ibuf->Reset();
}
return true;
}
How RSC handles a ctrl-C is discussed here.
History
- 2nd June, 2020: Initial version