This article presents logging and trace tools that can be used in live systems. These tools gather information to help identify and reproduce bugs. One of the tools captures function calls and displays them in an indented Who invoked whom? format. A new tool is implemented by deriving from TraceRecord, which allows a common framework to manage all of a system's trace tools.
Introduction
During development, setting breakpoints and writing information to the console are probably the two most commonly used debugging techniques. But once software is released and running on a customer's system, these techniques are no longer acceptable. This article describes alternatives that can be used to debug live systems and that are also useful during development. They actually need to be sufficient during development, because they will be the only things available in a live system.
When released software contains a bug, the goal is to reproduce it and develop a fix offline. Sometimes the customer can tell you how to reproduce the problem. But when the customer can't, you need to gather enough data to figure out how to recreate it. There may be times when the customer doesn't even know there is a bug, but your software does. It should then collect the necessary data on its own.
Tracepoint debugging is sometimes available in a live system. Many debuggers support tracepoints as well as breakpoints. When the debugger hits a tracepoint, it collects information and continues without halting. Most often, you tell it to collect the contents of certain data structures. Because this technique is closely related to breakpoint debugging, this article does not discuss it in further detail.
Walkthroughs
The code in this article is taken from the Robust Services Core (RSC). If this is the first time that you're reading an article about an aspect of RSC, please take a few minutes to read this preface.
Software Error Log
RSC produces a software error log when Thread::Start
catches an exception. This function is inherited by all threads that derive from RSC's Thread
class. Exceptions have significant overhead, so applications should throw them very sparingly and catch them even less often. In the vast majority of cases, an exception represents a bug that needs to be fixed.
When Thread::Start
catches an exception, the log that it produces includes a stack trace, as described in Capturing a Thread's Stack When An Exception Occurs. All exceptions in RSC derive from Exception
, whose constructor captures the stack when the exception is thrown. This allows Thread::LogTrap
to include a stack trace when it generates its log. This log, for example, occurs when using a bad pointer:
THR902 2-Aug-2019 09:45:43.183 on Reigi {5}
in NodeTools.RecoveryTestThread (tid=15, nid=0x0000bc60): trap number 2
type=Signal
signal : 11 (SIGSEGV: Invalid Memory Reference)
errval : 0xc0000005
Function Traceback:
NodeBase.SignalException.SignalException @ signalexception.cpp + 40[12]
NodeBase.Thread.HandleSignal @ thread.cpp + 1801[16]
NodeBase.SE_Handler @ systhread.win.cpp + 126[13]
_unDNameEx @ <unknown file> (err=487)
is_exception_typeof @ <unknown file> (err=487)
is_exception_typeof @ <unknown file> (err=487)
is_exception_typeof @ <unknown file> (err=487)
_CxxFrameHandler3 @ <unknown file> (err=487)
RtlInterlockedCompareExchange64 @ <unknown file> (err=487)
RtlInterlockedCompareExchange64 @ <unknown file> (err=487)
KiUserExceptionDispatcher @ <unknown file> (err=487)
NodeTools.RecoveryTestThread.Enter @ ntincrement.cpp + 3584[0]
NodeBase.Thread.Start @ thread.cpp + 2799[15]
NodeBase.Thread.EnterThread @ thread.cpp + 1564[0]
BaseThreadInitThunk @ <unknown file> (err=487)
RtlGetAppContainerNamedObjectPath @ <unknown file> (err=487)
RtlGetAppContainerNamedObjectPath @ <unknown file> (err=487)
The stack must be captured when the exception is thrown, not when it is caught. By the time it's caught, the stack has unwound (in this case all the way to Thread::Start
), so there is no longer anything useful to capture. This is why Exception
's constructor captures the stack. Contrast this with exceptions thrown in the STL, where all you get is a string like "invalid string position
", which says nothing about who caused the problem. The stack trace is vital because it not only identifies where the bug occurred (e.g., in a function that didn't check for a nullptr
argument), but also how that code was reached (i.e., from a function that actually provided the bad argument).
When an error is so severe that the work in progress should be aborted, an application should either
- invoke the function
Debug::Assert
, or throw
a SoftwareException
or one of its subclasses.
Both of these produce a log that includes a stack trace. RSC handles these exception logs in the same way as all others. It puts them in a log buffer that is soon written to a file and, optionally, to the console. In released software, users can collect these logs and send them to you. Better still, your software can include code to automatically send them to you over the internet. Each of these logs highlights a bug that needs to be fixed.
Capturing a stack is platform specific; the Windows targets are in SysThreadStack.win.cpp. The above log only captured the chain of function calls that led to the exception. A useful enhancement is to also capture each stack frame and the CPU's registers. Within the log, these simply appear as hex values, but an offline tool can then tie into the compiler's object files and interpret the raw hex so that the values of each function's arguments and local variables can be studied at the time of the exception. But this is more platform-specific code.
Software Warning Log
Unlike the software error log, which is produced after a problem has occurred, a software warning log is used when a problem is detected proactively. Application software generates these logs, most often when it receives invalid input. It invokes Debug::SwLog
to generate them:
static void SwLog(fn_name_arg func,
const std::string& errstr, debug64_t errval, bool stack = true);
func
identifies the function that is generating the log errstr
explains the reason for the log errval
is a value to assist with debugging (debug64_t
is an alias for uint64_t
) stack
is usually true
so that the log will include a stack trace to assist with debugging
Let's look at an example of a software warning log. The framework for RSC's command line interface (CLI) defines the base class CliParm
, from which all CLI parameters ultimately derive. When a parameter registers with the CLI, it must provide a help
string so that the CLI's help
command can explain each of the parameters that a CLI command expects. We want to limit the length of these strings so that, after they are prefixed with some other information, they will still fit on a console that is 80 characters wide. CliParm
's constructor therefore looks like this:
CliParm::CliParm(c_string help, bool opt, c_string tag) :
help_(help),
opt_(opt),
tag_(tag)
{
Debug::Assert(help_ != nullptr);
auto size = strlen(help_);
auto total = ParmWidth + strlen(ParmExplPrefix) + size;
if(size == 0)
Debug::SwLog(CliParm_ctor, "help string empty", size);
else if(total >= COUT_LENGTH_MAX)
Debug::SwLog
(CliParm_ctor, "help string too long", total - COUT_LENGTH_MAX + 1);
}
Debug::Assert
throws an AssertionException
if help
is not provided. This would actually cause system initialization to fail, so it won't occur in released software: a developer must provide the value before the code can even be checked in. After that, the code generates a warning log if help
is empty or too long to fit on one line of the console. So if we edit NbIncrement.cpp by changing
fixed_string LogsListTextExpl =
"shows info for all logs or the logs in a specific group";
to this:
fixed_string LogsListTextExpl =
"shows information for all logs or the logs in a specific group";
The following log then appears during system initialization:
SW900 19-Feb-2020 18:08:29.915 on Reigi {1}
in CliParm.ctor
errval=help string too long offset=0x3
Function Traceback:
NodeBase.Debug.GenerateSwLog @ debug.cpp + 102[22]
NodeBase.Debug.SwLog @ debug.cpp + 214[25]
NodeBase.CliParm.CliParm @ cliparm.cpp + 65[45]
NodeBase.CliText.CliText @ clitext.cpp + 42[19]
NodeBase.LogsListText.LogsListText @ nbincrement.cpp + 1435[25]
NodeBase.LogsAction.LogsAction @ nbincrement.cpp + 1560[31]
NodeBase.LogsCommand.LogsCommand @ nbincrement.cpp + 1575[39]
NodeBase.NbIncrement.NbIncrement @ nbincrement.cpp + 3700[33]
NodeBase.Singleton<NodeBase.NbIncrement>.Instance @ singleton.h + 81[34]
NodeBase.NbModule.Startup @ nbmodule.cpp + 149[5]
NodeBase.ModuleRegistry.Startup @ moduleregistry.cpp + 434[19]
NodeBase.ModuleRegistry.Restart @ moduleregistry.cpp + 263[22]
NodeBase.InitThread.InitializeSystem @ initthread.cpp + 354[0]
NodeBase.InitThread.Enter @ initthread.cpp + 225[0]
NodeBase.Thread.Start @ thread.cpp + 3232[15]
NodeBase.Thread.EnterThread @ thread.cpp + 1755[0]
BaseThreadInitThunk @ <unknown file> (err=487)
RtlGetAppContainerNamedObjectPath @ <unknown file> (err=487)
RtlGetAppContainerNamedObjectPath @ <unknown file> (err=487)
The log complains that the help
string is too long, and the culprit is easily identified as a parameter used by LogsCommand
. Looking at the code, 0x3
is the number of characters that need to be trimmed from the string.
The 225K lines of code in RSC's repository include over 700 calls to Debug::SwLog
. It should be used often! But as with the software error log, each occurrence of a software warning log highlights a bug that needs to be fixed.
Object Dump
Some threads service a queue of work items one at a time. When an error occurs during one of the thread's transactions, a dump containing all the objects involved in the transaction can be very helpful during debugging. Logging objects takes a fair bit of time, so it is usually done only in conjunction with a software error log.
The following design approach supports an object dump during a software error scenario:
- The application thread must track the object(s) involved in its current transaction.
- The base
Thread
class must catch all exceptions, as described in Robust C++: Safety Net. - During its handling of an exception,
Thread
must invoke a virtual function that allows the application thread to generate a log containing the objects involved in the transaction. - Each object involved in the transaction must implement a virtual
Display
function (unless you're happy to decrypt a hex dump or write an offline tool that converts hex to symbolic values).
Here is a walkthrough of an object dump that follows a software error log. The initial log occurs during a testcase that prompts the telephony application included with RSC to throw an exception when it receives a message indicating that the far end has answered the call. InvokerThread
is the thread that is scheduled to handle transactions for the telephony application:
THR902 2-Feb-2020 13:04:23.037 on Reigi {3}
in SessionBase.InvokerThread (tid=11, nid=0x000089b0): trap number 1
type=SessionBase Exception
errstr : trap recovery test
offset : 0x0
context : 000001D15B508178
Function Traceback:
SessionBase.SbException.SbException @ context.cpp + 113[25]
SessionBase.Context.Kill @ context.cpp + 608[26]
CallBase.BcOaAnalyzeRemoteMessage.ProcessEvent @ bchandlers.cpp + 131[30]
SessionBase.SbAnalyzeMessage.ProcessEvent @ sbhandlers.cpp + 74[40]
SessionBase.ServiceSM.ProcessEvent @ servicesm.cpp + 559[46]
SessionBase.SsmContext.ProcessIcMsg @ ssmcontext.cpp + 211[30]
SessionBase.Context.ProcessMsg @ context.cpp + 700[0]
SessionBase.Context.ProcessWork @ context.cpp + 773[40]
SessionBase.InvokerPool.ProcessWork @ invokerpool.cpp + 554[0]
SessionBase.InvokerThread.Enter @ invokerthread.cpp + 176[0]
NodeBase.Thread.Start @ thread.cpp + 3235[0]
NodeBase.Thread.EnterThread @ thread.cpp + 1755[0]
BaseThreadInitThunk @ <unknown file> (err=487)
RtlUserThreadStart @ <unknown file> (err=487)
The log that provides the object dump drones on and on, so you may want to hit the "Shrink" button. The log displays 9 objects:
- the processing context (
SsmContext
) that aggregates the objects involved in the call; - a protocol state machine (
PotsCallPsm
) for messages to/from the user's telephone; - a protocol state machine (
CipPsm
) for messages to/from the far end of the call; - the message (
CipMessage
) from the far end of the call, along with its contents (SbIpBuffer
); - two ports (
MsgPort
) through which the protocol state machines receive and send messages; - the service state machine for a basic call (
PotsBcSsm
); and - the event that the state machine is currently handling (
AnalyzeMsgEvent
).
SESS900 2-Feb-2020 13:04:23.037 on Reigi {4}
SessionBase.SsmContext
this : 000001D15B508178
patchArea : 0x0
pid : 6
seq : 1
link : 0000000000000000
assigned : true
orphaned : 0
corrupt : false
logged : false
whichq : 0000000000000000
link :
next : 0000000000000000
prev : 0000000000000000
priMsgq :
tail : 0000000000000000
diff : 16
stdMsgq :
tail : 0000000000000000
diff : 16
enqTime : 3505468504310
pool : 000001D15B0F4AD0
thread : 000001D15C8113C8
faction : 5
state : 3
prio : 2
traceOn : true
trans : 000001D15C8DBB00
trace : 4,9> 4,9> 3,2> 4,9>
portq :
tail : 000001D15D4E02D0
diff : 16
[->]: 000001D15D4E0218 SessionBase.MsgPort
[->]: 000001D15D4E02C0 SessionBase.MsgPort
psmq :
tail : 000001D15DC126B0
diff : 16
[->]: 000001D15DC12438 PotsBase.PotsCallPsm
[->]: 000001D15DC126A0 CallBase.CipPsm
root : 000001D15EB2A3F8
--------------------------------------------------------
PotsBase.PotsCallPsm
this : 000001D15DC12438
patchArea : 0x0
pid : 9
seq : 1
link : 000001D15DC126B0
assigned : true
orphaned : 0
corrupt : false
logged : false
ctx : 000001D15B508178
upper : 0000000000000000
lower : 000001D15D4E0218
rcvdMsgq :
tail : 0000000000000000
diff : 16
ogMsgq :
tail : 0000000000000000
diff : 16
sentMsgq :
tail : 0000000000000000
diff : 16
timerq :
tail : 0000000000000000
diff : 16
fid : 7
state : 1
ogMsg : 0000000000000000
sendRing : false
sendScan : false
sendCause : false
sendFacility : false
header :
signal : 0 (nullptr)
port : 11 (POTS 20000)
ring :
on : false
scan :
digits : false
flash : false
cause :
cause : 0 (Invalid Cause)
facility :
sid : 0 (nullptr)
ind : 0
--------------------------------------------------------
CallBase.CipPsm
this : 000001D15DC126A0
patchArea : 0x0
pid : 9
seq : 1
link : 000001D15DC12448
assigned : true
orphaned : 0
corrupt : false
logged : false
ctx : 000001D15B508178
upper : 0000000000000000
lower : 000001D15D4E02C0
rcvdMsgq :
tail : 000001D15D8F4288
diff : 16
[->]: 000001D15D8F4278 CallBase.CipMessage
ogMsgq :
tail : 0000000000000000
diff : 16
sentMsgq :
tail : 0000000000000000
diff : 16
timerq :
tail : 0000000000000000
diff : 16
fid : 2
state : 8
edge : false
ogPsm : 000001D15DC12438
ogTone : 255
icTone : 255
icMedia :
rxFrom : 12 (POTS 20001)
ogMediaSent :
rxFrom : 11 (POTS 20000)
ogMediaCurr :
rxFrom : 11 (POTS 20000)
mep : 0000000000000000
iamTimer : false
--------------------------------------------------------
CallBase.CipMessage
this : 000001D15D8F4278
patchArea : 0x0
pid : 7
seq : 1
link : 000001D15D8F4288
assigned : true
orphaned : 0
corrupt : false
logged : false
buff : 000001D15B0CBE78
bt : 0000000000000000
handled : false
saves : 0
psm : 000001D15DC126A0
whichq : 000001D15DC126D8
--------------------------------------------------------
SessionBase.SbIpBuffer
this : 000001D15B0CBE78
patchArea : 0x0
pid : 4
seq : 1
link : 0000000000000000
assigned : true
orphaned : 0
corrupt : false
logged : false
rxTicks : 3505468487443
buff : 000001D15ED27A90
hdrSize : 24
buffSize : 42
txAddr : 192.168.0.10: 40000
rxAddr : 192.168.0.10: 40000
dir : 0
external : false
queued : false
length : 8
00: 03 00 00 00 01 08 03 00 - 02 00 00 00 01 08 02 00 ................
01: 02 80 03 00 04 00 08 00 - 00 00 02 00 0c 00 0b 00 .€..............
MsgHeader:
txAddr : bid=3 seq=1 pid=8 fid=3
rxAddr : bid=2 seq=1 pid=8 fid=2
priority=2 initial=0 final=0 join=0 self=0
injected=0 kill=0 route=2 length=8 spare=0x0
protocol=3 (CipProtocol) signal=4 (CipAnmSignal)
Parameters:
Deleted parameter: pid=0, length=2
--------------------------------------------------------
SessionBase.MsgPort
this : 000001D15D4E0218
patchArea : 0x0
pid : 8
seq : 1
link : 000001D15D4E02D0
assigned : true
orphaned : 0
corrupt : false
logged : false
ctx : 000001D15B508178
upper : 000001D15DC12438
lower : 0000000000000000
locAddr :
SessionBase.GlobalAddress
this : 000001D15D4E0250
patchArea : 0x0
v4Addr : 192.168.0.10: 40002
port : 40002
proto : Any
socket : 0000000000000000
sbAddr : bid=1 seq=1 pid=8 fid=7
remAddr :
SessionBase.GlobalAddress
this : 000001D15D4E0280
patchArea : 0x0
v4Addr : 192.168.0.10: 40001
port : 40001
proto : Any
socket : 0000000000000000
sbAddr : bid=0 seq=0 pid=0 fid=6
msgRcvd : true
msgSent : true
--------------------------------------------------------
SessionBase.MsgPort
this : 000001D15D4E02C0
patchArea : 0x0
pid : 8
seq : 1
link : 000001D15D4E0228
assigned : true
orphaned : 0
corrupt : false
logged : false
ctx : 000001D15B508178
upper : 000001D15DC126A0
lower : 0000000000000000
locAddr :
SessionBase.GlobalAddress
this : 000001D15D4E02F8
patchArea : 0x0
v4Addr : 192.168.0.10: 40000
port : 40000
proto : Any
socket : 0000000000000000
sbAddr : bid=2 seq=1 pid=8 fid=2
remAddr :
SessionBase.GlobalAddress
this : 000001D15D4E0328
patchArea : 0x0
v4Addr : 192.168.0.10: 40000
port : 40000
proto : Any
socket : 0000000000000000
sbAddr : bid=3 seq=1 pid=8 fid=3
msgRcvd : true
msgSent : true
--------------------------------------------------------
PotsBase.PotsBcSsm
this : 000001D15EB2A3F8
patchArea : 0x0
pid : 12
seq : 1
link : 0000000000000000
assigned : true
orphaned : 0
corrupt : false
logged : false
sid : 2
currState : 8
nextState : 8
idled : false
nextSap : 0
nextSnp : 0
triggered :
1 2 4 5 7 8 9 12 13 15 16 18 19 20 23 24
ssmq :
tail : 0000000000000000
diff : 16
parentSsm : 0000000000000000
eventq[Active] :
tail : 000001D15ECA0BF8
diff : 16
[->]: 000001D15ECA0BE8 SessionBase.AnalyzeMsgEvent
eventq[Pending] :
tail : 0000000000000000
diff : 16
eventq[Saved] :
tail : 0000000000000000
diff : 16
ctx : 000001D15B508178
mgwPsm : 0000000000000000
model : 1
upsm : 000001D15DC12438
npsm : 000001D15DC126A0
dialed :
count : 5
digits : 20001
analysis :
selector : 1 (Directory Number)
identifier : 20001
route :
selector : 7 (PotsCallFactory)
identifier : 20001
proxyCount : 0
prof : 000001D15B162310
tid : 0
trmt : 0000000000000000
--------------------------------------------------------
SessionBase.AnalyzeMsgEvent
this : 000001D15ECA0BE8
patchArea : 0x0
pid : 11
seq : 1
link : 000001D15ECA0BF8
assigned : true
orphaned : 0
corrupt : false
logged : false
eid : 1
owner : 000001D15EB2A3F8
location : 0
msg : 000001D15D8F4278
The object dump was generated when Thread::Start
invoked InvokerThread::Recover
while handling the exception. InvokerThread
tracks the running context in its ctx_
member. After it dumps the context and its objects, it deletes them:
bool InvokerThread::Recover()
{
if(ctx_ != nullptr)
{
ctx_->Dump();
auto ctx = ctx_.get();
delete ctx;
ctx_.release();
}
return true;
}
Context::Dump
creates the log and uses Base::LogSubtended
to dump all of the objects:
void Context::Dump() const
{
auto log = Log::Create(SessionLogGroup, SessionError);
if(log == nullptr) return;
LogSubtended(*log, Log::Tab, NoFlags);
Log::Submit(log);
}
Base::LogSubtended
uses Base::GetSubtended
to populate an array with a root object and all of the objects that it transitively owns, after which it writes them to the log:
void Base::LogSubtended(ostream& stream,
const string& prefix, const Flags& options) const
{
std::vector<Base*> objects;
GetSubtended(objects);
for(size_t i = 0; i < objects.size(); ++i)
{
if(i > 0) stream << prefix << string(60 - prefix.size(), '-') << CRLF;
objects[i]->Display(stream, prefix, options);
}
}
A class whose objects own other objects overrides Base::GetSubtended
. It first invokes the base class version to add itself to the array and then invokes GetSubtended
on the objects that it owns, so that they can also add themselves and their objects to the array:
void Base::GetSubtended(std::vector<Base*>& objects) const
{
objects.push_back(const_cast<Base*>(this));
}
Trace Buffer
The debugging techniques that we've looked at so far—software error logs, software warning logs, and object dumps—gather information passively. But when they prove inadequate, accessing the customer's system to enable trace tools is the next step. Although trace tools could be enabled by default, their overhead in a live system is usually prohibitive when used outside of specific scenarios, so they are used only when they might help to uncover how a bug occurs. A packet sniffer is an example of a trace tool that may be familiar to you.
To support trace tools, RSC provides a framework whose key classes are
TraceRecord
, a virtual base class for defining events that can be recorded, and TraceBuffer
, for saving recorded events.
TraceBuffer
was recently rewritten to be non-blocking. It's a simple circular buffer that contains pointers to a series of TraceRecord
s. After an application constructs a subclass of TraceRecord
, it invokes TraceBuffer::Insert
to add the record to the buffer. By default, a TraceRecord
is deleted when the buffer is full. However, wraparound can be enabled so that newer records overwrite older ones.
bool TraceBuffer::Insert(TraceRecord* record)
{
auto slot = AllocSlot();
if(slot == SIZE_MAX)
{
delete record;
return false;
}
auto prev = buff_[slot];
if(prev != nullptr)
{
buff_[slot] = nullptr;
delete prev;
}
record->slot_ = slot;
buff_[slot] = record;
return true;
}
It is TraceBuffer::AllocSlot
that makes tracing non-blocking and thread-safe by incrementing the index bnext_
atomically when allocating the next slot in the buffer:
size_t TraceBuffer::AllocSlot()
{
if(buff_ == nullptr) return SIZE_MAX;
if(softLocks_ > 0)
{
++blocks_;
return SIZE_MAX;
}
auto slot = bnext_.fetch_add(1);
if(bnext_ >= size_)
{
ovfl_ = true;
if(wrap_) return (slot & (size_ - 1));
bnext_ = size_;
return SIZE_MAX;
}
return slot;
}
The buffer can be locked (softLocks_
), but this is only done by code that is iterating over the buffer's records, something that never occurs while an application that needs to be traced is running.
To control tracing, RSC's CLI provides the commands tools
, query
, set
, include
, exclude
, start
, stop
, save
, and clear
. These are documented, along with all other CLI commands, in the file docs/help.cli. The include
and exclude
commands allow tracing to be restricted to specific threads, IP ports, IP addresses, protocols, or services, so that time isn't wasted constructing irrelevant trace records that quickly fill up the trace buffer. Although this feature is important even during development, it is essential before trace tools can be used in a live system.
The virtual base class TraceRecord
is quite simple and contains three data members:
- The record's owner, which identifies a specific trace tool.
- The record's slot (index) once it has been added to the trace buffer.
- An identifier whose interpretation is tool-specific and which allows the tool to use
switch
instead of dynamic_cast
when it displays different types of events that it has collected.
Function Tracer
RSC's function tracer is one example of a trace tool. It's a good example because it's very general-purpose. And it's a bad example because it slows the system down by a factor of about 4x if used indiscriminately. But if used with suitable filtering or during off-peak times, it's a candidate for use in a live system. And during development, it can be a very useful tool indeed.
The interface to the function tracer is Debug::ft
, which most RSC functions invoke in their first statement. These calls were edited out of the code previously shown in this article, but if we went back to the first code example and looked at the full version of CliParm
's constructor, we would see this:
fn_name CliParm_ctor = "CliParm.ctor";
CliParm::CliParm(c_string help, bool opt, c_string tag) :
help_(help),
opt_(opt),
tag_(tag)
{
Debug::ft(CliParm_ctor); }
The call to Debug::ft
simply indicates that a function was invoked:
void Debug::ft(fn_name_arg func)
{
if(FcFlags_.none()) return;
Thread::FunctionInvoked(func);
}
Simple. If no one cares about function calls, do nothing. Otherwise, call Thread::FunctionInvoked
.
Who cares that a function was invoked?
- The function trace tool, which records a sequence of function calls.
- The
Thread
class, when it needs to deliver a signal to the thread that is now running. - The
Thread
class, when it wants to check if the running thread is at risk of overrunning its stack.
The ubiquity of Debug::ft
allows it to support the latter two capabilities, which would otherwise be impossible to implement. Which brings us to this:
void Thread::FunctionInvoked(fn_name_arg func)
{
Thread* thr = nullptr;
if(Debug::FcFlags_.test(Debug::TracingActive))
{
auto& lock = AccessFtLock();
if(!lock.test_and_set())
{
if(TraceRunningThread(thr))
FunctionTrace::Capture(func);
lock.clear();
}
}
if(Debug::FcFlags_.test(Debug::TrapPending))
{
if(thr == nullptr) thr = RunningThread(std::nothrow);
if(thr == nullptr) return;
thr->TrapCheck();
}
if(Debug::FcFlags_.test(Debug::StackChecking))
{
if(StackCheckCounter_ <= 1)
{
if(thr == nullptr) thr = RunningThread(std::nothrow);
if(thr == nullptr) return;
thr->StackCheck();
}
else
{
--StackCheckCounter_;
}
}
}
The first if
statement checks if the function trace tool is enabled. If so, it sets a per-thread lock before it checks whether the running thread should be traced. If it should, it adds the function invocation to the trace buffer. The rest of the code deals with delivering signals to the running thread and checking if it is close to overrunning its stack.
The per-thread lock prevents nested calls to Debug::ft
. If these aren't blocked, such that a function invoked during Debug::ft
also invokes Debug::ft
, the result is a stack overflow. You can remove the lock and try it yourself if you're not convinced. When it crashed and I saw the thread stack, I facepalmed. More recently, I added calls to Debug::noft
to functions invoked during Debug::ft
, to indicate that they don't bother invoking Debug::ft
because doing so would be a waste of time. Besides, who wants to trace functions that are invoked to trace a function?
Assuming that FunctionTrace::Capture
gets invoked, what does it do?
void FunctionTrace::Capture(fn_name_arg func)
{
if(Scope_ == CountsOnly)
{
auto buff = Singleton<TraceBuffer>::Instance();
buff->RecordInvocation(func);
return;
}
auto buff = Singleton<TraceBuffer>::Instance();
auto depth = SysThreadStack::FuncDepth() - 3;
auto rec = new FunctionTrace(func, depth);
buff->Insert(rec);
}
The function tracer has two modes. One of these, CountsOnly
, is fairly efficient because it simply counts how many times each function was invoked. This is useful when deciding which functions to optimize, because a small speed-up in a frequently invoked function usually provides much greater gains than a large speed-up in a function that is rarely invoked.
If we want to record more than the invocation count for each function, the platform-specific SysThreadStack
is used to find the function's depth on the stack. This, along with the function's name, is passed to the FunctionTrace
constructor, which also records the running thread's identifier and the current tick time (i.e., the time at which the function was invoked).
After tracing is stopped and the trace records are to be streamed, FunctionTrace
records are processed to calculate the gross and net time spent in each function. The number of stack frames (depth) when a function was invoked supports a "staircase" output that shows how function calls were nested. And if more than one thread was traced, it is possible to see which thread was running when each function was invoked.
Here is an excerpt from a function trace. This is actually the first code to be captured during RSC's initialization, even before the entry to main
. It shows the creation of three mutexes that register with MutexRegistry
, which must be created before it can register the first mutex:
START OF TRACE: 30-Jan-2020 18:36:00.492 on Reigi
mm:ss.ttt Thr Event TotalTime NetTime Function
--------- --- ----- --------- ------- --------
36:00.492 1> 134 1 : : : SysMutex.ctor
36:00.492 1 44 44 : : : Permanent.ctor
36:00.492 1 47 5 : : : Singleton.Instance
36:00.492 1 17 3 : : : : Permanent.operator new
36:00.492 1 14 4 : : : : Memory.Alloc
36:00.492 1 10 10 : : : : : SysHeap.Alloc
36:00.492 1 24 2 : : : : MutexRegistry.ctor
36:00.492 1 7 7 : : : : Permanent.ctor
36:00.492 1 9 9 : : : : Registry.ctor
36:00.492 1 6 2 : : : : Registry.Init
36:00.492 1 4 2 : : : : : Memory.Alloc
36:00.492 1 2 2 : : : : : SysHeap.Alloc
36:00.492 1 1 1 : : : : Singletons.BindInstance
36:00.492 1 42 18 : : : MutexRegistry.BindMutex
36:00.492 1 24 24 : : : : MutexRegistry.Find
36:00.492 1 26 0 : : : SysMutex.ctor
36:00.492 1 7 7 : : : Permanent.ctor
36:00.492 1 19 4 : : : MutexRegistry.BindMutex
36:00.492 1 15 15 : : : : MutexRegistry.Find
36:00.492 1 149 0 : : : SysMutex.ctor
36:00.492 1 6 6 : : : Permanent.ctor
36:00.492 1 143 11 : : : MutexRegistry.BindMutex
36:00.492 1 132 132 : : : : MutexRegistry.Find
The trace of the entire initialization is quite long and can be seen here, along with the invocation count for each function that appeared in the trace. If you search for ">" in the trace, you will find entries in the Thr
column. That column displays the identifier of the running thread, with a ">" appended when a context switch occurs. Whenever RSC creates a thread during initialization, Windows starts to initiate context switches at a pace that is beyond manic! It only subsides because new threads suspend execution soon after being entered, and are only signaled to resume once initialization is completed.
Most non-trivial functions invoke Debug::ft
so that they will appear in a function trace. Those that don't usually fall into one of the following categories:
- functions defined in a header file
- simple
const
functions, such as "getters" - functions that write data to a stream
- iterators (e.g.,
First
, Next
, Prev
, Last
) - functions that construct or process trace records
- functions that are invoked so often that they would fill the trace with noise
Other Trace Tools
The function tracer is one of RSC's many trace tools. Here is a brief description of the others:
MemoryTracer
records alloc and free events for a heap and is typically used in combination with other trace tools. ObjPoolTracer
records dequeue, enqueue, claim, and recover events for the blocks in an object pool. NetworkTracer
records 18 different types of events on sockets. TransTracer
records the costs of receiving a message over the network and later processing it. BufferTracer
records the full contents of incoming and outgoing messages. ContextTracer
records 20 different types of events within RSC's state machine framework. ParserTracer
records 14 different types of events in the parser/interpreter that is part of RSC's C++ static analysis tool, resulting in a trace that resembles pseudo-code for a stack machine.
You can enable any subset of tools before starting a trace. After stopping the trace, you can disable some of them to exclude their records before using the CLI's >save
command to generate a trace file. Each trace file contains events from all of the enabled tools, in the order in which they occurred. For example, this trace is of a basic call in RSC's telephony application, where A calls B, B answers, and they both disconnect. The Event
column contains events recorded by NetworkTracer
, TransTracer
, and ContextTracer
, and complete messages recorded by BufferTracer
also appear in the trace. When TransTracer
and ContextTracer
are both used to record events, their events can also be processed to produce a message sequence chart.
In the Lab
The above trace tools are also useful in the lab, where they can be used with less caution. Two other tools are also useful during focused debugging:
- To look for memory leaks, use the subcommands available through the CLI's
>heap trace
command. - To see context switches, use the subcommands available through the CLI's
>sched
command, which generates a timeline of threads that looks like this.
Points of Interest
Training
Although trace tools are primarily intended for debugging, they are also useful teaching aids that can help new developers understand how the code works. For example, the function tracer makes it easy to see which override of a virtual function gets invoked. And a message sequence chart, coupled with a state machine trace, provides an extremely good view of how the components of a reactive system collaborate.
Compiler Optimizations
Aggressive code optimization by the compiler can make it very difficult to debug a release build. It's one thing to inline a function defined in a header, but quite another to inline unrelated functions and eliminate local variables. Even the output from the function tracer can be hard to follow. If you need the ability to debug your software in live systems, it is highly recommended that you disable many compiler optimizations. Some of them yield little improvement and simply obfuscate the code.
Function Tracing and Object Creation/Deletion
If you look at the full version of FunctionTrace::Capture
, you will come across code that inserts a call to a "C++.delete
" at the start of a destructor chain. This function represents compiler-generated code that invokes the destructor chain, followed by the appropriate operator
delete
. When the first destructor is invoked, the function tracer doesn't know if a delete operator will subsequently be invoked, so it inserts a C++.delete
. Later, when the function trace records are processed before being streamed, spurious calls to C++.delete
(i.e., those that were not followed by a delete operator) are removed from the trace.
Here is a C++.delete
that was added to a trace:
36:01.304 3 39 1 : : C++.delete
36:01.304 3 18 6 : : FileRequest.dtor
36:01.304 3 12 7 : : : StreamRequest.dtor
36:01.304 3 5 5 : : : MsgBuffer.dtor
36:01.304 3 20 7 : : Pooled.operator delete
36:01.304 3 13 7 : : : ObjectPool.EnqBlock
36:01.304 3 6 6 : : : Q1Way.Enq
Constructors are even more problematic. First, a chain of constructor calls gets captured bottom to top, from base class to leaf class. This is somewhat confusing and results in misleading timing information when calculating the gross and net times spent in each function. The processing phase therefore reorders constructors by putting the leaf class first, followed by the base class, and thence up the class hierarchy. This more clearly shows which class is being constructed and provides the correct gross time for its constructor.
Another source of confusion is that functions invoked in a member initialization list appear before the constructor to which they belong. As with the basic constructor chain, this inverted ordering occurs because the body of a constructor must be entered before Debug::ft
records its invocation.
Finally, as constructors step out (in depth) from base class to leaf class, there are cases where it is impossible to determine the correct leaf class. Provided that no constructor in the chain fails to invoke Debug::ft
, each of them (from base class to leaf class) has a depth that is one less than the previous constructor. But if another constructor follows the true leaf constructor at a depth that is also one less, it will appear to be the leaf class. There seems to be no general way to fix this without knowledge of the class hierarchy. (When an object is allocated on the heap, however, the correct leaf class can always be determined, because its constructor must lie at the same depth as the call to operator
new
.)
Here is an excerpt from a trace in which no processing has been done to reorder constructors:
00:32.972 1 1011 63 : : : CxxStrLiteral.CreateRef
00:32.972 1 155 122 : : : Base.operator new
00:32.972 1 23 23 : : : : CxxToken.ctor
00:32.972 1 33 33 : : : : CxxNamed.ctor
00:32.972 1 169 11 : : : QualName.ctor(string)
00:32.972 1 29 22 : : : : Base.operator new
00:32.972 1 6 6 : : : : : CxxToken.ctor
00:32.972 1 7 7 : : : : CxxNamed.ctor
00:32.972 1 129 129 : : : : TypeName.ctor
00:32.972 1 104 54 : : : Base.operator new
00:32.972 1 10 10 : : : : : CxxToken.ctor
00:32.972 1 17 17 : : : : CxxNamed.ctor
00:32.972 1 44 44 : : : : TypeSpec.ctor
00:32.972 1 6 6 : : : : TypeTags.ctor
00:32.972 1 43 43 : : : DataSpec.ctor
00:32.972 1 5 5 : : : TypeTags.SetPointer
00:32.972 1 305 213 : : : Base.operator new
00:32.972 1 9 9 : : : : : : CxxToken.ctor
00:32.972 1 17 17 : : : : : CxxNamed.ctor
00:32.972 1 21 21 : : : : : CxxScoped.ctor
00:32.972 1 75 75 : : : : CxxScope.ctor
00:32.972 1 92 24 : : : : Data.ctor
00:32.972 1 68 8 : : : : DataSpec.SetUserType
00:32.972 1 23 23 : : : : : TypeSpec.SetUserType
00:32.972 1 37 15 : : : : : QualName.SetUserType
00:32.972 1 22 22 : : : : : TypeName.SetUserType
00:32.972 1 53 53 : : : FuncData.ctor
00:32.972 1 114 8 : : : Singleton.Instance
00:32.972 1 54 54 : : : : Base.operator new
00:32.972 1 13 13 : : : : CxxRoot.ctor
00:32.972 1 39 39 : : : : Singletons.BindInstance
And here is the "corrected" version of the same excerpt:
36:00.493 1 710 61 : : : CxxStrLiteral.CreateRef
36:00.493 1 41 41 : : : Base.operator new
36:00.493 1 192 21 : : : QualName.ctor(string)
36:00.493 1 20 20 : : : : CxxToken.ctor
36:00.493 1 39 39 : : : : CxxNamed.ctor
36:00.493 1 13 13 : : : : Base.operator new
36:00.493 1 119 4 : : : : TypeName.ctor
36:00.493 1 3 3 : : : : : CxxToken.ctor
36:00.493 1 115 115 : : : : CxxNamed.ctor
36:00.493 1 14 14 : : : Base.operator new
36:00.493 1 80 7 : : : DataSpec.ctor
36:00.493 1 3 3 : : : : : CxxToken.ctor
36:00.493 1 3 3 : : : : CxxNamed.ctor
36:00.493 1 34 34 : : : : TypeSpec.ctor
36:00.493 1 39 39 : : : : TypeTags.ctor
36:00.493 1 3 3 : : : TypeTags.SetPointer
36:00.493 1 47 47 : : : Base.operator new
36:00.493 1 195 82 : : : FuncData.ctor
36:00.493 1 3 3 : : : : : : CxxToken.ctor
36:00.493 1 11 11 : : : : : CxxNamed.ctor
36:00.493 1 10 10 : : : : : CxxScoped.ctor
36:00.493 1 56 56 : : : : CxxScope.ctor
36:00.493 1 113 9 : : : : Data.ctor
36:00.493 1 104 8 : : : : DataSpec.SetUserType
36:00.493 1 9 9 : : : : : TypeSpec.SetUserType
36:00.493 1 87 13 : : : : : QualName.SetUserType
36:00.493 1 74 74 : : : : : TypeName.SetUserType
36:00.493 1 77 3 : : : Singleton.Instance
36:00.493 1 41 41 : : : : Base.operator new
36:00.493 1 6 6 : : : : CxxRoot.ctor
36:00.493 1 27 27 : : : : Singletons.BindInstance
History
- 17th August, 2020: Add "In the Lab" section
- 4th February, 2020: Initial version