Introduction
Several weeks ago, when I was writing and testing a function for use in a library, I inadvertently gave a variable inside a block the same name as a variable defined in the containing block. Initially stunned, I soon realized my error, and understood why the values I saw in the watch window changed suddenly when the execution path entered the inner scope block.
Background
Since I have written more code in C than C++ in the last few months, I had grown accustomed to the compiler emitting a fatal error when a name is reused in the same function.
- C functions have a single namespace. All local variables must be defined at the top of a function, and they have function scope. Defining a variable below the first executable statement that isn't an initializer is a fatal syntax error.
- C++ functions have a namespace that includes all variables defined at that level. If the function contains blocks (Anything enclosed in braces is a block.), each block acts like a subsidiary namespace that inherits the names defined in each containing block, up to the block that delimits the function body.
Nowever, if the outer and inner blocks both define a variable named Foo
, the variable Foo
defined in the inner block hides the outer Foo
until execution passes the closing brace.
Demonstration of Problem and Its Solution
The first statement in the body of function FB_ReplaceW
defines lngFoundPos
as a long integer, and initializes it to UNICODE_STRING_MAX_CHARS
(32767). Things get really interesting as execution enters the do/while
block, in which the first statement defines another long integer, which it also names lngFoundPos, and assigns the value returned by function StrIndex_P6C
to it. If lngFoundPos is nonzero, a new value for lngTCharsToCopy
is computed, memcpy
is invoked to copy a substring, and lngTCharsToCopy
is added to lngOutPos
, the third variable defined with function scope.
When execution reaches the end of the do/while
block, I had a brain teaser on my hands. Which lngTCharsToCopy
does the while
clause evaluate? No fair peeking at the answer below.
LPTSTR __stdcall FB_ReplaceW
(
LPCTSTR plpStrData ,
LPCTSTR plpToFind ,
LPCTSTR plpToReplace ,
PUINT puintNewLength
)
{
#define UNICODE_STRING_MAX_CHARS 32767
#define BUFFER_BEGINNING_P6C 0
#define STRLEN_EMPTY_P6C 0
#define NONE_P6C 0
#define STRPOS_FOUND_P6C 1
#define TRAILING_NULL_ALLOWANCE_P6C 1
long lngFoundPos = UNICODE_STRING_MAX_CHARS ;
long lngInPos = BUFFER_BEGINNING_P6C ;
long lngOutPos = BUFFER_BEGINNING_P6C ;
long lngLenToRepl = StringIsNullOrEmptyWW ( plpToReplace )
? STRLEN_EMPTY_P6C
: _tcslen ( plpToReplace ) ;
long lngInStrLen = _tcslen ( plpStrData ) ;
long lngLenToFind = _tcslen ( plpToFind ) ;
long lngTCharsToCopy = STRLEN_EMPTY_P6C ;
do { long lngFoundPos = StrIndex_P6C ( ( plpStrData + ( LONG_PTR ) lngInPos ) ,
plpToFind ) ;
if ( lngFoundPos )
{ lngTCharsToCopy = lngFoundPos - 1 ;
if ( lngTCharsToCopy )
{
memcpy ( ( LPTSTR ) m_lpFBReplaceBuff + ( LONG_PTR ) lngOutPos ,
( LPCTSTR ) plpStrData + ( LONG_PTR ) lngInPos ,
TcharsToBytesP6C ( lngTCharsToCopy ) ) ;
lngOutPos += lngTCharsToCopy ;
}
if ( lngLenToRepl )
{
memcpy ( ( LPTSTR ) m_lpFBReplaceBuff + ( LONG_PTR ) lngOutPos ,
( LPCTSTR ) plpToReplace + ( LONG_PTR ) lngInPos ,
TcharsToBytesP6C ( lngLenToRepl ) ) ;
lngOutPos += lngLenToRepl ;
}
lngInPos = lngInPos
+ lngFoundPos
+ lngLenToFind
- TRAILING_NULL_ALLOWANCE_P6C ;
} else
{
lngTCharsToCopy = lngInStrLen != lngInPos
? lngInStrLen - lngInPos
: NONE_P6C ;
if ( lngTCharsToCopy )
{
memcpy ( ( LPTSTR ) m_lpFBReplaceBuff + ( LONG_PTR ) lngOutPos ,
( LPCTSTR ) plpStrData + ( LONG_PTR ) lngInPos ,
TcharsToBytesP6C ( lngTCharsToCopy ) ) ;
} } } while ( lngFoundPos > STRLEN_EMPTY_P6C ) ;
return m_lpFBReplaceBuff ;
}
I discovered the hard way that the outer variable is evaluated, since the end of the inner block is the closing brace,. The result was an infinite loop, because lpFoundPos
is initialized, and nevver changes thereafter.
The solution was obvious and simple. The first statement in the buggy block was as follows.
long lngFoundPos = StrIndex_P6C ( ( plpStrData + ( LONG_PTR ) lngInPos ) ,
plpToFind ) ;
Eliminating the first keyword (long
) keeps the original lngFoundPos
in scope, allowing the loop to stop when it should, rather tnan run off into deep space (high memory, actually). Consolidating the statement into the if
statement that followed it in the original code, simplifying the while
expression, and initializeing lpFoundPos
to NULL
(zero) yields a working loop that looks like this.
LPTSTR lpFoundPos = NULL ;
...
if ( lpFoundPos = _tcsstr ( lpInPos , plpToFind ) )
...
} while ( lpFoundPos ) ;
Points of Interest
Although the role of braces as scope boundary markers is familliar to me, because other languages that borrowed heavily from C++ exhibit the same behavior, the example made crystal clear that the braces form a Chinese wall around the code that they enclose. Any variable defined inside the block doesn't exist until execution passes the opening brace, and it ceases to exist the instant execution passes the closing brace. Since the while
clause lies outside the braces, it can't use any variable that was defined inside them, even if a like named variable exists in its scope. Technically, they are two different variables.
Numerous other languages follow these rules, or something very close to them. I know of the following languages, and I am certain that this list is far from exhaustive.
Other popular languages that I suspect follow the same rules include Python, PHP, and Pascal.
History
Monday, 01 June 2015, Initial Publication