Introduction
A macro is a symbolic name that you give to a series of characters called a text macro or give to one or more statements, called a macro procedure or function. As the assembler evaluates each line of your programs, it scans the source code for the name of an early defined macro, and then substitutes the macro definitions for the macro name. A macro procedure is a named block of assembly language statements. Once defined, it can be invoked (called) many times, even receiving different arguments passed. So you can avoid repeatedly writing the same code in places.
In this article, we’ll talk about some usages that are not thoroughly discussed or not clearly documented. We'll show and analyze examples in the Microsoft Visual Studio IDE. The topics will be related miscellaneously to the ECHO
directive, checking the parameter type and size in a macro procedure, and generating memory in repetitions with the current location counter $
.
All the materials presented here came from my teaching [2] for years. Thus, to read this article, a general understanding of Intel x86-64 assembly language is assumed, and being familiar with Visual Studio 2010 or above is required. Preferred, having read a textbook typically like [3]; or the MASM Programmer's Guide [5] that was originated in 1992 by Microsoft but still so valuable in today’s MASM learning. If you are taking an Assembly Language Programming class, this could be a supplemental reading or study reference.
Using ECHO to the Output Window
According to MSDN, the ECHO
directive displays a message to the standard output device. In Visual Studio, you are able to use ECHO
to send a string to the IDE’s Output pane to show assembling/compiling messages, such as warning or error, similar to what MSBuild does.
A simplified snippet to test ECHO
would be like this:
.386
.model flat,stdcall
ExitProcess proto,dwExitCode:DWORD
mTestEcho MACRO
ECHO ** Test Echo with Nothing **
ENDM
.code
main PROC
mTestEcho
invoke ExitProcess,0
main ENDP
END main
This code was working fine in VS 2008. Unfortunately, since VS 2010, the ECHO
directive does not result in writing to the output window in Visual Studio. As seen in [4], it doesn’t, unless you configure it to generate verbose output to assembly your code. To do this, you should go:
Tools->Options->Projects and Solutions->Build and Run
And from the "MSBuild project build output verbosity" dropdown box, choose the "Detailed" option (notice that default is "Minimal"):
To test the macro, you only need to compile an individual .ASM file, instead of building the whole project. Simply right click on your file and choose Compile:
To verify, you have to watch the display in the VS Output pane. The ECHO
directive is really working, however, the message you interested "** Test Echo with Nothing **" is buried somewhere in hundreds of lines that MSBuild generated. You must search tediously to find:
In learning MASM Assembly programming, this is definitely not a preferred way to practice. What I suggested is to use text "Error:" or "Warning:" with ECHO
, while still leaving MSBuild default output setting "Minimal" unchanged.
1. Output as Error
Simply add "Error:
" in the ECHO
statement in the macro and name it as mTestEchoError
:
mTestEchoError MACRO
ECHO Error: ** Test Echo with Error **
ENDM
Now let’s call mTestEchoError
in main PROC
. Compiling the code, you can see the minimal output so concise as below. Notice that because of Error here, reasonably the result said failed.
2. Output as Warning
Simply add "Warning:
" in the ECHO
statement and name it as mTestEchoWarning
:
mShowEchoWarning MACRO
ECHO Warning: ** Test Echo with Warning **
ENDM
Then calling mTestEchoWarning
in main PROC
and compiling it, you can see the minimal output much simpler as below. Since only Warning designated, the compiling succeeded.
As you are aware, this way, the ECHO
directive generates concise and clear messages without you searching for the outputs. The sample is in TestEcho.asm for download.
Checking Parameter Type and Size
When you pass an argument to a macro procedure, the procedure receives it from the parameter although just a text substitution. Usually, you will check some conditions from the parameter to do something accordingly. Since this happens at assembling time, it means that the assembler would choose some instructions if a condition is satisfied, else provide other instructions for unsatisfied if needed. Definitely, you can check the constant argument values, either in string or number. Yet another useful check perhaps, is based on the parameter type or size for registers and variables. For example, one macro procedure only accepts unsigned integers and bypass signed ones, while the second macro may deal with 16 and 32-bit without for 8 and 64-bit arguments.
1. Argument as a Memory Variable
Let’s define three variables here:
.data
swVal SWORD 1
wVal WORD 2
sdVal SDWORD 3
When applying the TYPE
and <code>SIZEOF
operators to these variables, we simply have:
mov eax, TYPE swVal
mov eax, SIZEOF swVal
mov eax, TYPE wVal
mov eax, SIZEOF wVal
mov eax, TYPE sdVal
mov eax, SIZEOF sdVal
As seen above, there is no numeric difference either between TYPE
and SIZEOF
, or between WORD
and SWORD
. The first four instructions all are moving the byte count 2
to EAX
. However, TYPE
can do more than just returning byte counts. Let’s try to check SWORD
type and size with the parameter par
:
mParameterTYPE MACRO par
IF TYPE par EQ TYPE SWORD
ECHO warning: ** TYPE par is TYPE SWORD
ELSE
ECHO warning: ** TYPE par is NOT TYPE SWORD
ENDIF
ENDM
mParameterSIZEOF MACRO par
IF SIZEOF par EQ SIZEOF SWORD
ECHO warning: ** SIZEOF par is SIZEOF SWORD
ELSE
ECHO warning: ** SIZEOF par is NOT SIZEOF SWORD
ENDIF
ENDM
Then calling two macros by passing the above defined variables
ECHO warning: --- Checking TYPE and SIZEOF for wVal ---
mParameterTYPE wVal
mParameterSIZEOF wVal
ECHO warning: --- Checking TYPE and SIZEOF for swVal ---
mParameterTYPE swVal
mParameterSIZEOF swVal
ECHO warning: --- Checking TYPE and SIZEOF for sdVal ---
mParameterTYPE sdVal
mParameterSIZEOF sdVal
See the following results in Output:
Obviously, the TYPE
operator can be used to differentiate the signed or unsigned arguments passed, as SWORD
and WORD
are different types. While SIZEOF
is simply a comparison of byte counts, as SWORD
and WORD
are both 2 bytes. The last two checks means the type of SDWORD
is not SWORD
and the size of SDWORD
is 4 bytes not 2.
Furthermore, let’s make direct checks, since two operators also can apply to data type names here:
mCheckTYPE MACRO
IF TYPE SWORD EQ TYPE WORD
ECHO warning: ** TYPE SWORD EQ TYPE WORD
ELSE
ECHO warning: ** TYPE SWORD NOT EQ TYPE WORD
ENDIF
ENDM
mCheckSIZEOF MACRO
IF SIZEOF SWORD EQ SIZEOF WORD
ECHO warning: ** SIZEOF SWORD EQ SIZEOF WORD
ELSE
ECHO warning: ** SIZEOF SWORD NOT EQ SIZEOF WORD
ENDIF
ENDM
The following result is intuitive and straightforward:
2. Argument as a Register
Since an argument can be a register, let’s call two previous macros to check its TYPE
and SIZEOF
:
mParameterTYPE AL
mParameterSIZEOF AL
mParameterTYPE AX
mParameterSIZEOF AX
We receive such messages:
As we see here, for type check, neither AL
nor AX
(even 16-bit) is signed WORD
. Actually, you cannot apply SIZEOF
to a register that causes assembling error A2009
. You can verify it directly:
mov ebx, SIZEOF al
mov ebx, TYPE al
But which type is for registers? The answer is all registers are unsigned by default. Simply make this:
mParameterTYPE2 MACRO par
IF TYPE par EQ WORD
ECHO warning: ** TYPE par is WORD
ELSE
ECHO warning: ** TYPE par is NOT WORD
ENDIF
ENDM
And call:
mParameterTYPE2 AL
mParameterTYPE2 AX
Also notice that I directly use the data type name WORD
here equivalent to using TYPE WORD
.
3. An Example in Practice
Now let’s take a look at a concrete example that requires moving an argument of a 8, 16, or 32-bit singed integer into EAX
. To create such a macro, we have to use either the instruction mov
or the sign-extension movsx
based on the parameter size. The following is one possible solution to compare the parameter's type with the required sizes. The %OUT
is the same as ECHO
as an alternative.
mParToEAX MACRO intVal
IF TYPE intVal LE SIZEOF WORD
movsx eax, intVal
ELSEIF TYPE intVal EQ SIZEOF DWORD
mov eax,intVal
ELSE
%OUT Error: ***************************************************************
%OUT Error: Argument intVal passed to mParToEAX must be 8, 16, or 32 bits.
%OUT Error:****************************************************************
ENDIF
ENDM
Test it with different sizes and types for variables and registers:
mParToEAX bVal
mParToEAX swVal
mParToEAX wVal
mParToEAX sdVal
mParToEAX qVal
mParToEAX AH
mParToEAX BX
mParToEAX EDX
mParToEAX RDX
As expected, the Output shows the following messages to reject qVal
reasonably. Also fine is an error reported for RDX
, as our 32-bit project doesn’t recognize a 64-bit register.
You can try the downloadable code in ParToEAX.asm. Furthermore, let’s generate its listing file to see what instructions the assembler has created to substitute macro calls. As expected, bVal
, swVal
, wVal
, and sdVal
are good but without qVal
; while AH
, BX
, and EDX
good but without RDX
:
00000000 .data
00000000 03 bVal BYTE 3
00000001 FFFC swVal SWORD -4
00000003 0005 wVal WORD 5
00000005 FFFFFFFA sdVal SDWORD -6
00000009 qVal QWORD 7
0000000000000007
00000000 .code
00000000 main_pe PROC
mParToEAX bVal
00000000 0F BE 05 1 movsx eax, bVal
00000000 R
mParToEAX swVal
00000007 0F BF 05 1 movsx eax, swVal
00000001 R
mParToEAX wVal
0000000E 0F BF 05 1 movsx eax, wVal
00000003 R
mParToEAX sdVal
00000015 A1 00000005 R 1 mov eax,sdVal
mParToEAX qVal
mParToEAX AH
0000001A 0F BE C4 1 movsx eax, AH
mParToEAX BX
0000001D 0F BF C3 1 movsx eax, BX
mParToEAX EDX
00000020 8B C2 1 mov eax,EDX
mParToEAX RDX
1 IF TYPE RDX LE SIZEOF WORD
AsmCode\ParToEAX.asm(45) : error A2006:undefined symbol : RDX
mParToEAX(1): Macro Called From
AsmCode\ParToEAX.asm(45): Main Line Code
1 ELSE
AsmCode\ParToEAX.asm(45) : error A2006:undefined symbol : RDX
mParToEAX(3): Macro Called From
AsmCode\ParToEAX.asm(45): Main Line Code
invoke ExitProcess,0
00000029 main_pe ENDP
END
Generating Data in Repetitions
In this section, we’ll talk about using macros to generate a memory block, an array of integers in the data segment, rather than calling macros in code. We’ll show three ways to create the same linked list: using an unchanged location counter $
, retrieving changed values from the counter $
, and calling a macro in data segment.
1. Using an Unchanged Locate Counter $
I simply borrowed the LinkedList
snippet from the textbook [3] to modify it with eight nodes as:
LinkedList ->11h ->12h ->13h ->14h ->15h ->16h ->17h ->18h ->00h
I added six extra DWORD
s of 01111111h
at the end for padding, although unnecessary while easy to format in the Memory window to watch:
ListNode STRUCT
NodeData DWORD ?
NextPtr DWORD ?
ListNode ENDS
TotalNodeCount = 8
.data
Counter = 0
LinkedList LABEL PTR ListNode
REPT TotalNodeCount
Counter = Counter + 1
ListNode <Counter+10h, ($ + Counter * SIZEOF ListNode)>
ENDM
ListNode <0,0>
DWORD 01111111h, 01111111h, 01111111h, 01111111h, 01111111h, 01111111h
The memory is created. The list header is the label LinkedList
, an alias that points to 0x00404010
:
Each node contains 4-byte DWORD
for NodeData
and another DWORD
for NextPtr
. As Intel IA-32 using little endian, the first integer in memory 11 00 00 00
, is 00000011
in hexadecimal; and its next pointer 18 40 40 00
, is 0x00404018
. So the two rows cover all eight list nodes. In the third row, the first node with two zero DWORD
s acts as a tail (although a waste node). Immediately followed is padding of six 01111111
.
Now let’s see what happens to the current location counter $
. As mentioned in [3]:
The expression ($
+ Counter
* SIZEOF ListNode
) tells the assembler to multiply the counter by the ListNode
size and add their product to the current location counter. The value is inserted into the NextPtr
field in the structure. [It’s interesting to note that the location counter’s value ($
) remains fixed at the first node of the list.]
This is really true that the value of $
always remains 0x00404010
without changing in each iteration in the REPT
block. The NextPtr
address calculated by ($
+ Counter
* SIZEOF ListNode
) makes node by node to link together to generate LinkedList
eventually. However, you might ask if we could get the actual current memory address to use in iteration? Yes. Here it comes.
2. Retrieving Changed Values From the Location Counter $
.data
Counter = 0
LinkedList2 LABEL PTR ListNode
REPT TotalNodeCount
Counter = Counter + 1
ThisPointer = $
ListNode <Counter+20h, (ThisPointer + SIZEOF ListNode)>
ENDM
ListNode <0,0>
DWORD 02222222h, 02222222h, 02222222h, 02222222h, 02222222h, 02222222h
len = ($ - LinkedList)/TYPE DWORD
Hey, almost nothing changed but to name a new symbolic constant ThisPointer
= $
that just assigns the $
’s current memory address to ThisPointer
. Now we can use ThisPointer
in the similar calculations to initialize the NextPtr
field of ListNode
object by a simpler expression (ThisPointer
+ SIZEOF ListNode
). This also makes node by node to link one another to generate LinkedList2
this time. You can check LinkedList2
’s memory, 0x00404070
:
To differentiate the first LinkedList
, I let Counter+20h
to make it as:
LinkedList2 ->21h ->22h ->23h ->24h ->25h ->26h ->27h ->28h ->00h
By comparing two memory blocks, both perform exactly the same functionality. Notice that at last, I purposely calculate the len
to see how many DWORD
s generated until now.
len = ($ - LinkedList)/TYPE DWORD
As an interesting exercise, please think of the value of len
in your mind. In code, move len
to a register to verify.
3. Calling a Macro in Data Segment
By making the third linked list, we can understand that not only can you call a macro in code, but you can also call one in the data segment. For this purpose, I define a macro named mListNode
with a parameter called start
, where a ListNode
object is simply initialized. To differentiate the previous two, I make Counter+30h
for NodeData
and assign NodePtr as (start
+ Counter
* SIZEOF ListNode
).
.data
mListNode MACRO start
Counter = Counter + 1
ListNode <Counter+30h, (start + Counter * SIZEOF ListNode)>
ENDM
LinkedList3 = $
Counter = 0
REPT TotalNodeCount
mListNode LinkedList3
ENDM
ListNode <0,0>
DWORD 03333333h, 03333333h, 03333333h, 03333333h, 03333333h, 03333333h
The third list looks like:
LinkedList3 ->31h ->32h ->33h ->34h->35h ->36h->37h ->38h->00h
We now take the lesson from LinkedList2
by having LinkedList3
= $
first at the beginning. Notice I simply use symbolic constant LinkedList3
as the third list header, instead of the LABEL
directive. Now I set the REPT
repetition with only one macro call by passing the header address LinkedList3
to mListNode
. That’s it! See memory at 0x004040D0
:
Imagine what if you pass $
as an argument to mListNode
, without LinkedList3
= $
?
4. Checking an Address and Traversing a Linked List
Finally, let us put all generations of three lists together and run LinkedList.asm (available for download). In the code segment, I first retrieve three list headers’ addresses as below:
mov edx,OFFSET LinkedList
mov ebx,OFFSET LinkedList2
mov esi,OFFSET LinkedList3
mov eax, len
As expected EDX
, 00404010
is for LinkedList
; EBX
, 00404070
for LinkedList2
; and ESI
, 004040D0
for LinkedList3
. The whole memory of three lists is neighboring each other as shown:
Notice because of LinkedList3
as a symbolic one, we don’t even have to use the OFFSET
operator here. Let’s leave ESI
for the LinkedList3
and traverse this list to see every NodeData
values with a loop like this:
NextNode:
mov eax, (ListNode PTR [esi]).NextPtr
cmp eax, 0
je quit
mov eax, (ListNode PTR [esi]).NodeData
mov esi, (ListNode PTR [esi]).NextPtr
jmp NextNode
quit:
Unfortunately, we haven’t involved any implementation of an output procedure to call here to show EAX
that NodeData
moved. But in your debugging, simply setting a break point there to watch EAX
should be enough to verify from 31h
, 32h
, …, to 38h
.
Summary
By scrutinizing the above examples, we exposed something that you may not know about the macro in MASM assembly programming. An assembly language program can be executed with Intel or AMD specified instructions at runtime. While on the other side, MASM provides many directives, operators, and symbols to control and organize the instructions and memory variables during the assembling time, similar to preprocessing in other programming languages. In fact, with all the features, the MASM macro itself could be considered as a sub or mini programming language with three control mechanisms of sequential, conditional, and repetition.
However, some usages of MASM macro have not been discussed in detail. In the article, we first introduced a better way to output your error or warning text making it so easy to trace macro behaviors. Then with if
-else
structures, we presented how to check the type and size for a macro’s parameter that is a usual practice either for memory or register arguments. Finally, we discussed the macro repetitions with three examples to generate the same linked list, as well as a better understanding to use the current address locator $
symbol. The downloadable zip file contains all samples in .asm files. The project file MacroTest.vcxproj has been created in VS 2010, while it can be opened and upgraded in any recent VS version.
This article does not involve hot technologies like .NET or C#. Assembly Language is comparatively traditional without much sensation. But please refer to TIOBE Programming Community Index, the ranking of Assembly Language is on the rise recently, which means different types of assembly languages play an important role in new device development. Academically, Assembly Language Programming is considered as a demanding course in Computer Science. Therefore, I hope this article could serve as problem-solving examples for students and perhaps, developers as well.
References
History
- February 26, 2016 -- Original version posted