The AVR Assembler brush shBrushAsm.js.zip available as a download is a script to be used in conjunction with the Syntaxhighlighter script by Alex Gorbatchev. It has not been thoroughly tested so if you find any problems with it give me a shout.
In order to learn the assembler programming language some knowledge must be known
about the hardware we are using. In this tutorial we will start with a brief introduction
to the inner workings of the AVR micro controller then move on to pure assembler
and finally show how to mix 'C' and assembler languages.
- Introduction
- Memory Configuration
- Accessing Memory
- Program Memory
- Data Memory
- Data Direct
- Data Indirect with Displacement
- Data Indirect
- Data Indirect wiith Pre-decrement
- Data Indirect with Post-increment
- EEPROM Memory
- Input/Output
- The Language
- Register Usage
- Macros
- Mixing languages
- Calling Assembler subroutine from C
- Calling C subroutine from Assembler
- Complete example assembler program
- Conclusion
- References
Why on earth would anyone want to program in a low level language like Assembler
when there are languages such as C, C++ and others that provide a layer of abstraction
that takes all the drudgery out of programming?
- Some people are masochists.
- The code generated by the high level compiler won't fit in MPU's memory space.
- We have a need for speed.
- We are control freaks and need to control every aspect of the applications flow.
There is also another good reason to learn assembler, the more you know about the
inner workings of the processor the more capable a programmer you will become. And
even if you do decide that you need to write portions of your code in Assembler
you are not restricted to just using Assmbler or higher level language we can mix
them as long as we observe a few simple rules. For instance we could use 'C' as
our main language but write the interrupt routines in assembler.
AVR uses a Harvard architecture which is an architecture with separate memories
and buses for program and data. Instructions in the program memory are executed
with a single level pipelining. This means that while one instruction is being executed,
the next instruction is pre-fetched from the program memory. This concept enables most
instructions to be executed at every clock cycle. The figure below illustrates the
memory map for a typical AVR device. The actual memory configuration will depend
on the particular MPU being used, check the data sheet.
Because of the disjointed nature of AVR's architecture each memory segment needs
to be accessed in a different fashion. Instructions are provided for program and
data memory access and memory can be retrieved or writen in the same way that it
is accessed in most other processors. EEPROM access will be covered in it's own
section because it is a different beast in Atmel and most other micro controllers.
Program Memory should be thought of as read only memory hence there are only two
instructions for working with it Load Program Memory (LPM) and Store Program Memory
(SPM) and unless you are writing self-modifying code there is really no need to
write to Program Memory.
In this section we will cover the many instructions dedicated to working with Data
Memory. The information was taken from the ATMega1280 Datasheet since not all processors
use these instructions in the same fashion you will need to refer to the Datasheet
for the particular processor you are using to be sure these instructions are available on the Microcontroller you are using.
The instructions listed in this section are limited to a 64K data segment and some less but processors that have a larger data space there is a special register, RAMPD that can be used in conjunction with some of the indirect instructiions to access memory beyond the 64K limit. For accessing Program memory above 64K there are also a special registers RAMPX, RAMPY and RAMPZ available.
The LDS instruction loads a single byte of data from the data
space to a register and depending on whether it uses an 8 or 16 bit address the Opcode size is either 16-bit or 32-bit respectively. Similar instructions are available to transfer a single byte of data from a register to data space using the STS instruction.
Instructions that use this format are;
LDS - Load Direct from Data Space
Syntax: LDS Rd,k 0«d«31, 0«k«65535
Example: LDS R0,0x0100
LDS (16 bit) - Load Direct from Data Space
Syntax: LDS Rd,k 16«d«31, 0«k«127
Example: LDS R16,0x00
STS - Store Direct to Data Space
Syntax: STS k,Rr 0«r«31, 0«k«65535
Example: STS 0x0100,R2
STS (16 bit) - Store Direct to Data Space
Syntax: STS k,Rr 16«r«31, 0«k«127
Example: STS 0x00,R2
Loads or stores a single byte of data to or from data memory and a register. As can be seen in the image below an immediate value is added to the value in the Y or Z register to derive at the final address of the desired data byte. On devices that have more than 64K data memory area the RAMPY and RAMPZ registers allow for 24-bit addressing.
LDD - Load Indirect using Y or Z
Syntax: LDD Rd,Y+q ;0«d«31, 0«q«63
Example: LDD R4,Y+2 ;Load R4 with loc. Y+2
<br />STD - Store indirect using Y or Z <br />Syntax: STD Y+q,Rr ;0«d«31, 0«q«63<br />Example: STD Y+2,R4 ;Store R4 at loc. Y+2<br />
This instruction is similar to the Data Indirect with Displacement except it doesn't use a displacement but instead loads indirectly using the X, Y or Z registers.
LD - Load Indirect using X, Y or Z
Syntax: LD Rd,X 0«d«31
LD Rd,Y
LD Rd,Z
Example: LDI R26,0x20
LD R2,X ;Load R2 with byte at loc. 0x20
LDI R28,0x40
LD R3,Y ;Load R3 with byte at loc. 0x40
LDI R30,0x60
LD R4,Z ;Load R4 with byte at loc. 0x60
Similar to the Data Indirect instruction this instruction decrements the X, Y or Z register before the data is accessed and like the Data Indirect instruction it allows the registers to be used.
LD - Load Indirect using X, Y or Z
Syntax: LD Rd,X 0«d«31
LD Rd,Y
LD Rd,Z
Example: LDI R26,0x20
LD R2,-X ;Load R2 with loc. 0x1F
LDI R28,0x40
LD R3,-Y ;Load R3 with loc. 0x3F
LDI R30,0x60
LD R4,-Z ;Load R4 with loc. 0x5F
Similar to the Data Indirect instruction this instruction increments the X, Y or Z register after the data is accessed and like the Data Indirect instruction it allows the registers to be used.
LD - Load Indirect using X, Y or Z
Syntax: LD Rd,X 0«d«31
LD Rd,Y
LD Rd,Z
Example: LDI R26,0x20
LD R2,X+ ;Load R2 with loc. 0x20
LD R2,X ;Load R2 with loc. 0x21
LDI R28,0x40
LD R3,Y+ ;Load R3 with loc. 0x40
LD R3,Y ;Load R3 with loc. 0x41
LDI R30,0x60
LD R4,Z+ ;Load R4 with loc. 0x60
LD R4,Z ;Load R4 with loc. 0x61
Although the program and data memories are fairly straight forward easy to understand
and program, the EEPROM is quite another story. In assembly this is not a trivial
pursuit and is better done in 'C' where code is provided that handles the reading
and writing of EEPROM.
But for those brave soles that are intent on using assembler to read/write to/from
EEPROM I have provided (lifted straight from data sheet) code to perform minimal
functionality. Refer to the data sheet for your particular device for further information.
Avoid using the lowest EEPROM address, in some instances this lowest address can
be trashed and you will lose your data. Since data is writen in the order you declare
your variables just declare a bogus variable before any other.
;
; The EEPROM_Write routine
;
EEPROM_write:
;Wait for completion of previous write
sbic eecr,eepe
rjmp EEPROM_write
;Set up the address (r18:r17) to address register
out eearh,r18
out eearl,r17
;Write data (r16) to Data register
out eedr,r16
;Write logical one to eempe
sbi eecr,eempe
;Start eeprom write by setting eepe
sbi eecr,eepe
ret
;
; The EEPROM_Read routine
;
EEPROM_Read:
;Wait for the completion of the previous write
sbic eecr,eepe
rjmp EEPROM_Read
;Set up address (r18:r17) in address register
out eearh,r18
out eearl,r17
;Start eeprom read by writing eere
sbi eecr,eere
;Read data from Data register
in r16,eedr
ret
IO register space is mapped into regular data memory with an offset of 0x20 for
most devices meaning that it can be accessed just like any other data memory, this
includes the registers for all peripherals such as Timers, USART, Watch Dog Timer,
etc..
When used as general I/O ports all ports have read-modify-write functionality and
each pin has symmetric capability to drive or sink source. In addition individual
pins may be configured as either input or output, have selectable pull-up resistors
and have protection diodes to both VCC and GND.
Two special instructions (IN and OUT) are provided for working with I/O registers.
An example of how these instructions are used can be viewed in the EEPROM example
code.
As a general rule registers used in conjuction with 'C' code follow the general
guidelines as listed in the following table. We will be taking a look at these registers
when we start mixing languages, they play a very important part in the integration.
r0
| Temporary register - use in interrupts not recommended.
|
r1
| Zero register - can be used for temporary data but must be zero'd after use.
|
r18-r27, r30-r31
| These are general purpose registers and don't need to be saved when using in conjuction
with 'C' code.
|
r2-r17, r28-r29
| These are general purpose registers but do need to be saved when using in conjuction
with 'C' code.
|
By definition a macro is a group of instructions that you code once and are able
to use as many times as necessary. The main difference between a macro and a subroutine
is that the macro is expanded at the place where it is used. A macro can take up
to 10 parameters referred to as @0-@9 and given as a coma delimited list.
;PUSH_REGS macro
;Example macro that accepts 2 parameters that define the
;registers that are to be pushed onto the stack.
.macro PUSH_REGS
push @0
push @1
.endmacro
;
;Then to use the PUSH_REGS macro
label:
ldi R18,0x00
ldi R17,0x02
PUSH_REGS R18,R17
.
.
;
;And in reality what you end up with is
label:
ldi R18,0x00
ldi R17,0x02
push R18 ;macro code
push R17 ;macro code
.
.
;
Macros are generally made up of code that gets executed on a routine basis and are
kept in libraries so that they may be included where and as needed.
The gcc 'C' compiler uses registers in a very consistent manner to pass parameters
to and return values from subrountines. If we observe a few simple rules when mixing
languages such as 'C' and assembler the integration of the two languages is fairly
straight forward. Only 'C' is referenced in this tutorial but I would imagine that
many high level languages that use the gcc compiler can be referenced in a similar
manner.
When passing parameters to a subroutine Registers r25 through r8, in that order
are used. If more parameters then registers need to be passed to the subroutine
the stack is used and is not recommended due to a substantial hit to resources.
As an additional note register pairs are used regardless of the size of the parameter
being passed. This concept and others will be discussed further in the next two
sections. Values returned from a subroutine follow the guide lines shown in the
following table.
R24
| 8 bit values
|
R24-R25
| 16 bit values
|
R24-R22
| 32 bit values
|
R24-R18
| 64 bit values
|
By now you should have a pretty good idea of what to expect so I will demonstrate
calling an assembly subroutine from 'C' by providing a couple of examples. Each
example will have the 'C' code, followed by the resulting disassembled code and
finally the assembler subroutine.
In the first example the assembler subroutine adds two 16 bit numbers passed as
parameters iParam1 (R25:R24) and iParam2 (R23:R22) and returns the result (R25:R24)
to the main 'C' routine.
int AsmSubroutine(int iParam1, int iParam2);
int main()
{
int iRetVal = 0;
iRetVal = AsmSubroutine(1024, 16);
}
Resulting disassembled code
iRetVal = AsmSubroutine(1024, 16);
318: 80 e0 ldi r24, 0x00 ; 0
31a: 94 e0 ldi r25, 0x04 ; 4
31c: 60 e1 ldi r22, 0x10 ; 16
31e: 70 e0 ldi r23, 0x00 ; 0
320: 0e 94 ae 01 call 0x35c ; 0x35c <AsmSubroutine>
324: 90 93 01 06 sts 0x0601, r25
328: 80 93 00 06 sts 0x0600, r24
Assembler subroutine code
.section .text
;The global directive declares AsmSubroutine as global for linker.
;The AsmSubroutine label must follow the global directive.
.global AsmSubroutine
AsmSubroutine:
add R25,R23
adc R24,R22
ret
.end
In the second example the assembler subroutine adds two 8 bit numbers passed as
parameters iParam1 (R24) and iParam2 (R22) and returns the result (R24) to the main
'C' routine.
unsigned char AsmSubroutine(unsigned char, unsigned char);
int main()
{
unsigned char ucRetVal = 0;
ucRetVal = AsmSubroutine(32, 16);
}
Resulting disassembled code
iRetVal = AsmSubroutine(32, 16);
318: 80 e2 ldi r24, 0x20 ; 32
31a: 60 e1 ldi r22, 0x10 ; 16
31c: 0e 94 aa 01 call 0x354 ; 0x354 <AsmSubroutine>
320: 80 93 00 06 sts 0x0600, r24
Assembler subroutine code
.section .text
;The global directive declares AsmSubroutine as global for linker.
;The AsmSubroutine label must follow the global directive.
.global AsmSubroutine
AsmSubroutine:
add R24,R22
ret
.end
As can be seen from the two examples the parameters passed in use a register pair
per parameter so in the second example even though we are passing two 8 bit values
the compiler puts each 8 bit value in the lower of the register pair.
When calling a 'C' subroutine from assembler the same rules and registers apply,
load the proper parameters into R25-R18 and expect the results in the corresponding
registers. To illustrate this concept we will add two 16 bit numbers as we did in
the first example above but after calling the Assembler subroutine from C we will
just make a call to a C routine that will add the two numbers and return the result
and as you will see the same results will be obtained.
int AsmSubroutine(int, int);
int AddCSubroutine(int, int);
int main()
{
int iRetVal = 0;
iRetVal = AsmSubroutine(1024, 16);
}
int AddCSubroutine(int p1, int p2)
{
return p1 + p2;
}
If you compare this with the first example above you will notice that they are
identical.
iRetVal = AsmSubroutine(1024, 16);
320: 80 e0 ldi r24, 0x00 ; 0
322: 94 e0 ldi r25, 0x04 ; 4
324: 60 e1 ldi r22, 0x10 ; 16
326: 70 e0 ldi r23, 0x00 ; 0
328: 0e 94 b2 01 call 0x364 ; 0x364 <AsmSubroutine>
32c: 90 93 01 06 sts 0x0601, r25
330: 80 93 00 06 sts 0x0600, r24
The assembler subroutine merely calls the 'C' subroutine demonstrating that the same
registers are used throughout the process.
.section .text
.global AsmSubroutine
AsmSubroutine:
call AddCSubroutine
ret
.end
This simple but complete assembler program demonstrates the basic components needed
for an assmbler application. The application reads data from program memory and
writes it in reverse order into data memory demonstrating how the program and data
memories are accessed. The example is well commented so no further explanation is
provided.
.NOLIST
.include "m1280def.inc"
.LIST
.macro SET_STACK
ldi r16, LOW(RAMEND)
out spl, r16
ldi r16, HIGH(RAMEND)
out sph, r16
.endmacro
.dseg
msgd:
.byte 0x20
.cseg
.org 0
rjmp start
.org 0x20
start:
SET_STACK ;Invoke our macro to set stack ptr
ldi ZH,high(msg<<1) ;Set Z pointer to message
ldi ZL,low(msg<<1)
rcall get_length ;call subroutine to get length
ldi XH,high(msgd) ;Set X pointer to destination in
ldi XL,low(msgd) ; data memory.
add XL,r17 ;Add count to X pointer,
loop:
lpm r24,Z+
st X,r24
dec XL
dec r17
brge loop
ret
get_length:
push ZH
push ZL
ldi r17,0
loop1:
lpm r24,Z+
cpi r24,0
breq exit
inc r17
rjmp loop1
exit:
pop ZL
pop ZH
ret
msg:
.db "String to be reversed",0
In this article I have made an attempt to touch on the important aspects of the
AVR Assembler language but it is such a broad subject that it would be impossible
to cover the entire subject in one setting.
The best way to learn assembler is to go through code and see what others have done or to write a segment of code in C and go into the list file and view the assembler listing. But the bottom line is you
have to get your hands dirty.