Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming

Writing a boot loader in Assembly and C - Part 1

4.88/5 (139 votes)
20 Apr 2015CPOL29 min read 376.8K  
How to boot a floppy image with your own hand written code in C and Assembly

Introduction

I consider this article to be an introduction on writing a boot loader in C and Assembly and I did not want to get into performance comparisons against code written in C and Assembly in terms of writing a boot loader. In this article, I will only try to brief you about how to boot a floppy image by writing your own code and inject into the boot sector of a device (a boot loader program). During this process, I will break down the article into various sections. It felt hard to explain about computers, bootable devices, and how to write code in a single article so I have tried to do my best to explain the most common aspects of learning about computers and what is meant by booting. I tried to generalize the meaning and importance of each stage so that it may be easy to understand and remember too. In case you require more detailed explanation, you can browse through many articles provided on the internet.

What is the scope of the article?

I will limit the scope of the article to how to write program code and how to copy it to the boot sector of a floppy disk image, and then how to test the floppy disk if it boots with your program code or not, using an x86 emulator like bochs on Linux.

What I have not explained in the article

I have not explained why a boot loader cannot be written in other languages like assembly and the drawbacks in terms of writing a boot loader in one language when compared to another. As this is an introductory learning article on how to write boot code I don't want to bother you much about more advanced topics like speed, writing smaller code, etc.

How the article is organized

As far I am concerned I want to introduce in the article some of the basics and then follow that with code. So here is the break down of the contents in the order I brief you about how to write a boot loader.

  • Introduction to Bootable Devices.
  • Introduction to the development environment.
  • Introduction to microprocessor.
  • Writing code in an Assembler.
  • Writing code in a Compiler.
  • A mini-project to display rectangles.

Note:

  • This article really helps you a lot if you have prior programming experience in any language. Though this article seems to be fairly introductory, writing programs in Assembly and C can be a daunting task during boot time. If you are new to computer programming then I would suggest you to read a few tutorials on introducing programming and computer fundamentals and then come back to this article.
  • Throughout this article, I will be introducing you to various terminologies related to computers in the way of questions and answers. To be frank, I will write this article as if I am introducing this article to myself. So many question and answer kind of conversations has been put in to make sure that I will understand its importance and purpose in my every day life. Example: What do you mean by computers?, or Why do I need them because I am much smarter than them?

So, let us begin then....:)

Introduction to Bootable Devices

What happens when a typical computer is powered on?

Normally, when a computer is turned on the power button signals power supply to send proper voltage to computer and other components such as CPU, Monitor, Keyboard, Mouse. CPU initializes Basic Input Output System Read only Memory chip to load an executable program. Once the BIOS chip is initialized, it passes a special program to the CPU to execute called as BIOS and below are its functionality.

  • A BIOS is a special program that is embedded in BIOS chip.
  • The BIOS program is executed which in turn performs the following tasks.
  • Runs Power On self Test.
  • Checks the clocks and various buses available.
  • Checks system clock and hardware information in CMOS RAM
  • Verifies system settings, hardware settings pre-configured etc.,
  • Tests the attached hardware starting from devices like RAM, disk drives, optical drives, hardware drives and so on.
  • Depending upon the pre-configured information in BIOS Bootable devices information, it searches for a boot drive based on the information available in the settings and starts initializing it to proceed further.

Note: All x86 compatible CPUs start in an operating mode called as Real Mode during the booting.

What is a bootable device?

A device is bootable device if it contains a boot sector or boot block and bios reads that device by first loading the boot sector into memory (RAM) for execution and then proceeds further.

What is a sector?

A sector is a specifically sized division of a bootable disk. Usually a sector is of 512 bytes in size. I will explain you more about how a computer memory is measured and what are the various terminologies associated with it in the coming sections.

What is a boot sector?

A boot sector or a boot block is a region on a bootable device that contains machine code to be loaded into RAM by a computer system’s built-in firmware during its initialization. It is of 512 bytes on a floppy disk. You will come to know more about bytes in the coming sections.

How does a bootable device work?

Whenever a bootable device is initialized, bios searches and loads the 1st sector which is known as boot sector or boot block into the RAM and starts executing it. Whatever the code resides inside a boot sector is the first program you may edit to define the functionality of the computer for the rest of the time. What I mean here is you can write your own code and copy it to the boot sector to make the computer work in accordance with your requisites. The program code that you intend to write to the boot sector of a device is also called as boot loader.

What is a Boot Loader?

In computing, a boot loader is a special program that is executed each time a bootable device is initialized by the computer during its power on or reset. It is an executable machine code, which is very specific to the hardware architecture of the type of CPU or microprocessor.

How many types of microprocessor are available?

I will list out the following mainly.

  • 16 bit
  • 32 bit
  • 64 bit

Normally the more the number of bits the more memory space the programs are accessed to and the more performance they gain in terms of temporary storage etc. There are two major manufacturers of the microprocessors in business today and they are Intel and AMD. Through the rest of this article I will be referring only to Intel based family(x86) microprocessors.

What is the difference between Intel based microprocessors and AMD based microprocessors?

Each company has their own unique way of designing the microprocessors in terms of hardware and instruction sets used for the interactions.

Introduction to the development environment.

What is Real Mode?

Earlier in section “What happens when a computer boots”, I have mentioned that all x86 CPUs while booting from a device start in a real mode. It is very important to make a note of this while writing a boot code for any device. Real mode supports only 16-bit instructions. So the code you write to load into a boot record or boot sector of a device should be compiled to only 16 bit compatible code. In real mode, the instructions can work with a maximum of 16-bits at once, for example: a 16-bit CPU will have a particular instruction that can add two 16-bit numbers together in one CPU cycle, if it was necessary for a process to add together two 32-bit numbers, then it would take more cycles, that make use of 16-bit addition.

What is an instruction set?

A heterogeneous collection of entities that are very specific to the architecture (in terms of design) of the microprocessor that a user can use to interact with a microprocessor. I mean a collection of entities, which comprises of native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling and external I/O. Usually a group of instructions are made common available for a family of microprocessor. The 8086 microprocessor is one of the family of 8086, 80286, 80386, 80486, Pentium, Pentium I, II, III …. also referred to as the X86 family. Through out this article I will refer to the instruction set referring to the x86 family of microprocessors.

How to write your own code to boot sector of a device?

To successfully achieve this task, we need to know about the below.

  • Operating system (GNU Linux)
  • Assembler (GNU Assembler)
  • Instruction set(x86 family)
  • Writing x86 Instructions on GNU Assembler for x86 Microprocessor.
  • Compiler (C programming language - optional)
  • Linker (GNU linker ld)
  • An x86 emulator like bochs used for our testing purposes.

What is an Operating System?

I will explain this in a very simple way. A big collection of various programs written by 100s and 1000s of professionals includes applications and utilities to help individuals and people across the globe. A part from technical stand point of view in general an operating system is mainly written to provide various applications to help people a lot in their daily life activities. Like connecting to internet, chatting, browsing the net, create files, save files, data, process data and a lot more. I still did not understand. What I mean here is that you may want to chat with your friends, you may want to watch news online, you may want to write some personal information to a file, you may want to watch some movies, you may want to calculate some mathematical equations, you may want to play games, you may want to do write programs and more…All these tasks can be achieved by means of an operating system. The job of an operating system is to provide with enough tools to help you and serve you. Some of the activities you want to multitask too and it is the job of the operating system to manage hardware and provide you the best experience it can to you.

Also, please make a note that, all modern operating systems operate in protected mode.

What are the different types of Operating System?

  • Windows
  • Linux
  • MAC

And more…

What is protected mode?

Unlike in Real mode, protected mode supports 32-bit instructions. Do not worry about it now a lot, as we are not much bothered about how an operating system works etc.

What is an Assembler?

An assembler converts the instructions given by a user to a machine code.

Even a compiler does the same...doesn't it?

At a higher-level yes...but it is actually the assembler which is embedded inside a compiler does this activity.

Then why can't a compiler generate machine code directly?

The primary job of a compiler mainly falls into converting the instructions written by a user into an intermediate set of instructions called as assembly language instructions. Then the assembler will consume these instructions and will convert into the respective machine code.

Why do I need an operating system to write a code for boot sector?

Right now, I do not want to get into very detailed level of explanation but let me explain in terms of the scope of this article. Well! Earlier I mentioned that in order to write instructions that can be understood by a microprocessor, we need compiler and this compiler is developed as a utility in Operating Systems. I told you that Operating Systems are designed to help people providing various utilities and compilers are one of the utilities too.

Which Operating System may I use?

I have written programs on Ubuntu Operating system to boot from a floppy device so I would recommend Ubuntu for this article.

Which compiler should I use?

I have written programs using GNU GCC compiler and I will how to compile the code using the same. How do I test a hand written code to a boot sector of a device? I will introduce you to an x86 emulator which can help us to a great levels without letting us to restart the computer each time we edit the boot sector of the device.

Introduction to microprocessor

In order to learn programming a microprocessor, first we need to learn how to use registers.

What are registers?

Registers are like utilities of a microprocessor to store data temporarily and manipulate it as per our requirements. Suppose say if the user wants to add 3 with 2, the user asks the computer to store number 3 in one register and number 2 in more register and then add the contents of the these two registers and the result is placed in another register by the CPU which is the output that we desire to see. There are four types of registers and are listed below.

  • General purpose registers
  • Segment registers
  • Stack registers
  • Index registers

Let me brief you about each of the types.

General purpose registers: These are used to store temporary data required by the program during its lifecycle. Each of these registers is 16 bit wide or 2 bytes long.

  • AX - the accumulator register
  • BX - the base address register
  • CX - the count register
  • DX - the data register

Segment Registers: To represent a memory address to a microprocessor, there are two terms we need to be aware of:

  • Segment: It is usually the beginning of the block of a memory.
  • Offset: It is the index of memory block onto it.

Example: Suppose say, there is a byte whose value is 'X' that is present on a block of memory whose start address is 0x7c00 and the byte is located at the 10th position from the beginning. In this situation, We represent segment as 0x7c00 and the offset as 10.
The absolute address is 0x7c00 + 10.

There are four categories that I wanted to list out.

  • CS - code segment
  • SS - stack segment
  • DS - data segment
  • ES - extended segment

But there is always a limitation with these registers. You cannot directly assign an address to these registers. What we can do is, copy the address to a general purpose registers and then copy the address from that register to the segment registers. Example: To solve the problem of locating byte 'X', we do the following way

ASM
movw $0x07c0, %ax
movw %ax    , %ds
movw (0x0A) , %ax 

In our case what happens is

  • set 0x07c0 * 16 in AX
  • set DS = AX = 0x7c00
  • set 0x7c00 + 0x0a to ax

I will describe about the various addressing modes that we need to understand while writing programs.

Stack Registers:

  • BP - base pointer
  • SP - stack pointer

Index Registers:

  • SI - source index register.
  • DI - destination index register.
  • AX: CPU uses it for arithmetic operations.
  • BX: It can hold the address of a procedure or variable (SI, DI, and BP can also). And also perform arithmetic and data movement.
  • CX: It acts as a counter for repeating or looping instructions.
  • DX: It holds the high 16 bits of the product in multiply (also handles divide operations).
  • CS: It holds base location for all executable instructions in a program.
  • SS: It holds the base location of the stack.
  • DS: It holds the default base location for variables.
  • ES: It holds additional base location for memory variables.
  • BP: It contains an assumed offset from the SS register. Often used by a subroutine to locate variables that were passed on the stack by a calling program.
  • SP: Contains the offset of the top of the stack.
  • SI: Used in string movement instructions. The source string is pointed to by the SI register.
  • DI: Acts as the destination for string movement instructions.

What is a bit?

In computing, a bit is the smallest unit where data can be stored. Bits store data in the form of binary. Either a 1(On) or 0(Off).

More about registers:

The registers are further divided as below following left to right order or bits:

  • AX: The first 8 bits of AX is identified as AL and the last 8 bits is identified as AH
  • BX: The first 8 bits of BX is identified as BL and the last 8 bits is identified as BH
  • CX: The first 8 bits of CX is identified as CL and the last 8 bits is identified as CH
  • DX: The first 8 bits of DX is identified as DL and the last 8 bits is identified as DH

How to access BIOS functions?

BIOS provide a set of functions that let us draw the attention of the CPU. One will be able to access BIOS features through interrupts.

What are interrupts?

To interrupt the ordinary flow of a program and to process events that require prompt response we use interrupts. The hardware of a computer provides a mechanism called interrupts to handle events. For example, when a mouse is moved, the mouse hardware interrupts the current program to handle the mouse movement (to move the mouse cursor, etc.) Interrupts cause control to be passed to an interrupt handler. Interrupt handlers are routines that process the interrupt. Each type of interrupt is assigned an integer number. At the beginning of physical memory, a table of interrupt vectors resides that contain the segmented addresses of the interrupt handlers. The number of interrupt is essentially an index into this table. We can also called as the interrupt as a service offered by BIOS.

Which interrupt service are we going to use in our programs?

Bios interrupt 0x10.

Writing code in an Assembler

What are the various Data types available in GNU Assembler?

A group of bits used in representing a unit to frame various data types.

What is a data type?

A data type is used to identify the characteristic of a data. Various data types are as below.

  • byte
  • word
  • int
  • ascii
  • asciz

 

byte: It is eight bits long. A byte is considered as the smallest unit on a computer onto which data can be stored through programming.

word: It is a unit of data that is 16 bits long.

What is an int?

An int is a data type that represents data of 32 bits long. Four bytes or two words constitute an int.

What is an ascii?

A data type to represent a group of bytes with out a null terminator.

What is an asciz?

A data type to represent a group of bytes terminated with a null character in the end.

How do I generate code for real mode through an assembler?

When the CPU starts in Real Mode (16-bit), all we can do while booting from a device is to utilize the built in functions provided by the BIOS to proceed further. What I mean here is we can utilize the functions of BIOS to write our own boot loader code, and then dump into onto the boot sector of the device, and then boot it. Let us see how to write a small piece of code in assembler that generates 16-bit CPU code through GNU Assembler.

Example: test.S  

 

ASM
.code16                   #generate 16-bit code
.text                     #executable code location
     .globl _start;
_start:                   #code entry point
     . = _start + 510     #mov to 510th byte from 0 pos
     .byte 0x55           #append boot signature
     .byte 0xaa           #append boot signature

Let me explain each statement in the code above.

  • .code16: It is a directive or a command given to an assembler to generate 16-bit code rather than 32-bit ones. Why is this hint necessary? Remember that you will be using an operating system to utilize an assembler and a compiler to write boot loader code. However, I have also mentioned that an operating system works in 32 bit protected mode. So when you utilize assembler on a protected mode operating system, it’s configured by default to produce 32-bit code rather than 16-bit code, which does not serve the purpose, as we need 16-bit code. To avoid assembler and compilers generating 32-bit code, we use this directive.
  • .text: The .text section contains the actual machine instructions, which make up your program.
  • .globl _start: .global <symbol> makes the symbol visible to linker. If you define symbol in your partial program, its value is made available to other partial programs that are linked with it. Otherwise, symbol takes its attributes from a symbol of the same name from another file linked into the same program.
  • _start: Entry to the main code and _start is the default entry point for the linker.
  • . = _start + 510: traverse from beginning through 510th byte
  • .byte 0x55: It is the first byte identified as a part of the boot signature.(511th byte)
  • .byte 0xaa: It is the last byte identified as a part of the boot signature.(512th byte )

 

How to compile an assembly program?

Save the code as test.S file. On the command prompt type the below:

  • as test.S -o test.o
  • ld –Ttext 0x7c00 --oformat=binary test.o –o test.bin

What does the above commands means to us anyway?

  • as test.S –o test.o: this command converts the given assembly code into respective object code which is an intermediate code generated by the assembler before converting into machine code.
  • The --oformat=binary switch tells the linker you want your output file to be a plain binary image (no startup code, no relocations, ...).
  • The –Ttext 0x7c00 tells the linker you want your "text" (code segment) address to be loaded to 0x7c00 and thus it calculates the correct address for absolute addressing.

What is a boot signature?

Remember earlier I was briefing about boot record or boot sector loaded by BIOS program. How does BIOS recognize if a device contains a boot sector or not? To answer this, I can tell you that a boot sector is 512 bytes long and in 510th byte a symbol 0x55 is expected and in the 511th byte another symbol 0xaa is expected. So I verifies if the last two bytes of a boot sector are 0x55 and 0xaa and if it is then it identifies that sector as a boot sector and proceeds execution of the boot sector code or else it throws an error that the device is not bootable. Using a hexadecimal editor you can view the contents of the binary file in a more readable way and below is the snapshot for your reference when you view the file using the hexedit tool.

How to copy the executable code to a bootable device and then test it?

To create a floppy disk image of 1.4mb size, type the following on the command prompt.

  • dd if=/dev/zero of=floppy.img bs=512 count=2880

To copy the code to the boot sector of the floppy disk image file, type the following on the command prompt.

  • dd if=test.bin of=floppy.img

To test the program type the following on the command prompt

  • bochs

If bochs is not installed then you may type the below commands

  • sudo apt-get install bochs-x
Sample bochsrc.txt file
ASM
megs: 32
#romimage: file=/usr/local/bochs/1.4.1/BIOS-bochs-latest, address=0xf0000
#vgaromimage: /usr/local/bochs/1.4.1/VGABIOS-elpin-2.40
floppya: 1_44=floppy.img, status=inserted
boot: a
log: bochsout.txt
mouse: enabled=0 

You should see a typical emulating window of bochs as below.

Image 1

Observation:  

Now if you view the test.bin file in a hexadecimal editor you will see the boot signature is placed at the end of the 510th byte and here is the screenshot for your reference.

 Image 2

Nothing has just happened as we did not write anything to display on the screen in our code. So you only see a message “Booting from Floppy”. Let us see a few more examples on writing assembly code on an assembler. 

Example: test2.S    
ASM
.code16                    #generate 16-bit code
.text                      #executable code location
     .globl _start;
_start:                    #code entry point

     movb $'X' , %al       #character to print
     movb $0x0e, %ah       #bios service code to print
     int  $0x10            #interrupt the cpu now

     . = _start + 510      #mov to 510th byte from 0 pos
     .byte 0x55            #append boot signature
     .byte 0xaa            #append boot signature 

After typing the above, save to test2.S and then compile as instructed before by changing the source file name. When you compile and successfully copy this code to the boot sector and run bochs you should see the below screen. On the command prompt type bochs to see the result and you should see the letter ‘X’ on the screen as shown in the below screenshot.

Image 3

Congrats!!!

Observation:

if viewed in a hexadecimal editor, you will see that the character 'X' is in the second position from the start address.

Image 4

Now lets do something different like printing a text onto the screen.

Example: test3.S  
ASM
.code16                  #generate 16-bit code
.text                    #executable code location
     .globl _start;

_start:                  #code entry point

     #print letter 'H' onto the screen
     movb $'H' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'e' onto the screen
     movb $'e' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'l' onto the screen
     movb $'l' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'l' onto the screen
     movb $'l' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'o' onto the screen
     movb $'o' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter ',' onto the screen
     movb $',' , %al
     movb $0x0e, %ah
     int  $0x10

     #print space onto the screen
     movb $' ' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'W' onto the screen
     movb $'W' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'o' onto the screen
     movb $'o' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'r' onto the screen
     movb $'r' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'l' onto the screen
     movb $'l' , %al
     movb $0x0e, %ah
     int  $0x10

     #print letter 'd' onto the screen
     movb $'d' , %al
     movb $0x0e, %ah
     int  $0x10

     . = _start + 510      #mov to 510th byte from 0 pos
     .byte 0x55            #append boot signature
     .byte 0xaa            #append boot signature 

Save it as test3.S. When you compile and successfully copy this code to the boot sector and run bochs you should see the below screen.

Image 5

Observation:

Image 6 

Okay...now we do something more different than the previous programs.  

Let us write an assembly program to print the letters "Hello, World" onto the screen.  

We will also try to define functions and macros through which we will try to print the string.

Example: test4.S     
ASM
#generate 16-bit code
.code16

#hint the assembler that here is the executable code located
.text
.globl _start;
#boot code entry
_start:
      jmp _boot                           #jump to boot code
      welcome: .asciz "Hello, World\n\r"  #here we define the string

     .macro mWriteString str              #macro which calls a function to print a string
          leaw  \str, %si
          call .writeStringIn
     .endm

     #function to print the string
     .writeStringIn:
          lodsb
          orb  %al, %al
          jz   .writeStringOut
          movb $0x0e, %ah
          int  $0x10
          jmp  .writeStringIn
     .writeStringOut:
     ret

_boot:
     mWriteString welcome

     #move to 510th byte from the start and append boot signature
     . = _start + 510
     .byte 0x55
     .byte 0xaa  

Save it as test4.S. When you compile and successfully copy this code to the boot sector and run bochs you should see the below screen.

Image 7

 

Well!!! If you did understand what I have done and you were able to write similar program then congratulations again!

Observation:

Image 8

What is a function?

A function is a block of code that has a name and it has a property that it is reusable.

What is a macro?

A macro is a fragment of code, which has been given a name. Whenever the name is used, it is replaced by the contents of the macro.

What is the difference between a macro and a function in terms of syntax?

To call a function we use the below syntax.

  • push <argument>
  • call <function name>

To call a macro we use the below syntax

  • macroname <argument>

But the calling and usage syntax of the macro is very simple when compared to that of a function. So I preferred to write a macro and use it instead of calling a function in the main code. You can refer to more materials online as to how to write assembly code on GNU Assembler.

Writing code in a C-Compiler

What is C?

In computing, C is a general-purpose programming language initially developed by Dennis Ritchie between 1969 and 1973 at AT&T Bell Labs.

Why use C? A machine dependent language but programs written in C are usually small and fast to execute. The language includes low-level features that are normally available only in assembly or machine language. C is a structured programming language.

Why do I need to write code in C?

Well if you want to write smaller programs and want them to be really fast then go for it.

What do I need to write code in C language?

Well, we will be using GNU C compiler called gcc to write C code.

How to write programs in GCC compiler in C?

Let us write a program to see how it looks like.

Example: test.c
C++
__asm__(".code16\n");
__asm__("jmpl $0x0000, $main\n");

void main() {
} 
File: test.ld
C++
ENTRY(main);
SECTIONS
{
    . = 0x7C00;
    .text : AT(0x7C00)
    {
        *(.text);
    }
    .sig : AT(0x7DFE)
    {
        SHORT(0xaa55);
    }
} 

How to compile a C program? On the command prompt type the below:

  • gcc -c -g -Os -march=i686 -ffreestanding -Wall -Werror test.c -o test.o
  • ld -static -Ttest.ld -nostdlib --nmagic -o test.elf test.o
  • objcopy -O binary test.elf test.bin

What does the above commands means to us anyway?

This command converts the given C code into respective object code which is an intermediate code generated by the compiler before converting into machine code.

  • gcc -c -g -Os -march=i686 -ffreestanding -Wall -Werror test.c -o test.o:

What does each flag mean?

  • -c: It is used to compile the given source code without linking.
  • -g: Generates debug information to be used by GDB debugger.
  • -Os: optimization for code size
  • -march: Generates code for the specific CPU architecture (in our case i686)
  • -ffreestanding: A freestanding environment is one in which the standard library may not exist, and program startup may not necessarily be at ‘main’.
  • -Wall: Enable all compiler's warning messages. This option should always be used, in order to generate better code.
  • -Werror: Enable warnings being treated as errors
  • test.c: input source file name
  • -o: generate object code
  • test.o: output object code file name.

With all the above combinations of flags to the compiler, we try to generate object code which helps us in identifying errors, warnings and also produce much efficient code for the type of CPU. If you do not specify march=i686 it generates code for the machine type you have or else it on order to port it always better to specify which type of CPU are you targeting for.

  • ld -static -Ttest.ld -nostdlib --nmagic test.elf -o test.o:

This is the command to invoke linker from the command prompt and I have explained below what are we trying to do with the linker.

What does each flag mean?

  • -static: Do not link against shared libraries.
  • -Ttest.ld: This feature permits the linker to follow commands from a linker script.
  • -nostdlib: This feature permits the linker to generate code by linking no standard C library startup functions.
  • --nmagic:This feature permits the linker to generate code without _start_SECTION and _stop_SECTION codes.
  • test.elf: input file name(platform dependent file format to store executables Windows: PE, Linux: ELF)
  • -o: generate object code
  • test.o: output object code file name.

What is a linker?

It is the final stage of compilation. The ld(linker) takes one or more object files or libraries as input and combines them to produce a single (usually executable) file. In doing so, it resolves references to external symbols, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses (a process called relocation).

Also remember that we have no standard libraries and all fancy functions to use in our code.

  • objcopy -O binary test.elf test.bin

This command is used to generate platform independent code. Note that Linux stores executables in a different way than windows. Each have their own way storing files but we are just developing a small code to boot which does not depend on any operating system at the moment. So we are dependent on neither of those as we don't require an Operating system to run our code during boot time.

Why use assembly statements inside a C program?

In Real Mode, the BIOS functions can be easily accessed through software interrupts, using Assembly language instructions. This has lead to the usage of inline assembly in our C code.

How to copy the executable code to a bootable device and then test it?

To create a floppy disk image of 1.4mb size, type the following on the command prompt.

  • dd if=/dev/zero of=floppy.img bs=512 count=2880

To copy the code to the boot sector of the floppy disk image file, type the following on the command prompt.

  • dd if=test.bin of=floppy.img

To test the program type the following on the command prompt

  • bochs

You should see a typical emulating window of bochs as below.

Image 9

Observation: Nothing has just happened as we did not write anything to display on the screen in our code. So you only see a message “Booting from Floppy”. Congrats!!!

  • We use __asm__ keyword to embed assembly language statements into a C program. This keyword hints the compiler to recognize that it is an assembly instruction given by the user.
  • We also use __volatile__ to hint the assembler not to modify our code and let it as it is.

This way of embedding assembly code inside C code is called as inline assembly.

Let us see a few more examples on writing code on a Compiler.

Let us write an assembly program to print the letter ‘X’ onto the screen.

Example: test2.c

C++
__asm__(".code16\n");
__asm__("jmpl $0x0000, $main\n");

void main() {
     __asm__ __volatile__ ("movb $'X'  , %al\n");
     __asm__ __volatile__ ("movb $0x0e, %ah\n");
     __asm__ __volatile__ ("int $0x10\n");
}

After typing the above, save to test2.c and then compile as instructed before by changing the source file name. When you compile and successfully copy this code to the boot sector and run bochs you should see the below screen. On the command prompt type bochs to see the result and you should see the letter ‘X’ on the screen as shown in the below screen shot.

Image 10

Now, let us write a c program to print the letters “Hello, World” onto the screen.

We will also try to define functions and macros through which we will try to print the string.

Example: test3.c

C++
/*generate 16-bit code*/
__asm__(".code16\n");
/*jump boot code entry*/
__asm__("jmpl $0x0000, $main\n");

void main() {
     /*print letter 'H' onto the screen*/
     __asm__ __volatile__("movb $'H' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'e' onto the screen*/
     __asm__ __volatile__("movb $'e' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'l' onto the screen*/
     __asm__ __volatile__("movb $'l' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'l' onto the screen*/
     __asm__ __volatile__("movb $'l' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'o' onto the screen*/
     __asm__ __volatile__("movb $'o' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter ',' onto the screen*/
     __asm__ __volatile__("movb $',' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter ' ' onto the screen*/
     __asm__ __volatile__("movb $' ' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'W' onto the screen*/
     __asm__ __volatile__("movb $'W' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'o' onto the screen*/
     __asm__ __volatile__("movb $'o' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'r' onto the screen*/
     __asm__ __volatile__("movb $'r' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'l' onto the screen*/
     __asm__ __volatile__("movb $'l' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");

     /*print letter 'd' onto the screen*/
     __asm__ __volatile__("movb $'d' , %al\n");
     __asm__ __volatile__("movb $0x0e, %ah\n");
     __asm__ __volatile__("int  $0x10\n");
}

Now save the above code as test3.c and then follow the compilation instructions given by changing the input source file name and follow the instructions given to copy the compiled code to the boot sector of the floppy. Now observe the result. You should see the below screen output if everything was fine.

Image 11

Let us write a C program to print the letters “Hello, World” onto the screen.

We will also try to define function through which we will try to print the string.

Example: test4.c

C++
/*generate 16-bit code*/
__asm__(".code16\n");
/*jump boot code entry*/
__asm__("jmpl $0x0000, $main\n");

/* user defined function to print series of characters terminated by null character */
void printString(const char* pStr) {
     while(*pStr) {
          __asm__ __volatile__ (
               "int $0x10" : : "a"(0x0e00 | *pStr), "b"(0x0007)
          );
          ++pStr;
     }
}

void main() {
     /* calling the printString function passing string as an argument */
     printString("Hello, World");
} 

Now save the above code as test3.c and then follow the compilation instructions given by changing the input source file name and follow the instructions given to copy the compiled code to the boot sector of the floppy. Now observe the result. You should see the below screen output if everything was fine.

Image 12

 

I wanted to bring to your note one point. All we are trying to do is just converting the assembly programs written earlier into C programs by way of learning. By now you should be comfortable in writing programs in Assembly and C and also well aware of how to compile and then test them.

Now we will move onto writing loops and making them work inside a function and also see more bios services.

A mini-project to display rectangles

Now let us move onto something more big…like displaying graphics.

Example: test5.c

 

C++
/* generate 16 bit code                                                 */
__asm__(".code16\n");
/* jump to main function or program code                                */
__asm__("jmpl $0x0000, $main\n");

#define MAX_COLS     320 /* maximum columns of the screen               */
#define MAX_ROWS     200 /* maximum rows of the screen                  */

/* function to print string onto the screen                             */
/* input ah = 0x0e                                                      */
/* input al = <character to print>                                      */
/* interrupt: 0x10                                                      */
/* we use interrupt 0x10 with function code 0x0e to print               */
/* a byte in al onto the screen                                         */
/* this function takes string as an argument and then                   */
/* prints character by character until it founds null                   */
/* character                                                            */
void printString(const char* pStr) {
     while(*pStr) {
          __asm__ __volatile__ (
               "int $0x10" : : "a"(0x0e00 | *pStr), "b"(0x0007)
          );
          ++pStr;
     }
}

/* function to get a keystroke from the keyboard                        */
/* input ah = 0x00                                                      */
/* input al = 0x00                                                      */
/* interrupt: 0x10                                                      */
/* we use this function to hit a key to continue by the                 */
/* user                                                                                    */
void getch() {
     __asm__ __volatile__ (
          "xorw %ax, %ax\n"
          "int $0x16\n"
     );
}

/* function to print a colored pixel onto the screen                    */
/* at a given column and at a given row                                 */
/* input ah = 0x0c                                                      */
/* input al = desired color                                             */
/* input cx = desired column                                            */
/* input dx = desired row                                               */
/* interrupt: 0x10                                                      */
void drawPixel(unsigned char color, int col, int row) {
     __asm__ __volatile__ (
          "int $0x10" : : "a"(0x0c00 | color), "c"(col), "d"(row)
     );
}

/* function to clear the screen and set the video mode to               */
/* 320x200 pixel format                                                 */
/* function to clear the screen as below                                */
/* input ah = 0x00                                                      */
/* input al = 0x03                                                      */
/* interrupt = 0x10                                                     */
/* function to set the video mode as below                              */
/* input ah = 0x00                                                      */
/* input al = 0x13                                                      */
/* interrupt = 0x10                                                     */
void initEnvironment() {
     /* clear screen                                                    */
     __asm__ __volatile__ (
          "int $0x10" : : "a"(0x03)
     );
     __asm__ __volatile__ (
          "int $0x10" : : "a"(0x0013)
     );
}

/* function to print rectangles in descending order of                  */
/* their sizes                                                          */
/* I follow the below sequence                                          */
/* (left, top)     to (left, bottom)                                    */
/* (left, bottom)  to (right, bottom)                                   */
/* (right, bottom) to (right, top)                                      */
/* (right, top)    to (left, top)                                       */
void initGraphics() {
     int i = 0, j = 0;
     int m = 0;
     int cnt1 = 0, cnt2 =0;
     unsigned char color = 10;

     for(;;) {
          if(m < (MAX_ROWS - m)) {
               ++cnt1;
          }
          if(m < (MAX_COLS - m - 3)) {
               ++cnt2;
          }

          if(cnt1 != cnt2) {
               cnt1  = 0;
               cnt2  = 0;
               m     = 0;
               if(++color > 255) color= 0;
          }

          /* (left, top) to (left, bottom)                              */
          j = 0;
          for(i = m; i < MAX_ROWS - m; ++i) {
               drawPixel(color, j+m, i);
          }
          /* (left, bottom) to (right, bottom)                          */
          for(j = m; j < MAX_COLS - m; ++j) {
               drawPixel(color, j, i);
          }

          /* (right, bottom) to (right, top)                            */
          for(i = MAX_ROWS - m - 1 ; i >= m; --i) {
               drawPixel(color, MAX_COLS - m - 1, i);
          }
          /* (right, top)   to (left, top)                              */
          for(j = MAX_COLS - m - 1; j >= m; --j) {
               drawPixel(color, j, m);
          }
          m += 6;
          if(++color > 255)  color = 0;
     }
}

/* function is boot code and it calls the below functions               */
/* print a message to the screen to make the user hit the               */
/* key to proceed further and then once the user hits then              */
/* it displays rectangles in the descending order                       */
void main() {
     printString("Now in bootloader...hit a key to continue\n\r");
     getch();
     initEnvironment();
     initGraphics();
}

Now save the above code as test5.c and then follow the compilation instructions given by changing the input source file name and follow the instructions given to copy the compiled code to the boot sector of the floppy.

Now observe the result. You should see the below screen output if everything was fine.

Image 13

Now hit a key to see what will happen further.

Image 14

Image 15

Image 16

Image 17

Observation:

If you closely look at the contents of the executable, you will observe that we were almost running out of space. As boot sector is only 512 bytes, we were able to embed only few functions into our program like intializing the environment and then printing colored rectangles but not more than that because it requires more than 512 bytes of space. Below is the snapshot for your reference.

Image 18

That’s all for this article. Have fun and write more programs to explore the real mode and you will observe that it is real fun programming in real mode using bios Interrupts. In the next article, I will try to explain about Addressing modes used to access data, reading a floppy disk and its architecture and also why a boot loader is mostly written in Assembly than C and what are the constraints in writing a bootloader in C in terms of Code Generation :) 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)