Introduction
- Environment setup
- C++ support code and the console
- Descriptor tables and interrupts
- The Real-Time Clock, Programmable Interrupt Timer and KeyBoard Controller
If you’re reading this article, chances are you want to know more about how to create your own operating system. The first thing you need to know is that not everything can be covered in only a few articles. I’ll cover some of the basics, but you’ll need to know how to research and look through technical documents in order to truly understand why your operating system works the way it does. Still reading? Good. Then let’s begin.
The primary thing to understand about OSDev is that you really are starting from scratch, and working with the bare hardware. You’ll probably encounter bugs in the hardware itself, and inconsistencies in places you wouldn’t expect. There are no standard libraries available and no .NET Framework. You’re on your own.
To start, you need to understand what you are creating, and how control is given to you. Strangely, there is no well-defined standard about the state of a computer when it is switched on. The only thing you can really guarantee is that execution will start with a boot sector, which is located at the beginning of a bootable storage medium, loaded at 0x7C00, is 512 bytes long, and has the signature 0xAA55 at the end of the file. This is commonly written in assembly language. To save time, we will use GrUB; Grand Unified Bootloader. This gives us a good base to work from, with the computer in a standardised state.
The state the computer gets put into by GrUB is as follows:
- Protected mode
- A20 Gate enabled
- EBX contains a pointer to the Multiboot information structure
- EAX contains the value 0x2BADB002
- Paging off
- Stack somewhere in memory
- Interrupts disabled
I'll go through these one by one. Protected mode allows a kernel developer to access up to 4 gigabytes of memory per virtual address space, and introduces the concept of rings. You’ll find more information about these in a few paragraphs' time. The A20 gate is a legacy line located near to the keyboard controller. It was originally designed for compatibility with the 8086, and wraps any attempt to reference memory at a point larger than one megabyte around to the start of memory when disabled.
The Multiboot information structure is a very nice thing indeed. It tells us a lot about the system, such as program information (if you want to load symbol tables for debugging), a memory map, modules which GrUB may have loaded, the boot device, and other pieces of information. The full specification is here.
The Multiboot magic value is largely a safety mechanism. By comparing the value to the required value, you can see if the kernel’s been loaded by a Multiboot-compliant bootloader. It's nice, but not terribly useful. Paging is a very complex subject, which I won't go into here. Suffice to say that it allows you to give any process its own address space, which has been mapped to a physical memory address. It also prevents the process from rewriting kernel memory using privilege levels.
A privilege level, also known as a processor ring, is a hardware-based protection mechanism which usually accompanies protected mode. It prevents code executing in a certain ring from executing privileged instructions like HLT, CLI and STI. When these instructions are encountered, a General Protection Fault is raised and the OS must deal with this in an appropriate manner.
The stack is the main problem here. It could be absolutely anywhere in usable memory. For now it's fine, but this will probably be relocated in most kernels when paging and multitasking are active. You will probably find it interesting to know that it grows downwards, so can in theory overwrite kernel code.
Interrupts are off. This is a very good thing, as you don't have the infrastructure needed to handle them. When you do build this infrastructure, you'll be able to respond to requests from hardware such as keyboards, mice, network cards, clocks, etc. When an interrupt occurs, normal program flow is suspended and the function which you provide the CPU with an offset to is invoked (the actual process is a little more complex, but this is the gist of it.) When the interrupt has been acknowledged, normal program flow continues.
Environment
Now that some of the theory is in place, we can start building a tool chain. This is a fairly simple process, but you won’t be using Microsoft’s compilers; this is mostly because they can’t generate ELF files which can be loaded by GrUB.
In order to compile and use the tool chain, you’ll need the Cygwin environment. This is effectively an emulation layer which allows us to compile and use POSIX-compliant software under Windows. When you install this software, make sure you install GCC, NASM, Make, Bison, GenIsoImage and Flex. These will allow us to compile our source code.
A cross-compiler is an ordinary compiler which runs on one system but produces executables which run on another. Our compilers will be version 4.2.4 of GCC and version 2.02 of NASM. To support the compilers, we’ll also need version 2.18 of Binutils. Binutils is basically a set of information tools and low-level compilers which provide the low-level infrastructure needed for compilations.
To save time, you can download the full complement of necessary compilers from this article’s web page. However, should you wish to compile them yourselves, the instructions you need can be found here. I won’t go into detail.
If you followed the instructions on the OSDev Wiki, then your compilers will have their directories added to the PATH environment variable. We’ll want this as well, so extract all of the compilers you find in the zip file to C:\Cygwin\usr\cross\bin, and add that to the aforementioned environment variable. This means that you don’t have to specify full paths to the cross-compilers anymore, saving time.
Provided everything’s done, you now have a working cross-compiler. It still needs some cajoling to get rid of the operating system dependant #includes
, but this will suffice. Look in C:\Cygwin\usr\cross\bin. There’s a whole raft of executables there, but here are the main ones and their functions:
AS | GCC’s assembly code compiler. We’ll be using NASM, but GCC will stick to this one |
G++ | C++ compiler. GCC should pass through to this automatically |
GCC | Our main compiler. It compiles everything down to assembly code then passes that through to AS |
LD | The linker. This trims out the unused references and gives us our final executable |
An important note is that no elements of the cross-compiler will operate outside of Cygwin – they’ll complain about the absence of cygwin1.dll.
You now have the tools to completely link your source code into an independent file. What’s now needed is to be able to actually package run this file. Traditionally, we would copy the file onto a virtual floppy disk, and use an emulator such as Virtual PC or Bochs. However, with the advent of optical storage, we can now put the compiled kernel onto a CD image (an ISO file). Then, we can simply boot from it.
The first thing which we’ll need is GrUB (unless you feel like a challenge and want to write a Multiboot-compliant bootloader which can parse the file system). To get this, simply download this file and copy the file “stage2_eltorito” to the folder which will contain the uncompressed contents of the CD image.
This is a very important step: make sure you have a very well-defined folder structure. Without one of these, you’ll find it very difficult to organise your code, and separate your header files from your actual source code. Additionally, you’ll find it far more difficult to use genisoimage, and the make file you’ll be setting up will be overly complex.
To make the ISO image, you’ll need that well-defined folder structure I told you about in the previous paragraph. You’ll also need a program which comes with Cygwin called genisoimage. Before we start, you need a directory which will be copied into the ISO image. We’ll call this IsoSource for the sake of simplicity. GrUB will see this as a CDROM drive when we boot from it. Inside IsoSource, you need two directories, one inside the other. The first-level directory must be called boot, and inside that should be a directory named grub. The grub directory should have the file “stage2_eltorito” inside it, accompanied by a “menu.lst” file.
To make this a little easier to understand, this diagram represents the directory structure you need:
“menu.lst” is the menu file, which GrUB parses. A vanilla menu file will have two lines; one which states the title of the kernel, and one which contains the path. In our case, the menu.lst file will look like this:
title TutorialKernel
kernel /kernel
It’s fairly self-describing, but the documentation for it is located here. As we can see, the first line simply states the title, which will appear in the list of operating systems to load, and the second line gives us a relative path to the kernel file. You could easily make a more complex one, which reduces the timeout for the menu or adds modules (files which are loaded into memory and passed to the kernel), but this is all you really need.
Generating the Image
Now that we have the folder structure sorted out, we simply need to generate the ISO. The command that we use is similar to this:
genisoimage -R -b boot/grub/stage2_eltorito -no-emul-boot -boot-load-size 4
-boot-info-table -o "ISO Image/Compressed image.iso" "ISO Image/IsoSource"
A quick run-through of this will be beneficial. I’ll go through the parameters individually.
-R | Generate Rock Ridge information |
-b | The next file path contains the boot image |
-no-emul-boot | Don’t try to emulate a floppy disk |
-boot-load-size | The next parameter contains the number of sectors loaded at a time |
-boot-info-table | Add an info table to the ISO image |
-o | The next file path is the output file |
“ISO Image/IsoSource” | This is the last parameter, and tells GenIsoImage where to find the directory structure |
Provided you have the directory structure exactly right, that will churn out an ISO image, which you can either boot a real computer or a virtual machine from.
Now it’s simply a matter of putting it all together. In short, to create a kernel from source, we need to:
- Compile every ASM and CPP file using NASM and i586-elf-g++
- Link the resulting .o files into a single file using our specialised link file and i586-elf-ld
- Copy that file to the ISO image’s source folder
- Run GenIsoImage with the parameters above
Now that the task has been broken down, we can easily use a Make file. I’ll leave that as an exercise for the reader; just remember that you must perform the steps in the exact order they’re given in. Don’t mix them up!
Now, you’ll probably want to test your kernel. For this, we use a virtual machine. It may be best to test on more than one – different machines have different capabilities, and react to your code in different ways. On my computer I have Bochs and Microsoft Virtual PC installed; Bochs for the graphical debugging capability and Virtual PC for the real-time simulations.
I won’t cover setting up a Virtual PC image here; you can do that quite easily yourself. Our kernel runs on any setup right now, so it doesn’t matter too much. The only quirk is that you probably won’t need a Virtual Hard Disk file yet – you’ve got no way of using it.
Bochs, on the other hand is quite different. There are hundreds of options to set up the configuration just how you need it. These options can either be manually entered or loaded from a text file. You can download Bochs from here; you just need to install it.
When you’ve installed it, you can run it for the first time. You’ll see a command line window and a window titled ‘Bochs Start Menu’. Here is where you actually make your configuration file. I’ll go through the options you will probably want to use and will find useful.
Logfile | Very useful. If you’ve got a problem with your operating system, this provides you with a lot of information. Set the filename to somewhere you can easily get to, possibly in your project directory |
CPU | If you’re checking your processor identification function, you’ll find it useful to check your results here |
Memory | You can alter the amount of memory your guest operating will use here. Very helpful and equally important |
Disk & Boot | The most important part of all. By working your way through here, you can add a CD drive to the simulation, which loads its data from an ISO image. The last main tab, Boot Options, also allows you to specify the boot order |
Other | This allows you to enable the port 0xE9 hack. Basically, you write a value to this port, and trigger a breakpoint in Bochs. Very useful for debugging |
The graphical interface will only take you so far though – it doesn’t seem to have an option to enable the graphical debugger. Save the options you’ve selected to a BXRC file, and open the file in Notepad. There should be a line which starts with ‘display_library’. Alter that line to ‘display_library: win32, options="gui_debug"’. You now have Bochs set up to use the graphical debugger, with the ISO image to boot from. All you need now is some code to compile and run!
I’ve said this before, and will say it once again: you have no standard library. C++ is only supported if you use neither exceptions nor Runtime Type Information. You’ll have to set up the compiler to stop these from getting referred to by the linker, so you’ll have to add multiple command line options. There’s no built-in memory allocation so you’ll have to implement your own NEW and DELETE operators.
The general format of the compiler options is as follows:
i586-elf-g++ -W -Werror -Wall -Wpointer-arith -Wcast-align
-Wno-unused-parameter -nostdlib -fno-builtin -fno-rtti -fno-exceptions
-c "Input file" -o "Output file"
Every parameter starting with ‘W’ is a warning. This will warn you if you make some errors with pointer arithmetic and cast alignments. Werror will treat all warnings as errors, to prevent you from ignoring mistakes.
Now we simply need to put it all together. Whether you use a batch file or a make file is up to you; I personally recommend a make file for scalability as you create new files. This is the basic template:
i586-elf-g++ -W -Werror -Wall -Wpointer-arith -Wcast-align
-Wno-unused-parameter -nostdlib -fno-builtin -fno-rtti -fno-exceptions -c
"Main.cpp" -o "Main.o"
i586-elf-ld –T "Link.ld" –o "kernel" "Main.o"
cp "kernel" "ISO Image/IsoSource"
genisoimage -R -b boot/grub/stage2_eltorito -no-emul-boot -boot-load-size 4
-boot-info-table -o "ISO Image/Compressed image.iso" "ISO Image/IsoSource"
And hey presto, your kernel is compiled, linked, copied and compressed. Boot Virtual PC from that ISO image and your code will be executed. The only thing you need now is some code to compile! You can find a starting point for this code in the next article, which you can find a link to below.
Next: C++ support code and the console