Articles in this series
- Environment setup
- C++ support code and the console
- Descriptor tables and interrupts
- The Real-Time Clock, Programmable Interrupt Timer, and KeyBoard Controller
Introduction
After the last article, you might be a little intimidated by the sheer scope of writing an Operating System. To an extent, you'd be right; there are very complex elements. But it isn't completely like this. There are some parts of an Operating System which are refreshingly simple. Four of these areas are the console, RTC, PIT, and keyboard. Since we've already looked at the console driver, we've got here a chance to look at the other three.
We're taking a break from the demanding stuff and moving at a slower place. But, we also need to be able to access MSRs when we initialise some more interesting hardware. There's also a bunch of links to important and interesting specifications toward the end, which will provide a lot of context and hardware details which are outside the scope of this article, but could be helpful during troubleshooting.
RTC
Let's have a look at the Real-Time Clock. It's a simple clock which keeps track of the current time while the computer is turned off. It is powered by a small battery; in my computers, it has always been a watch battery (CR2032), but your mileage may vary.
Of course, things aren't quite that basic. There's also a simple low-precision timer and an alarm. The data which makes up the current and alarm time and date is stored in a small amount of non-volatile memory (usually 256 bytes) named the Complementary Metal Oxide Semiconductor. We can access this memory through input and output ports, 0x70 and 0x71.
To access the CMOS, we need to send two bytes. The first byte is sent to output port 0x70, and states which register we want to access. The second byte is either sent to or read from port 0x71, and it either writes or reads the byte contained in that register.
Now that we know how to read and write registers, we would probably find it useful to know what each register is supposed to contain. The first fifteen bytes contain information about the RTC, and after that is information about the system boot state, configuration settings, shutdown status, and hard drive information. The registers important for this tutorial are listed below; other parts of the layout can be found here.
Offset | Name |
0x0
| Seconds
|
0x2
| Minutes
|
0x4
| Hours
|
0x7
| Day
|
0x8
| Month
|
0x9
| Year
|
There is also a 'Day of week' field at offset 0x6, but this is often wrong, and can vary with some BIOSs, due to the function of the Year field. The Year field doesn't store the complete year (the maximum value is 256), so instead, it stores the last two digits. For example, the Year field for 2010 would be 10. If you look at your calendar, you will see that the day of week varies dependant on the year; without the full year, this field will be inaccurate. On a side-note, when you begin to parse the ACPI tables, the CENTURY field (108 bytes in) of the Fixed ACPI Description Table will contain an offset into the CMOS which you can use to create the correct year. Because of this, you will probably want to calculate the day of the week and make the relatively safe assumption that the century is 20.
Now that you've got the theory, let's get to work. If you make a simple DateTime
structure which supports the AddXXX
methods so that you can manipulate the internal fields, you can easily read the correct fields.
unsigned char readByte(unsigned char offset)
{
outportByte(0x70, offset);
return inportByte(0x71);
}
void writeByte(unsigned char offset, unsigned char value)
{
outportByte(0x70, offset);
outportByte(0x71, value);
}
DateTime *currentDate = new DateTime();
currentDate->Year = readByte(0x9) + 2000;
currentDate->Month = readByte(0x8);
currentDate->Day = readByte(0x7);
currentDate->Hour = readByte(0x4);
currentDate->Minute = readByte(0x2);
currentDate->Second = readByte(0x0);
Notice that I work from the largest unit time to the smallest; this is because if the time changes while we're reading from the ports, our data will become inaccurate. While there is always a degree of inaccuracy in the RTC, this tries to avoid it to the greatest degree possible.
While we're considering loss of accuracy, we only read the date and time once. It would be an understatement to say that this is a problem; what we need is a way to always stay up to date. If we had multi-tasking, then we might be able to get away with a thread which continuously polls the ports. But even this is inefficient. What we need is something which will interrupt the flow of the program and tell us whenever we need to update the clock. Coincidentally, we've written the base for all this in the previous tutorial, where we setup IRQ support.
The IRQ for the RTC is 8. To enable it, we just need to set bit six of the byte at offset 0xB. The process is remarkably simple, as you may have guessed:
unsigned char registerB = readByte(0xB) | 0x40;
installIRQHandler(8, rtcHandler);
writeByte(0xB, registerB);
An interrupt will fire almost immediately, so we have to install an IRQ handler before we write the byte which switches on the interrupt. Another point is the timing of the interrupt. The default frequency is 1024 Hz, which means that it fires 1024 times per second. It can be changed by changing the bottom 4 bits of the register at offset 0xA. The frequency is determined as follows:
frequency = 32768 >> (rate - 1)
We could very easily read the time every time the IRQ fires, but what would be the point? The granularity we use is down to the second, so we would end up performing reads and writes which aren't necessary. Instead, we could only read the time every 1024th time. This is quite a large optimisation, and in itself would be very good. But, we can go further. If we read the time every second, we already know how long it's been since the last interrupt. The only thing we have to do is add a second to the current time and let that take the strain (hopefully, you've remembered that a second can also tip you over to the next minute, hour, day, month, and year).
There's also another register. The register at offset 0xC tells the interrupt handler what happened. In order to receive further interrupts, we need to read it. Whether we do anything with it or not is irrelevant.
Unfortunately, some of the data in the CMOS registers are encoded. While this is not present in every BIOS, you will want to deal with it. You can check if Binary Coded Decimal is used by looking at bit 3 of the register at offset 0xB; if it's set, then you can proceed as usual. If it's zero, then you need to decode the data which makes up the date and time. The process of decoding is very simple.
unsigned char decoded = ((encoded >> 4) * 10) + (encoded & 0xF);
For efficiency's sake, you might want to put that in a function or macro.
When we put all this together, we've got a simple clock. This is one of multiple clocks available in the system (RTC, PIT, HPET, and LAPIC), so you might want to work on a robust base class which every timing device inherits from.
PIT
The Intel 8254, also known as the Programmable Interval Timer, is another timer. Unlike the RTC, however, it allows for a far more granular interval between interrupts. It's commonly used for keeping track of seconds, and for pre-emptive multi-tasking.
On its own, the crystal oscillator at the heart of the chip would only be able to run at one speed. However, there are also three dividers, which allow for three arbitrary frequencies to be used. Our Operating System is also notified every time the oscillator oscillates at a given frequency (Hertz). The frequency is a hardwired number divided by the divider. This number is 119318.2 hertz, or 1.193862 megahertz. The reasons for this pertain to the legacy portions of a computer, for which a television oscillator was used, divided and ANDed, resulting in an odd frequency.
Even though the PIC has three outputs, we only have channels 0 and 2 available to us. This is because channel 1 is used to refresh some memory, to prevent it from losing its state. In modern computers, this is no longer necessary, but we don't use it because (like many portions of the x86 architecture) if our code is run on an older machine, we'll mess up memory, causing (what appear to be) random errors.
So our channels are 0 and 2. We can use these channels for two purposes: timing and sound. This is a relatively easy device to set up; we simply send a command byte, then the upper and lower bytes of the divider. For channel 0, this command byte is 0x36, and for channel 2, this command byte is 0xB6. So, to set up the timer on channel 0, we use this code:
void activateChannel(unsigned char channel, unsigned int frequency)
{
unsigned int divider = 1193180 / frequency;
outportByte(0x43, channel == 0 ? 0x36 : 0xB6);
outportByte(0x40 + channel, (unsigned char)(divider & 0xFF));
outportByte(0x40 + channel, (unsigned char)((divider >> 8) & 0xFF);
}
We can see from this that the base port for the PIT is 0x40. If we want to use the timer, we only have to run this method after setting up an IRQ for IRQ0, and we'll be getting frequency interrupts per second. To save time, you could link the PIT and RTC together so that you only have to retrieve the time every now and again, and add one second to the current time on the necessary IRQs, instead of using the costly IO port.
If you've tried to activate channel 2 using the code snippet above, you'll notice that on Microsoft or Windows Virtual PC, the computer doesn't stop emitting a tone on the frequency you set. To do this, simply retrieve the byte at port 0x61, switch off the first two bits (& 0xFC), and write it back. So, to beep for a given time, you just need a sleep
function, which isn't that difficult, and is left as an exercise for the reader.
KBC
If you've been paying a little more than the usual attention, you'll have noticed that we're moving from the hardest to the easiest out of the three drivers here. Logically then, the KBC should be easy. And it is; all you need to do is to hook up an interrupt handler on IRQ1. The only exception is that Bochs doesn't always flush the keyboard buffer, which might interfere with a command prompt. We can do this manually by reading bytes from port 0x60 until there are two identical values in a row.
The contents of the IRQ handler are also simple. You simply have to read a byte from 0x60 (necessary to receive more than one IRQ) and map it to a character. To perform this mapping, it is common to either load a text file as a module and use the byte read as an index into it, or to have a large collection of character arrays which map every possibility of num lock, caps lock, and shift. However, you don't need to struggle through with this bit by trial and error. This page contains the bulk of the key map you need to create; since we're in C++ and might want to support localisation, you might consider creating a derived class with a TranslateScancode
method. Note, however, that not every byte corresponds to a printable character. There are also other keys, like Shift, Tab, Caps lock, and the Fn keys. Every key also has an 'on' and 'off' value, which tell us when the user presses and releases each character. When the user holds down a key, the interrupt occurs until they release it.
Should you decide to write a bootloader, you could also use the KBC to prevent accesses at addresses higher than 1 MiB from wrapping around, using the A20 gate. As it is, we only really need to enable and disable the LEDs on the keyboard. However, the logical organisation of a computer continues to manifest itself; we can also reset the computer by writing the byte 0xFE to the port 0x64. Isn't legacy stuff wonderful?
So, to enable the LEDs, we send a command byte (0xED) to the port 0x60, followed by a character with a precise bit format.
Bits | Field size (bits) | Description |
0 | 1 | Scroll lock |
1 | 1 | Num lock |
2 | 1 | Caps lock |
3 : 7 | 5 | Unused |
This will switch on the LEDs corresponding to the bits we send to one of the keyboard ports.
What now?
We've covered everything we set out to; hopefully, you've now got three fully functional drivers which you can use in your kernel. From here, you could move on to multi-tasking. However, we need to cover memory management first, so that we can provide each process with its own perspective of memory and provide memory to our programs. Since I'm not going to be able to post another article for a little while due to exams, the adventurous may want to continue reading, while I cover some snippets, filling in a useful gap which will make debugging easier.
MSRs
Because sometime in the future you will need this snippet (LAPICs have some very interesting features), here's a basic code snippet to read and write a Model Specific Register. We use the unsigned long long
datatype here because all MSRs are 64 bits wide.
struct MSR
{
public unsigned int Offset;
public void SetValue(unsigned long long value)
{
asm volatile ("wrmsr" : : "a"(value >> 32),
"d"(value & 0xFFFFFFFF00000000), "c"(Offset));
}
public unsigned long long GetValue()
{
unsigned int low = 0, high = 0;
asm volatile ("rdmsr" : "=a"(&low),
"=d"(&high) : "c"(Offset));
return (high << 32) | low;
}
};
Stack trace
To understand the code snippet that follows, you need to know the layout of the stack using the x86 CDECL architecture. If GCC decides to change the calling convention, you'll also need to rewrite this. You can find a comprehensive listing here; it's a little beyond the scope of this tutorial, but recommended reading nonetheless.
void printStackTrace(unsigned int ebp)
{
unsigned int *stackPosition = (unsigned int *)ebp;
while(stackPosition != 0)
{
unsigned int methodLocation = *(stackPosition + 1);
writeHexadecimal(methodLocation);
if(*stackPosition != 0)
writeLine();
stackPosition = (unsigned int *)(*stackPosition);
}
}
Now that you have a list of addresses, you just have to resolve them to kernel symbols and method names. This isn't too difficult to do; you just need to use the Multiboot header (your Main function received a pointer to it) to get the ELF section header string table, then move on from there to the symbol and string tables. Scan through the symbol table until there's a symbol that covers each value, then look it up in the string table. If this sounds complex, then have no fear because the ELF specifications are below.
References