*** In this older source below Red and Blue in the RGBA macro are reversed .. sorry didn't notice
Introduction
Previous article:
Well, this is not the article that was originally intended which was supposed to be on getting the USB hub up and running baremetal with the KeyBoard and Mouse. While I achieved that with the Pi1 (as per my updated screenshot added to my previous article), I got a present for Christmas in the shape of a Pi3 B+ and a Pi camera board.
Finally, having access to such a board, I immediately went to try out my code and it was then that all hell broke loose. Not only did my Pi3 specific code not work but I could not get any baremetal code running on the Pi3 at all and that set the stage for a very frustrating 3 weeks working out why.
I also leave a warning that as with many Pi things, there as some answers I just don't know that are shrouded in undisclosed detail.
This is the result of my running code. We have some visual board details and a few running graphics test (ifs fern, matrix rain, shading) as well as blinking the activity led .... okay, I know not very exciting.
Background
The Pi3 B+ uses a 1.2GHz 64-bit quad-core ARMv8 CPU, however the Arm startup firmware firmly delivers the CPU to you running in 32 Bit mode and the default linux distros it runs are 32 Bit. Once I got everything working, I did find it trivial to transfer the CPU into 64 bit mode and I may take that further at a latter stage and another article. For this article, we will have the CPU running in 32 bit mode as the intention is to have one set of code that will run on any of the Pi versions.
The Pi foundation has supposedly made the firmware do a specific load sequence with the view to theoretically make all Pi version compatible. I say supposedly because as we will discuss, I could not get it to work. There is some missing detail. So the firmware boot sequence is supposed to do this load sequence, do some memory shuffling and boot from address 0x8000
RPi3 Boot sequence:
- Check for kernel8.img and if found, load it and boot in 64bit mode.
- If not found, check for kernel8-32.img and if found, load it and boot in 32bit mode
- If not found, check for kernel7.img and if found, load it and boot in 32bit mode
- If not found, check for kernel.img and if found, load it and boot in 32bit mode
That sequence is correct and worked but what would not work for me was picking up the processor from 0x8000. I could get no baremetal image reliably booted from that address. I tried almost every bit of baremetal code out on the internet that supposedly ran the Pi3 and all failed. I had actually thought the board was dead but a quick test with raspbian image disk and it burst into life. Even code from sites like David Welch (https://github.com/dwelch67/raspberrypi) would not pick the Pi3 up. Hitting the forums, it is clear I was not alone. There were a number of us who could not get the Pi3 B+ up. Testing quickly showed the processor just didn't ever kick or start from 0x8000 or if it did the firmware had not put my file in memory correctly.
Update: Some in the Pi Community have taken exception to me suggesting that there is a problem or bug with the loader on the Pi3 even going so far as to basically call me a Liar. They insist that XYZ code or repository works on the Pi3 without problem. All can say is the boards I have seem to struggle with most code that starts at 0x8000 and is very flakey at loading. I don't have any reason to spend many more hours to deal with the issue as the loader provides no extra features that I have been able to glean, over just starting the processor at 0x0 using the config.txt file.
After a lot of frustration, I finally found one site that the code from would pick up the Pi3 and infact every sample of his worked. The site (https://github.com/PeterLemon/RaspberryPi) is by Peter Lemon AKA krom. What all of his code did was introduce a config.txt file onto the SD card with these lines:
kernel_old=1
disable_commandline_tags=1
disable_overscan=1
framebuffer_swap=0
The first line is the most important line. It tells the firmware to deliver the processor for startup at the memory address 0x0
. The moment I had my code loading from 0x0, everything burst into life. I honestly don't know if this is a bug with the firmware which may be fixed soon or something permanent. Peter Lemon doesn't detail how or why he knew to do this. All I can say is for now it is required. So your SD card for a Pi3 in baremetal ends up with 4 files on it being bootcode.bin, start.elf, config.txt and kernel8-32.img. The kernel8-32.img could also be kernel7.img. There appears to be no real difference between the two if you look at the boot sequence above. I am using Kernel8-32.img so I know it is a Pi3 file leaving kernel7.img for Pi2 kernel files.
Updated: I have been advised that fixup.dat needs to go on the SD card or the Pi3B will report only 256MB of memory. None of our play samples goes anywhere near that amount of memory so I will leave that up to you if you intend to write larger code.
Updated: The new composite SD image contains 3 kernel files, kernel.img (Pi1), kernel7.img (Pi2) and kernel8-32.img (Pi3). The SD card will now have 6 files, but will work on any Pi board it is placed in.
Using the Code
The arm tools have been updated. They have advanced to 6.2, from the old 5.4 in last articles. When I was having trouble, I updated my tools just in case (https://developer.arm.com/open-source/gnu-toolchain/gnu-rm/downloads).
As per previous articles, I just setup a batch file for my compiling like this:
g:\pi\gcc_pi_6_2\bin\arm-none-eabi-gcc -O2 -mfpu=neon-fp-armv8 -mfloat-abi=hard
-march=armv8-a+crc -mtune=cortex-a53 -nostartfiles -g3 -Wa,--defsym,BCM2837=1 start.s main.c -Wl,
-T,rpi3.ld -o kernel.elf -lc -lm
You can see the processor options matching the Arm8 processor on the Pi3:
-mfpu=neon-fp-armv8 -mfloat-abi=hard -march=armv8-a+crc -mtune=cortex-a53
UPDATE for SRC 2a: The new source 2a does not require the symbol to be pushed the Start.S file does all auto-detection of CPU type and Peripheral base address and corrects for everything. The batch file included in that code does not push the symbol so take care not to mix code between original and 2a zip files.
There is a new command that pushes the CPU to the assembler code in Start.S
:
-Wa,--defsym,BCM2837=1
It pushes "BCM2837=1
" into the assembler for Start.S
which controls whether certain blocks of assembler are included (Update: Pi1 BCM8235=1, Pi2 BCM8236 =1
). This is in preparation to again rejoin the code with the standard Pi1 code from the previous articles as I will need start.S
to add/delete code depending on what Pi we are compiling for.
Start.S
is aware the Pi2/Pi3 are quad core and it starts and parks Core 1, 2, 3 into a deadloop. Only Core 0 continues on and runs into the main code in this example, so it really looks like a Pi1
. Again, all this is done so I can merge the Pi3 code back into the older article code.
The more advanced users may want to open Start.S
and look at the assembler code that is run before it jumps to the C start point called kernel_main
in main.c.
UPDATED: In the original code, there was a requirement to run two critical sections of code immediately in the initial C code. When merging Start.S
with the ability to run the Pi1/Pi2, I was able to move these two critical sections down into the assembler code. The new Start.S
both clears the .BSS and runs the automatic peripheral base detection as well as a new feature of reading the CPUID
from the Arm processor and placing the results in two global variables:
RPi_IO_Base_Addr = Peripheral Base Address auto detected (0x3F000000 or 0x20000000)
RPi_CpuId = ARM processor CPUID
You can access these in C code by declaring an external like so:
extern uint32_t RPi_IO_Base_Addr; extern uint32_t RPi_CpuId;
The slightly modified source will now display the detected peripheral base address as well as the CPU.
Here are those details from my Pi3
B+:
Compare that to the Pi1
details:
UPDATE for SRC 2a: Techically the Arm6 code produced by Src 2a in the single Kernel.img will run on every PI because of all the auto-detection code. It does however run slower on the Pi2 and 3 because it does not use the advanced instructions available to it when it was compiling for Arm6. If you want to try it you just have have only the one file Kernel.img and you will see that behaviour. This is a very embedded trick we would often use to have something like a bootstrap loader which would go in sort out what board etc we are dealing with and then download a correct image file for that board.
As per the above, the new SD composite has 3 kernels for each of the Pi boards and the boot sequence discussed above will ensure the card will work on any Pi.
I have provided 3 batch files, Pi1.bat (builds kernel.img), Pi2.bat (builds Kernel7.img) and Pi3.bat (builds Kernel8-32.img) which will make the appropriate Kernel image file for each of the different Pi boards. I copy whichever version I am working on to make.bat which my compiler links with.
There is a special linker file "rpi3.ld" which sets the start address, the .BSS section and .end defining where the heap starts. It's pretty straight forward and any reading of GCC simple linker scripts will explain what it does.
I have tried to keep this article brief as it is really just a temporary report, but it may be useful for those trying to specifically baremetal the Pi3.
Points of Interest
With the Pi3
, I now have further things I may look at in future and do articles on:
- Using all 4 cores
- 64 Bit coding
- Baremetal access to the Camera
Hopefully, the next article will be as intended on baremetal access to the USB with keyboard and mouse drivers and tested on the Pi1 and Pi3.
History
- Version 0.4: Added new version 2a src code with extended auto-detect code
- Version 0.3: Objections about 0x8000 loading and fixup.dat from some in Pi Community added.
- Version 0.2: Code modified to run Pi1 & 2 (
Start.S
heavily modified) - Version 0.1: Initial release
Old Code