PL and OS

Hacking software and hardware at BU

The Dark Art of Linker Scripts

There are many ways in which writing operating system code is like writing ordinary application code: you still need libraries, runtime support, and you use many common algorithms. However, there are also many ways in which it is different: for example, you are responsible for providing the libraries and runtime support. Most importantly, a kernel does not have the luxury of taking a simplistic view of memory.

A typical application assumes a flat memory layout with standardized addresses for program entry, stack and heap spaces. That illusion is provided by the operating system through the mechanism known as Virtual Memory which is implemented in hardware by the MMU. The system compiler produces object files that have symbol entries, and the system linker combines them into a single image using a linker script. The linker writes the final addresses for every symbol in the program (for dynamic linking, this step is deferred until runtime). You can get a list of symbols and addresses on any unstripped image by using the “nm” tool. You’ll notice that on ordinary applications, the addresses are pretty much all very similar from program to program. That is because there is a default system-wide linker script for regular programs. You can see it by typing “ld –verbose” but it is likely to be very long and very confusing. Linker scripts are what would now be termed a “domain specific language” with the sole purpose of describing the layout of sections and symbols in an output file, combining many inputs.

There are a myriad of options and features, mostly tacked on here and there to support different platforms, formats or architectures. I will focus on our barebones ARM image in the ubiquitous Executable and Linking Format (ELF). You can read all about ELF and its use in ordinary circumstances from links, I am going to focus on puppy‘s needs. When examining ELF files you want to be familiar with the tools “readelf” and “objdump” (and keep in mind that we are using the “arm-none-linux-gnueabi-” tool-chain). Let’s run “arm-none-linux-gnueabi-readelf -l” on “puppy.elf”:

Elf file type is EXEC (Executable file)
Entry point 0x80008000
There are 5 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x00c2c0 0xc000c2c0 0x8000c2c0 0x000f8 0x000f8 R 0x4
LOAD 0x008000 0x80008000 0x80008000 0x0017c 0x0017c R E 0x8000
LOAD 0x009000 0xc0009000 0x80009000 0x040d0 0x040d0 RWE 0x8000
LOAD 0x010000 0xc0010000 0x8000d0d0 0x00000 0x08000 RW 0x8000
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4
Section to Segment mapping:
Segment Sections...
00 .ARM.exidx
01 .startup .stubtext .stubARMidx
02 .text .rodata .ARM.extab .ARM.exidx .rodata.str1.4 .data
03 .bss

Program headers list “segments” which describe memory layout. Section headers describe “sections” which partition the image file itself. Not all sections are loaded into memory — some may just contain metadata for debuggers and such. In addition, sections that appear consecutively on disk may be loaded into disparate memory locations. For example, although “.data” and “.bss” are neighbors in the file, they are loaded into different program segments. In fact, “.bss” is interesting because it takes up zero space on disk, but some amount of space in memory. That is because it consists of uninitialized space for global variables. If you look carefully, you can see how that is represented in the ELF header.

The most important aspect of the program header to note is that there is a difference between VirtAddr and PhysAddr. This is one of the distinguishing aspects of a kernel image vs an ordinary program image. A kernel has to deal with the real physical memory underlying the virtual memory abstraction. It also has to live inside the virtual memory world that it ends up creating. In other words, kernel symbols must be in two places at the same time. This trick can be pulled off by linkers because they understand that the memory addresses assigned to symbols may be completely unrelated to where the symbol ends up living inside the image. Doing this, however, requires work on your part: you must bootstrap the virtual memory system so that these symbol addresses are actually meaningful and not wild references off into unmapped areas.

In Puppy, the job of bootstrapping virtual memory is handled by the “stub”. The stub is placed at the entry address. There is a small bit of assembler which sets up a temporary stack — enough to run bare metal C code. The stub C code then does the necessary initialization of the memory hardware. This needs to be done in so-called “identity-mapped” space: where virtual and physical addresses are equivalent. Why? Because the moment you toggle on the MMU, the program counter is now working in virtual memory. The simplest way to keep the kernel running (and not worry about pipeline issues) is to make sure that the next instruction is the same regardless of whether you are addressing virtually or physically. Once the stub returns, the assembler sets up the permanent system stacks (in high virtual memory) and then jumps to the real kernel entry point, in high virtual memory.

Therefore, we lay out the sections for the stub according to this portion of the linker script:

. = 0x80008000;
_physical_start = .;
_stub_start = .;
/* The stub runs in identity-mapped space */
.startup : { init/startup.o (.text) }
.stubtext : { init/stub.o (.text) }
.stubARMtab : { init/stub.o (.ARM.extab) }
.stubARMidx : { init/stub.o (.ARM.exidx) }
_stub_end = ALIGN(0x1000);
_stub_len = _stub_end - _stub_start;

I have selected the starting address, in physical memory, to be the same as Linux. Keep in mind that on the Beagle Board (OMAP353x), there are only two banks of SDRAM, and the first one has physical addresses beginning at 0x80000000. This is a bit weird if you are coming from x86, where physical memory starts at 0, but it makes sense because many other addresses are used for memory mapped device registers. The linker script maintains a running counter of “virtual memory” addresses, and you can set it a value using a line such as “. = 0x80008000;”. The primary job of the linker script is to create output sections and symbols with addresses derived from the counter. The output sections can mix and match input sections from the various object files. Here I have specified that the input section “.text” of “init/startup.o” should go into the output section named “.startup”. Note that “.text” is a standardized section name which is used for executable code. “.stubtext” contains the C code. The stack unwinding information in “.stubARMtab”/”.stubARMidx” is not really necessary for me, because I am not using C++ exceptions, but due to the vagaries of ARM linking, I must put this somewhere. I define a few convenient symbols for finding the starting and ending address of the stub, as well as its length.

The rest of the kernel is placed into high virtual memory. The reason for this is to leave the low virtual memory addresses for the use of ordinary programs. This is a pretty standard technique for OS kernels that use paged virtual memory. The main problem is that PhysAddr and VirtAddr can no longer be equivalent. That means we have to override the default. This can be accomplished with the “AT” option. It is also important that the physical addresses be kept as close together as possible. This is because the final image is not going to be ELF — the bootloader doesn’t understand it — but rather a raw binary image. Any discontinuities between sections will be filled in with zeroes, resulting in a much larger image file than you probably intended.

. = 0xC0008000;
/* From now on, virtual and physical addresses are separate. */
. += _stub_len; /* skip stub code */

We need to skip over the stub in virtual memory. The reason is simply that we are mapping a 1MB region that begins at 0xC0000000 to the physical address 0x80000000. Therefore, the stub code is going to take up the page at 0xC0008000. It can be reclaimed later.

_kernel_start = .;
/* The virtual address is taken from the counter, but the physical
* address must be specified with AT. */
.text : AT ( _stub_end ) { *(.text) }
.rodata : AT ( LOADADDR (.text) + SIZEOF (.text) ) { *(.rodata) }
. = ALIGN(0x1000);
_kernel_readonly_end = .;
.data : AT ( LOADADDR (.text) + _kernel_readonly_end - _kernel_start ) { *(.data) }
.bss : AT ( LOADADDR (.data) + SIZEOF (.data) ) { *(.bss) }
. = ALIGN(0x1000);
. = . + 0x1000; /* 4kB of stack memory */
svc_stack_top = .;

Since _stub_end is a physical address, “AT (_stub_end)” should be fairly self explanatory. The subsequent sections are somewhat more difficult. Since we do not have the advantage of a physical memory counter, we must calculate the appropriate locations. Fortunately, we are given the LOADADDR and SIZEOF functions, which get the section PhysAddr and section length respectively. Using these tools we lay each output section consecutively in physical memory, combining the input sections from all object files (the ones not already used). Then we kick the counter by a page and make a symbol representing the supervisor stack top, which is used in startup assembler.

_kernel_pages = (_kernel_end - _kernel_start) / 0x1000;
_kernel_readonly_pages = (_kernel_readonly_end - _kernel_start) / 0x1000;
_kernel_readwrite_pages = (_kernel_end - _kernel_readonly_end) / 0x1000;
/* compute the physical address of the l1table */
_l1table_phys = l1table - ((_kernel_start - _stub_len) - _stub_start);
/* To make arm-none-linux-gnueabi-gcc shut up: */
__aeabi_unwind_cpp_pr0 = .;
__aeabi_unwind_cpp_pr1 = .;

Linker scripts let us define arbitrary symbols, so I add some useful metadata about the kernel itself. In addition, there is another exception-related workaround: the two symbols at the end aren’t used by my code, but I can’t convince GNUEABI GCC to stop emitting references to them. This makes the linker stop complaining about undefined symbols.

Let’s check the output of “arm-none-linux-gnueabi-nm puppy.elf”:

80008000 T _reset
c0009018 T c_entry

Looks good, for now. But this won’t be the end of tweaking the linker script. I’ve already spotted some small details that I would like to fix. And you can do other interesting things with linker scripts, like creating sections for constructors, per-cpu variables, or other special purposes. And while there is formal documentation for LD, it isn’t really much of a help — not to mention that the implementation is not always flawless. I’ve had to be careful with the linker version I use, as some language features change as bugs are fixed or new ones are introduced. Since linker script is considered more of a configuration file format instead of a programming language, it doesn’t get the same level of careful management that might be given to C compiler front-ends. And it doesn’t help that the primary writers of linker scripts are also the developers of LD itself, who are intimately familiar with the inner workings of their interpreter.

Categories: hacking

Leave a Reply