This section deals with the steps needed to get a machine from power-up to running your OS code.
How does a machine boot?
Computers are controlled by programs. Some early computers had no program in memory when they were powered up and yet needed a program to tell them how to load programs from disk. This chicken-and-egg scenario came to be known as bootstrapping from the deliberately absurd notion of lifting oneself up by one's bootstraps or bootlaces.
Early machines often required an operator to enter a tiny program into memory by means of a keypad. Such a program would be just enough to load another, larger program from disk. That loaded program would then be big enough to load further code and thus the computer would have been bootstrapped.
Fortunately, manual program loading is no longer required. Computers come with some program code already built in and stored in read-only memory (ROM). ROMs may contain more than just the initial loader. PCs, for example, come with a set of ROM-based routines called the basic input/output system or BIOS which can also be used by initial programs once they have been loaded. The BIOS is limited but sufficient to support access to disks, the display and the keyboard. Unfortunately, network cards are not normally supported so will not be accessible until a driver has been loaded which can control the card.
The steps an IBM-compatible PC takes when being booted are enumerated below.
- When the CPU receives power or is reset it will start executing the instructions at a predefined address. x86 CPUs use CS:IP address 0xffff:0000. This results in address 0xffff0 which is 16 bytes below 1Mbyte. PCs have BIOS code in ROM at that address.
- Because 16 bytes is too small to contain much code the code at that address will jump to somewhere else in the BIOS.
- The BIOS code will then carry out a number of steps: machine self-tests, scans for bootable devices, allocation of things like device address space, DMA channels and IRQ vectors for those devices, possibly scans and allocations for non-bootable devices (if set to do so in the BIOS setup) and so on. Then it will look for one bootsector to load into memory.
- The bootsector can come from floppy disk, hard disk, plugged-in non-volatile memory or even from a remote device somewhere on a network. The order in which the BIOS searches for a suitable bootsector can be altered in BIOS setup screens. It will load the first valid one it finds. If it fails to find any it will usually print a message to the screen and either stop or wait for a keypress.
- Regardless of where the BIOS finds a valid bootsector it will load it to address 0x7c00. There is nothing magical about this address; it is just an address that was chosen. But bootsectors often rely on it being 0x7c00 so BIOSes use it consistently. The bootsector is 512 bytes in size so will occupy addresses 0x7c00 to 0x7dff.
- Once the bootsector has been loaded the BIOS transfers control (jumps) to its first byte. That address is normally made by setting CS:IP to 0x0000:7c00 but some BIOSes have been reported to set those registers to 0x07c0:0000. Whichever form of address is used the resulting value is the same, 0x7c00.
- Priot to this point the machine has been controlled by BIOS program code stored in ROM and thus not changeable by an OS developer. Aside from interrupt handling which continues in the background, from this point onward, user-written code is in control of the machine. What happens next depends on the program code in that bootsector and thus cannot be described in detail. However, some generic notes on how the boot normally progresses may be helpful.
- Before it started the bootsector code, the BIOS should have set register DL to the BIOS-assigned number of the disk from which the boot sector was loaded. This is usually 0x00 for a floppy disk and 0x80 for a hard disk. Some rare BIOSes apparently fail to set this correctly which is why you might find bootloaders which ignore the disk number and assume they were booted from the first diskette or hard disk.
- Bootsectors often contain data as well as code. The data provides information that the bootsector code requires. For the common FAT-formatted floppy disk the bootsector data will contain the diskette's geometry such as number of tracks, cluster size, reserved sectors and so on. For a hard disk which has been set up with the common four-partition scheme the bootsector is called the master boot record (MBR) and includes a table which specifies the locations and sizes of the four primary partitions.
- With hard-disk partitioning each primary partition has space for its own volume boot record (VBR). The job of the MBR is to load and start one of the four VBRs. By tradition, the MBR will load the selected VBR to 0x7c00, with the MBR moving itself out of the way first. VBRs may be written to expect to have been loaded to that location.
- Because each partition can contain a different operating system it would be unusual to have a custom MBR. Instead, there are a few de-facto standard MBRs that can interoperate with different OSes. Whatever MBR is there will have to work with any of the bootable partitions on the disk. If you install your own MBR you may prevent it working with other bootable partitions.
- An OS developer will, however, normally write a VBR.
- A handy nomenclature is to call each partition's boot record a partition boot record (PBR, i.e. a term which applies only to partitioned media) and to call any volume boot record - whether a PBR or a non-partitioned medium such as a floppy disk - a volume boot record (VBR).
- A VBR will need to know how to read further code from within the volume. It can do this by following file-system structures or by loading from fixed and immovable locations within the volume. A VBR will normally be designed for the specific type of file system with which the volume is formatted. Also, even if they use the same type of file system, the VBR on a floppy may not be the same as the VBR in the partition of a hard disk.
- The VBR of a volume is the first sector of that volume. When a hard disk partition or a floppy is reformatted the VBR will be overwritten.
- Because a bootloader in a VBR is limited in size to 512 bytes it may be too small to load the OS proper so it loads another program which might be called the OS loader and which is large enough to load and start the OS.
Bootsector tutorials and guides
- For multiboot involving Windows Vista see http://www.multibooters.co.uk
Pre-written bootsector examples
- Check the bootsector example at http://fasm.sourceforge.net/archive/oldsite/examples
What do I have to do at bootup?
On the PC, all of the above happens in real mode (RM) which is the traditional mode that the BIOS operates in. Before transitioning to protected mode (PM) it is a good idea to gather some information from the BIOS and store it where your PM code can find it.
How do I find out how much RAM is there in a PC?
You use the BIOS for that. Beginning in 1994 BIOSes began to support int 0x15 function 0xe820. Try that first. If it is supported you have the information you need and need try no further options. If it comes back with carry flag set or one of the other returns is not as documented then you may have a very old machine. Try 0xE801 and 0x88. If these are also not supported you might try CMOS regs 0x17 & 0x18. To find out how much RAM is available below 1 MB mark (of the above only 0E820h can provide that), you can use interrupt 0x12.
It is possible to start an OS under DOS or some other operating system that allows full access to hardware. If you start your OS under DOS and intend to return to it or use it in some way you may want to use DOS functions and Himem.sys (if loaded) to find available memory.
You cannot generally probe the RAM on an x86 PC because this is unsafe and may lead to hanging, resetting and possibly even damaging the PC. That is because some of the PC's devices are memory-mapped, that is, accessible through memory locations. Writing garbage to those locations may force them to do something which is not always safe.
What is the A20 line that needs to be switched on on the PC?
- A20 is an address line that needs to be enabled before an x86 CPU can be safely switched to protected mode and before a program can access all of memory.
- Early PCs used 8086 CPUs which had exactly 20 address lines, named A19 (highest) down to A0 (lowest). These gave access to a maximum of 1MiByte of memory. For these CPUs there was no such thing as an A20 line or anything higher but they could generate addresses that would have used one if it was available, i.e. addresses which were slightly over 1MiByte. A program trying to address memory just beyond the 1MiByte limit - which would need an A20 (if there had been one) to be set to 1(one) - would instead lose this address line information and just use A19 down to A0. This would cause a large address to wrap round and become a small address from address zero onwards. It is a little like counting in decimal to 127 and losing the hundreds digit. Since this would leave just the bottom two digits, numbers up to 99 would work normally but after that the numbers 100, 101, 102, … 127 would lose the top digit and result in numbers 00, 01, 02, … 27. On the early PCs this meant that address 1MiByte + N resulted in location N - i.e. the "1MiByte" part of the address was lost. Unfortunately, some older code relied on this wrap-around. For compatibility with old code even though modern machines have A20 and higher address lines some still start in the same way - i.e. with A20 disabled. In this respect they effectively start as if they were fast versions of the early CPUs.
- Since your OS will be new it will not need this legacy behaviour. Therefore you will want to ensure that A20 has been enabled before you try to access memory above 1MiByte.
- To be clear, even without A20 being enabled you would still be able to access any memory address which had A20 equal to 0 (zero). In other words the logic gate in the PC controls only address line A20, not lines A21 upwards. Since A20 equals zero anyway on all even MiByte ranges - i.e. from 0MiByte, 2MiByte, 4MiByte etc - they are accessible even without A20 being enabled. Only addresses where A20 needs to equal one - 1MiByte, 3MiByte, 5MiByte etc - would wrap back to the preceding MiByte.
- To control the A20 line the original IBM Personal Computer/AT used a tri-state multiplexer (mux) F257 chip to gate the A20 output from its 80286 CPU before it reached the rest of the machine. An otherwise-unused pin in the keyboard controller Intel 8042 chip was fed to control the mux. Thus the A20 line from the CPU passed through the mux, not through the keyboard controller.
- An IBM compatible PC will use the same method today to control A20 today even though the functions of controller and mux have long been unified into a chipset.
- To remain compliant A20 must be enabled while in real mode before switching to protected mode if the code is to run successfully on Intel and other CPUs. See, for example, the data sheet for an Intel 486.
- Some systems have a fast A20 gate option at system control port A (at I/O address 0x92).
- It should not be a problem to enable A20 while booting and remove this relic of the past. To find out how to ensure that A20 has been enabled see the code used in the following.
Can I read the CMOS/BIOS data?
- Check for cmos references in http://www.aarongray.org/AODUBL/bookmarks.html
How do I disable NMI on a PC?
If you want to disable NMI on a PC you have to set a bit in the address port for the CMOS such as with
out 0x70, al
where AL contains 0x80 or above. However, doing nothing more than writing to that port has the potential to cause problems. The following code swaps that write in some code to address those potential problems.
;The top bit of output port 0x70 is the NMI disable bit. Set to one to ;disable NMI. Return to zero to restore normal conditions, i.e. NMI enabled. ;Define a safe location in the CMOS/RTC chip. Various options: ;CMOS_SAFE_LOCATION: equ 0x00 ;Seconds ;CMOS_SAFE_LOCATION: equ 0x06 ;Day of the week (1 to 7) ;CMOS_SAFE_LOCATION: equ 0x0d ;Status register D ;CMOS_SAFE_LOCATION: equ 0x0e ;POST diagnostic status ;CMOS_SAFE_LOCATION: equ 0x0f ;Shutdown status %if CMOS_SAFE_LOCATION >= 0x80 %error Error: CMOS_SAFE_LOCATION cannot use more than seven bits %endif disable_nmi: push ax ;Save AX mov al, 0x80 + CMOS_SAFE_LOCATION ;Set the disable bit pushf ;Save interrupt-enable state cli ;Disable interrupts mov [cmos_current_location], al ;Note the location 0x70 points at out 0x70, al ;Disable NMI in al, 0x71 ;Keep the chip happy with a dummy access popf ;Restore incoming interrupt-enable state pop ax ;Restore the incoming AX ret enable_nmi: push ax ;Save AX mov al, CMOS_SAFE_LOCATION ;Clear the disable bit pushf ;Save interrupt-enable state cli ;Disable interrupts mov [cmos_current_location], al ;Note the location 0x70 points at out 0x70, al ;Enable NMI in al, 0x71 ;Keep the chip happy with a dummy access popf ;Restore incoming interrupt-enable state pop ax ;Restore the incoming AX ret ;Notes ; ; * Only setting or clearing port 0x70 bit 7 is required but the ; routines include code to make them practical to use. ; * Opinions vary as to the best location to leave 0x70 pointing at once the ; routines have finished. Some options are presented for ; CMOS_SAFE_LOCATION. ; * The routines are 16-bit for use in real mode. Modify appropriately for ; use in protected mode. ; * The routines modify no registers. ; * The routines disable normal interrupts over the operation to ; prevent any timeslicing, if running. The routines disable ; interrupts for the shortest time possible. ; * The last CMOS address written to 0x70 is recorded for information as ; the 0x70 register itself is write-only. ; * Some devices expect a read or write of 0x71 after the CMOS address is ; set up on 0x70. The routines include a dummy read of the port to ; satisfy this requirement. ; * No delay between port accesses is shown but may be required on ; some machines.
How do I go into protected mode on ia32?
Once you have set up relevant data structures - IDT, GDT etc - then changing to protected mode is easy.
;Enable A20 (while in real mode, as shown above) ;Disable NMI (shown above) cli ;Disable normal interrupts lgdt <GdtLocation> ;Point to the Global Descriptor Table mov eax,cr0 ;Fetch CR0 or eax,1 ;Set the protected mode bit... mov cr0,eax ;...in CR0 jmp seg:offset ;Far jump must /immediately/ follow load of CR0
Different generations of Intel CPU had slightly different requirements for the jump which follows the move to CR0. The above should work with all of them.
For full information see the Intel Software Developer’s Manuals for 80386, 80486 and Pentium.
Can I go back to real mode after entering protected mode?
You won't normally need to do this when booting your OS but it is possible. See http://www.sudleyplace.com/pmtorm.html
How do I enable paging on ia32?
To enable paging on a base ia32 system (80386 with not more than 4GiBytes of accessible memory) you need to set up the following
- a page fault handler
- interrupt 14 pointing to the page fault handler
- page table entries for the page fault handler
- page table entries to support the address range in which you are currently executing (these must be identity-mapped - i.e. the physical address and the linear address must be identical)
- a 4k page directory
Once the above are in place
;After the data structures have been set up and the page fault handler is ready mov cr3, pagedir ;The /physical/ address of the page directory mov eax, cr0 ;Switch or eax, 0x8000_0000 ; to mov cr0, eax ; page jmp paging ; mode paging: ; ;Paging is enabled from here on ;
How do I read from a disk after moving to Pmode?
Good question since the part of the BIOS that handles disk access will generally only work in real mode. There are some options. Whichever method you use you will need to load the code for that method before switching to protected mode.
- Use unreal mode and the BIOS. If you switch to unreal mode, you can use all of the memory address space as well as using the BIOS. Unreal mode a simply sets the segment limits to 4gig (or what every you wish it to), then you go back to real mode and as long as you don't modify the segment registers, you can use the bios and while using 32-bit registers. You can access more than one meg of ram. Don't forget to set the A20 line.
- A virtual 8086 task. You can run a virtual 8086 task and use BIOS inside it.
- Using port I/O. Learn how to program the controlling chip registers and deal with them directly.
- By using Port I/O you can send commands to the floppy controller to read from or write to the disk bypassing the BIOS. To do so, you will need to know that the first floppy controllers base address is 0x3F0 while the second is at 0x370. Then reading and writing to this address and each of its registers (0x3F1, 0x3F2, etc), you can control the floppy disk that is attached to the controller. Have a look at the following specs/pages for more on controlling the controller.
- For the Intel 82078: http://www.intel.com/design/archives/periphrl/docs/29047403.htm
- For the Intel 82077AA: http://www.isdaman.com/alsos/hardware/fdc/floppy.htm
- See Hale Landis' Public Domain ATA/ATAPI code: http://www.ata-atapi.com
- Doors IDE: http://www.freedoors.org/idework/idework.htm