Can A20 be enabled reliably?
A20 Enable Attempts
Early attempts by the computer industry to find a consistent way to enable A20 had little success. Himem.sys introduced a switch with no fewer than seventeen or nineteen different options to cover the variety of machine types. Linux failed to boot on some laptops, notably Toshiba Tecras, because its attempt to enable A20 failed. This leads to the question: Is there a reliable way to enable A20?
This page suggests a way to enable A20 using the traditional and most widely supported means of using the keyboard controller. There are two key components to the proposal: allowing the keyboard controller to regulate the pace and being prepared to wait long enough for it to do its work.
As will be seen, the time taken even on a machine reputed to be slow is only three or four milliseconds. Being patient is not going to make a noticeable impact on your operating system boot time.
To understand the timing differences that can be found in the field it is worth bearing in mind the different ways that the hardware can be designed to allow A20 to be enabled by a real or virtual keyboard controller. We want to support timing that is compatible with all of these.
With a real 8042
This is the original and oldest way. The machine has an Intel 8042 or compatible on the motherboard acting as keyboard controller. The 8042 is a microcontroller that runs its own built-in program. It only has to be fast enough to send to and receive from the keyboard serial data and to interact with the main CPU over the system buses. Especially on laptops a slow chip clock rate is desirable to save energy so there is no need to make it anything like as fast as the main CPU.
The implication of this is that once the KBC has been told to enable A20 it will go away and do what it has been told to do at its own pace. Being simple, it is single-threaded and will not be able to accept another command until it has enabled A20.
With a chipset
The essential functions of a keyboard controller can be implemented in logic. This would be much faster although it doesn't need to be. It is just that with a stable specification there would be no need for a separate microcontroller. With reduced chip costs, functions that were once implemented by code could be implemented by dedicated logic.
By the use of System Management Mode and a little chipset support it is possible to not have any keyboard controller. Instead the computer can be made to emulate the 8042, trapping to software any time the virtual 8042 is accessed.
In practice, there may be both a real keyboard controller (or some hardware to replace it) as well as emulation software. The machine can be set up to handle some functions in software and pass others through to hardware.
Mapping to and from USB
USB provides a keyboard model that is completely different to those above. When an OS is up and running it may well use the USB system to communicate with a USB keyboard. But during the boot process the USB system may emulate the original keyboard interface.
With no keyboard controller
There may be machines which neither have nor emulate a keyboard controller. If they exist these machines will likely be rare, industrial PCs or perhaps newer machines which only support USB with no legacy emulation. If there is genuinely no keyboard controller function the algorithm below suggests falling back to using System Control Port B but from tests the keyboard controller seems to be more widely supported than SCP-B. And both seem more widely supported than the BIOS routines.
How can we handle the timing differences inherent in the above differences? Basically by adjusting our pace to that of the controller. The steps are as follows. All of the odd-numbered steps are waits.
|0||Check whether A20 is already enabled||Newer machines may start with A20 already enabled. If so there is no need to go through any of the steps below although you might prefer to do so so as to ensure that A20 has been enabled in the way you expect.|
|1||Read 0x64 waiting until the KBC is ready to be written to||Wait until the KBC status register (read from port 0x64) has bit 1 set to zero|
|2||Write command 0xd1 to KBC port 0x64||Command 0xd1 tells the KBC that we are going to write a byte to its output port|
|3||Read 0x64 waiting until the KBC is ready to be written to||On the original 8042 the 'occupied' bit was set by hardware when the 8042 received a command or data so there should be no need to delay after writing a command before checking to see if the bit is set or clear.|
|4||Write a suitable byte value to the KBC data port 0x60||The essential elements of the value used here is that the low two bits should both be set to 1. An option is 0x87, the rationale for the rest of the bits being: since data and clock lines are open collector set them high to prevent interference with other operation; since the clock lines are inverted set them to zero; don't assert the IRQ outputs. For the meaning of the output bits see the KBC output port.|
|5||Read 0x64 waiting until the KBC is ready to be written to||This is a key step in adjusting to the speed of the controller. Once the controller indicates it can be written to then the previous command - to enable A20 - should have been executed. An A20 check after this wait should show that A20 has been enabled. However, there is more to do.|
|6||Write command 0xff to port 0x64||While not required on the original AT or PS/2 computers this command (which is a null command, accomplishing nothing on the KBC) is expected by UHCI, one of the USB standards. Until this command has been written some USB emulation of legacy functions may be affected. Notably UHCI is poorly specified in that it does not include all of the waits listed above and specifically says that any variation in the sequence (which would include the waits needed on early machines) is to break the sequence.|
|7||Read 0x64 waiting until the KBC is ready to be written to||This final step is, again, to regulate the speed of the algorithm to that of the hardware. Once the controller again indicates it can be written to then it has presumably completed anything it might need to do for the Null operation and the entire operation to enable A20 will have been completed.|
To stress, the algorithm requires the necessary patience to wait regardless of the speed of the hardware. Some algorithms wait in a loop a maximum of 65,536 times. Depending on the relative speeds of CPU and KBC this may not be enough. Use the figures below as a guide to how long to wait.
If your code waits for long enough but does eventually time out you may then want to try System Control Port B at address 0x92.
It is always good to see real timings to back up the above theory. Here we see - left to right - how long each of the above operations took when tested on real hardware. All times are in microseconds and indicate how long the operation that heads the column took on the machine listed on the left. After each operation a test was run to determine whether A20 had been enabled or not. This test and various overheads amount to something of the order of 1.5 microseconds. Bear this in mind when considering the small-value times in the table.
|Machine||Frequency||Wait until we can write||Write command 0xd1||Wait until we can write||Write data 0x87||Wait until we can write||Write command 0xff||Wait until we can write|
|Toshiba Tecra 710cdt, Pentium CPU||133,000,000||2.4||2.4||to be added||2.3||(e) to be added||2.4||to be added|
|AOpen P3||601,000,000||2.0||1.7||1.6||(e) 2.0||2.3||1.6||1.7|
|Viglen MPC-L||398,000,000||1.6||1.7||1.6||(e) 65.6||2.7||1.8||1.6|
|Asus EEE 4G||630,000,000||2.1||2.2||2.0||(e) 2.9||2.6||2.3||to be added|
|Jetway, Atom CPU||1,600,000,000||1.6||1.7||1.7||(e) 1.9||2.1||1.6||1.6|
- Each row has (e) to indicate where A20 was enabled. Prior to that cell A20 was not enabled.
- Most machines showed A20 enabled immediately after the value 0x87 was written to the data port but the Tecra enabled A20 at the end of the subsequent wait-for-write step. In the case of the Tecra the A20 line was enabled and the KBC indicated it could be written to at the same time. This backs up the theory suggesting that the KBC carries out an operation and then indicates that it is ready to receive another command.
- Some machines start with A20 already enabled. To test these A20 was disabled prior to starting the first step.
- Many machines enable A20 effectively immediately when the command to enable it is issued. This could be because they have a fast modern chipset which updates the A20 bit in hardware. This appears to be the case where the state changes in just one or two microseconds.
- Other machines emulate the keyboard controller. This might be the case with such as the Viglen MPC-L which takes around 65 microseconds.
- Early machines had a real 8042 or compatible keyboard controller. With these the command to enable A20 does not take effect immediately. The 8042 is a separate microcontroller running its own internal program. As such, after the command to enable A20 has been written to the microcontroller it needs time to run its internal code and adjust its output A20 enable bit.
- The Toshiba Tecra appears to have a real keyboard controller. After the command to enable A20 has been issued we must wait for the keyboard controller to carry out the command.
- The Toshiba Tecra range was reported to have a slow keyboard controller. However, in the tests above it took less than 1ms to enable A20. Also it's other times are not very far off some of the others. So slowness is a relative concept.
- Given that in most OSes A20 will be enabled once and then left enabled it is worth taking the extra few milliseconds to give the KBC enough time to do its job.
- Because a KBC runs an internal program it is possible that the value written for the output port will require the KBC to carry out more than one operation. Therefore the KBC may take more or less time than mentioned above to set the output port. This can be addressed by either reading the value of the port before writing it (though the port values will change over time so there is no guarantee that when you write it back you won't alter a bit) or waiting long enough for the KBC to carry out any and all work it has been given to do.
The above information suggests that the algorithm is reliable. In short the steps are
- wait for the KBC; write command 0xd1
- wait for the KBC; write data 0x87 or similar
- wait for the KBC; write command 0xff
- wait for the KBC again.