The LOADALL Instruction by Robert Collins --------------------------------------------------------------------------- Of the few undocumented instructions in the 80286 and 80386 microprocessors, the LOADALL instruction is the most widely known. Nevertheless very few people understand how to use it. Using LOADALL is not as simple as merely knowing the LOADALL opcode and its format, because knowing how to use LOADALL requires a knowledge of many aspects if the CPU behavior that aren't documented in their respective data sheets. The '286 LOADALL is widely known because a 15 page Intel-confidential document describing its use was given to many developers. '286 LOADALL is so commonly used in production code that DOS 3.3 (and above) and OS/2 have provisions for using LOADALL built in them; every '386 and '486 BIOS emulates '286 LOADALL; and even Microsoft CODEVIEW recognizes the '286 LOADALL opcode and disassembles it. On the other hand, the '386 LOADALL is not widely known, and very few developers even know it exists. In this article, I will explain how to use both the '286 and '386 LOADALL instructions and present source code to demonstrate the various aspects of CPU behavior that become apparent, or can be proven, when using LOADALL. LOADALL was originally included in the CPU mask for testing purposes and In Circuit Emulator (ICE) support. As its name implies, LOADALL loads all of the CPU registers, including the "hidden" software-invisible registers. At the completion of a LOADALL instruction, the entire CPU state is defined according to the LOADALL data table. LOADALL loads all of the softwarevisible registers such as AX, and all of the software-invisible registers such as the segment descriptor cache's (see the side bar "Descriptor Cache Registers"). Direct manipulation of the descriptor cache base registers allows the CPU to access its entire address space without switching to protected mode. By using LOADALL, it is possible to address memory above 1M. By being allowed to access memory above 1M without switching to protected mode, the heavy time penalty involved in resetting the CPU (as a means of returning to real mode ) is eliminated. In this respect, LOADALL is most significant to '286 programmers because it provides them with a new capability that isn't available by any other means. LOADALL is closely coupled with the CPU hardware. Both the '286 and '386 have different internal hardware, and therefore Intel implemented LOADALL using different opcodes on the '286 and '386. 80286 LOADALL (opcode 0F05) produces an invalid opcode exception when executed on the '386, and 80386 LOADALL (opcode 0F07) produces an invalid opcode exception when executed on the '286. LOADALL Details: All CPU registers (including MSW, GDTR, CSBASE, ESACCESS) are loaded by LOADALL from a memory image. Software-invisible descriptor cache registers associated with each segment register are also loaded. LOADALL may be executed in real or protected mode, but only at privilege level 0 (CPL=0). If LOADALL is executed at any other privilege level, the CPU generates an exception. By directly loading the descriptor cache registers with LOADALL, a program has explicit control over the base address, segment limit, and access rights associated with each memory segment. Normally, these values are loaded each time a segment register is loaded, but LOADALL allows these "hidden" registers to be loaded independently of their segment register counterparts. In real mode, it is possible to access a memory segment that isn't associated with any segment register. Likewise in protected mode, it is possible to access memory that has no descriptor table entry.LOADALL performs no protection checks against any of the loaded values. When executed at CPL 0, LOADALL can generate no exceptions. The segment access rights and limit portions may be values that would otherwise be illegal in the context of real mode or protected mode, but LOADALL willingly loads these values with no checks. Once loaded, full access checks are performed when accessing a segment. For example, it is possible to load a segment whose access is marked "not present." Normally, this condition would generate exception 11, "segment not present." LOADALL does not generate exception 11. Instead, any attempt to access this segment will generate exception 13. LOADALL does not check coherency between the software-visible segment registers and the software-invisible segment descriptor cache registers. Any segment descriptor base register may point to any area in the CPU address space, while the software-visible segment register may contain any other arbitrary value. All memory references are made according to the descriptor cache registers, not the software-visible segment registers. All subsequent segment register loads will reload the descriptor cache register. Beware of using values in CS that don't perfectly match a code segment descriptor table entry, or a real mode code segment -- as an interrupt return (IRET) may either cause an exception or execution to resume at an unexpected location. Likewise, pushing and subsequently popping any segment register will force the descriptor cache register to reload according to the CPU's conventional protocol, thereby obviating any real mode extended memory references. 80286 LOADALL: 80286 LOADALL is encoded as a two byte opcode 0F05h. LOADALL reads its table from a fixed memory location at 800h (80:0 in real-mode addressing). LOADALL performs 51 bus cycles (WORD cycles), and takes 195 clocks with no wait states. Table 1 shows the '286 LOADALL format. All CPU register entries in the LOADALL table conform to the standard Intel format, where the least significant byte is at the lowest memory address. The '286 descriptor cache entries have the following format: Offset: Description: 0-2 24-bit physical address of the segment in memory. These bytes are stored in standard Intel format with the least significant byte at the lowest memory address. 3 Access rights. The format of this byte is the same as that in the descriptor table. This access byte is loaded in the descriptor cache register regardless of its validity. Therefore the "present" bit in the access rights field becomes a "descriptor valid" bit. When this bit is cleared, the descriptor is considered invalid, and any memory reference using this descriptor generates exception 13, with error code 0. The Descriptor Privilege Level (DPL) of the SS and CS descriptor caches determines the Current Privilege Level (CPL). The CS descriptor cache may be loaded as a read/write data segment. 4-5 Segment limit. The standard 16-bit segment limit stored in standard Intel format. The GDTR and IDTR (GDT and IDT descriptor caches) have the following format: Offset: Description: 0-2 24-bit physical address of the segment in memory. 3 Should be 0. 4-5 Segment limit. Intel recommends the following guidelines for proper execution following LOADALL: * The Stack Segment should be a read/write data segment. * The Code Segment can be execute only (access=95h), read/execute (access=9bh), or read/write/execute (access=93h). Proper protected mode operation also requires that: * DPL of CS and DPL of SS must be equal. These attributes determine the CPL of the processor. * The DPL fields of ES and DS should be 3 to prevent RETF or IRET instructions from zeroing these registers. The code in listing 1 demonstrates how to explore the various operating modes with '286 LOADALL and how to access extended memory while in real mode. The LOADALL test performs various functions that would be impossible to duplicate without using LOADALL. 80386 LOADALL: 80386 LOADALL is encoded as a two byte opcode (0F07). LOADALL reads its data from a table pointed to by ES:EDI. Segment overrides are allowed, but apparently ignored. LOADALL performs 51 bus cycles (DWORD cycles) and takes 122 clocks with no wait states. Table 2 shows the '386 LOADALL format. Not shown in table 2 is the fact that prior to reading the LOADALL table, LOADALL reads 10 DWORDs exactly 100h bytes beyond the beginning of the table (ES:EDI+100h). This data is NOT used to load any of the registers LOADALL doesn't load (CR2, CR3, DR0-DR3, TR6, TR7), or the Numeric Processor eXtension (NPX). At this time, the purpose of reading this data and its destination is a mystery. Figure 1 shows an ICE trace showing all the bus cycles associated with LOADALL's execution. As with '286 LOADALL, all CPU register entries in the LOADALL table are in the standard Intel format where the least significant byte is at the lowest memory address. The '386 descriptor cache entries have the following format: Offset: Description: 0-3 Access rights. The access rights dword consumes 11 bits of this 32-bit field. See figure 2 for a complete description of this field. 4-7 32-bit base address of the segment in memory. 8-11 32-bit segment limit. The GDTR and IDTR (GDT and IDT descriptor caches) have the following format: Offset: Description: 0-3 Should be 0. 4-7 32-bit base address of GDTR or IDTR. 8-11 32-bit limit of GDTR or IDTR. Listing 2 shows how to test '386 LOADALL. This test is more comprehensive than the '286 LOADALL test because of the expanded capabilities of the '386 microprocessor. During the LOADALL test, the CPU is put in various states that are illegal and are impossible to duplicate through any other software means. LOADALL EMULATION Due to the large number of systems' programs that use '286 LOADALL, all '386 and '486 BIOS's must emulate the '286 LOADALL instruction (opcode 0F05). On the '386 and '486, the '286 LOADALL instruction generates an invalid opcode exception. The BIOS traps this exception and does its best to emulate the functionality of the LOADALL instruction. But perfect emulation is impossible without using LOADALL itself. Using '386 LOADALL to emulate '286 LOADALL can be done, but has its risks. First of all, the '486 does not have a LOADALL instruction. Second, Intel has threatened to remove LOADALL from the '386 mask. But perfect emulation is possible on the '386 by using '386 LOADALL to emulate '286 LOADALL. Listing 3 shows a TSR program that uses '386 LOADALL to emulate '286 LOADALL. The program first tests that you are a '386 before installing itself. By using this emulation program, you can guarantee perfect '286 LOADALL emulation. CONCLUSION LOADALL is a very powerful instruction, but using it has its risks. The features that make it so powerful also cause many of these risks. For example, the processor can be put in states that are otherwise impossible to duplicate through any other software means. Using LOADALL requires a thorough understanding of how the CPU processes register loads, the ramifications of those register loads, and careful planning. The illegally induced processor states can easily cause system crashes if not properly planned for. The best avoidance technique toward system crashes is to avoid using LOADALL unless you are totally confident in your understanding of the CPU, and in your programming skills. The '286 LOADALL is described in a 15-page Intel-confidential document. The document describes in detail how to use the instruction, and also describes many of its possible uses. LOADALL can be used to access extended memory while in real mode, and to emulate real mode while in protected mode. Programs such as RAMDRIVE, ABOVEDISC, and OS/2 use LOADALL. DOS 3.3 has provisions for using LOADALL by leaving a 102-byte 'hole' at 80:0. If you are a system's programmer and have a need to know this information, Intel will provide it, along with source code to emulate '286 LOADALL on the '386 (without using '386 LOADALL). The '386 LOADALL is still 'TOP-SECRET' from Intel. I don't know of any document that describes its use, format, or acknowledges its existence. Very few people at Intel will acknowledge that LOADALL even exists in the '386 mask. The official Intel line is that due to U.S. Military pressure, LOADALL was removed from the '386 mask over a year ago. However, running the program in listing 2 will demonstrate that LOADALL is alive, well, and still available on the latest stepping of the 80386. --------------------------------------------------------------------------- View source code for 286 LOADALL: ftp://ftp.x86.org/pub/x86/source/286load/286load.asm ftp://ftp.x86.org/pub/x86/source/286load/loadfns.286 ftp://ftp.x86.org/pub/x86/source/286load/macros.286 ftp://ftp.x86.org/pub/x86/source/include/cpu_type.asm View source code for 386 LOADALL: ftp://ftp.x86.org/pub/x86/source/386load/386load.asm ftp://ftp.x86.org/pub/x86/source/386load/loadfns.386 ftp://ftp.x86.org/pub/x86/source/386load/macros.386 ftp://ftp.x86.org/pub/x86/source/include/cpu_type.asm View source code for EMULOAD (286 LOADALL emulation using 386 LOADALL): ftp://ftp.x86.org/pub/x86/source/emuload/emuload.asm ftp://ftp.x86.org/pub/x86/source/include/cpu_type.asm Download entire source code archive for 286LOAD, 386LOAD, and EMULOAD: ftp://ftp.x86.org/pub/x86/dloads/LOADALL.ZIP --------------------------------------------------------------------------- Back to Books and Articles home page [Image][Image] --------------------------------------------------------------------------- © 1991, 1995, 1996 Intel Secrets(TM) home page. PGP key available. Make no mistake! The Intel Secrets web site is proud to provide superior information and service without any affilation to Intel. "Intel Secrets" and "What Intel doesn't want you to know" are trademarks of Robert Collins. Pentium and Intel are trademarks of Intel Corporation. 386, 486, 586, P6, and all other numbers…are not! All other trademarks are those of their respective companies. See Trademarks and Disclaimers for more info. This page has been access [Lotsa] times. Click here, to download the counter program. Robert Collins is a Senior Design Engineer and Manager of some sort of Research in Dallas, TX. Robert may be reached via email at rcollins@x86.org or via phone at (214) 491-7718.