Transition from Protected Mode to Real Mode

Introduction

The transition from Protected Mode (PM) to Real Mode (RM) can be difficult if you don't appreciate all the registers, data structures, and hidden elements you need to manage. This page attempts to deal with every element necessary to accomplish this mode switch.

In the process, we'll mention things you need to do outside the scope of the code presented here — these items are  marked . Also, as the code is built up from the initial instructions, new portions of the code are  marked . Finally, places where you need to substitute something are marked .

The Simplest Case

Consider the following barebones code which makes the transition from PM to RM:

        cli

mov eax,cr0 and al,not 1 mov cr0,eax
sti

Indeed this code does disable bit 0 (the Protect Enable (PE) bit) in Control Register 0, but there's a lot it misses. That is, except for certain rare circumstances1, if the above code is all one does to make the RM-to-PM transition, then quite likely the system will reboot immediately.

However, it's a start and as more issues arise we'll add to this code to make it more robust.

Interrupts

The above code correctly disables and enables interrupts around the transition, however PM and RM each use a different format for the interrupt table — PM uses an 8-byte entry per interrupt and RM uses a 4-byte entry. If you setup a PM IDT (Interrupt Descriptor Table) to use when in PM, then someplace you need to load the RM IDT before enabling interrupts in RM:

        cli

mov eax,cr0 and al,not 1 mov cr0,eax

        lidt    fword ptr RM_IDT

        sti

where RM_IDT  is the address of the RM IDT. Note that in order for the IDT to be usable in RM it must reside in the first one megabyte of memory, typically at 0:0.

Code Selector vs. Segment

A code selector in PM can have several different attributes:

A code segment in RM can have only one of each of the above choices:

so someplace the above RM attributes must be acquired by CS. Typically this is done by allocating a selector (in either the GDT or LDT) with the desired attributes and whose limits encompass the above code so your code can jump to it to complete the job of RM-to-PM.

Data Selectors vs. Segments

A data selector in PM can have several different attributes:

A data segment in RM can have only one of each of the above choices:

so someplace the above RM attributes must be acquired by DS, ES, FS, GS, and SS while still in PM. Typically this is done by allocating one or more selectors (in either the GDT or LDT) with the desired attributes. The actual addresses to which these selectors point is immaterial as, at this point, we're interested in the attributes only.

        cli

        mov     eax,DataSelector
        mov     ds,eax
        assume  ds:nothing

mov es,eax mov fs,eax mov gs,eax mov ss,eax assume es:nothing,fs:nothing,gs:nothing,ss:nothing

        mov     eax,cr0
        and     al,not 1
        mov     cr0,eax

lidt fword ptr RM_IDT
sti

While in Real Mode

So far we've talked about what you need to do in PM only, however there's work to do in RM as well. In particular, the segment registers need to be changed to their RM values. First, we'll do CS.

Note that just because we shifted to RM by clearing the PE bit (that's bit 0) in CR0, doesn't mean that the value in CS changed. That can be changed only by executing a far jump (or ret, however, we don't have a valid stack at this point, so a ret is out).

        cli

mov eax,DataSelector mov ds,eax assume ds:nothing
mov es,eax mov fs,eax mov gs,eax mov ss,eax assume es:nothing,fs:nothing,gs:nothing,ss:nothing
mov eax,cr0 and al,not 1 mov cr0,eax
lidt fword ptr RM_IDT

        db      0EAh
        dw      offset cs:L1
        dw      seg CODE
L1:
        sti

The db statement defines a byte with the opcode for a far jump (0EAh). The next statement defines a word value which is used by the far jump as the instruction offset, along with the second dw which defines the RM segment of the label L1. Together, these three statements create a far jump to label L1.

Although the above technique is commonly used, there is a cleaner way as follows (assuming your assembler supports it):

        cli

mov eax,DataSelector mov ds,eax assume ds:nothing
mov es,eax mov fs,eax mov gs,eax mov ss,eax assume es:nothing,fs:nothing,gs:nothing,ss:nothing
mov eax,cr0 and al,not 1 mov cr0,eax
lidt fword ptr RM_IDT

        jmp     far ptr L1
L1:
        sti

Note that the far jump is placed inside the cli/sti instructions as we can't tolerate an RM interrupt with a PM selector value in CS — the interrupt handler would return to the wrong segment.

After returning to RM, the RM segment values can be loaded into the segment registers. The only one you need be careful about is SS which needs to be set before interrupts are enabled, as we can't tolerate an RM interrupt with a PM selector value in SS — the interrupt handler would use the wrong stack segment.

Putting it all together now ...

CR0FLAGS record  $PG:1,$CD:1,$NW:1,$CR0RSV0:10,$AM:1,$CR0RSV1:1,$WP:1,\
                 $CR0RSV2:10,$NE:1,$ET:1,$TS:1,$EM:1,$MP:1,$PE:1

CODE segment use16 ; The PM selector which encompasses CODE ; must be use16, 64KB, byte-granular, ; non-conforming, readable. ; You might also want to define it as ; byte-aligned and public in a class. assume cs:CODE ; Tell the assembler about it
PM2RM proc ; near or far, your choice assume ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing

        cli                     ; We can't afford an interrupt here

mov eax,DataSelector ; Get a selector with RM attrs mov ds,eax ; Load all selectors mov es,eax ; ... mov fs,eax ; ... mov gs,eax ; ... mov ss,eax ; ... assume ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing
mov eax,cr0 ; Get current value and al,not (mask $PE) ; Tell the CPU to enter RM mov cr0,eax ; Set current value, we're now in RM
        mov     ax,DataSegment  ; Get valid data segment
        mov     ds,ax           ; Set to known value
        assume  ds:RMDATA       ; Tell the assembler about it

        lidt    fword ptr RM_IDT ; Load the RM Interrupt Descriptor Table

jmp far ptr L1 ; Load CS with RM value L1:
        lss     sp,StackPointer ; SS:SP ==> valid stack

        sti                     ; OK to interrupt now

; From here on, the instructions are optional (except for the ret)

xor ax,ax ; A convenient zero mov es,ax ; Set to known value mov fs,ax ; ... mov gs,ax ; ... assume es:nothing,fs:nothing,gs:nothing
ret ; near or far return depending upon initial proc
assume ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing
PM2RM endp
CODE ends

Note that we set DS before loading the RM IDT in case it's needed to address that structure and/or the stack pointer.

Paging

When paging is enabled, there are a few more bases to touch in a PM-to-RM transition.

Putting it all together now ...

CR0FLAGS record  $PG:1,$CD:1,$NW:1,$CR0RSV0:10,$AM:1,$CR0RSV1:1,$WP:1,\
                 $CR0RSV2:10,$NE:1,$ET:1,$TS:1,$EM:1,$MP:1,$PE:1

CODE segment use16 ; The PM selector which encompasses CODE ; must be use16, 64KB, byte-granular, ; non-conforming, readable. ; You might also want to define it as ; byte-aligned and public in a class. assume cs:CODE ; Tell the assembler about it
PM2RM proc ; near or far, your choice assume ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing
cli ; We can't afford an interrupt here
mov eax,DataSelector ; Get a selector with RM attrs mov ds,eax ; Load all selectors mov es,eax ; ... mov fs,eax ; ... mov gs,eax ; ... mov ss,eax ; ... assume ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing

; We should be in identity-mapped memory at this point

mov eax,cr0 ; Get current value and eax,not (mask $PG) ; Disable paging mov cr0,eax ; Set current value, paging is now disabled
xor eax,eax ; A convenient zero mov cr3,eax ; Flush the TLB

        mov     eax,cr0         ; Get current value
        and     al,not (mask $PE) ; Tell the CPU to enter RM
        mov     cr0,eax         ; Set current value, we're now in RM

mov ax,DataSegment ; Get valid data segment mov ds,ax ; Set to known value assume ds:RMDATA ; Tell the assembler about it
lidt fword ptr RM_IDT ; Load the RM Interrupt Descriptor Table
jmp far ptr L1 ; Load CS with RM value L1: lss sp,StackPointer ; SS:SP ==> valid stack
sti ; OK to interrupt now
; From here on, the instructions are optional (except for the ret)
xor ax,ax ; A convenient zero mov es,ax ; Set to known value mov fs,ax ; ... mov gs,ax ; ... assume es:nothing,fs:nothing,gs:nothing
ret ; near or far return depending upon initial proc
assume ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing
PM2RM endp
CODE ends

Author

This page was written by Bob Smith. Please all comments and corrections to me.

Footnote

1 The rare circumstances mentioned above are essentially that if you initialize any of various data structures in PM, you must, in a sense, uninitialize them in the process of transitioning from PM to RM. For example, if you load an IDT for use in PM, you must load a IDT for use in RM; if you load segment registers with PM selectors, you must load the same segment registers with RM segment values, etc.