Transition from Protected Mode to Real Mode

Introduction

The transition from Protected Mode (PM) to Real Mode (RM) can be difficult if you don't appreciate all the registers, data structures, and hidden elements you need to manage. This page attempts to deal with every element necessary to accomplish this mode switch.

In the process, we'll mention things you need to do outside the scope of the code presented here — these items are marked . Also, as the code is built up from the initial instructions, new portions of the code are marked . Finally, places where you need to substitute something are marked .

The Simplest Case

Consider the following barebones code which makes the transition from PM to RM:

        cli


        mov     eax,cr0
        and     al,not 1
        mov     cr0,eax


        sti

Indeed this code does disable bit 0 (the Protect Enable (PE) bit) in Control Register 0, but there's a lot it misses. That is, except for certain rare circumstances¹, if the above code is all one does to make the RM-to-PM transition, then quite likely the system will reboot immediately.

However, it's a start and as more issues arise we'll add to this code to make it more robust.

Interrupts

The above code correctly disables and enables interrupts around the transition, however PM and RM each use a different format for the interrupt table — PM uses an 8-byte entry per interrupt and RM uses a 4-byte entry. If you setup a PM IDT (Interrupt Descriptor Table) to use when in PM, then someplace you need to load the RM IDT before enabling interrupts in RM:

        cli


        mov     eax,cr0
        and     al,not 1
        mov     cr0,eax

        lidt    fword ptr RM_IDT

sti

where RM_IDT is the address of the RM IDT. Note that in order for the IDT to be usable in RM it must reside in the first one megabyte of memory, typically at 0:0.

Code Selector vs. Segment

A code selector in PM can have several different attributes:

use16 vs. use32
byte- vs. page-granular
non-conforming vs. conforming
readable vs. not-readable

A code segment in RM can have only one of each of the above choices:

use16
byte-granular
non-conforming
readable

so someplace the above RM attributes must be acquired by CS. Typically this is done by allocating a selector (in either the GDT or LDT) with the desired attributes and whose limits encompass the above code so your code can jump to it to complete the job of RM-to-PM.

Data Selectors vs. Segments

A data selector in PM can have several different attributes:

small or big (meaningful for stacks and expand-down data segments only)
byte- vs. page-granular
expand-up vs. expand-down
writable vs. read-only

A data segment in RM can have only one of each of the above choices:

small
byte-granular
expand-up
writable

so someplace the above RM attributes must be acquired by DS, ES, FS, GS, and SS while still in PM. Typically this is done by allocating one or more selectors (in either the GDT or LDT) with the desired attributes. The actual addresses to which these selectors point is immaterial as, at this point, we're interested in the attributes only.

cli

        mov     eax,DataSelector
        mov     ds,eax
        assume  ds:nothing


        mov     es,eax
        mov     fs,eax
        mov     gs,eax
        mov     ss,eax
        assume  es:nothing,fs:nothing,gs:nothing,ss:nothing

        mov     eax,cr0
        and     al,not 1
        mov     cr0,eax


        lidt    fword ptr RM_IDT


        sti

While in Real Mode

So far we've talked about what you need to do in PM only, however there's work to do in RM as well. In particular, the segment registers need to be changed to their RM values. First, we'll do CS.

Note that just because we shifted to RM by clearing the PE bit (that's bit 0) in CR0, doesn't mean that the value in CS changed. That can be changed only by executing a far jump (or ret, however, we don't have a valid stack at this point, so a ret is out).

        cli


        mov     eax,DataSelector
        mov     ds,eax
        assume  ds:nothing


        mov     es,eax
        mov     fs,eax
        mov     gs,eax
        mov     ss,eax
        assume  es:nothing,fs:nothing,gs:nothing,ss:nothing


        mov     eax,cr0
        and     al,not 1
        mov     cr0,eax


        lidt    fword ptr RM_IDT

        db      0EAh
        dw      offset cs:L1
        dw      seg CODE
L1:

sti

The db statement defines a byte with the opcode for a far jump (0EAh). The next statement defines a word value which is used by the far jump as the instruction offset, along with the second dw which defines the RM segment of the label L1. Together, these three statements create a far jump to label L1.

Although the above technique is commonly used, there is a cleaner way as follows (assuming your assembler supports it):

        cli


        mov     eax,DataSelector
        mov     ds,eax
        assume  ds:nothing


        mov     es,eax
        mov     fs,eax
        mov     gs,eax
        mov     ss,eax
        assume  es:nothing,fs:nothing,gs:nothing,ss:nothing


        mov     eax,cr0
        and     al,not 1
        mov     cr0,eax


        lidt    fword ptr RM_IDT

        jmp     far ptr L1
L1:

sti

Note that the far jump is placed inside the cli/sti instructions as we can't tolerate an RM interrupt with a PM selector value in CS — the interrupt handler would return to the wrong segment.

After returning to RM, the RM segment values can be loaded into the segment registers. The only one you need be careful about is SS which needs to be set before interrupts are enabled, as we can't tolerate an RM interrupt with a PM selector value in SS — the interrupt handler would use the wrong stack segment.

Putting it all together now ...

CR0FLAGS record  $PG:1,$CD:1,$NW:1,$CR0RSV0:10,$AM:1,$CR0RSV1:1,$WP:1,\
                 $CR0RSV2:10,$NE:1,$ET:1,$TS:1,$EM:1,$MP:1,$PE:1


CODE    segment use16           ; The PM selector which encompasses CODE
                                ; must be use16, 64KB, byte-granular,
                                ; non-conforming, readable.
                                ; You might also want to define it as
                                ; byte-aligned and public in a class.
        assume  cs:CODE         ; Tell the assembler about it


PM2RM   proc                    ; near or far, your choice
        assume  ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing

        cli                     ; We can't afford an interrupt here


        mov     eax,DataSelector ; Get a selector with RM attrs
        mov     ds,eax          ; Load all selectors
        mov     es,eax          ; ...
        mov     fs,eax          ; ...
        mov     gs,eax          ; ...
        mov     ss,eax          ; ...
        assume  ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing


        mov     eax,cr0         ; Get current value
        and     al,not (mask $PE) ; Tell the CPU to enter RM
        mov     cr0,eax         ; Set current value, we're now in RM

        mov     ax,DataSegment  ; Get valid data segment
        mov     ds,ax           ; Set to known value
        assume  ds:RMDATA       ; Tell the assembler about it

        lidt    fword ptr RM_IDT ; Load the RM Interrupt Descriptor Table


        jmp     far ptr L1      ; Load CS with RM value
L1:

        lss     sp,StackPointer ; SS:SP ==> valid stack

        sti                     ; OK to interrupt now

; From here on, the instructions are optional (except for the ret)


        xor     ax,ax           ; A convenient zero
        mov     es,ax           ; Set to known value
        mov     fs,ax           ; ...
        mov     gs,ax           ; ...
        assume  es:nothing,fs:nothing,gs:nothing


        ret                     ; near or far return depending upon initial proc


        assume  ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing


PM2RM   endp


CODE    ends

Note that we set DS before loading the RM IDT in case it's needed to address that structure and/or the stack pointer.

Paging

When paging is enabled, there are a few more bases to touch in a PM-to-RM transition.

The CODE segment with the RM attributes must have its linear and physical addresses the same (called identity-mapped memory). This is necessary because when we disable paging, the next instruction in the linear address space must be the same as the next instruction in the physical address space. Usually, this condition is met in the process of initially defining the CODE segment.
In case there are any references (direct or indirect) to the GDT, LDT, or IDT after paging is disabled, those data structures must be moved to identity-mapped memory. The easiest way to do this is to define them that way in the first place.
Disable paging.
Flush the TLB (Translation Lookaside Buffer).

Putting it all together now ...

CR0FLAGS record  $PG:1,$CD:1,$NW:1,$CR0RSV0:10,$AM:1,$CR0RSV1:1,$WP:1,\
                 $CR0RSV2:10,$NE:1,$ET:1,$TS:1,$EM:1,$MP:1,$PE:1


CODE    segment use16           ; The PM selector which encompasses CODE
                                ; must be use16, 64KB, byte-granular,
                                ; non-conforming, readable.
                                ; You might also want to define it as
                                ; byte-aligned and public in a class.
        assume  cs:CODE         ; Tell the assembler about it


PM2RM   proc                    ; near or far, your choice
        assume  ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing


        cli                     ; We can't afford an interrupt here


        mov     eax,DataSelector ; Get a selector with RM attrs
        mov     ds,eax          ; Load all selectors
        mov     es,eax          ; ...
        mov     fs,eax          ; ...
        mov     gs,eax          ; ...
        mov     ss,eax          ; ...
        assume  ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing

; We should be in identity-mapped memory at this point


        mov     eax,cr0         ; Get current value
        and     eax,not (mask $PG) ; Disable paging
        mov     cr0,eax         ; Set current value, paging is now disabled


        xor     eax,eax         ; A convenient zero
        mov     cr3,eax         ; Flush the TLB

        mov     eax,cr0         ; Get current value
        and     al,not (mask $PE) ; Tell the CPU to enter RM
        mov     cr0,eax         ; Set current value, we're now in RM


        mov     ax,DataSegment  ; Get valid data segment
        mov     ds,ax           ; Set to known value
        assume  ds:RMDATA       ; Tell the assembler about it


        lidt    fword ptr RM_IDT ; Load the RM Interrupt Descriptor Table


        jmp     far ptr L1      ; Load CS with RM value
L1:
        lss     sp,StackPointer ; SS:SP ==> valid stack


        sti                     ; OK to interrupt now


; From here on, the instructions are optional (except for the ret)


        xor     ax,ax           ; A convenient zero
        mov     es,ax           ; Set to known value
        mov     fs,ax           ; ...
        mov     gs,ax           ; ...
        assume  es:nothing,fs:nothing,gs:nothing


        ret                     ; near or far return depending upon initial proc


        assume  ds:nothing,es:nothing,fs:nothing,gs:nothing,ss:nothing


PM2RM   endp


CODE    ends

Author

This page was written by Bob Smith. Please all comments and corrections to me.

Footnote

¹ The rare circumstances mentioned above are essentially that if you initialize any of various data structures in PM, you must, in a sense, uninitialize them in the process of transitioning from PM to RM. For example, if you load an IDT for use in PM, you must load a IDT for use in RM; if you load segment registers with PM selectors, you must load the same segment registers with RM segment values, etc.

NARS2000 © 2006-2020
Comments or suggestions? Send them to .