Memory Management - Part 1: Virtual memory and Paging concepts

Table of Contents

Memory is an important part of a computer system that is used to store data. The process of memory management is critical because there is a limited amount of memory available to a computer system. One of the most important parts in the kernel of an operating system is the memory management part. In this section, the kernel has the task of making the memory available to the executive units (like processes or threads) by effectively communicating with the memory management unit in the processor, or in other words, allocating the memory to them.

Virtual Memory

Virtual memory is one of the memory management techniques in the operating system. Using this technique for memory management in operating systems makes the processes have the illusion that they have a very large space of memory at their disposal, when the main memory space can be much more limited. In this model, the operating system divides the virtual memory into smaller pieces called pages. The reason for this segmentation is to simplify the process of accessing different memory locations. Each of these pages can be considered equivalent to a location in main memory or peripheral memory. This means that some memory pages of a process can be stored in RAM while others can be stored on disk. Figure 1 shows an abstract of this topic.

Figure 1: Abstraction of process pages and their mapped locations in main memory and secondary memory

This approach offers two main benefits:

  1. Isolation: Each process operates within its own virtual address space, isolating it from other processes. This prevents one process from accessing or modifying the memory of another process unless explicitly permitted, enhancing security and stability.
  2. Efficiency: The operating system loads only the parts of a process’s memory that are actively being used into RAM, leaving less frequently used parts on disk. This optimizes the use of physical memory and ensures that more memory is available for other tasks.

Page Sizes in Virtual Memory

Modern operating systems use different page sizes to balance performance and flexibility. The size of a page impacts how memory is allocated and managed. For example, There are three common page sizes on Intel’s 64-bit architecture,

  1. Standard/Small Pages (4 KB): In Intel’s 64-bit architecture, the size of these pages is 4 kilobytes. These types of pages are most commonly used in virtual memory systems and are used as the basic unit of pages for most processes. The small size of these pages allows the operating system to manage memory accurately and optimally. Using this type of pages for programs that require a large amount of memory reduces system efficiency.
  2. Large Pages (2 MB): The size of these pages is 2 MB. These pages are usually suitable for programs that need to access large chunks of memory. Due to the larger size, fewer page tables are used for mapping, which reduces the operating system’s memory management burden. However, it should be noted that using larger pages reduces the flexibility of the operating system in memory management, because smaller parts of memory cannot be controlled accurately.
  3. Huge Pages (1 GB): the size of these pages is equal to 1 GB. Very large pages are used for applications that require access to very large volumes of data and are typically used in server systems or high-performance computing environments. By using huge pages, the need for a large number of page tables is eliminated, which reduces the operating system’s burden of memory management. Hence, it is suitable for extremely large applications such as very large databases, virtual machine disks and big data analysis. Also, for the same reason as the previous type, these pages are not flexible and are used only in special cases. The table below shows the size of pages in different architectures.
Architecture Small Pages Large Pages Huge Pages
x86-32 4 KB 2 MB 1 GB
x86-64 4 KB 2 MB 1 GB
ARM 4 KB 4 MB -
ARM64 4 KB 2 MB 512 MB

Memory Page States

In addition to different sizes, memory pages can exist in various states within the system. These states determine how the pages are used and whether they are available for allocation:

  1. Committed Pages: The Committed virtual memory pages are pages that are mapped to a frame in physical memory. These pages are available for use by the application and data is stored and processed in it. When an executive unit requests memory from the kernel, the operating system commits the memory for those pages so that the executive unit can use those pages to store and reuse its data.
  2. Reserved Pages: The Reserved pages occupy part of the virtual address space, but are not yet mapped to physical memory or paging space. These pages are reserved for future use, but are not committed until the application asks the operating system to use them. This helps to keep the address space available to the application without using physical memory unnecessarily. If the program uses these pages without requesting them to be committed, the operating system will close the process by sending a page fault exception error.
  3. Free Pages: The Free pages are actually free and are not assigned to any process. These pages are available to be allocated to applications by the operating system as needed. When an application requests memory, free pages can be reserved first and then committed as needed. As with the previous type, any access to these pages without reservation and commit requests will result in memory page fault.

By managing these page states, the operating system ensures that memory is allocated efficiently, reusing and swapping pages between RAM and disk when necessary.

Guard Pages

Guard pages are a specialized type of virtual memory page used to protect memory regions and manage dynamic memory growth. They are marked with a special protection mask that triggers an exception (STATUS_GUARD_PAGE_VIOLATION) when accessed, signaling that the boundary of committed memory has been reached.

This exception can be handled by the program using a try/except block, allowing it to take action such as committing more memory. Once the guard page is accessed and the exception is handled, the system typically commits more memory to meet the program’s needs, and the previously protected guard page becomes part of the committed memory. A new guard page is then set at the next boundary to continue the protection.

Guard pages are commonly used for managing stack growth in programs. A stack can reserve more memory than is immediately needed, committing additional pages dynamically as the program requires them, with guard pages ensuring that memory overruns are caught safely.

Reserved memory for guard pages has a minimal impact on system resources, as the memory is only committed when it is accessed, ensuring efficient resource usage.

Windows VMMap Example

Figure 2 in the article shows a screenshot from VMMap, a Windows tool for visualizing memory usage, highlighting how guard pages are visually distinct from other memory types and are used to manage the growth of memory regions dynamically.

Figure 2: Display of pages and guard page mechanism in VMMAP software of Windows operating system

Page Table Hierarchy and Address Translation

In general, in virtual memory technique, there are two types of addressing methods called virtual/logical address and physical address. The virtual address belongs to the virtual space of the memory that is dedicated to the processes, and the physical address is the exact location of the desired data in the physical memory. Although the process of allocating memory, creating and managing pages is the responsibility of the operating system, the task of translating the virtual address to the physical address is the responsibility of a hardware part called Memory Management Unit (MMU). The work of this hardware is address translation, access control and cache control. In the address translation section, the operating system can access the location of the physical address by giving the virtual address to the MMU and can perform its desired actions. Figure 3 shows the form in which the MMU is considered a separate hardware unit and the processor uses it to access the memory.

Figure 3: Abstraction of an old architecture for communication between MMU and addresses

In the x86-64 architecture, while the virtual address is theoretically 64-bits, only the lower 48 bits are used bacause of limitations in current implementations. This means that 256 TB of virtual memory is addressable and accessible (That’s why there’s a free gap between user-space and kernel-space). The virtual address is translated in 5 different levels in the page tables, but the Windows operating system supports only 4 levels, which we will examine only those four levels below.

What’s the use of page levels?

Systems with a large address space, such as x86-64, need solutions to manage all virtual memory space optimally. Dividing this space into different levels means that large parts of the memory that are not used do not need to be mapped directly and only a part of the addressing space is managed dynamically.

Page Levels in the Intel x86-64 Architecture

As mentioned, 4 different levels for memory pages in the Windows operating system can be considered, which are:

  1. Page Map Level 4 (PML4)
  2. Page Directory Pointer Table (PDPT)
  3. Page Directory Table (PDT)
  4. Page Table (PT)

PML4

This table is the first level of page table of x86-64 systems, which has 512 entries and each entry can point to a PDP table on the second level. In fact, PML4 manages the overall map of access to larger parts of memory. The main use of this level is to divide a very large address space into smaller parts, and there is no need for the operating system to manage the entire space directly. Each entry in PML4 covers about 512 GB of virtual address space.

PDPT

PDPT is the second level of the page table and each entry from this level points to a PD table. The PDP table also has 512 entries and each entry can manage 1 GB of virtual space. If Huge pages are used, this level can directly point to physical addresses without the need for lower levels, and in this case, each entry in the PDP table can handle a 1 GB page.

PDT

The third level of the PDT page table has 512 entries like the previous levels. Each entry in the PD table can handle 2 MB of virtual addressing space. Also, each entry points to a PT. If large 2MB pages are used, the PD table can point directly to physical addresses and there is no need for a PT level. In this case, the translation process becomes faster and simpler because fewer levels are required for translation.

PT

This level is the last level of the page table where each entry points to a 4 KB page of physical memory. PT also has 512 entries, and each entry points to a 4KB physical frame. This hierarchical structure efficiently manages large virtual memory spaces and reduces the complexity of maintaining large page tables.

Figure 4: Paging Levels from AMD Architecture programmer’s manual, volume 2

The virtual address is divided into 5 parts and each part is used to index into a different level of the page table hierarchy. For example, in x86-64 systems, the address translation process in paging level works as follows:

  • Bits 47-39: Used to index into the PML4.
  • Bits 38-30: Used to index into the PDPT.
  • Bits 29-21: Used to index into the PDT.
  • Bits 20-12: Used to index into the PT.
  • Bits 11-0: Represent the offset within the 4 KB page.

Example: Translating a 64-bit Virtual Address

Consider a 64-bit virtual address like 0x00007FFFFFFFFFFF. This address is divided into segments, with each segment used to index into a different level of the page table hierarchy. The breakdown is as follows:

Segment Binary Value Purpose
PML4 000000000 Index into the PML4 table
PDPT 111111111 Index into the PDPT
PDT 111111111 Index into the PDT
PT 111111111 Index into the PT
Offset 111111111111 Offset within the page

This structure means that the processor uses these bits to navigate the multi-level page table hierarchy. The PML4 entry is used to locate the corresponding PDPT entry, and so on, until the final physical page is identified, and the offset is applied to access the specific data within the page.

Control Registers for Paging

On x86 and x86-64 architectures, control registers (specifically CR0, CR2, CR3, and CR4) play a crucial role in managing the paging system and overall memory management. These registers are used to configure how the processor handles virtual memory, paging, and other essential tasks related to memory protection and execution.

CR0 Register

The CR0 register controls several critical operating modes of the CPU, including enabling and disabling paging. Its relevant flags include:

  • PG (Paging Enable): This flag controls whether paging is enabled or disabled. When set, the processor translates virtual addresses into physical addresses using the page tables.
  • PE (Protection Enable): This flag, when set, enables protected mode, which allows for the use of paging and access control based on privilege levels.
  • WP (Write Protect): This flag, when set, prevents supervisor-level code from writing to user-mode pages, providing an additional layer of memory protection.

CR2 Register

The CR2 register holds the faulting address in the event of a page fault. When a page fault occurs (such as when a process tries to access a non-committed or non-existent page), the memory address that caused the fault is stored in CR2, allowing the operating system to handle the exception properly and decide how to proceed (e.g., by loading the page from disk or terminating the process).

CR3 Register

The CR3 register contains the physical address of the base of the page directory. This register is crucial in paging because it allows the CPU to locate the page tables needed for virtual address translation. The page directory is a top-level data structure that the memory management unit (MMU) uses to start the process of translating a virtual address into a physical address. The contents of CR3 are updated whenever a context switch occurs to change the address space of the running process.

In x86-64 systems, CR3 is responsible for pointing to the base of the PML4 (Page Map Level 4) table, which is the top level of the page table hierarchy.

CR4 Register

The CR4 register controls various advanced CPU features, many of which are related to memory management and protection mechanisms. Some of the relevant bits in CR4 include:

  • PAE (Physical Address Extension): This flag enables 36-bit physical addressing, which allows for larger amounts of physical memory to be addressed than the standard 32-bit limit.
  • PSE (Page Size Extension): This flag allows the use of large pages (4 MB in x86, 2 MB or 1 GB in x86-64 systems), which can improve performance by reducing the overhead of managing smaller 4 KB pages.
  • SMEP (Supervisor Mode Execution Protection): This bit prevents supervisor-mode code (e.g., kernel code) from executing user-mode pages, which enhances security by reducing the risk of privilege escalation attacks.
  • NXE (No-Execute Enable): This flag enables support for the NX (No-Execute) bit in page tables, which marks pages as non-executable, preventing code execution from regions of memory not intended to contain executable code (such as the stack or heap).

Together, these control registers form the backbone of paging and memory protection mechanisms in modern x86 and x86-64 systems. They ensure efficient and secure handling of memory by controlling how virtual addresses are translated, how exceptions are handled, and how memory protection is enforced.

Example 1: Examining virtual and physical memory using Windbg

To practically work with addresses in kernel mode using Windbg on a Windows 10 virtual machine, the first step is to change the context to the desired process. Here’s a detailed step-by-step explanation:

Step 1: Get the List of Processes

To begin, I attached to my Windows 10 VM via Windbg in kernel mode. To check the pages of a process in kernel mode, we first need to change the context to that process. Use the command !process 0 0 to get a list of all processes running on the system.

Figure 5: The processes list in windbg

Step 2: Change Context to a Specific Process

For this example, I am changing the context to cmd.exe. To do so, I need the address of the EPROCESS object for cmd.exe, which is ffffdd8291b29300. Then, I use the following command to change the context: Windbg will prompt me to enter g to complete the context switch.

Figure 6: Changing process context in windbg

Step 3: Find Virtual Address Descriptors (VAD)

Next, we need to find the page addresses. The !vad command provides details about the virtual address descriptors (VAD) of the process.

Figure 7: Virtual Address Descriptors in windbg

Step 4: Retrieve Page Table Entry (PTE)

From the VAD list, I randomly select one address and use the !pte command to retrieve the page table entry. For instance:

Figure 8: The Page Table entries

Step 5: Determine PLM4 and CR3 Values

To find the physical address for the page, we need the PLM4 address (FFFFEFF7FBFDFDD8) and the CR3 register value. The CR3 value is obtained by running:

r cr3
Figure 9: The cr3 register value

The CR3 value is 00000000ab994002. To convert this into a usable form, the first 12 bits should be set to zero, resulting in 00000000AB994000.

Step 6: Convert Virtual Address to Physical Address

Using the !vtop command, I can convert the virtual address to a physical address:

!vtop 00000000AB994000 FFFFEFF7FBFDFDD8
Figure 10: Converting virtual address to the physical address using vtop command in windbg

The output shows the translation of the virtual address FFFFEFF7FBFDFDD8 to the physical address ab994dd8.

Step 7: Verify Using Physical and Virtual Dump Commands

Finally, to verify the data in both the physical and virtual addresses, use the !dq command for the physical address:

!dq 00000000ab994dd8

And for the virtual address:

dq FFFFEFF7FBFDFDD8

Both commands will display the corresponding memory content:

Figure 11: Comparing virtual memory and physical memory contents

Example 2: Finding the Physical Address of a Page Using the PFN Formula in Windows

In the context of Windows memory management, PFN stands for Page Frame Number. It’s an index that represents a page’s location within physical memory, used by the memory manager to map virtual memory addresses to physical addresses. Each PFN points to a specific page in physical memory, helping the system to translate a virtual address into the actual physical address in RAM.

When dealing with page table entries in Windows, the PFN typically occupies a part of the PTE. For example, in a typical 4 KB page configuration, the PFN, combined with the page offset, helps calculate the exact physical address.

To translate a virtual address to a physical address using PFN, we can use the following formula:

Physical address = (PFN * Page_Size) + Offset

Let’s examine how this applies with a sample C program that allocates memory at a specific virtual address:

#include <windows.h>
#include <stdio.h>

int main() {
    LPVOID lpAddress = (LPVOID)0x50000;

    LPVOID allocatedMemory = VirtualAlloc(
        lpAddress,
        4096,
        MEM_COMMIT | MEM_RESERVE,
        PAGE_READWRITE
    );

    if (allocatedMemory == NULL) {
        printf("Memory allocation failed at 0x50000. Error: %d\n", GetLastError());
    }
    else {
        printf("Memory successfully allocated at address: 0x%p\n", allocatedMemory);
        memset(allocatedMemory, 0x7f, 4096);
    }

    getchar();

    if (allocatedMemory != NULL) {
        VirtualFree(allocatedMemory, 0, MEM_RELEASE);
        printf("Memory freed.\n");
    }

    return 0;
}

This code allocates a 4KB page at virtual address 0x50000 and sets each byte to 0x7f. After running this code, you can check the PTE for this address in WinDbg:

2: kd> !pte 0x0000000000050000
                                           VA 0000000000050000
PXE at FFFFD8EC763B1000    PPE at FFFFD8EC76200000    PDE at FFFFD8EC40000000    PTE at FFFFD88000000280
contains 8A00000024489867  contains 0A0000011CCE4867  contains 0A00000114BE5867  contains 81000000360E6867
pfn 24489     ---DA--UW-V  pfn 11cce4    ---DA--UWEV  pfn 114be5    ---DA--UWEV  pfn 360e6     ---DA--UW-V

Here, the PFN for the page table entry (PTE) is 0x360e6. Since each page is 4KB, we can calculate the physical address:

Physical Address = 0x360e6 * 0x1000 + 0 = 0x360e6000

To verify, use the dq command to view the contents at both the virtual and physical addresses:

2: kd> dq 0x50000
00000000`00050000  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
00000000`00050010  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
00000000`00050020  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
00000000`00050030  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
00000000`00050040  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
00000000`00050050  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
00000000`00050060  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
00000000`00050070  7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
2: kd> !dq 0x360e6000
#360e6000 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
#360e6010 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
#360e6020 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
#360e6030 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
#360e6040 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
#360e6050 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
#360e6060 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f
#360e6070 7f7f7f7f`7f7f7f7f 7f7f7f7f`7f7f7f7f

comments powered by Disqus