Prior to 1990, most PCs were DOS-based, and the amount of physical memory installed could accommodate the DOS operating system and whatever application program was running.  However, application programs were continuing to get larger and more complex, and the engineers at Microsoft were developing a new operating system that would allow "multi-tasking"; ie, the ability to have more than one application program in memory at the same time.  Although physical memory continued to increase in size and decrease in price, a multi-tasking operating system would need much more memory than what was physically installed.  The development of "virtual memory" allowed PC multi-tasking operating systems to become a reality, and established Windows as the dominant PC operating system in the world.

Virtual memory is a memory management scheme that allows more memory to be used than what is physically available.  In fact, virtual memory allows the (virtual) use of the PC's theoretical maximum amount of memory, which is dependent on the size of the processor's address bus.

Recall that the address bus in a Pentium Core 2 Duo processor is 36 bits wide, giving a maximum theoretical RAM size of 64 GB, even though most PCs can only accommodate 2-4 GB.  In a virtual memory system, the theoretical limit becomes a reality.  In other words, you could (conceivably) be running 32 different application programs, each 2 GB in size, or work on a document in MS Word that is almost 64 GB in size, even though your computer only has 2 GB of RAM!

 

Paging

Virtual memory is accomplished by using a paging technique similar to the one that was used to implement expanded memory, except that the hard disk is now used to hold the portions of the data that are not active, rather than a memory expansion card. 

Application programs and data are broken up into small segments, called pages.  Only a small number of these pages are actually resident in RAM; the rest are stored in a temporary area on the hard disk known as a swap file.  When the application refers to data that is not currently in a page in RAM, a page fault occurs.  The operating system then decides which pages in RAM are least likely to be needed soon, and removes these pages.  The required data is then moved from the swap file into the page frames left empty from the previous pages.

Thrashing

Operating systems like Windows can automatically vary the amount of disk space required for the swap file in a virtual memory system.  Occasionally, in systems where the hard disk is full, or the amount of RAM is small, the operating system spends more time swapping pages in and out of memory, and no time executing the application.  This is known as thrashing, and is characterized by an excessive amount of disk activity, even when it appears that your application is idle.

 

Disk Caching

In a virtual memory system, if the instruction is not found in RAM, then a page fault occurs.  A disk access is now required, since the operating system must locate the required instruction or data in the swap file on the hard disk, and load the required page into RAM. 

To reduce disk accesses, the operating system may reserve a portion of memory to be used as a disk cache.  The disk cache works on the same principle as the L1 and L2 caches – that certain instructions and data in a program are used more frequently than others.

When a page fault occurs, the CPU must remove some pages from RAM in order to make room for the incoming pages.  Instead of simply discarding these pages, they are moved to the disk cache, in the hopes that they will be needed again soon.  Thus, when a page fault occurs, the operating system will look to the disk cache before fetching pages from the swap file on the hard disk.

The disk cache is created and maintained by the operating system.  Computers with large amounts of RAM may set aside several MB for disk caching.  (An operating system like Windows can use up to one-quarter of the total installed RAM as a disk cache!)

Therefore, when a program is executing, the CPU is continuously fetching and decoding millions of instructions every second.  To obtain the next instruction or piece of data, the CPU uses the following search path:
 
1. Look for the instruction in the L1 cache (built into the CPU)
2. If not found in L1 cache, look in the L2 cache (found on SRAM chips)
3. If not found in L2 cache, look for the instruction in RAM (disk cache).
4. If not found in the disk cache, look for the instruction in RAM (where the program is resident).
5. If not found in resident RAM ("page fault"), go to the HDD and get the appropriate page(s) from the swap file.