Tuesday, November 27, 2007

Research Topic 2 - IT 222 OPERATING SYSTEMS

Consider both Windows and UNIX operating systems.
Compare and contrast how each implements virtual memory

Virtual memory in UNIX
Virtual memory is an internal “trick” that relies on the fact that not every executing task is always referencing it’s RAM memory region. Since all RAM regions are not constantly in-use, UNIX has developed a paging algorithm that move RAM memory pages to the swap disk when it appears that they will not be needed in the immediate future.

RAM demand paging in UNIX

As memory regions are created, UNIX will not refuse a new task whose RAM requests exceeds the amount of RAM. Rather, UNIX will page out the least recently referenced RAM memory page to the swap disk to make room for the incoming request. When the physical limit of the RAM is exceeded UNIX can wipe-out RAM regions because they have already been written to the swap disk.

When the RAM region is been removed to swap, any subsequent references by the originating program require UNIX copy page in the RAM region to make the memory accessible. UNIX page in operations involve disk I/O and are a source of slow performance. Hence, avoiding UNIX page in operations is an important concern for the Oracle DBA.

The Main functions of paging are performed when a program tries to access pages that do not currently reside in RAM, a situation causing page fault:

1. Handles the page fault, in a manner invisible to the causing program, and takes control.

2. Determines the location of the data in auxiliary storage.

3. Determines the page frame in RAM to use as a container for the data.

4. If a page currently residing in chosen frame has been modified since loading (if it is dirty), writes the page to auxiliary storage.

5. Loads the requested data into the available page.

6. Returns control to the program, transparently retrying the instruction that caused page fault.

The need to reference memory at a particular address arises from two main sources:

  • Processor trying to load and execute a program's instructions itself.
  • Data being accessed by a program's instruction.

In step 3, when a page has to be loaded and all existing pages in RAM are currently in use, one of the existing pages must be swapped with the requested new page. The paging system must determine the page to swap by choosing a one that is least likely to be needed within a short time. There are various page replacement algorithms that try to answer such issue.

Most operating systems use the least recently used (LRU) page replacement algorithm. The theory behind LRU is that the least recently used page is the most likely one not to be needed shortly; when a new page is needed, the least recently used page is discarded. This algorithm is most often correct but not always: e.g. a sequential process moves forward through memory and never again accesses the most recently used page.

Most programs that become active reach a steady state in their demand for memory locality both in terms of instructions fetched and data being accessed. This steady state is usually much less than the total memory required by the program. This steady state is sometimes referred to as the working set: the set of memory pages that are most frequently accessed.

Virtual memory systems work most efficiently when the ratio of the working set to the total number of pages that can be stored in RAM is low enough to minimize the number of page faults. A program that works with huge data structures will sometimes require a working set that is too large to be efficiently managed by the page system resulting in constant page faults that drastically slow down the system. This condition is referred to as thrashing: a page is swapped out and then accessed causing frequent faults.

An interesting characteristic of thrashing is that as the working set grows, there is very little increase in the number of faults until the critical point, when faults go up dramatically and majority of system's processing power is spent on handling them.

Virtual memory is an internal “trick” that relies on the fact that not every executing task is always referencing it’s RAM memory region. Since all RAM regions are not constantly in-use, UNIX has developed a paging algorithm that move RAM memory pages to the swap disk when it appears that they will not be needed in the immediate future.

RAM demand paging in UNIX

As memory regions are created, UNIX will not refuse a new task whose RAM requests exceeds the amount of RAM. Rather, UNIX will page out the least recently referenced RAM memory page to the swap disk to make room for the incoming request. When the physical limit of the RAM is exceeded UNIX can wipe-out RAM regions because they have already been written to the swap disk.

When the RAM region is been removed to swap, any subsequent references by the originating program require UNIX copy page in the RAM region to make the memory accessible. UNIX page in operations involve disk I/O and are a source of slow performance. Hence, avoiding UNIX page in operations is an important concern for the Oracle DBA.

The Main functions of paging are performed when a program tries to access pages that do not currently reside in RAM, a situation causing page fault:

1. Handles the page fault, in a manner invisible to the causing program, and takes control.

2. Determines the location of the data in auxiliary storage.

3. Determines the page frame in RAM to use as a container for the data.

4. If a page currently residing in chosen frame has been modified since loading (if it is dirty), writes the page to auxiliary storage.

5. Loads the requested data into the available page.

6. Returns control to the program, transparently retrying the instruction that caused page fault.

The need to reference memory at a particular address arises from two main sources:

  • Processor trying to load and execute a program's instructions itself.
  • Data being accessed by a program's instruction.

In step 3, when a page has to be loaded and all existing pages in RAM are currently in use, one of the existing pages must be swapped with the requested new page. The paging system must determine the page to swap by choosing a one that is least likely to be needed within a short time. There are various page replacement algorithms that try to answer such issue.

Most operating systems use the least recently used (LRU) page replacement algorithm. The theory behind LRU is that the least recently used page is the most likely one not to be needed shortly; when a new page is needed, the least recently used page is discarded. This algorithm is most often correct but not always: e.g. a sequential process moves forward through memory and never again accesses the most recently used page.

Most programs that become active reach a steady state in their demand for memory locality both in terms of instructions fetched and data being accessed. This steady state is usually much less than the total memory required by the program. This steady state is sometimes referred to as the working set: the set of memory pages that are most frequently accessed.

Virtual memory systems work most efficiently when the ratio of the working set to the total number of pages that can be stored in RAM is low enough to minimize the number of page faults. A program that works with huge data structures will sometimes require a working set that is too large to be efficiently managed by the page system resulting in constant page faults that drastically slow down the system. This condition is referred to as thrashing: a page is swapped out and then accessed causing frequent faults.

An interesting characteristic of thrashing is that as the working set grows, there is very little increase in the number of faults until the critical point, when faults go up dramatically and majority of system's processing power is spent on handling

Windows, in addition to the RAM, uses a part or parts of the hard disk for storing temporary files and information. These are data that are not required immediately. For example, when you minimize a window, or have an application running in the background. Although Windows management of the virtual memory has grown more efficient, it still tends to access the hard disk very often. Most times absolutely unnecessarily, because it is programmed to keep the RAM free. With a little bit of tricking, you can optimize this access, not only making sure that
Windows uses this feature sparingly and sensibly but speeding up file access generally.

Virtual Memory in Windows NT

The virtual-memory manager (VMM) in Windows NT is nothing like the memory managers used in previous versions of the Windows operating system. Relying on a 32-bit address model, Windows NT is able to drop the segmented architecture of previous versions of Windows. Instead, the VMM employs 32-bit virtual addresses for directly manipulating the entire 4-GB process. At first this appears to be a restriction because, without segment selectors for relative addressing, there is no way to move a chunk of memory without having to change the address that references it. In reality, the VMM is able to do exactly that by implementing virtual addresses. Each application is able to reference a physical chunk of memory, at a specific virtual address, throughout the life of the application. The VMM takes care of whether the memory should be moved to a new location or swapped to disk completely independently of the application, much like updating a selector entry in the local descriptor table (LDT).

Windows versions 3.1 and earlier employed a scheme for moving segments of memory to other locations in memory both to maximize the amount of available contiguous memory and to place executable segments in the location where they could be executed. An equivalent operation is unnecessary in Windows NT's virtual memory management system for three reasons. One, code segments are no longer required to reside in the 0-640K range of memory in order for Windows NT to execute them. Windows NT does require that the hardware have at least a 32-bit address bus, so it is able to address all of physical memory, regardless of location. Two, the VMM virtualizes the address space such that two processes can use the same virtual address to refer to distinct locations in physical memory. Virtual address locations are not a commodity, especially considering that a process has 2 GB available for the application. So, each process may use any or all of its virtual addresses without regard to other processes in the system. Three, contiguous virtual memory in Windows NT can be allocated discontiguously in physical memory. So, there is no need to move chunks to make room for a large allocation.

No comments: