Subject: VIRTUAL MEMORY ON UNIX Content Type: TEXT/PLAIN Creation Date: 23-AUG-1995 --------------------------- II. Virtual memory on Unix --------------------------- The discussion above of virtual memory and paging is a very general one, and all of the statements in it apply to any system that implements virtual memory and paging. A full discussion of paging and virtual memory implementation on UNIX is beyond the scope of this article. In addition, different UNIX vendors have implemented different paging subsystems, so you need to contact your UNIX vendor for precise information about the paging algorithms on your UNIX machine. However, there are certain key features of the UNIX paging system which are consistent among UNIX ports. Processes run in a virtual address space, and the UNIX kernel transparently manages the paging of physical memory for all processes on the system. Because UNIX uses virtual memory and paging, typically only a portion of the process is in RAM, while the remainder of the process is on disk. 1) The System Memory Map The physical memory on a UNIX system is divided among three uses. Some portion of the memory is dedicated for use by the operating system kernel. Of the remaining memory, some is dedicated for use by the I/O subsystem (this is called the buffer cache) and the remainder goes into the page pool. Some versions of UNIX statically assign the sizes of system memory, the buffer cache, and the page pool, at system boot time; while other versions will dynamically move RAM between these three at run time, depending on system load. (Consult your UNIX system vendor for details on your particular version of UNIX.) The physical memory used by processes comes out of the page pool. In addition, the UNIX kernel allocates a certain amount of system memory for each process for data structures that allow it to keep track of that process. This memory is typically not more than a few pages. If your system memory size is fixed at boot time you can completely ignore this usage, as it does not come out of the page pool. If your system memory size is adjusted dynamically at run-time, you can also typically ignore this usage, as it is dwarfed by the page pool requirements of Oracle software. 2) Global Paging Strategy UNIX systems implement a global paging strategy. This means that the operating system will look at all processes on the system when is searching for a page of physical memory on behalf of a process. This strategy has a number of advantages, and one key disadvantage. The advantages of a global paging strategy are: (1) An idle process can be compleatly paged out so it does not hold memory pages that can be better used by another process. (2) A global strategy allows for a better utilization of system memory; each process's page allocations will be closer to their actual working set size. (3) The administrative overhead of managing process or user page quotas is completely absent. (4) The implementation is smaller and faster. The disadvantage of a global strategy is that is is possible for a single ill-behaved process to affect the performance of all processes on the system, simply by allocating and using a large number of pages. 3) Text and Data Pages A UNIX process can be conceptually divided into two portions; text and data. The text portion contains the machine instructions that the process executes; the data portion contains everything else. These two portions occupy different areas of the process's virtual address space. Both text and data pages are managed by the paging subsystem. This means that at any point in time, only some of the text pages and only some of the data pages of any given process are in RAM. UNIX treats text pages and data pages differently. Since text pages are typically not modified by a process while it executes, text pages are marked read-only. This means that the operating system will generate an error if a process attempts to write to a text page. (Some UNIX systems provide the ability to compile a program which does not have read-only text: consult the man pages on 'ld' and 'a.out' for details.) The fact that text pages are read-only allows the UNIX kernel to perform two important optimizations: text pages are shared between all processes running the same program, and text pages are paged from the filesystem instead of from the paging area. Sharing text pages between processes reduces the amount of RAM required to run multiple instances of the same program. For example, if five processes are running Oracle Forms, only one set of text pages is required for all five processes. The same is true if there are fifty or five hundred processes running Oracle Forms. Paging from the filesystem means that no paging space needs to be allocated for any text pages. When a text page is paged out it is simply over-written in RAM; if it is paged in at a later time the original text page is available in the program image in the file system. On the other hand, data pages must be read/write, and therefore cannot (in general) be shared between processes. This means that each process must have its own copy of every data page. Also, since a process can modify its data pages, when a data page is paged out it must be written to disk before it is over-written in RAM. Data pages are written to specially reserved sections of the disk. For historical reasons, this paging space is called "swap space" on UNIX. Don't let this name confuse you: the swap space is used for paging. 4) Swap Space Usage The UNIX kernel is in charge of managing which data pages are in RAM and which are in the swap space. The swap space is divided into swap pages, which are the same size as the RAM pages. For example, if a particular system has a page size of 4K, and 40M devoted to swap space, this swap space will be divided up into 10240 swap pages. A page of swap can be in one of three states: it can be free, allocated, or used. A "free" page of swap is available to be allocated as a disk page. An "allocated" page of swap has been allocated to be the disk page for a particular virtual page in a particular process, but no data has been written to the disk page yet -- that is, the corresponding memory page has not yet been paged out. A "used" page of swap is one where the swap page contains the data which has been paged out from RAM. A swap page is not freed until the process which "owns" it frees the corresponding virtual page. On most UNIX systems, swap pages are allocated when virtual memory is allocated. If a process requests an additional 1M of (virtual) memory, the UNIX kernel finds 1M of pages in the swap space, and marks those pages as allocated to a particular process. If at some future time a particular page of RAM must be paged out, swap space is already allocated for it. In other words, every virtual data page is "backed with" a page of swap space. An important consequence of this strategy is if all the swap space is allocated, no more virtual memory can be allocated. In other words, the amount of swap space on a system limits the maximum amount of virtual memory on the system. If there is no swap space available, and a process makes a request for more virtual memory, then the request will fail. The request will also fail if there is some swap space available, but the amount available is less than the amount requested. There are four system calls which allocate virtual memory: these are fork(), exec(), sbrk(), and shmget(). When one of these system calls fails, the system error code is set to EAGAIN. The text message associated with EAGAIN is often "No more processes". (This is because EAGAIN is also used to indicate that the per-user or system-wide process limit has been reached.) If you ever run into a situation where processes are failing because of EAGAIN errors, be sure to check the amount of available swap as well as the number of processes. If a system has run out of swap space, there are only two ways to fix the problem: you can either terminate some processes (preferably ones that are using a lot of virtual memory) or you can add swap space to your system. The method for adding swap space to a system varies between UNIX variants: consult your operating system documentation or vendor for details. 5) Shared Memory UNIX systems implement, and the Oracle server uses, shared memory. In the UNIX shared memory implementation, processes can create and attach shared memory segments. Shared memory segments are attached to a process at a particular virtual address. Once a shared memory segment is attached to a processes, memory at that address can be read from and written to, just like any other memory in the processes address space. Unlike "normal" virtual memory, changes written to an address in the shared memory segment are visible to every process that has attached to that segment. Shared memory is made up of data pages, just like "conventional" memory. Other that the fact that multiple processes are using the same data pages, the paging subsystem does not treat shared memory pages any differently than conventional memory. Swap space is reserved for a shared memory segment at the time it is allocated, and the pages of memory in RAM are subject to being paged out if they are not in use, just like regular data pages. The only difference between the treatment of regular data pages and shared data pages is that shared pages are allocated only once, no matter how many processes are using the shared memory segment. 6) Memory Usage of a Process When discussing the memory usage of a process, there are really two types of memory usage to consider: the virtual memory usage and the physical memory usage. The virtual memory usage of a process is the sum of the virtual text pages allocated to the process, plus the sum of the virtual data pages allocated to the process. Each non-shared virtual data page has a corresponding page allocated for it in the swap space. There is no system-wide limit on the number of virtual text pages, and the number of virtual data pages on the system is limited by the size of the swap space. Shared memory segments are allocated on a system-wide basis rather than on a per-process basis, but are allocated swap pages and are paged from the swap device in exactly the same way as non-shared data. The physical memory usage of a process is the sum of the physical text pages of that process, plus the sum of the physical data pages of that process. Physical text pages are shared among all processes running the same executable image, and physical data pages used for shared memory are shared among among all processes attached to the same shared memory segment. Because UNIX implements virtual memory, the physical memory usage of a process will be lower than the virtual memory usage. The actual amount of physical memory used by a process depends on the behavior of the operating system paging subsystem. Unlike the virtual memory usage of a process, which will be the same every time a particular program runs with a particular input, the physical memory usage of a process depends on a number of other factors. First: since the working set of a process changes over time, the amount of physical memory needed by the process will change over time. Second: if the process is waiting for user input, the amount of physical memory it needs will drop dramatically. (This is a special case of the working set size changing.) Third: the amount of physical memory actually allocated to a process depends on the overall system load. If a process is being run on a heavily loaded system, then the global page allocation policy will tend to make the number of physical memory pages allocated to that process to be very close to the size of the working set. If the same program is run with the same input on a lightly loaded system, the number of physical memory pages allocated to that process will tend to be much larger than the size of the working set: the operating system has no need to reclaim physical pages from that process, and will not do so. The net effect of this is that any measure of physical memory usage will be inaccurate unless you are simulating both the input and the system load of the final system you will be testing. For example, the physical memory usage of a Oracle Forms process will be very different if a user is rapidly moving between 3 large windows, infrequently moving between the same three windows, rapidly typing into a single window, slowly typing into the same window, or if they are reading data off of the screen and the process is sitting idle -- even though the virtual memory usage of the process will remain the same. By the same token, the physical memory usage of an Oracle Forms process will be different if it is the only active process on a system, or if it is one of fifty active Oracle Forms processes on the same system. 7) Key Points There are a number of key points to understand about the UNIX virtual memory implementation. (1) Every data page in every process is "backed" by a page in the swap space. This size of the swap space limits the amount of virtual data space on the system; processes are not able to allocate memory if there is not enough swap space available to back it up, regardless of how much physical memory is available on the system. (2) UNIX implements a global paging strategy. This means that the amount of physical memory allocated to a process varies greatly over time, depending on the size of the process's working set and the overall system load. Idle processes may be paged out completely on a busy system. On a lightly loaded system processes may be allocated much more physical memory than they require for their working sets. (3) The amount of virtual memory available on a system is determined by the amount of swap spaces configured for that system. The amount of swap space needed is equal to the sum of the virtual data allocated by all processes on the system at the time of maximum load. (4) Physical memory is allocated for processes out of the page pool, which is the memory not allocated to the operating system kernel and the buffer cache. The amount of physical memory needed for the page pool is equal to the sum of the physical pages in the working sets of all processes on the system at the time of maximum load.