Operating Systems Technology



System Administration

Course Notes #11

Processes

The term process, when discussion operating systems, is used to describe a program in the state of execution. That is, it is a program that has begun execution but has not completed. The reason that we differentiate between a program and a process is that a program is a passive entity – it is a body of code. But a process includes a state: the values of the various data at the moment, the process’ status (executing, sleeping, waiting, suspended), and the resources currently issued to that process. This set of notes concentrates on processes, their various states and terminology that you might need to know about processes.

Starting a Process

In most operating systems, starting a process is a matter of double clicking on the appropriate short cut icon or selecting the program’s name from a menu. As you have seen in Linux, you can also start a process from the command line. The shortcut approach in a GUI (menu or double clicking) is merely a convenience because, upon selecting the program, the OS translates your mouse action into a call to start the program, similarly to the command line approach. If you take a look at the shortcuts in the Windows environment (select Properties from the right button menu) you will see the actual command that executes. From the command line however, you can provide parameters which are not available from the GUI. For instance, you might type emacs foobar.txt rather than just start emacs. Or, you can supply any number of parameters in such commands as ls or useradd.

Once you have issued your command to start a program, the OS takes over. First, it must locate the executable code stored somewhere on the hard disk. Next, it must load the executable code onto the swap space and copy a portion of the executable code (usually the first few pages) into memory. It creates a process ID and a data structure that describes the process’ status and inserts it into a queue. Queues are described below. The processor will eventually get to the process and begin executing it. This may be immediate, or it may take a few milliseconds depending on what other processes are being executed at the time.

Process Execution

Once started, the process is added to a list of processes that need CPU and OS attention. Depending on the type of operating system, the process may start executing immediately or be forced to wait. The various ways that OSs will execute programs is:

• Single tasking

• Batch

• Multiprogramming

• Multitasking

• Multithreaded

A single tasking system, the oldest and simplest, merely executes one program until it concludes and then switches back to the OS to wait for the user to request another program to execute. MS-DOS was such an OS. All programs are executed in the order that they were requested.

A batch processing system is similar to a single tasking system except that there is a queue (waiting line) for processes. Users submit their program to the OS, the OS adds the process to a queue. When the processor finishes with one process, the OS is invoked to decide which of the waiting processes to bring to the CPU next. Some form of scheduling is needed to decide which process gets selected next, including a priority scheme (processes are given priorities depending on the users who submitted them, for instance administration processes would have the highest priority, faculty and staff next and students last), shortest job first, and first come first serve. A multiprogramming system is similar to a batch system except that, if the current process requires I/O (which is time consuming), then it is moved to another queue (an I/O waiting queue) and the OS selects another process to begin (or resume). When the original process finishes with its I/O, it is resumed and the current process is moved back into a queue. This leads to a distinction in queues. There is the ready queue (the queue of processes waiting for the CPU, these processes have already been moved into memory and/or swap space), I/O queues (one for each I/O device), and the waiting queue (the queue of processes that have been requested to be run, but have not yet been moved into the ready queue). Multiprogramming is more efficient than simple batch processing because the processor never has to wait for I/O as long as there are other processes around to run.

Multitasking takes the idea of multiprogramming one step further. In multiprogramming, a process is only suspended (forced to wait) when it needs I/O, thus causing the CPU to switch to another process. But in multitasking, a timer is used to count the number of clock cycles that has elapsed since this process began. After a preset number of cycles, the timer interrupts the CPU to force it to switch to another process. The suspending process gets moved to the back of the ready queue and must wait its turn. Because modern processors are so fast, the user would not tend to notice that the running process has been moved to the ready queue because it would be moved back to the CPU within a few milliseconds time. Note that an older version of multitasking was called time sharing.

The idea of switching processes is known as a context switch (the processor is switching its context). During a context switch, the processor is idle (because the process being resumed needs to have its important information restored into memory and registers). In order to make this efficient, computer hardware supports the process. The extra hardware includes registers and a run-time stack in memory.

Today, processes can contain multiple running parts, called threads. For instance, you might run Mozilla but have several open tabs or windows. You are running a single process but several instances of the process. The only difference between each process is its data (window 1 is open to one URL, window 2 is open to another, the URLs and the data displayed are different, but the program code is the same). A multithreaded OS is the same as a multitasking OS except that the CPU switches off between threads of a single process and between different processes.

Some additional terms related to the execution of processes are:

• Thread – a portion of a process

• Process status – the current run-time status of the process which includes what its state is (running, ready, suspended, waiting for I/O, terminated), where it is currently at (which queue or in the CPU), and the value of the program counter which describe what instruction is the next to be executed, and other information.

• Run-time stack – the OS maintains a stack of processes so that it can resume another process when needed

• Queue – waiting lines, there are usually several maintained by the OS including the ready queue, the waiting queue and I/O queues.

• Scheduling – an algorithm which selects the next process or thread to resume when the timer elapses or the CPU becomes available. First come first serve is common today.

• Synchronization – when two or more processes need to communicate with each other, perhaps by passing data between them. It is critical to make sure that this is done properly or else data can become corrupted.

• Foreground – the current running process. In a windows environment, it is the active window (denoted by a title bar of a different color than other windows and a “pressed” tab at the bottom of the screen)

• Background – processes that have been started but are currently waiting for attention or for the user to select them.

In a multitasking environment, we are used to having multiple processes run at the same time (in fact, they are not executing simultaneously, instead, the CPU executes a few hundred instructions on one process and then moves onto the next, and then the next, and eventually returns to the first one, over and over. The CPU is so quick at this that we don’t notice that any given process isn’t getting full attention). There are times, however, when we want to specify which process should have the CPU’s attention or should be the process that receives user input.

Process Status

A process’ status is information about the process, for instance, what state it is in, what resources it is using, what its process ID is (PID in Linux). To obtain information about a process, you can use ps. This provides the process status for all active processes. Using ps by itself shows you the active processes in the current window owned by you. Using ps aux gives you all active processes by all users. Note that older versions of Unix required a – for parameters (as in ps –aux) but the – is largely omitted these days. Using ps axf gives you the processes in the shape of a tree to show you which processes spawned other processes. For instance, from bash you might type emacs and from emacs you might issue another command (say to compile the current file), thus bash is the parent of emacs is the parent of the compile command. The ps command, when supplied with the parameter u gives a long listing of information which includes for each process, the process’s owner, the PID, CPU usage, memory usage, terminal from which the command was issued, current process status, start time and amount of CPU time that has elapsed. Note that since ps aux will give you a list of all active processes, you might want to pipe the output to grep to list only specific processes, such as your processes by doing ps aux | grep username. The PID is particularly useful for issuing other commands including kill (covered later in these notes).

Another command to obtain status information is top. While ps provides a snapshop of what is going on at the time you issue the command, top is interactive in that it updates itself and displays the most recent information, usually every 3 seconds. Because top is interactive, it remains in the window and so you do not have access to the command line prompt while it is executing. Also different between ps and top is that top only lists the processes that are most active at the moment. There are a number of parameters however that can allow you to change what top displays. For the most part, unless you are trying to work with the efficiency of your system, you will find ps to be easier and more helpful to use. To exit from top, type ctrl+c.

In a Windowing environment, you can select the foreground process by clicking on the window containing that process, or clicking on the tab at the bottom of the screen to bring the window up. Other processes might move to the background, meaning that they are not executed, or that they are executed only when the CPU has time to spend on them. In Linux, we can have multiple processes running in a single terminal window. We specify which process is in the foreground and which are in the background. To move a running process to the background, use control-z to suspend it. To see what processes are currently active in the window, type jobs. To move a process to the foreground, type fg jobnumber where the job number is given when you type jobs (or just fg if it is the only active process in the terminal window). The command bg jobnumber will move a job to the background. Note that the job number differs from the PID and you do not use the PID for fg or bg.

Another idea is of processes spawning (starting) other processes. For instance, mozilla may spawn a print process (lp). There are two ways that a process can spawn another. The first is to issue the process initiation command and wait for the process to execute and terminate before resuming. This can be performed in Linux using the exec command. However, the original process may not want to wait (for instance, would you want to wait until the print job finished before you were able to regain control of your mozilla browser?) in which case the process instead forks the subprocess. Now both processes can run independently. To fork a process in Linux, use &. This is commonly done at the command prompt – you are already in bash and wish to start a new process. If you type a process like rm or awk, the program (rm or awk) runs to completion and then you get control of your bash shell again. If you want to run a lengthier program while continuing to use your shell, you would issue the command and follow it with &, as in emacs & which will start an emacs buffer outside of the current window.

In Linux, you can also specify the priority of a process by using the nice command. Nice means to make the process nicer, or to let it share the CPU more readily. The form of nice is nice process-name –n value where value is between -20 (lowest priority and 19 (highest).

Process Scheduling

Aside from the OS selecting a process (scheduling), you can also determine when a process should be initiated. In Linux, there are three approaches, at, batch and cron.

The at program allows you to specify when a given process should begin execution. You may either specify a program in your at command, or you may issue a list of commands to execute at the at prompt. The former approach is easier although requires that you add –f in the command. For instance, you might write a shell script which searches through directories and finds any files that are improperly protected such as 777 or 000 and alter their permissions or make a list of those files and users. You want to run this program at 2 am when the system is not being used heavily. You write your shell script and issue the at command now, but the script does not execute until the time you set in your at command. There are multiple ways to specify the time when the command should execute. Among these include now (execute immediately), now + n (execute in n minutes), now + n hours (execute in n hours), at time (where time is a time of day such as at 10am to execute the next time it is 10 am, or a time of day and day of year as in 10am Mar 1). There are other options for specifying time as well (see the textbook and at’s man page).

At commands can only be issued by the system administrator or by a user whose name has been added to the file /etc/at.allow (this file can only be edited by the system administrator, so the sysadmin has the ability to extend the use of at to others). There is also a /etc/at.deny file. At works by using an at daemon (atd). The at daemon runs in the background on the system all of the time and occasionally (once per second) looks to see if any waiting at job’s time has been reached.

A variation of at is batch. Batch does the same thing as at in terms of allowing a user (sysadmin) to execute a process later, but there is no time specified. Instead, the process executes once the system’s load (CPU utility) drops below 1.5 (or a value specified by the at daemon). In order to see all pending at and batch jobs, use atq and to delete an at or batch job from the queue, use atrm followed by the number of the job in the queue.

The third form of scheduling processes is to you crontab on a cron job. cron is the name of the daemon used by crontab. A cron job differs from an at job in that it executes more than once. So you use at for a process that should run one time but cron for a process that should run every time the indicated time period is reached. In cron, you can specify the time as every minute, hour, date of the month, month, or day of the week. You use a pattern of * and numbers to specify this. For instance, to execute a job every 12:00 pm, you would say * 12 * * *, or to say every 15th of the month you would say * * 15 * *. If you wanted to have something run every February 1st, you would use * * 1 2 * . The last value represents the day of the week with Sunday being 0 (or 7), Monday being 1, etc, so 30 18 * * 1 means to run it every Monday at 6:30 pm. You place your time and the commands to be executed in a file and issue the file to crontab, much as you did with at. crontab does not however require the –f as it only accepts a file as a command. Using the cron daemon is easier than this though because there are directories set up called /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly and /etc/cron.monthly. Placing your file in one of these directories means that it will execute during the time specified (e.g., once a day). As with at, you must be root to use crontab, or be given permission by having your username added to /etc/cron.allow. In order to find pending cron jobs, use crontab –l and to delete a crontab job, use crontab –r.

Terminating Processes

There are multiple ways for a process to terminate. First, it may complete execution. Second, you may choose to stop it yourself by pressing ctrl+c in the window where it is executing. You might also try to close the window if it is a window-based process. The better way to kill a process though is to use the kill command. The kill command requires the process ID and a kill level. Level -9 is the most severe and ensures that the process will be terminated. To kill a process, you must be the process owner or root. The command killall will kill all active processes. To use killall, you must be root.

The Shutdown Command

The shutdown command, as its name implies, allows you to safely shut down your Linux computer. This should always be executed before you turn off your computer or else you might corrupt data files (this program is executed for you whenever you choose shut down from the menu). From the command line, shutdown accepts a parameter, -h, which causes the system to also halt. Halting the system actually stops all of the machinery whereas shutdown merely places you into a special mode that permits you to halt the system. In addition to –h, you must specify a time (in minutes) to denote when shutdown should occur. This gives you a “grace period” so that all users can log off in time. The command shutdown –h 10 means to shutdown and halt the system in 10 minutes. You may also wish to send a message to users and processes to warn them of the imminent shutdown. The message might be something like “warning, system shutdown in 10 minutes, kill all processes NOW and log out!” The –r option not only shuts the system down, but reboots it afterward.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download