The Process API
Overview
The Process API contains the following 4 APIs:
The
fork()
syscall allows one process, the parent, to create a new process, the child. This is done by making the new child process an (almost) exact duplicate of the parent: the child obtains copies of the parent’s stack, data, heap, and text segments. The term fork derives from the fact that we can envisage the parent process as dividing to yield two copies of itself.The
exit(status)
library function terminates a process, making all resources (memory, open file descriptors, and so on) used by the process available for subsequent reallocation by the kernel. Thestatus
argument is an integer that determines the termination status for the process. Using thewait()
syscall, the parent can retrieve this status.The
wait(&status)
syscall has two purposes. First, if a child of this process has not yet terminated by callingexit()
, thenwait()
suspends execution of the process until one of its children has terminated. Second, the termination status of the child is returned in the status argument ofwait()
.The
execve(pathname, argv, envp)
syscall loads a new program (pathname
, with argument listargv
, and environment listenvp
) into a process’s memory. The existing program text is discarded, and the stack, data, and heap segments are freshly created for the new program. This operation is often referred to as execing a new program. Later, we’ll see that several library functions are layered on top ofexecve()
, each of which provides a useful variation in the programming interface. Where we don’t care about these interface variations, we follow the common convention of referring to these calls generically asexec()
, but be aware that there is no system call or library function with this name.
Pictorially:
fork()
fork()
The fork()
syscall is used to craete a new process. Consider the following program:
Key Ideas:
The process calls the
fork()
syscall, which the OS provides as a way to create a new process. The process that is created is an (almost) exact copy of the calling process. The "caller" is the parent and the "callee" is the child.The newly-created process doesn't start running at
main()
, rather, it just comes into life as if it had calledfork()
itself.While the parent receives the PID of the newly-created child, the child receives a return code of 0. This differentiation is useful, because it is simple then to write the code that handles the two different cases (as above).
Note that this program is non-deterministic: the parent may printf first, or the child may printf, depending on the CPU scheduler.
wait()
wait()
The wait()
syscall asks the parent to wait for a child process to finish what it has been doing. Consider the following program:
Key Ideas:
This time the program is deterministic: the child process will always printf first because of
wait()
.If the child runs first, then it is all good; if the parent runs first, it will wait for the child.
exec()
exec()
The exec()
syscall is useful when you want to run a program that is different from the calling program. Consider the following program:
Key Ideas:
Given the name of an executable (e.g.,
wc
), and some arguments (e.g.,p3.c
), it loads code (and static data) from that executable and overwrites its current code segment (and current static data) with it.exec()
does not create a new process; rather, it transforms the currently running program (formerlyp3
) into a different running program (wc
).After the
exec()
in the child, it is almost as ifp3.c
never ran; a successful call toexec()
never returns.
Motivating the API
The separation of fork()
and exec()
is essential in building a UNIX shell, because it lets the shell run code after the call to fork()
but before the call to exec()
.
Image we are interacting with a UNIX shell. You type a command into it, the shell will do the following things:
Figures out where in the file system the executable resides through
$PATH
environment variable.Calls
fork()
to create a new child process to run the command.Calls some variant of
exec()
to run the command.Waits for the command to complete by calling
wait()
.When the child completes, the shell returns from
wait()
and prints out a prompt again, ready for your next command.
The separation of fork()
and exec()
allows the shell do a whole bunch of useful things rather easily. For example:
When the child is created, before calling exec()
, the shell closes standard output and opens the file newfile.txt
. By doing so, any output from the soon-to-be-running program wc
are sent to the file instead of the screen. This operation will be impossible if fork()
and exec()
are merged as one syscall.
We can actually implement this redirection feature using fork()
and exec()
:
UNIX pipes are implemented in a similar way, but with the pipe() syscall. In this case, the output of one process is connected to that same pipe; thus, the output of one process seamlessly is used as input to the next, and long and useful chains of commands can be strung together. For example:
Reference
Last updated