Threads
Motivation: Concurrency within a Process
In the "Process" section, we were convinced that switching between many processes can make CPU buzy and therefore boosts CPU efficiency. The same can be applied within the scope of one process. For example, when we visit a web page, the text data will display first, and then small images, and then large images. This design makes sense because images take longer time to load. At the beginning of web page rendering, we should let the users see something at least, even though the web page is still incomplete.
In the actual implementation, the algorithm will be:
Call
GetData()
to download the text data.Call
ShowText()
to display the text data.Call
GetData()
to download the image data.Call
ProcessImage()
to decompress the image data.Call
ShowImage()
to display the image data.
Each function can be called in a thread since the funtionality of each function is independent from each other: we can process the image while processing text, they don't really conflict. This idea is called multithreading. In pseudocode:
You may ask, why do we use 4 threads instead of 4 processes? There are two major cons about the process model:
In our example,
GetData()
writes data to a buffer andShowText()
reads data from this buffer. In other word, they need shared resources. If we assign a process for each function, we need to worry about IPC and therefore it seems like an overkill.The cost of creating and maintaining processes is high: resource/PCB allocation/deallocation, context switching, etc.
Thread is the solution to these two problems.
Thread vs. Process
A process may contains multiple threads.
Thread is the atom of scheduling.
Process is the atom of resource allocation.
Detailed comparison:
Property | Thread | Process |
---|---|---|
Concurrency | Yes | Yes |
Cost | Low | High |
Resource Sharing | Shared | Separated |
Security | Insecure | Secure |
Belongs to | Process | OS |
The following data are public for all the threads (owned by the process):
Global variables
Heap
Static variables
Code
Open files
The following data are private for each thread:
Local variables
Stack
Registers
Function arguments
Thread Local Storage (TLS) data
Thread Implementation
There are three ways to implement threads:
User Thread
Create and manage threads in user mode.
Kernel Thread
Create and manage threads in kernel mode.
LightWeight Process
User Thread
Kernel Thread
LightWeight Thread
The Thread API
The Thread API contains the following syscalls:
pthread_create()
pthread_join()
The pthread_create()
Syscall
pthread_create()
SyscallDefinition:
Arguments:
thread
: A pointer to a structure of typepthread_t
. We'll use this structure to interact with this thread, and thus we need to pass it topthread_create()
in order to initialize it.attr
: Specifies any attributes this thread might have. Some examples include setting the stack size or perhaps information about the scheduling priority of the thread.start_routine
: Which function should this thread start running in?arg
: The argument to be passed to the function where the thread begins execution.
Thread Safety
Reference
Last updated