Introduction
In today’s programming world, multi-threading has become an imperative part of any programming language whether it's .NET, Java or C++. To write highly responsive and scalable applications, you must avail the power of multi threading programming. While working on .NET Framework, I came across various Framework Class Libraries (FCL) for parallel task processing like Task Parallel Library (TPL), Parallel LINQ (PLINQ), Task Factories, Thread Pool, Asynchronous programming modal, etc., all of which behind the scene use power of Windows threads to achieve parallelism. Understanding the basic structure of Windows thread always help developer in implementing and understanding these advanced features like TPL, PLINQ, etc. in a better way and help in visualizing how multiple threads work in a system together, specially when you are trouble shooting multithreaded applications. In this article, I would like to share some of the basics about Windows thread which may help you in understanding how operating system implements threads.
What Windows Thread Consists Of
Let’s start with looking at the basic components of a thread. There are three basic components of Windows thread:
- Thread Kernel Object
- Stack
- TEB
Windows Thread Components
All of these three components together create Windows thread. I tried to explain all of them one by one below but before looking into these three components, let's have a brief introduction about Windows kernel and kernel objects as these are the most important part of Windows operating system.
What Is Operating System Kernel
Kernel is the main component of any operating system. It is a bridge between applications and hardware. Kernel provides layer of abstraction through which application can interact with hardware.
Kernel is the part of the operating system that loads first, and it remains in physical memory. The kernel's primary function is to manage the computer's hardware and resources and allow other programs to run and use these resources. To know more about kernel, visit this link.
What Are Kernel Objects
Kernel needs to maintain lots of data about numerous resources such as processes, threads, files, etc., for that kernel use “kernel data structures” which are known as kernel objects. Each kernel object is simply a memory block allocated by the kernel and is accessible only to the kernel. This memory block is a data structure whose members maintain information about the object. Some members (security descriptor, usage count, and so on) are same across all object types, but most data members are specific to the type of kernel object. Kernel creates and manipulates several types of kernel objects, such as process objects, thread objects, event objects, file objects, file-mapping objects, I/O completion port objects, job objects, mutex objects, pipe objects, semaphore objects, etc.
Winobj Screenshot
If you are curious to see the list of all the kernel object types, then you can use free WinObj tool from Sysinternals located here.
Thread Kernel Object
First and very basic component of Windows thread is thread kernel object. For every thread in system, operating system create one thread kernel object. Operating systems use these thread kernel objects for managing and executing threads across the system. The kernel object is also where the system keeps all the statistical information about the thread. Below are some of the important properties of thread kernel object.
Thread Context
Each thread kernel object contains set of CPU registers, called the thread's context. The context reflects state of the CPU registers when the thread last executed. The set of CPU registers for the thread is saved in a CONTEXT structure. The instruction pointer and stack pointer registers are the two most important registers in the threads context. A stack pointer is a register that stores the starting memory address of the stack frame of the current function executing inside the thread. Instruction pointer points to the current instruction that need to be executed by the CPU. Operating system use kernel object context information while performing thread context switching. Context switch is the process of storing and restoring the state (context) of a thread so that execution can be resumed from the same point at a later time.
Below mentioned table displays some of other important information held in thread kernel object about the thread.
Property Name | Description |
CreateTime | This field contains the time when the Thread was created. |
ThreadsProcess | This field contains a pointer to the EPROCESS Structure of the Process that owns this Thread. |
StackBase | This field contains the Base Address of this Thread’s Stack. |
StackLimit | This field contains the end of the Kernel-Mode Stack of the Thread. |
TEB | This field contains a pointer to the Thread’s Environment Block. |
State | This field contains the Thread’s current state. |
Priority | This field contains the Thread’s current priority. |
ContextSwitches | This field counts the number of Context Switches that the Thread has gone through (switching Contexts/Threads). |
WaitTime | This field contains the time until a Wait will expire. |
Queue | This field contains a Queue for this Thread. |
Preempted | This field specifies if the Thread will be preempted or not. |
Affinity | This field contains the Thread’s Kernel Affinity. |
KernelTime | This field contains the time that the Thread has spent in Kernel Mode. |
UserTime | This field contains the time that the Thread has spent in User Mode. |
ImpersonationInfo | This field contains a pointer to a structure used when the Thread is impersonating another one. |
SuspendCount | This field contains a count on how many times the Thread has been suspended. |
Stack
The second basic component of a thread is stack. Once the thread kernel object has been created, the system allocates memory, which is used for the thread's stack. Every thread got its own stack which is used for maintaining local variables of functions and for passing arguments to functions executing inside a thread. When a function executes, it may add some of its state data to the top of the stack like arguments and local variables, when the function exits it is responsible for removing that data from the stack. Apart from that, a thread's stack is used to store the location of function calls in order to allow return
statements to return to the correct location.
Operating system allocates two types of stack for every thread, one is user-mode stack and other is kernel-mode stack.
User-mode stack
The user-mode stack is used for local variables and arguments passed to methods. It also contains the address indicating what the thread should execute next when the current method returns. By default, Windows allocates 1 MB of memory for each thread’s user-mode stack
Kernel-mode stack
The kernel-mode stack is used when application code passes arguments to a kernel function in the operating system. For security reasons, Windows copies any arguments passed from user-mode code to the kernel from the thread’s user-mode stack to the thread’s kernel-mode stack. Once copied, the kernel can verify the arguments’ values, and since the application code can’t access the kernel mode stack, the application can’t modify the arguments’ values after they have been validated and the OS kernel code begins to operate on them. In addition, the kernel calls methods within itself and uses the kernel-mode stack to pass its own arguments, to store a function’s local variables, and to store return addresses. The kernel-mode stack is 12 KB when running on a 32-bit Windows system and 24 KB when running on a 64-bit Windows system.
You can learn more about thread stack at the following links:
Thread Environment Block (TEB)
Another important data structure used by every thread is Thread environment Block (TEB). TEB is a block of memory allocated and initialized in user mode (user mode address space is directly accessible to the application code where else kernel mode address space is not accessible to the application code directly). The TEB consumes 1 page of memory (4 KB on x86 and x64 CPUs).
On of the important information TEB contains is information about exception handling which is used by SEH (Microsoft Structured Exception Handling). The TEB contains the head of the thread’s exception-handling chain. Each try
block that the thread enters inserts a node in the head of this chain.The node is removed from the chain when the thread exit the try block. You can learn more about SEH
here.
In addition, TEB contains the thread-local storage data. In multi-threaded applications, there often arises the need to maintain data that is unique to a thread. The place where this thread specific data get stored called thread-local storage. You can learn more about thread-local storage here.
Below mentioned table displays few important properties of TEB:
Property Name | Description |
ThreadLocalStorage | This field contains the thread specific data. |
ExceptionList | This field contains the Exception Handlers List used by SEH |
ExceptionCode | This field contains the last exception code generated by the Thread. |
LastErrorValue | This field contains the last DLL Error Value for the Thread. |
CountOwnedCriticalSections | This field counts the number of Critical Sections (a Synchronization mechanism) that the Thread owns. |
IsImpersonating | This field is a flag on whether the Thread is doing any impersonation. |
ImpersonationLocale | This field contains the locale ID that the Thread is impersonating. |
Thread kernel object as thread handle
System keeps all information required for thread execution/ scheduling inside thread kernel object. Apart from that, the operating system stores address of thread stack and thread TEB in thread kernel object as shown in the below figure:
Thread kernel object mapping
Thread kernel object is the only handle through which operating system access all the information about the thread and is use it for thread execution/ scheduling.
Thread State
Each thread exists in a particular execution state at any given time. Operating system stores the state of thread inside thread kernel object field "state". Operating system uses these states that are relevant to performance; these are:
- Running - thread is using CPU
- Blocked - thread is waiting for input
- Ready - thread is ready to run (not Blocked or Running)
- Exited - thread has exited but not been destroyed
Thread State Diagram
Thread Scheduler Queues
Operating system thread scheduler maintains thread kernel objects in different queues based on the state of a thread
- Ready queue - Scheduler maintains list containing threads in Ready state and can be scheduled on CPU. Often list is sorted, generally one queue per CPU.
- Waiting queues - A thread in Blocked state is put in a wait queue. Below are few examples which cause thread block.
- Thread kernel object might have a suspend count greater than 0. This means that the thread is suspended
- Thread is waiting on some lock to get release
- Thread is waiting for reply from E.g., disk, console, network, etc.
- Exited queue - A thread in Exited state is put in this queue
Thread scheduler use doubly linked list data structure for maintaining these queues where in a list head points to a collection of list elements or entries and each item points to the next and previous items in the list.
Thread kernel object doubly link list
Scheduler moves threads across queues on thread state change - E.g., thread moves from a wait queue to ready queue on wake up.
How OS Run Threads
As we already know that thread context structure is maintained inside the thread's kernel object. This context structure reflects the state of the thread's CPU registers when the thread was last executing. Every 20 milliseconds or so, operating system thread scheduler looks at all the thread kernel objects currently inside Ready Queue (doubly linked list). Thread scheduler selects one of the thread kernel objects and loads the CPU's registers with the values that were last saved in the thread's context. This action is called a context switch. At this point, the thread is executing code and manipulating data in its process' address space. After another 20 milliseconds or so, scheduler saves the CPU's registers back into the thread's context. The scheduler again examines the remaining thread kernel objects in Ready Queue, selects another thread's kernel object, loads this thread's context into the CPU's registers, and continues.
Thread Scheduler Diagram
This operation of loading a thread's context, letting the thread run, saving the context, and repeating the operation begins when the system boots and continues until the system is shut down.
Processes and Threads
One more thing I would like to share is the relationship between thread and process. Every process requires at least one thread. A process never executes anything, it is simply a container for threads. Threads are always created in the context of some process and live their entire life within that process. What this really means is that the thread executes code and manipulates data within its process' address space. So if you have two or more threads running in the context of a single process, the threads share a single address space. The threads can execute the same code and manipulate the same data.
Process gives structural information to the in-memory copy of your executable program, such as which memory is currently allocated, which program is running, how much memory it is using, etc. The Process however, does not execute any code on its own. It simply allows the OS (and the user) to know to which executable program a certain Thread belongs to. It also contains all the handles and security rights and privileges that threads create. Therefore, code actually runs in Threads.
For understanding, you can make analogy for processes and threads using a regular, everyday object -- a house. A house is really a container, with certain attributes (such as the amount of floor space, the number of bedrooms, and so on). If you look at it that way, the house really doesn't actively do anything on its own -- it's a passive object. This is effectively what a process is.
The people living in the house are the active objects -- they're the ones using the various rooms, watching TV, cooking, taking showers, and so on. We'll soon see that's how threads behave. Just as a house occupies an area of real estate, a process occupies memory. And just as a house's occupants are free to go into any room they want, a processes' threads all have common access to that memory.
A process, just like a house, has some well-defined "borders." A person in a house has a pretty good idea when they're in the house, and when they're not. A thread has a very good idea -- if it's accessing memory within the process, it can live. If it steps out of the bounds of the process's address space, it gets killed. This means that two threads, running in different processes, are effectively isolated from each other.
If you want to learn more about process and thread, please read Processes and Threads.
Summary
Three basic components of thread are:
- Thread Kernel Object is the primary data structure through which OS manages thread.
- Thread stack is used for maintaining local variables of functions and for passing arguments to functions executing inside a thread. Operating system allocates two types of stack for every thread, one is user-mode stack and other is kernel-mode stack.
- Thread Environment Block is a block of memory allocated and initialized in user mode primarily used for exception handling and thread-local storage data.
Thread State
Each thread exists in a particular execution state at any given time which are below:
- Running - thread is using CPU
- Blocked - thread is waiting for input
- Ready - thread is ready to run (not Blocked or Running)
- Exited - thread has exited but not been destroyed
Thread Scheduler Queues
Operating system thread scheduler maintains thread kernel objects in different queues based on the state of a thread:
- Ready queue
- Waiting
- Exited queue
How OS Run Threads
Every 20 milliseconds or so, operating system thread scheduler looks at all the thread kernel objects currently inside Ready Queue. Thread scheduler selects one of the thread kernel objects and loads the CPU's registers with the values in the thread's context and execute thread.
Processes and Threads
Every process requires at least one thread. A process never executes anything, it is simply a container for threads. Threads are always created in the context of some process and live their entire life within that process.
References