A thread is lot like a process, except that it shares memory with other threads.
A thread has its own program counter from which it gets the instructions to execute. Each thread has its own private set of registers it uses.
Switching from one thread to another requires a context switch, where it backs up the contents of the registers, and restores those in the other thread. As you can imagine, this is expensive. The state is saved to a thread control block. However, unlike with processes, the address space remains unchanged.
Below is a diagram of the address space of a single vs two threaded process:
AS you can see, now you have multiple stacks in the address space: One per thread. Anything a thread puts in its stack is considered thread local.
Do note that now you have the problem of the free space between the two stacks potentially running out. In practice, this is not a big problem unless you rely heavily on recursion.
Why Use Threads?
We often use threads to prevent being I/O bound. Let one thread be blocked by I/O while other threads can continue their work.
The Trouble With Threads
Consider this C program:
#include <stdio.h> #include <pthread.h> #include "common.h" #include "common_threads.h" static volatile int counter = 0; // mythread() // // Simply adds 1 to counter repeatedly, in a loop // No, this is not how you would add 10,000,000 to // a counter, but it shows the problem nicely. // void *mythread(void *arg) { printf("%s: begin\n", (char *) arg); int i; for (i = 0; i < 1e7; i++) { counter = counter + 1; } printf("%s: done\n", (char *) arg); return NULL; } // main() // // Just launches two threads (pthread_create) // and then waits for them (pthread_join) // int main(int argc, char *argv[]) { pthread_t p1, p2; printf("main: begin (counter = %d)\n", counter); Pthread_create(&p1, NULL, mythread, "A"); Pthread_create(&p2, NULL, mythread, "B"); // join waits for the threads to finish Pthread_join(p1, NULL); Pthread_join(p2, NULL); printf("main: done with both (counter = %d)\n", counter); return 0; }
This program has each thread increment counter 10 million times. So you’d expect the total to be 20 million, right?
In practice, you often get a number less than 20 million, and a different number each time.
Why?
Because the counter=counter + 1 statement is actually a number of assembly operations:
- Moving the value of counter into a register.
- Adding 1 to the register.
- Moving the value back into counter.
Between 2 and 3, the other thread may start running and clobber our update - so you get 2 threads adding 1 to the same number, resulting in a net add of 1.
This is a race condition. A race condition occurs when multiple threads enter a critical section at roughly the same time.
A critical section of code is where access to a shared resource should not occur by more than one thread at a time. We achieve this via mutual exclusion.
One approach is to make your operations/transactions atomic. This means that your set of actions should either all run without interruption, or all fail and you can never be allowed to be in an intermediate state.