CS 6200 Introduction to Operating Systems

What is an Operating System

Shop Manager Metaphor

A shop manager:

Directs operational resources: use of employee time, parts, tools
Enforces working policies: fairness, safety, cleanup
Mitigates difficulty of complex tasks: simplifies operation, optimizes performance

An operating system:

Directs operational resources: use of CPU, memory, peripheral devices
Enforces working policies: fair resource access, limits to resource usage
Mitigates difficulty of complex tasks: abstract hardware details (system calls)

Attempt do define an operating system

Hardwares (CPU, Memory, GPU, Storage, Network Card) are used by multiple applications. The operating system sits in between the underlying hardwares and the applications. It:

hides hardware complexity
manages resource (schedule tasks on CPU, manages memory usage)
provide isolation and protection (so application do not interfere with each other)

Operating System Definition

An operating system is a layer of systems software that:

directly has privileged access to the underlying hardware
hides the hardware complexity (abstraction)
manages hardware on behalf of one or more applications according to some predefined policies (arbitration)
ensures that applications are isolated and protected from one another

Operating Systems Examples

Desktop:

Microsoft Windows
Unix-based
- Mac OSX (BSD)
- Linux (Ubuntu, Centos, Fedora)

Embedded:

Android (Linux)
iOS
Symbian

Mainframe

Operating System Elements

Abstractions:
- application: process, thread
- hardware: file, socket, memory page
Mechanisms:
- application: create, schedule
- hardware: open, write, allocate
Policies (how mechanisms will be used):
- least-recently used (LRU)
- earliest deadline first (EDF)

Memory management example

Abstractions: memory page
Mechanism: allocate, map to a process
Policies: least recently used (least recently used memory page will be moved to disk, a.k.a swap)

Design Principles

Separation of mechanism and policy

Implement flexible mechanisms to support many policies, e.g. LRU, LFU, random
Optimize for common case
- Where will the OS be used?
- What will user want to execute on that machine?
- What are the workload requirements?

User/Kernel Protection Boundary

Kernel level: OS Kernel, privileged mode, direct hardware access
User level: unprivileged mode, applications

user-kernel switch is supported by hardware:

trap instructions
system call: open (file), send (socket), mmap (memory)
signals

System Call Flowchart

Flow of synchronous mode: wait until the sys call completes

(user process) User process executing
(user process) Call system call
(kernel) Trap
(kernel) Execute system call
(kernel) Return
(user process) Return from system call

To make a system call an application must:

write arguments
save relevant data at well-defined location
make system call

Crossing the User/Kernel Protection Boundary

User/Kernel Transitions

hardware supported: e.g. traps on illegal instructions or memory accesses requiring special privilege
involves a number of instructions: e.g. 50-100ns on a 2Ghz machine running linux
switches locality: affects hardware cache!

Note on cache

Because context switches will swap the data/addresses currently in cache, the performance of applications can benefit or suffer based on how a context switch changes what is in cache at the time they are accessing it.

A cache would be considered hot if an application is accessing the cache when it contains the data/addresses it needs.

Likewise, a cache would be considered cold if an application is accessing the cache when it does not contain the data/addresses it needs – forcing it to retrieve data/addresses from main memory.

Basic OS Services

Scheduler: access to CPU
Memory manager
Block device driver: e.g. disk
File system

Windows vs Unix

Process Control:
- Windows: CreateProcess, ExitProcess, WaitForSingleObject
- Unix: fork, exit, wait
File Manipulation
- Windows: CreateFile, ReadFile, WriteFile, CloseHandle
- Unix: open, read, write, close
Device Manipulation
- Windows: SetConsoleMode, ReadConsole, WriteConsole
- Unix: ioctl, read, write
Information Maintenance
- Windows: GetCurrentProcessID, SetTime, Sleep
- Unix: getpid, alarm, sleep
Communication
- Windows: CreatePipe, CreateFileMapping, MapViewofFile
- Unix: pipe, shmget, mmap
Protection
- Windows: SetFileSecurity, InitializeSecurityDescriptor, SetSecurityDescriptorGroup
- Unix: chmod, umask, chown

Linux System Calls Table

Monolithic OS

Historically, OS had a monolithic design. Every possible service any application can require or hardware will make are already included.

+ everything included
+ inlining, compile-time optimization
- customization, portability, manageability
- memory footprint
- performance

Modular OS

e.g. Linux

OS specifies interface for any module to be included in the OS. Modules can be installed dynamically.

+ maintainability
+ smaller footprint
+ less resource needs
- indirection can impact performance
- maintenance can still be an issue

Microkernel

Only require the most basic primitives at OS level. File system, disk driver, etc will run at unprivileged application level.

+ size
+ verifiability
- portability
- complexity of software development
- cost of user/kernel crossing (more frequent)

Processes and Process Management

Simple Process Definition:

Instance of an executing program
Synonymous with “task” or “job”

Toy Shop Metaphor

An order of toys:

state of execution: completed toys, waiting to be built
parts and temporary holding area: plastic pieces, containers
may require special hardware: sewing machine, glue gun

A process:

state of execution: program counter, stack
parts and temporary holding area: data, register state occupies state in memory
may require special hardware: I/O devices

What is a process

OS manages hardware on behalf of applications. Applications are programs on disk, flash memory, etc. They are static entities. Once an application is loaded, it becomes an active entity and is called a process. Process represents the execution state of an active entity.

What does a process look like

Process encapsulate all states from an application, e.g. text, data, heap, stack. Every state has to be uniquely identified by its address. Address is defined as a range of address from $V_0$ to $V_\text{max}$.

Types of state:

text and data: static state when process first loads
heap: dynamically created during execution
stack: grows and shrinks (LIFO queue)

What have been said above regarding the encapsulation can be called the address space. It is the “in memory” representation of a process. We call $V_0$ to $V_\text{max}$ used by the process virtual addresses, in contrast with physical addresses which correspond to locations in physical memory (e.g. DRAM). A Page table created by the OS maintains a mapping of virtual to physical addresses.

32bits: Virtual address space up to 4GB.

Some times applications can share the same physical addresses, and additional states will be swapped in and out of the disk.

How does the OS know what a process is doing

OS has to have idea of what a process is doing, so it can stop and restart the process at the same point.

Program Counter: where in the instruction sequence the current is with
CPU register: maintains program counter while the process is runner
Stack pointer: where the top of the stack is. It’s important because stack is LIFO.

What is a Process Control Block (PCB)

A data structure that the operating system maintains for every one of the processes that it manages.

process state, process number, program counter, registers, memory limits, list of open files, priority, signal mask, CPU scheduling info
PCB created when process is created
Certain fields are updated when process state changes
Other fields change too frequently (e.g. registers)

What is a context switch

A context switch is the mechanism used by the operating system to switch the CPU from the context of one process to the context of another.

They are expensive:

direct costs: number of cycles for load and store instructions
indirect costs: cold cache! cache misses

Process Lifecycle

States of a process:

New: when a process is created. OS will perform admission control, and create PCB.
Ready: process is admitted, and ready to start executing. It isn’t actually running on the CPU and has to wait until the scheduler is ready to move it into a running state when it schedules it on the CPU.
Running: ready state process is dispatched by the scheduler. It can be interuppted so that a context switch can be performed. This will move running process back into the ready state.
Waiting: when a longer operation is initiated (e.g. reading data from disk, waiting for keyboard input), process enters waiting state. When the operation is complete, it will be moved back to ready state.
Terminated: when all instructions are complete or errors are encountered, process will exit.

Process creation

Mechanism for process creation

fork: copies the parent PCB into new child PCB. Child continues execution at instruction after fork.
EXEC: replace child image. Load new program and start from first instruction.

Note:
The parent of all processes: init

What is the role of the CPU scheduler

A CPU scheduler determines which one of the currently ready processes will be dispatched to the CPU to start running, and how long it should run for.

OS must be able to do the following tasks efficiently:

preempt: interrupt and save current context
schedule: run scheduler to choose next process
dispatch: dispatch process and switch into its context

How long should a process run for/ How frequently should we run the scheduler

Longer the process runs means less often we are invoking the task scheduler. Consider the following case:

running process for amount of time $T_p$
schedular takes amount of $T_\text{sched}$

$$\text{Useful CPU Work} = \frac{\text{Total Processing Time}}{\text{Total time}} = \frac{T_p}{T_p + T_\text{sched}}$$

Can Processes Interact

An operating system must provide mechanisms to allow processes to interact with each other. Many applications we see today are structured as multiple processes.

Inter Process Communication (IPC) mechanisms:

transfer data/info between address spaces
maintain protection and isolation
provide flexibility and performance

Message-passing IPC

OS provides communication channel, like shared buffer
Processes write/read messages to.from channel
+ OS manages
- overheads: everything needs to go through the OS

Shared memory IPC

OS establishes a shared channel and maps it into each process address space
Processes directly read/write from this memory
+: OS is out of the way
-: No shared and well defined API because OS is not involved. Can be more error-prone, and developers may need to re-implement code. Mapping memory between processes can also be expensive.

Threads and Concurrency

Toy shop worker metaphor

A worker in a toy shop:

is an active entity: executing unit of toy order
works simultaneously with others: many workers completing toy orders
requires coordination: sharing of tools, parts, workstations

A thread:

is an active entity: executing unit of a process
works simultaneously with others: many threads executing
requires coordination: sharing of I/O devices, CPUs, memory

Process vs Thread

Threads are part of the same virtual address space. Each thread will have a separate program counter, registers, and stack pointer.

Why are threads useful

Each thread can execute the same code but for different portions of the input. Parallelization helps to speed up processes.
Threads can execute different parts of the program.
Specialization with multiple cores: hotter cache.
Efficiency: lower memory requirement and cheaper IPC.

Are threads useful on a single CPU (or # of threads > # of CPUs)

If $t_\text{idle} > 2 t_\text{context switch}$, then context switch to hide idling time.
When switching between threads, it is not necessary to recreate virtual to physical memory maps (the most costly step). Therefore, $t_\text{thread context switch} << t_\text{process context switch}$.
Multithreading is specially useful to high latency in IO operations.

Benefits to applications and OS code

Multithreaded OS kernel

threads working on behalf of apps
OS-level services like daemons or device drivers

What do we need to support threads?

thread data structure: identify threads, keep track of resource usage
mechanisms to create and manage threads
mechanisms to safely coordinate among threads running concurrently in the same address space

Threads and concurrency

Processes: Operating in own address spaces. Opearting system prevents overwriting physical addresses.

Threads: They both use the same virtual and physical addresses. Multiple threads can attempt to access the same data at the same time.

Concurrency Control and Coordination

Synchronization Mechanisms

Mutual Exclusion
- exclusive access to only one thread at a time
- mutex
Waiting on other threads
- specific condition before proceeding
- condition variable
Waking up other threads from wait state.

Threads and Thread Creation

Thread type: thread data structure (threadId, PC, SP, registers, stack, attributes)
Fork(proc, args): create a thread, not UNIX fork. t1 = fork(proc, args) creates a new thread.
Join(thread): terminate a thread.

Thread Creation Example

Thread thread1;
Shared_list list;
thread1 = fork(safe_insert, 4);
safe_insert(6);
join(thread1); // Optional, blocks parent thread

Result can be:

4 $\rightarrow$ 6 $\rightarrow$ NIL
6 $\rightarrow$ 4 $\rightarrow$ NIL

How is the list updated?

create new list element e
set e.value = X
read list and list.p_next
set e.point = list.p_next
set list.p_next = e

Mutual Exclusion

Mutex data structure:

whether it’s locked
owner of mutex
blocked threads

Birrell’s Lock API

lock(m) {
  // critical section
} // unlock;

Common Thread API

lock(m);
// critical section
unlock(m);

Making safe_insert safe

list<int> my_list;
Mutex m;
void safe_insert(int i) {
  lock(m) {
    my_list.insert(i);
  } // unlock;
}

Producer/Consumer Example

What if the processing you wish to perform with mutual exclusion needs to occur only under certain conditions?

//main
for i=0..10
   producers[i] = fork(safe_insert, NULL) //create producers
consumer = fork(print_and_clear, my_list) //create consumer

// producers: safe_insert
lock(m) {
  list->insert(my_thread_id)
} // unlock;

// consumer: print_and_clear
lock(m) {
  if my_list.full -> print; clear up to limit of elements of list (WASTEFUL!)
  else -> release lock and try again later
} // unlock;

Condition Variable

// consumer: print and clear
lock(m) {
  while (my_list.not_full()) Wait(m, list_full);
  my_list.print_and_remove_all();
}

// producers: safe_insert
lock(m) {
  my_list.insert(my_thread_id);
  if my_list.full()
    Signal(list_full)
}

Condition Variable API

Wait(mutex, cond): mutex is automatically released and re-acquired on wait
Signal(cond): notify only one thread waiting on condition
Broadcast(cond): notify all waiting threads

Reader/Writer Problem

Readers: 0 or more can access
Writer: 0 or 1 can access

if read_counter == 0 and write_counter == 0:
  read okay, write okay
if read_counter > 0:
  read okay
if write_counter > 0:
  read no, write no

State of shared file/resource:

free: resource.counter = 0
reading: resource.counter > 0
writing: resource.counter = -1

Reader/Writer Example

Mutex counter_mutex;
Condition read_phase, write_phase;
int resource_counter = 0;

READERS

Lock(counter_mutex) {
  while (resource_counter == -1)
    Wait(counter_mutex, read_phase);
  resource_counter++;
} // unlock;

// ... read data ...

Lock(counter_mutex) {
  resource_counter--;
  if (resource_counter == 0)
    Signal(write_phase)
} // unlock;

WRITER

Lock(counter_mutex) {
  while (resource_counter != 0)
    Wait(counter_mutex, write_phase);
  resource_counter = -1;
} // unlock;

// ... write data ...

Lock(counter_mutex) {
  resource_counter = 0;
  Broadcast(read_phase);
  Broadcast(write_phase);
}

Critical Sections in Readers/Writer Example

Critical Section

Critical Section Structure with Proxy Variable

Critical Section Proxy Variable

Avoiding Common Mistakes

Keep track of mutex/conditional variables used with a resource
- e.g. mutex_type m1; //mutex for file 1
Check that you are always (and correctly) using lock and unlock
- e.g. did you forget to lock/unlock? What about compilers?
Use a single mutex to access a single resource!
- e.g. read and write of the same file has 2 mutexs
Check that you are signaling correct condition
Check that you are not using signal when broadcast is needed
- signal: only 1 thread will proceed… remaining threads will continue to wait… possibly indefinitely!
Ask yourself: do you need priority guarantees?
- thread execution order not controlled by signals to condition variables!

Spurious Wake Ups

Spurious Wake-Ups
Unlock Before Broadcast

Deadlocks

Definition:

Two or more competing threads are waiting on each other to complete, but none of them ever do.

To avoid deadlocks: Maintain lock order across threads, so there is no dead cycles.

In summary:

A cycle in the wait graph is necessary and sufficient for a deadlock to occur. (edges from thread waiting on a resource to thread owning a resource)

What can we do about it?

Deadlock prevention (EXPENSIVE)
Deadlock detection & recovery (ROLLBACK)
Apply the ostrich Algorithm (DO NOTHING!) => if all else fails… just REBOOT

Kernel vs User-Level Threads

One to One Model

One to One

+ OS sees/understands threads, synchronization, blocking
-
- Must go to OS for all operations (may be expensive)
- OS may have limits on policies, thread #
- portability

Many to One model

Many to One

+ totally portable, doesn’t depend on OS limits and policies
-
- OS has no insights into application needs
- OS may block entire process if one user-level thread blocks on I/O

Many to Many Model

Many to Many

+
- can be best of both worlds
- can have bound or unbound threads
-
- requires coordination between user and kernel level thread managers

Scope

System Scope: System-wide thread management by OS-level thread managers (e.g. CPU scheduler)
Process Scope: User-level library manages threads within a single process

Multithreading Patterns

Toy shop example: for each wooden toy order, we:

accept the order
parse the order
cut wooden parts
paint and add decorations
assemble the wooden toys
ship the order

Boss/Workers Pattern

Boss/Workers

Boss-Workers:

boss: assigns work to workers (step 1)
workers: performs entire task (step 2-6)

Throughput of the system limited by boss thread => must keep boss efficient
$$\text{Throughput} = \frac{1}{\text{boss time per order}}$$

Boss assigns work by

directly signalling specific worker
- + workders don’t need to synchronize
- - boss must track what each worker is doing, and throughput will go down
placing work in producer/consumer queue (more often used)
- + boss doesn’t need to know details about workers
- - queue synchronization: workers need to synchronize access to the queue, etc

How many workers

on demand
pool of workers
- static or dynamic

Boss-Workers:

boss: assigns work to workers (step 1)
workers: performs entire task (step 2-6)
placing work in producer/consumer queue (more often used)
pool of workers
+ simplicity
- thread pool management
- locality

Boss-Workers Variants:

all workders created equal vs workers specialized for certain tasks
+ better locality (hotter cache); quality of service management
- load balancing

Pipeline Pattern

Pipeline Patter

Pipeline:

threads assigned one subtask in the system
entire tasks == pipeline of threads
multiple tasks concurrently in the system, in different pipeline stages
throughput == weakest link => pipeline stage == thread pool
shared-buffer based communication between stages

In summary:

sequence of stages
stage == subtask
each stage == thread pool
buffer-based communication
+ specialization and locality
- balancing and synchronization overheads

Layered Pattern

Layered:

each layer group of related subtasks
end-to-end task must pass up and down through all layers
+ specialization and locality, but less fine-grained than pipeline
- not suitable for all applications
- synchronization

Time to finish orders

Process Time

Boss-Worker:

$$ \text{Time to Finish 1 Order} \times \text{Ceiling}(\text{num orders}/\text{num threads}) $$

Pipeline:

$$ \text{time to finish first order} + (\text{remaining orders} \times \text{time to finish last stage})$$

Threads Case Study: PThreads

PThreads: POSIX Threads
POSIX: Portable Operating System Interface

Pthread creation

Thread
```
pthread_t aThread; //type of thread
```

Fork(proc, args)

int pthread_create(
  pthread_t *thread,
  const pthread_attr_t *attr,
  void * (*start_routine)(void *),
  void *arg
);

Join(thread)

int pthread_join(pthread_t thread, void **status)

Pthread Attributes

pthread_attr_t
- specified in pthread_create
- defines features of the new thread
- has default behavior with NULL in pthread_create
attributes
- stack size
- inheritance
- joinable
- scheduling policy
- priority
- system/process scope

methods

int pthread_attr_init(pthread_attr_t *attr);
int pthread_attr_destroy(pthread_attr_t *attr);
pthread_attr_{set/get}{attribute}

Detaching Pthread

Detaching Pthreads

int pthread_detach();

Example

#include <stdio.h>
# include <pthread.h>

void *foo (void *arg) { /* thread main */
  printf("Foobar!\n");
  pthread_exit(NULL);
}

int main (void) {

  int i;
  pthread_t tid;

  pthread_attr_t attr;
  pthread_attr_init(&attr); // required!
  pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
  pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); // share resource with all other threads in the system
  pthread_create(&tid, &attr, foo, NULL);

  return 0;

}

Compiling Pthreads

#include <pthread.h> in main file

compile source with -lpthread or -pthread

gcc -o main main.c -lpthread
gcc -o main main.c -pthread

check return values of common functions

Quiz 1

#include <stdio.h>
#include <pthread.h>

#define NUM_THREADS 4

void *hello (void *arg) { /* thread main */
  printf("Hello Thread\n");
  return 0;
}

int main (void) {
  int i;
  pthread_t tid[NUM_THREADS];

  for (i = 0; i < NUM_THREADS; i++) { /* create/fork threads */
    pthread_create(&tid[i], NULL, hello, NULL);
  }

  for (i = 0; i < NUM_THREADS; i++) { /* wait/join threads */
    pthread_join(tid[i], NULL);
  }
  return 0;
}

Quiz 2

/* A Slightly Less Simple PThreads Example */

#include <stdio.h>
#include <pthread.h>

#define NUM_THREADS 4

void *threadFunc(void *pArg) { /* thread main */
  int *p = (int*)pArg;
  int myNum = *p;
  printf("Thread number %d\n", myNum);
  return 0;
}

int main(void) {
  int i;
  pthread_t tid[NUM_THREADS];

  for(i = 0; i < NUM_THREADS; i++) { /* create/fork threads */
    pthread_create(&tid[i], NULL, threadFunc, &i);
  }

  for(i = 0; i < NUM_THREADS; i++) { /* wait/join threads */
    pthread_join(tid[i], NULL);
  }
  return 0;
}

i is defined in main => it’s globally visible variable
when it changes in one thread => all other threads see new value

Quiz 3

/* PThread Creation Quiz 3 */

#include <stdio.h>
#include <pthread.h>

#define NUM_THREADS 4

void *threadFunc(void *pArg) { /* thread main */
  int myNum = *((int*)pArg);
  printf("Thread number %d\n", myNum);
  return 0;
}

int main(void) {

  int i;
  int tNum[NUM_THREADS];
  pthread_t tid[NUM_THREADS];

  for(i = 0; i < NUM_THREADS; i++) { /* create/fork threads */
    tNum[i] = i;
    pthread_create(&tid[i], NULL, threadFunc, &tNum[i]);
  }

  for(i = 0; i < NUM_THREADS; i++) { /* wait/join threads */
    pthread_join(tid[i], NULL);
  }

  return 0;
}

Pthread Mutexes

“to solve mutual exclusion problems among concurrent threads”

mutex
```
pthread_mutext_t aMutex; // mutex type
```

Lock(mutex)

// explicit lock
int pthread_mutex_lock(pthread_mutex_t *mutex);
// run critical section
// explicit unlock
int pthread_mutex_unlock(pthread_mutex_t *mutex);

safe insert example

list<int> my_list;
pthread_mutex_t m;
void safe_insert(int i) {
  pthread_mutex_lock(m);
  my_list.insert(i);
  pthread_mutex_unlock(m);
}

Other Mutex Operations

Attributes

int pthread_mutex_init(pthread_mutex_t *mutex, const pthread_mutexattr_t *attr);
// mutex attributes == specifies mutex behavior when a mutex is shared among processes

Try Lock

int pthread_mutex_trylock(pthread_mutex_t *mutex);

Destroy

int pthread_mutex_destroy(pthread_mutex_t *mutex);

Mutex Safety Tips

Shared data should always be accessed through a single mutex
mutex scope must be visible to all
globally order locks
- for all threads, lock mutexes in order
always unlock a mutex
- always unlock the correct mutex

Pthread Condition Variables

Condition

pthread_cond_t aCond; // type of cond variable

Wait

int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);

Signal

int pthread_cond_signal(pthread_cond_t *cond);

Broadcast

int pthread_cond_broadcast(pthread_cond_t *cond);

Others
- pthread_cond_init
- pthread_cond_destroy

Condition Variable Safety Tips

Do not forget to notify waiting threads
- predicate change => signal/broadcast correct condition variable
When in doubt broadcast
- but performance loss
You do not need a mutex to signal/broadcast

Producer/Consumer Example

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define BUF_SIZE 3    /* Size of shared buffer */

int buffer[BUF_SIZE];    /* shared buffer */
int add = 0;        /* place to add next element */
int rem = 0;        /* place to remove next element */
int num = 0;        /* number elements in buffer */

pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;    /* mutex lock for buffer */
pthread_cond_t c_cons = PTHREAD_COND_INITIALIZER; /* consumer waits on this cond var */
pthread_cond_t c_prod = PTHREAD_COND_INITIALIZER; /* producer waits on this cond var */

void *producer (void *param);
void *consumer (void *param);

int main(int argc, char *argv[]) {

  pthread_t tid1, tid2;  /* thread identifiers */
  int i;

  /* create the threads; may be any number, in general */
  if(pthread_create(&tid1, NULL, producer, NULL) != 0) {
    fprintf(stderr, "Unable to create producer thread\n");
    exit(1);
  }

  if(pthread_create(&tid2, NULL, consumer, NULL) != 0) {
    fprintf(stderr, "Unable to create consumer thread\n");
    exit(1);
  }

  /* wait for created thread to exit */
  pthread_join(tid1, NULL);
  pthread_join(tid2, NULL);
  printf("Parent quiting\n");

  return 0;
}

/* Produce value(s) */
void *producer(void *param) {

  int i;
  for (i=1; i<=20; i++) {

    /* Insert into buffer */
    pthread_mutex_lock (&m);
      if (num > BUF_SIZE) {
        exit(1);  /* overflow */
      }

      while (num == BUF_SIZE) {  /* block if buffer is full */
        pthread_cond_wait (&c_prod, &m);
      }

      /* if executing here, buffer not full so add element */
      buffer[add] = i;
      add = (add+1) % BUF_SIZE;
      num++;
    pthread_mutex_unlock (&m);

    pthread_cond_signal (&c_cons);
    printf ("producer: inserted %d\n", i);
    fflush (stdout);
  }

  printf("producer quiting\n");
  fflush(stdout);
  return 0;
}

/* Consume value(s); Note the consumer never terminates */
void *consumer(void *param) {

  int i;

  while(1) {

    pthread_mutex_lock (&m);
      if (num < 0) {
        exit(1);
      } /* underflow */

      while (num == 0) {  /* block if buffer empty */
        pthread_cond_wait (&c_cons, &m);
      }

      /* if executing here, buffer not empty so remove element */
      i = buffer[rem];
      rem = (rem+1) % BUF_SIZE;
      num--;
    pthread_mutex_unlock (&m);

    pthread_cond_signal (&c_prod);
    printf ("Consume value %d\n", i);  fflush(stdout);

  }
  return 0;
}

Thread Design Considerations

Kernel vs User-level Threads

Kernel-level Threading: OS Kernel maintains thread abstraction, scheduling, sync, …
User-level Threading:
- User-level library provides thread abstraction, scheduling, sync
- different process can use different threading libraries
Mappings: one to one, one to many, many to one

thread related

Hard and Light Process State

When two threads point to the same process, there are information in the process control block that are specific to each thread, and there are also information that’s shared between these two threads. We split these information into two states:

“light” process state: specific to each thread. e.g. signal mask, sys call args
“hard” process state: shared across threads belong to the same process e.g. virtual address mapping

Rationale for Multiple Data Structures

Single PCB:

large continuous data structures
- - scalability
private for each entity
- - overheads
saved and restored on each context switch
- - performance
update for any changes
- - flexibility

Multiple data structures:

smaller data structures
- + scalability
easier to share
- + overheads
on context switch only save and restore what needs to change
- + performance
update for any changes
user-level library need only update portion of the state
- + flexibility

SunOS 5.0 Threading Model

Sun Threading

User-level Thread Data Structures

Light Weight Threading

“Implementing Lightweight Threads”

not POSIX threads, but similar
thread creation => thread ID (tid)
tid => index into table of pointers
table pointers point to per thread data structure
stack growth can be dangerous! As OS do not have knowledge of multiple threads, as stack of one thread grows, it can overwrite data for another thread. Debug is hard as cause error in one thread is caused by another thread.
- solution => red zone (portion of virtual address space that’s not allocated. When thread grows large, it will write to red zone, which will in turn cause a fault. Now it is easier to reason about the fault because the problem is caused directly by the same source.)

Kernel-level Data Structures

Process

list of kernel-level threads
virtual address space
user credentials
signal handlers

Light-Weight Process (LWP): similar to ULT, but visible to kernel. Not needed when process is not running

user level registers
system call args
resource usage info
signal mask

Kernel-level Threads: information needed even when process not running => not swappable

kernel-level registers
stack pointer
scheduling info (class, …)
pointers to associated LWP process, CPU structures

CPU

current thread
list of kernel-level thread
dispatching and interrupt handling information

Sun demo

Basic Thread Management Interactions

User-level library does not know what is happening in the kernel
Kernel does not know what is happening in user-level library

=> System calls and special signals allow kernel and ULT library to interact and coordinate

Also see: pthread_setconcurrency()

Lack of Thread Management Visibility

User-Level library Sees:

ULTs
available KLTs

Kernel sees:

KLTs
CPUs
KL scheduler

User level thread has a lock => kernel preempted the thread => none of other user level threads can proceed => kernel cycle through threads, and complete the thread with lock

Many to Many:

UL Scheduling decisions
change ULT-KLT mapping

Invisible to Kernel:

mutex variable
wait queues

Problem: Visibility of state and decisions between kernel and user level library; one to one helps address some of these issues

How/When does the UL library run>

Process jumps to UL library scheduler when:

ULTs explicitly yield
timer set by UL library expires
ULTs call library functions like lock/unlock
blocked threads become runnable

ULT library scheduler

runs on ULT operations
runs on signals from timer or kernel

Issues on multiple CPUs

multiple cpu

spin vs block

Destroying Threads

Instead of destroying… reuse threads
When a thread exits
- put on a death row
- periodically destroyed by reaper thread
- otherwise thread structures/stacks are reused => performance gains

Interrupts vs Signals

Interrupts

events generated externally by components other than the CPU (I/O devices, timers, other CPUs)
determined based on the physical platform
appear asynchronously
external notification to notify CPU some event has occurred

Signals

events triggered by the CPU and software running on it
determined based on the operating system
appear synchronously or asynchronously

Interrupts and signals

have a unique ID depending on the hardware or OS
can be masked and disabled/suspended via corresponding mask
- per-CPU interrupt mask, per-process signal mask
If enabled, trigger corresponding handler
- interrupt handler set for entire system by OS
- signal handlers set on per process basis, by process

Visual Metaphor

visual metaphor

Toyshop example:

hanlded in specific ways
- safety protocols, hazard plans
can be ignored
- continue working
expected or unexpected
- happen regularly or irregularly

Interrupt and signal:

hanlded in specific ways
- interrupt and signal handlers
can be ignored
- interrupt/signal mask
expected or unexpected
- appear synchronously or asynchronously

Interrupts

MSI: message signal interruptor

Interrupts

Signals

SIGNALS

Handlers/Actions

Default Actions
- Terminate, Ignore, Terminate and Core Dump, Stop or Continue
Process Installs Handler
- signal(), sigaction()
- for most signals, some cannot be caught

Synchronous Signals

SIGSEGV (access to protected memory)
SIGFPE (divide by zero)
SIGKILL (kill, id) can be directed to a specific thread

Asynchronous Signals

SIGKILL (kill)
SIGALARM

Why disable interrupts or signals?

Sometimes interrupts/signals can create deadlocks.

interrupt deadlocks

keep handler code simple => too restrictive
control interruptions by handler code => use interrupt/signal masks
- mask is a sequences of 0s and 1s, e.g. 0 0 1 1 1 0 0 1 1 0 ...: 0 - disabled, 1 - enabled

Interrupts on Multicore Systems

Interrupts can be directed to any CPU that has them enabled
may set interrupt on just a single core => avoids overheads and perturbations on all other cores

Types of signals

One-shot signals

“n signals pending == 1 signal pending”: signal will be handled at least once
must be explicitly re-enabled; e.g. the handler has been invoked by signal. Future instances of the signal will be handled by OS default, or ignored.

Real Time Signals

“if n signals raised, then handler is called n times”

POSIX signals

SIGINT terminal interrupt signal
SIGURG high bandwidth data is available on a socket
SIGTTOU background process attempting write
SIGXFSZ file size limit exceeded

Handling Interrupts as Threads

← Build Your Own Jira at Home Python Package Development Best Practices - Managing Environments →

Intro to OS

CS 6200 Introduction to Operating Systems

What is an Operating System

Shop Manager Metaphor

Attempt do define an operating system

Operating System Definition

Operating Systems Examples

Operating System Elements

Memory management example

Design Principles

User/Kernel Protection Boundary

System Call Flowchart

Crossing the User/Kernel Protection Boundary

Note on cache

Basic OS Services

Monolithic OS

Modular OS

Microkernel

Processes and Process Management

Toy Shop Metaphor

What is a process

What does a process look like

How does the OS know what a process is doing

What is a Process Control Block (PCB)

What is a context switch

Process Lifecycle

Process creation

What is the role of the CPU scheduler

How long should a process run for/ How frequently should we run the scheduler

Can Processes Interact

Message-passing IPC

Shared memory IPC

Threads and Concurrency

Toy shop worker metaphor

Process vs Thread

Why are threads useful

Are threads useful on a single CPU (or # of threads > # of CPUs)

Benefits to applications and OS code

What do we need to support threads?

Threads and concurrency

Concurrency Control and Coordination

Threads and Thread Creation

Thread Creation Example

Mutual Exclusion

Making safe_insert safe

Producer/Consumer Example

Condition Variable

Condition Variable API

Reader/Writer Problem

Reader/Writer Example

Critical Sections in Readers/Writer Example

Critical Section Structure with Proxy Variable

Avoiding Common Mistakes

Spurious Wake Ups

Deadlocks

Kernel vs User-Level Threads

One to One Model

Many to One model

Many to Many Model

Scope

Multithreading Patterns

Boss/Workers Pattern

Pipeline Pattern

Layered Pattern

Time to finish orders

Threads Case Study: PThreads

Pthread creation

Pthread Attributes

Detaching Pthread

Compiling Pthreads

Quiz 1

Quiz 2

Quiz 3

Pthread Mutexes

Other Mutex Operations

Mutex Safety Tips

Pthread Condition Variables

Condition Variable Safety Tips

Producer/Consumer Example

Thread Design Considerations