Chapter 4: Threads & Concurrency: Difference Between Multiprocessing and Multithreading

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Chapter 4: Threads & Concurrency

Benefits
 Responsiveness
 Resource Sharing
 Economy
 Scalability
Multicore or multiprocessor systems putting pressure on programmers,
challenges include:
 Dividing activities
 Balance
 Data splitting
 Data dependency
 Testing and debugging

Difference between Multiprocessing and Multithreading

.NO Multiprocessing Multithreading

While In Multithreading, many


In Multiprocessing, CPUs are added threads are created of a
1.
for increasing computing power. single process for increasing
computing power.

While in multithreading, many


In Multiprocessing, Many processes
2. threads of a process are
are executed simultaneously.
executed simultaneously.

Multiprocessing are classified While Multithreading is not


3.
into Symmetric and Asymmetric. classified in any categories.

While in Multithreading,
In Multiprocessing, Process creation is
4. process creation is according
a time-consuming process.
to economical.
.NO Multiprocessing Multithreading

While in Multithreading, a
In Multiprocessing, every process
5. common address space is
owned a separate address space.
shared by all the threads.

 Concurrency is about multiple tasks which start, run, and


complete in overlapping time periods, in no specific order.

 Parallelism is about multiple tasks or subtasks of the same


task that literally run at the same time on a hardware with
multiple computing resources like multi-core processor.

S.NO Concurrency Parallelism

While parallelism is the task


Concurrency is the task of running and
of running multiple
1. managing the multiple computations at
computations
the same time.
simultaneously.

Concurrency is achieved through the


While it is achieved by
interleaving operation of processes on
2. through multiple central
the central processing unit(CPU) or in
processing units(CPUs).
other words by the context switching.

While this can’t be done by


Concurrency can be done by using a using a single processing
3.
single processing unit. unit. it needs multiple
processing units.

While it improves the


Concurrency increases the amount of throughput and
4.
work finished at a time. computational speed of the
system.

Concurrency deals lot of things While it do lot of things


5.
simultaneously. simultaneously.

6. Concurrency is the non-deterministic While it is deterministic


control flow approach. control flow approach.

While in this debugging is


In concurrency debugging is very
7. also hard but simple than
hard.
concurrency.

Multicore Programming
Types of parallelism
Data parallelism – distributes subsets of the same data across
multiple cores, same operation on each

Task parallelism – distributing threads across cores, each thread


performing unique operation.

Amdahl’s Law

 Identifies performance gains from adding additional


cores to an application that has both serial and
parallel components
 S is serial portion
 N processing cores
 That is, if application is 75% parallel / 25% serial,
moving from 1 to 2 cores results in speedup of 1.6
times
 As N approaches infinity, speedup approaches
1/S

User Threads and Kernel Threads


 User threads - management done by user-level
threads library
 Three primary thread libraries:
 POSIX Pthreads
 Windows threads
 Java threads
 Kernel threads - Supported by the Kernel
 Examples – virtually all general -purpose operating
systems, including:
 Windows

 Linux

 Mac OS X

 iOS

 Android

Multithreading Models
 Many-to-One = Many user-level threads mapped to
single kernel thread
 One thread blocking causes all to block
 Multiple threads may not run in parallel on multicore
system because only one may be in kernel at a time
 Few systems currently use this model
 Examples:
 Solaris Green Threads

 GNU Portable Threads

One-to-One = Each user-level thread maps to kernel


thread
 Creating a user-level thread creates a kernel thread
 More concurrency than many-to-one
 Number of threads per process sometimes restricted
due to overhead
 Examples
 Windows
 Linux

 Many-to-Many = Allows many user level threads to


be mapped to many kernel threads
 Allows the operating system to create a sufficient
number of kernel threads
 Windows with the ThreadFiber package
 Otherwise not very common
Thread Libraries
 Thread library provides programmer with API for
creating and managing threads
 Two primary ways of implementing
 Library entirely in user space

 Kernel-level library supported by the OS

Pthreads
 May be provided either as user-level or kernel-level
 A POSIX standard (IEEE 1003.1c) API for thread
creation and synchronization
 Specification, not implementation
 API specifies behavior of the thread library,
implementation is up to development of the library
 Common in UNIX operating systems (Linux & Mac
OS X)

Implicit Threading
 Growing in popularity as numbers of threads
increase, program correctness more difficult with
explicit threads
 Creation and management of threads done by
compilers and run-time libraries rather than
programmers
 Five methods explored
 Thread Pools

 Fork-Join

 OpenMP

 Grand Central Dispatch

 Intel Threading Building Blocks

Thread Pools
 Create a number of threads in a pool where they
await work
 Advantages:
 Usually slightly faster to service a request with

an existing thread than create a new thread


 Allows the number of threads in the

application(s) to be bound to the size of the pool


 Separating task to be performed from

mechanics of creating task allows different


strategies for running task
 i.e.,Tasks could be scheduled to run

periodically

Fork-Join Parallelism
Multiple threads (tasks) are forked, and then joined.
Intel Threading Building Blocks (TBB)
 Template library for designing parallel C++ programs
 A serial version of a simple for loop

The same for loop written using TBB with


parallel_for statement:
Threading Issues
 Semantics of fork() and exec() system calls
 Signal handling
 Synchronous and asynchronous

 Thread cancellation of target thread


 Asynchronous or deferred

 Thread-local storage
 Scheduler Activations

Semantics of fork() and exec()


 Does fork()duplicate only the calling thread or all threads?
 Some UNIXes have two versions of fork
exec() usually works as normal – replace the running process
including all threads

Signal Handling
 Signals are used in UNIX systems to notify a
process that a particular event has occurred.
 A signal handler is used to process signals
1. Signal is generated by particular event
2. Signal is delivered to a process
3. Signal is handled by one of two signal handlers:
1. default
2. user-defined
 Every signal has default handler that kernel runs
when handling signal
 User-defined signal handler can override

default
 For single-threaded, signal delivered to process

 Where should a signal be delivered for multi-


threaded?
 Deliver the signal to the thread to which the

signal applies
 Deliver the signal to every thread in the process

 Deliver the signal to certain threads in the

process
 Assign a specific thread to receive all signals for

the process.

Thread Cancellation
 Terminating a thread before it has finished
 Thread to be canceled is target thread
 Two general approaches:
 Asynchronous cancellation terminates the

target thread immediately


 Deferred cancellation allows the target thread

to periodically check if it should be cancelled


 Pthread code to create and cancel a thread:
Operating System Examples

 Windows Threads
 Linux Threads

Windows Threads
 Windows API – primary API for Windows
applications
 Implements the one-to-one mapping, kernel-level
 Each thread contains
 A thread id

 Register set representing state of processor

 Separate user and kernel stacks for when

thread runs in user mode or kernel mode


 Private data storage area used by run-time

libraries and dynamic link libraries (DLLs)


 The register set, stacks, and private storage area are
known as the context of the thread.

 The primary data structures of a thread include:


 ETHREAD (executive thread block) – includes

pointer to process to which thread belongs and


to KTHREAD, in kernel space
KTHREAD (kernel thread block) – scheduling
and synchronization info, kernel-mode stack,
pointer to TEB, in kernel space
 TEB (thread environment block) – thread id,

user-mode stack, thread-local storage, in user


space
Windows Threads Data Structures

Linux
Threads
 Linux refers to them as tasks rather than threads
 Thread creation is done through clone() system
call
 clone() allows a child task to share the address
space of the parent task (process)
 Flags control behavior
 struct task_struct points to process data
structures (shared or unique)

Chapter-7- Synchronization Examples


Classical Problems of Synchronization
 Classical problems used to test newly-proposed
synchronization schemes
 Bounded-Buffer Problem
 Readers and Writers Problem
 Dining-Philosophers Problem

Bounded-Buffer Problem
 n buffers, each can hold one item
 Semaphore mutex initialized to the value 1
 Semaphore full initialized to the value 0
Semaphore empty initialized to the value n
 The structure of the producer process

while (true) {
...
/* produce an item in next_produced */
...
wait(empty);
wait(mutex);
...
/* add next produced to the buffer */
...
signal(mutex);
signal(full);

 The structure of the consumer process

while (true) {
wait(full);
wait(mutex);
...
/* remove an item from buffer to next_consumed */
...
signal(mutex);
signal(empty);
...
/* consume the item in next consumed */
...
}
Readers-Writers Problem
 A data set is shared among a number of concurrent
processes
 Readers – only read the data set; they do not

perform any updates


 Writers – can both read and write
 Problem – allow multiple readers to read at the same
time
 Only one single writer can access the shared

data at the same time


 Several variations of how readers and writers are
considered – all involve some form of priorities

 Shared Data
 Data set
 Semaphore rw_mutex initialized to 1
Semaphore mutex initialized to 1
Integer read_count initialized to 0
 The structure of a writer process

while (true) {
wait(rw_mutex);
...
/* writing is performed */
...
signal(rw_mutex);
}
 The structure of a reader process

The structure of a reader process


while (true){
wait(mutex);
read_count++;
if (read_count == 1) /* first reader */
wait(rw_mutex);
signal(mutex);
...
/* reading is performed */
...
wait(mutex);
read count--;
if (read_count == 0) /* last reader */
signal(rw_mutex);
signal(mutex);
}
Readers-Writers Problem Variations
 The solution in previous slide can result in a situation where a writer process
never writes. It is referred to as the “First reader-writer” problem.
 The “Second reader-writer” problem is a variation the first reader-writer
problem that state:
 Once a writer is ready to write, no “newly arrived reader” is allowed to
read.
 Both the first and second may result in starvation. leading to even more
variations
 Problem is solved on some systems by kernel providing reader-writer locks

Dining-Philosophers Problem
 N philosophers’ sit at a round table with a bowl of
rice in the middle.
They spend their lives alternating thinking and eating.
 They do not interact with their neighbors.

 Occasionally try to pick up 2 chopsticks

(one at a time) to eat from bowl


 Need both to eat, then release both

when done
 In the case of 5 philosophers, the shared

data
 Bowl of rice (data set)
 Semaphore chopstick [5] initialized to 1

Dining-Philosophers Problem Algorithm

 Semaphore Solution
 The structure of Philosopher i :
while (true){
wait (chopstick[i] );
wait (chopStick[ (i + 1) % 5] );

/* eat for awhile */

signal (chopstick[i] );
signal (chopstick[ (i + 1) % 5] );

/* think for awhile */

Monitor Solution to Dining Philosophers


monitor DiningPhilosophers
{
enum { THINKING; HUNGRY, EATING) state [5] ;
condition self [5];

void pickup (int i) {


state[i] = HUNGRY;
test(i);
if (state[i] != EATING) self[i].wait;
}
void putdown (int i) {
state[i] = THINKING;
// test left and right neighbors
test((i + 4) % 5);
test((i + 1) % 5);
}

Solution to Dining Philosophers


void test (int i) {
if ((state[(i + 4) % 5] != EATING) &&
(state[i] == HUNGRY) &&
(state[(i + 1) % 5] != EATING) ) {
state[i] = EATING ;
self[i].signal () ;
}
}

initialization_code() {
for (int i = 0; i < 5; i++)
state[i] = THINKING;
}
}
 Each philosopher “i” invokes the operations pickup() and putdown() in
the following sequence:
 DiningPhilosophers.pickup(i);
/** EAT **/
DiningPhilosophers.putdown(i);
No deadlock, but starvation is possible

Kernel Synchronization - Windows


 Uses interrupt masks to protect access to global resources on
uniprocessor systems
 Uses spinlocks on multiprocessor systems
 Spinlocking-thread will never be preempted
 Also provides dispatcher objects user-land which may act
mutexes, semaphores, events, and timers
 Events
 An event acts much like a condition variable
Timers notify one or more thread when time expired

Dispatcher objects either signaled-state (object available) or non-
signaled state (thread will block)

Mutex dispatcher object


Linux Synchronization
 Linux:
 Prior to kernel Version 2.6, disables interrupts to implement
short critical sections
 Version 2.6 and later, fully preemptive
 Linux provides:
 Semaphores
 Atomic integers
 Spinlocks
 Reader-writer versions of both
 On single-CPU system, spinlocks replaced by enabling and
disabling kernel preemption

Atomic variables

atomic_t is the type for atomic integer
Consider the variables

atomic_t counter;
int value;

You might also like