CoE 135 Notes 2
CoE 135 Notes 2
CoE 135 Notes 2
, Stack o Shareable: code/text section, data section, signals and open files Multithreading o More responsive, resources sharing, economy, multi-processor utilization o Higher throughput o Disadvantages: inter-process communication, synchronization Single-threaded Model o Signal-Driven I/O Can cope up with slow I/O, but not with fast I/O Multi-threaded Model o Thread support: kernel level, user level o Thread models Many-to-One Multiple user threads -> one kernel thread Thread management done by thread library Entire process blocks if a thread blocks Unable to run in parallel on multi-processors Green threads (Solaris), Portable threads (GNU) One-to-One One user thread -> one kernel thread One thread can run even if another thread blocks Able to run in multi-processors Overhead is creating a kernel thread for every process thread Linux, Windows threads Many-to-Many Multiplexes user threads to an almost equal number of kernel threads Programs can create as many threads as necessary Able to run in multi-processors IRIX, HP-UX, Tru64 UNIX threads Variant: Two-level mode Thread libraries o Could be implemented with/without kernel support o Main thread libraries: POSIX Pthreads Win32 Java Pthreads o POSIX threads as described in POSIX:THR Threads Extension o Implements APIs (Application Programming Interfaces) for Thread creation, Termination, Synchronization, etc
o Unlike fork(), it does not copy the parents page tables POSIX Thread Management o Thread creation pthread_create() pthread_create(thread_id, attr, func_to_run, arg_to_pass); Error codes: EAGAIN lack of resource, EINVAL attr invalid, EPERM caller does not have enough info o Thread termination (similar to calling return or exit) pthread_exit() pthread_exit(ptr_to_return_value); pthread_exit(NULL) can be called to wait for other threads o Cancelling another thread pthread_cancel() pthread_cancel(thread_id); Thread can only be canceled only if it is in PTHREAD_CANCEL_ENABLE Cancellation requests are pended in PTHREAD_CANCEL_DISABLE Changed through pthread_setcancelstate(state, oldstate); o Setting thread to release resources pthread_detach() pthread_detach(thread_id); Memory resources consumed by the thread will be freed immediately when it terminates Prevents other threads from synchronizing on its termination using pthread_join o Test two threads for equality pthread_equal() pthread_equal(thread1_id, thread2_id); returns 0 if equal o Send a signal to a thread pthread_kill() pthread_kill(thread_id, signo); o Wait for a thread pthread_join() pthread_join(thread_id_youre_waiting_for, ptr_to_return_value); o Find out own thread ID pthread_self() o Returns 0 on success, non-zero on error o POSIX thread functions do not use errno and perror They use strerror_r Thread Scheduling Management get/set scheduling policy and parameters o pthread_setschedparam o pthread_getschedparam Thread Safety o Thread safe if multiple threads can execute simultaneous active invocations of the function without interference Re-entrant o Some non-thread safe functions: asctime, dirname, getenv, rand o Use of static variables: common reason for non-thread safety Thread Attributes o pthread_attr_init() o pthread_attr_destroy()
Thread Synchronization o Mutex Special variable that can either be locked/unlocked If locked, a thread holds the mutex Has a queue for the threads waiting to hold it Determined by thread-scheduling pthread_mutex_t PTHREAD_MUTEX_INITIALIZER pthread_mutex_init() pthread_mutex_destroy() pthread_mutex_lock() pthread_mutex_trylock() pthread_mutex_unlock() Voluntary
Lec 5 Scheduler Enables multiple processes to share resources o Multitasking system Scheduler policy o Algorithm that determines schedulers behaviour Scheduler Criteria o Turnaround time time between submission of a process and its completion o Response time time between submission of request and beginning of response o Deadline future time a process must complete its execution o Predictability a job should run with the same time and cost regardless of system load o Throughput amount of processes completed per unit time o Utilization percentage of time the processor is busy o Fairness processes must be treated the same, no starvation o Priorities some processes are more urgent o Balancing all resources must be busy o Waiting time sum of periods spent waiting in ready queue I/O-bound and processor-bound o Linux scheduler favors I/O-bound to increase system response Non-realtime Scheduler o Goal of scheduler is to maximize average throughout Number of tasks completed per unit time Windows and Linux schedulers are examples of this Linux 2.6 and Windows XP have mechanisms to enable real-time scheduling o Minimize average waiting time o Ideal: fast response time with high process throughput (conflicting goals) When CPU becomes idle, OS selects one of the processes in the ready queue o Carried out by the short-term scheduler o Records in the queue are process control blocks (PCBs) of the processes
Dispatcher o Module that gives control of the CPU to the process selected by the short-term scheduler o Overhead Switching context Switching to user mode Jumping to proper location to start the program Multitasking OS Scheduler o Classified according to its decision mode Decision mode instant of time when a new process is selected for execution Cooperative Pre-emptive Cooperative Multitasking o A process does not stop until it decides to suspend itself (yielding) o Scheduler has no control once the process starts running o Examples: Win 3.1, Mac OS9 o First-Come-First-Served (FCFS) also known as FIFO Max waiting time for choosing which process to run next Low overhead, large spread in response time, penalizes short and I/O bound o Shortest-Job-First (SJF) Assigned to the process that has the smallest CPU burst FCFS for tie-breaking Lowest number has highest priority Pre-emptive Multitasking o Scheduler decides when a process should start or stop o Pre-emption: involuntary suspension o Time slice: time allotted for a process before it was pre-empted o Example: 2.6.X Linux kernel o Causes of pre-emption Running state -> waiting state (e.g. I/O request) Running state -> ready state (e.g. Interrupt) Waiting state -> ready state (e.g. I/O completion) Termination o Round-robin (RR) scheduler Reduces penalty for short jobs Next process to execute is selected on a FIFO basis Hard question: What is the value of the timeslice? Small value -> frequent context switching Large value -> will degrade to FIFO Enhances fairness, slight bias against I/O bound o Pre-emptive SJF scheduler If current process does not have the shortest remaining time, it is pre-empted o Pre-emptive Priority-based scheduler If current process does not have the highest priority, it is pre-empted
Linux scheduler o 1.2 Kernel Circular queue, Round-robin o 2.2 Kernel Scheduling classes (RT tasks, non-preemptive, non-RT tasks) SMP support (Symmetric multiprocessing) o 2.4 Kernel O(N) scheduler Tasks are allotted time slice Chooses next task using a goodness of fit function Not suitable for RT applications Linux 2.6.X scheduler o O(1) algorithm Real-time response o Superseded by new algorithm called Completely Fair Scheduler Completely Fair Scheduler (CFS) o Introduced in Kernel 2.6.23 o Relies on red-black trees instead of run queues o Proposed by Con Kolivas as Rotating Staircase Deadline Scheduler (RSDL) o Later incorporated as CFS by Ingo Molnar o O(1) scheduler was difficult to manage Scheduler classes RT and Non-RT o Real-time scheduling policies: SCHED_FIFO SCHED_RR Managed by kernel/sched/rt.c Kernel space priority numbers: Highest 0, Lowest 99 o Non-realtime SCHED_OTHER SCHED_NORMAL Each process is assigned a priority Lower number, higher priority User-space: priority numbers are -20 to 19 o 0 for normal processes o nice() to change process priority number Managed by kernel/sched/fair.c Timeslice Policy o Average timeslice in Linux is 100ms Process timeslice calculation o Lower priority (less interactive): less time slice (10 msec least) o Maximum timeslice: 800msec Preemption before timeslice exhaustion o Process may suspend itself before timeslice expires Timeslice
o Calculated as a function of runnable processes o Uses nice value to weigh proportion of cpu on a process o CFS uses target latency o Min granularity 1ms under heavy load CFS Implementation o Time Accounting Each process is assigned a time slice, decremented every tick Virtual runtime actual runtime normalized by number of runnable processes o Process Selection Scheduler picks process w/smallest vruntime Uses red-black tree Add process: enqueue_entity() Remove process: dequeue_entity() o Scheduler Entry Point schedule() main entry point Decides which process to run Calls pick_next_task() CFS Data Structure Each node represented by rb_node Struct task_struct -> struct sched_entity -> struct rb_node Pick_next_task picks left-most task from RB tree o Sleeping and Waking Up 2.6.X Kernel O(1) scheduler o Every algorithm in the scheduler completes in constant time regardless of the number of tasks; O(1) o SMP scalability o Provides fairness o Good interactive performance even during high load o SMP affinity processes which run one CPU should stay in one CPU o O(1) RT scheduling, load balancing o Data structures runqueue all info on runnable processes is stored in struct rq task_struct o Priority arrays Each runqueue has 2 priority arrays Active o Contains tasks w/timeslice left Expired o Contains tasks w/o timeslice left Store priorities of processes using bitmap When active priority array becomes empty, expired priority array becomes active and the active array becomes expired No recalculation of timeslice and priority
Priority numbers: RT: 0 to 99 Non-RT: 100-139 task_rq_lock, task_rq_unlock Schedule() Picks next task to run context_switch() switches to a new task Examines bitmap from 0 to 139 Selection is done in an RR basis for processes of the same priority I/O bound or processor-bound? Linux keeps a running tab on how much time a process spends sleeping vs now much time in the runnable state sleep_avg Scheduler Macros Non-RT priorities: MAX_RT_PRIO -> MAX_PRIO-1 RT priorities: 0 -> MAX_RT_PRIO-1 MAX_USER_RT_PRIO Maximum real time priority exported to user-space MIN_TIMESLICE, MAX_TIMESLICE
p.76