Windows Internals All Slides
Windows Internals All Slides
Windows Internals All Slides
Module 1: Introduction
Pavel Yosifovich
CTO, CodeValue
[email protected]
http://blogs.Microsoft.co.il/blogs/pavely
Contents
Course Objectives
Windows Versions
Tools
Summary
Course Objectives
Windows built in
Task manager, resource monitor, performance monitor, others
SysInternals
Obtained from http://www.sysinternals.com (which is redirected to
http://microsoft.technet.com/sysinternals)
Most written by Mark Russinovich
No installation needed
Free
Debugging tools for Windows
Now part of the Windows SDK
No installation needed
Free
Demo
Process
A set of resources used to execute a program
A process consists of
A private virtual address space
An executable program, referring to an image file on disk which
contains the initial code and data to be executed
A table of handles to various kernel objects
A security context (access token), used for security checks when
accessing shared resources
One or more threads that execute code
Demo
Task Manager
Demo
Process Explorer
Threads
Thread
Entity that is scheduled by the kernel to execute code
A thread contains
The state of CPU registers
Current access mode (user mode or kernel mode)
Two stacks, one in user space and one in kernel space
A private storage area, called Thread Local Storage (TLS)
Optional security token
Optional message queue and Windows the thread creates
A priority, used in thread scheduling
A state: running, ready, waiting
Demo
Threads
Virtual Memory
Process A Process B
Disk
Virtual Memory Layout
High addresses
2 GB 6657 GB
System Space System Space
Unmapped
2 GB 8192 GB
User Process (8 TB)
Space User Process
Low addresses Space
Demo
Virtual Memory
Objects and Handles
Windows XP Home
Designed as a replacement for the Windows 9x/ME family
(“Consumer Windows”)
Windows Professional (2000, XP, Vista, 7, 8)
Main desktop (client) OS
Windows Server Standard, Advanced, Datacenter
editions (Windows 2000, 2003/R2, 2008/R2, 2012)
Server platforms
Other variants
XP starter, XP Home, Media center, Server Web Edition, Home,
Premium, Ultimate, Business, Enterprise
Professional vs. Server
Windows NT 4 (4.0)
Windows 2000 (5.0)
Windows XP (5.1)
Windows Server 2003, 2003 R2 (5.2)
Windows Vista, Server 2008 (6.0)
Windows 7, Server 2008 R2 (6.1)
Windows 8, Server 2012 (6.2)
Windows 8.1, Server 2012 R2 (6.3)
These values can be obtained using GetVersionEx (Win32) or
RtlGetVersion (WDK)
Demo
Environment
Services User Applications
System Subsystem
Processes
Subsystem DLLs
NTDLL.DLL
User Mode
Kernel Mode
Executive
Graphics
(Win32k) Device Drivers Kernel
call NtReadFile
return to caller
Kernel32.DLL
sysenter / syscall
return to caller
NtDll.DLL
User mode
Kernel mode
call NtReadFile NtOskrnl.EXE
NtReadFile:
call driver NtOskrnl.EXE
return to caller
initiate I/O
driver.sys
return to caller
Brief Overview of WinDbg
Although there are many Windows editions, the kernel is basically the
same
User mode processes use subsystem DLLs to access OS functionality
A system service call entails transitioning from user mode to kernel
mode (and back)
Windows Internals
Module 4: System Architecture (Part 2)
Pavel Yosifovich
CTO, CodeValue
[email protected]
http://blogs.Microsoft.co.il/blogs/pavely
Contents
Ntoskrnl.exe
Executive and kernel on 64 bit systems
NtKrnlPa.exe
Executive and kernel on 32 bit systems
Hal.dll
Hardware Abstraction Layer
Win32k.sys
Kernel component of the Windows subsystem
Handles windowing and GDI
NtDll.dll
System support routines and Native API dispatcher to executive services
Kernel32.dll, user32.dll, gdi32.dll, advapi32.dll
Core Windows subsystem DLLs
CSRSS.exe (“Client Server Runtime SubSystem”)
The Windows subsystem process
Demo
SMP
All CPUs are the same and share main memory and have equal access to
peripheral devices (no master/slave)
Basic architecture supports up to 32/64 CPUs
Windows 7 64 bit & 2008 R2 support up to 256 cores
Uses a new concept of a “processor group”
Actual number of CPUs determined by licensing and product type
Multiple cores do not count towards this limit
Demo
SMP
Subsystems
Implemented by NTDLL.DLL
Used by subsystem DLLs and “native” images
Undocumented interface
Lowest layer of user mode code
Contains
Various support functions
Dispatcher to kernel services
Most of them accessible using Windows API “wrappers”
Demo
Idle process
System process
Session Manager (Smss.Exe)
Windows subsystem (Csrss.Exe)
Logon process (Winlogon.Exe)
Service control manager (SCM) (Services.Exe)
Local security authentication server (Lsass.Exe)
Local session manager (Lsm.exe)
Idle Process
Services
Wow64
32 bit NtDll.Dll
Wow64Cpu.Dll
Wow64.Dll Wow64Win.Dll
64 bit NtDll.Dll
NtOsKrnl.Exe Win32k.Sys
Wow64 Restrictions
Wow64
Summary
Processes
Threads
Thread scheduling
Thread synchronization
Thread pools
Jobs
Summary
Process
Creating a Process
Demo
Process internals
Threads
Creating threads
Windows Internals
Module 6: Processes & Threads (Part 2)
Pavel Yosifovich
CTO, CodeValue
[email protected]
http://blogs.Microsoft.co.il/blogs/pavely
Thread Stacks
Thread stacks
Thread Priorities
1 4 6 8 10 13 15 16 24 31
Priority
Demo
Thread Priorities
CPU
Ready
Running
priority
5
31
16
8 6 4
4 7
Waiting
888 2 1 3
Thread Scheduling (single processor)
Init (0)
Preemption,
Preempt quantum end
Ready, (1)
Deferred Standby (3) Running (2)
Ready (7)
Voluntary
switch
Transition (6)
Waiting (5)
Kernel
Terminate (4)
stack outswap
The Scheduler
Thread Scheduling
The Quantum
Thread Quantum
Windows Internals
Module 7: Processes & Threads (Part 3)
Pavel Yosifovich
CTO, CodeValue
[email protected]
http://blogs.Microsoft.co.il/blogs/pavely
Priority Boosts
#define IO_SERIAL_INCREMENT 2
#define EVENT_INCREMENT 1
#define IO_KEYBOARD_INCREMENT 6
Thread Priority Boost and Decay
Priority
quantum
Priority Preempted
boost upon (before
wait quatum
completion end)
Base
Priority Run Wait Run Wait Run
Time
Foreground Process Wait Boost
Foreground process
The process which contains the thread who is the owner (and creator) of the
foreground window
After a thread running in the foreground process completes a wait
on a kernel object
Receives a boost in the amount of the value set in the registry for
foreground priority boost
+2 by default
GUI Thread Wakeup
Priority Inversion
High-priority thread waits on something locked by a lower priority thread
which can’t run because of a middle priority thread running
Boosts thread to avoid priority inversion
Threads staying in ready state a long time (four seconds) get a big boost to
priority 15
Get to run for 3 quantums at this special boost Wait
Then priority drops to base 12
Technically, starvation avoidance
Implemented by the balance set manager 7 Run
Scans at most 16 threads per pass
Boosts at most 10 threads per pass
4 Ready
Demo
Priority boosts
Multiprocessing - Soft affinity
Ideal Processor
Every thread has an ideal processor
Default value set in round-robin within each process
A random starting value
Can Override with SetThreadIdealProcessor
On hyper-threaded systems, the next ideal processor selected is from the
next physical core (not logical)
Multiprocessing - Hard Affinity
Threads can run on any CPU unless hard affinity is set for that thread
SetThreadAffinityMask
The mask is a bit mask of allowed CPUs to run that thread
Default is process affinity mask, which defaults to all processors
Calling SetProcessAffinityMask changes priority mask for all threads
running under that process
And future created threads in that process
Using hard affinity may result in threads getting less CPU time
Multiprocessor Scheduling
Is ideal CPU
Y running N Is there Y Y
Is ideal
Use it lower an idle Use it
priority CPU idle?
CPU?
thread?
N
N
Is Y
previous Use it
Add it to the ideal CPU idle?
CPU’s ready
queue N
N Is current Y
Use it
CPU idle?
Find and use first
numbered idle CPU
Thread Synchronization
Process
The process has terminated
Thread
The thread has terminated
Mutex
The mutex is free
Event
The event flag is raised
Semaphore
The semaphore count is greater than zero
File, I/O completion port
I/O operation completed
Timer
Interval time expires
Mutex
Mutual exclusion
Called Mutant in kernel terminology
Allows a single thread to enter a critical region
The thread that enters the critical region (its wait has succeeded) is
the owner of the mutex
Releasing the mutex allows one (single) thread to acquire it and
enter the critical section
Recursive acquisition is ok (increments a counter)
If the owning thread does not release the mutex before it terminates, the
kernel releases it and the next wait succeeds with a special code (abandoned
mutex)
Semaphore
Thread Synchronization
More threading
Thread pools
Simplifies thread management
Potentially boosts performance as threads don’t need to be
created/destroyed explicitly
C++11 and .NET 4+ provide helpers for fork/join scenarios
parallel_for (C++), Parallel.For (.NET)
Simplify operations where order is unimportant
Other higher level threading helpers exist in C++ 11 and .NET 4+
Manual thread management considered “low level”
Understanding threads can help make the right choices and solve problems
Demo
Automatic parallelization
Jobs
Jobs
Summary