SIR PPT CO 3 N 4
SIR PPT CO 3 N 4
SIR PPT CO 3 N 4
• Using paging, 80386 organizes the available physical memory into pages
of 4kB each.
• 2 versions: 80386DX & 80386SX (16-bit D’bus, 24-bit A’bus, low power &
low cost).
Instruction cache
• 8 KB of dedicated instruction cache
• 256 lines between instruction cache and prefetch buffers; allows 32 bytes to
be transferred from cache to buffer
Features of Pentium Processors (80586)
Data cache
• 8 KB dedicate data cache gives data to execution units
• 32-bit lines
• A second processor ‘checker’ is used to execute in lock step with the ‘master’ processor
• It checks the master’s output and compares the value with the internal computed values
Superscalar architecture
• Three execution units
• One execution unit executes floating point instructions
• The other two (U pipe and V pipe) execute integer instructions
• Parallel execution of several instructions – superscalar processor
Features of Pentium Processor (80586)
Intel MMX Architecture (introduced from P-II)
• Intel introduced MMX (Multimedia Extension) technology when there was need to
improve 2-D and 3-D imaging for multimedia applications.
• Most of the image processing algorithms in multimedia apps involve operations on several
pixels simultaneously.
• Most of the multimedia apps require Single Instruction Stream Multiple Data Stream
(SIMD) kind of architecture.
• Using conventional CPU, we can operate on 2 pixels simultaneously; whereas using MMX,
we can operate on 8 pixels concurrently.
FEATURES OF PENTIUM PROCESSOR (80586)
INTEL MMX ARCHITECTURE
• MMX instructions use 8 FPRs as MMX registers and use only 64-bit mantissa portion of
these registers.
• As these MMX registers are 64-bit sized, one can pack a total of 8-pixel values in one
register manipulation of these 8 pixels simultaneously is possible with MMX technology.
• 4 MMX data types: Packed bytes, words, doubleword and one quadword.
• Some of the MMX instructions are PADD (B,W,D), PSUB, PCMPEQ, PCMPGT, PMULLW,
PMULHW, PMADDWD, PAND/POR/PXOR, PSRA/PSRL.
EXAMPLE OF PCMPGT
DATA TYPES FOR MMX
4 two-byte integers:
2 four-byte integers
1 eight-byte integer
APPLICATIONS OF MMX TECHNOLOGY
Graphics
MPEG video/image processing
Music synthesis
Speech compression
Video conferencing
Matrix and vector calculations
Advanced 3D graphics
Speech recognition
PENTIUM PRO FEATURES
• The Pentium Pro processor has 36 address lines.
• The L2 cache is connected to BIU, BIU generates memory addresses and control signals and
passes or fetches data or instructions either to L1 data cache or L1 instruction cache.
• The Instruction Fetch and Decode Unit (IFDU), contains three separate instruction decoders that
decode three instructions simultaneously
PENTIUM PRO FEATURES
• It also includes Branch Prediction Logic.
• It predicts if the branch will be taken or not for a conditional jump instruction
• The execute unit consists of three units namely two integer execution unit
and one floating point unit – two integer and one floating instruction can be
executed simultaneously
• The instruction once executed is retired and the result is written into the
destination location by the retire unit
• It is the last stage of instruction execution.
INTEL CORE 2 DUO
Microarchitecture
The Cores
◦ Single-die(107 mm²),
◦ Two identical core(L1 cache 64K x 2),
◦ Shared L2 cache 6M
◦ No Hyper-threading, no L3 cache
◦ Keep front-side bus
◦ Larger L2 cache
INTEL CORE 2 DUO MICROARCHITECHTURE
CORE 2 DUO FEATURES
• Intel Core 2 Duo processor family support Intel 64 architecture; they are
based on the high-performance, power-efficient Intel Core microarchitecture
built on 65 nm process technology.
• The Intel Core microarchitecture includes the following innovative features:
• Intel Wide Dynamic Execution to increase performance and execution
throughput
• Intel Intelligent Power Capability to reduce power consumption
• Intel Advanced Smart Cache which allows for efficient data sharing between
two processor cores
• Intel Smart Memory Access to increase data bandwidth and hide latency of
memory accesses
• Intel Advanced Digital Media Boost which improves application performance
using multiple generations of Streaming SIMD extensions
CORE 2 DUO FEATURES
Intel® Wide Dynamic Execution enable each processor core to fetch, dispatch, execute in high bandwidths to
support retirement of up to four instructions per cycle.
• Fourteen-stage efficient pipeline
• Three arithmetic logical units per port
• Four decoders to decode up to five instruction per cycle
• Macro-fusion and micro-fusion to improve front-end throughput
• Peak issue rate of dispatching up to six micro-ops per cycle
• Peak retirement bandwidth of up to 4 micro-ops per cycle
• Advanced branch prediction
• Stack pointer tracker to improve efficiency of executing function/procedure entries and exits.
Intel® Advanced Smart Cache delivers higher bandwidth from the second level cache to the core, and optimal
performance and flexibility for single-threaded and multi-threaded applications.
• Large second level cache up to 4 MB and 16-way associativity
• Optimized for multicore and single-threaded execution environments
• 256-bit internal data path to improve bandwidth from L2 to first-level data cache
CORE 2 DUO FEATURES
Intel® Smart Memory Access prefetches data from memory in
response to data access patterns and reduces cache-miss exposure of
out-of-order execution.
• Hardware prefetchers to reduce effective latency of second-level
cache misses
• Memory disambiguation to improve efficiency of speculative
execution engine
Intel® Advanced Digital Media Boost improves most 128-bit SIMD
instruction with single-cycle throughput and floating-point operations.
• Single-cycle throughput of most 128-bit SIMD instructions
• Up to eight floating-point operation per cycle
HYPER-THREADING TECHNOLOGY
Cache hierarchies
o Having frequently used data on the processor caches reduces average accesses time
HYPER-THREADING TECHNOLOGY
Pipelining
o Exists whenever the machine instructions that make up a program are insensitive to the
order in which they are executed if dependencies does not exist, they may be executed.
HYPER-THREADING TECHNOLOGY
Thread level parallelism
Chip MultiProcessing
o Two processors, each with full set of execution and architectural resources, reside on a single die.
Time Slice Multi Threading
o single processor to execute multiple threads by switching between them
Switch on Event Multi Threading
o switch threads on long latency events such as cache misses
Simultaneous Multi Threading
o Multiple threads can execute on a single processor without switching.
o The threads execute simultaneously and make much better use of the resources.
o It maximizes the performance vs. transistor count and power consumption.
HYPER-THREADING TECHNOLOGY
Next-Instruction Pointer
Sharing of Resources
Major Sharing Schemes are-
o Partition
o Threshold
o Full Sharing
Partition
Each logical processor uses half the resources
Simple and low in complexity
Ensures fairness and progress
Good for major pipeline queues
HYPER-THREADING TECHNOLOGY
Threshold
History Of RISC:
1. RISC approach developed as a result of development in 1970’s.
2. Increase in memory size.
3. Decrease in cost.
4. Advanced compilers.
5. In late 1970’s IBM was the first to start.
6. In 1980 , David Patterson ,began the project
that gives this approach RISC.
Concepts of RISC
Characteristics Of RISC:
◦ Simplified instructions , taking 1 clock cycle.
◦ Large no. of general-purpose registers.
◦ Circuit is much simpler.
◦ Fast to decode.
◦ Fast to execute.
◦ Pipelining- fetching of next instruction while previous instruction executes.
Concepts of RISC
• Register-to-Register Operations:
RISC processors only allow LOAD/STORE operations to access memory.
Example:-
Load X, Load Y,
add X and Y,
Store on Z
X *Y→Z
Concepts of RISC
Intel was able to spend vast amounts of money on processor development to offset the RISC
advantages enough to maintain PC market share.
New microprocessors can be developed and tested more quickly if being less complicated is
one of it’s aims.
RISC Vs CISC
RISC stands for ‘Reduced Instruction Set Computer Whereas, CISC stands
for Complex Instruction Set Computer. The RISC processors have a smaller
set of instructions with few addressing nodes. The CISC processors have a
larger set of instructions with many addressing nodes.
RISC Vs CISC
1. Memory Unit
RISC has no memory unit and uses a separate hardware to implement
instructions. CISC has a memory unit to implement complex instructions
2. Program
RISC has a hard-wired unit of programming. CISC has a microprogramming
unit
3.Design
RISC is a complex compiler design. CISC is an easy compiler design
4.Calculations
RISC calculations are faster and more precise. CISC calculations are slow
and precise
RISC Vs CISC
5.Decoding
RISC decoding of instructions is simple. CISC decoding of instructions is complex
6.Time
Execution time is very less in RISC. Execution time is very high in CISC.
7.External memory
RISC does not require external memory for calculations. CISC requires external
memory for calculations.
8. Pipelining
RISC Pipelining does function correctly. CISC Pipelining does not function
correctly.
9.Stalling
RISC stalling is mostly reduced in processors. CISC processors often stall.
3. Performance is optimized with more focus on 3. Performance is optimized with more focus on
software hardware.
4. It has no memory unit and uses a separate 4. It has a memory unit to implement complex
hardware to implement instructions.. instructions.
6. The instruction set is reduced i.e. it has only 6. The instruction set has a variety of different
a few instructions in the instruction set. Many of instructions that can be used for complex
these instructions are very primitive. operations.
7. CISC has many different addressing modes
7. The instruction set has a variety of different and can thus be used to represent higher-level
instructions that can be used for complex operations. programming language statements more
efficiently.
8. Complex addressing modes are synthesized using 8. CISC already supports complex addressing
the software. modes
9. Multiple register sets are present 9. Only has a single register set
10. They are normally not pipelined or less
10. RISC processors are highly pipelined
pipelined
11. The complexity of RISC lies with the compiler
11. The complexity lies in the microprogram
that executes the program
12. Execution time is very less 12. Execution time is very high
13. Code expansion can be a problem 13. Code expansion is not a problem
14. Decoding of instructions is simple. 14. Decoding of instructions is complex
15. It does not require external memory for
15. It requires external memory for calculations
calculations
16. The most common RISC microprocessors are 16. Examples of CISC processors are the
Alpha, ARC, ARM, AVR, MIPS, PA-RISC, PIC, Power System/360, VAX, PDP-11, Motorola 68000 family,
Architecture, and SPARC. AMD and Intel x86 CPUs.
17. RISC architecture is used in high-end applications 17. CISC architecture is used in low-end
such as video processing, telecommunications and applications such as security systems, home
image processing. automation, etc.
Microcontrollers
Microcontrollers
Microcontrollers
ACCORDING TO BITS
4-BIT MICROCONTROLLERS
ALU performs arithmetic and logical operations on a nibble (4-bits) at an instruction.
Internal bus width of 4-bit.
Small size, minimum pin count and low-cost controllers.
Low power consumption and used for low end applications like LED & LCD display drivers, portable
battery chargers.
Examples: Renasa M34501 256 and ATAM862 series from ATMEL.
8-BIT MICROCONTROLLER
ALU performs arithmetic and logical operations on a byte (8-bits) at an instruction.
Internal bus width of 8-bit.
Examples: Intel 8051 family and Motorola MC68HC11 family.
16-BIT MICROCONTROLLER
ALU performs arithmetic and logical operations on a word (16-bits) at an instruction.
Internal bus width of 16-bit microcontroller is of 16-bit.
Enhanced performance, computing capability and greater precision as compared to the 8-bit
microcontrollers.
Examples: Intel 8096 family, Motorola MC68HC12 and MC68332 families.
32-BIT MICROCONTROLLER
ALU performs arithmetic and logical operations on a double word (32-bits) at an instruction.
Internal bus width of 32-bit.
Much more enhanced performance, computing capability with greater precision as compared to
16-bit microcontrollers.
Examples: Intel 80960 family, Motorola M683xx and Intel/Atmel 251 family.
ACCORDING TO MEMORY/DEVICES
EMBEDDED MICROCONTROLLERS
An embedded system has a microcontroller unit that has all the functional blocks (including
program as well as data memory) available on the same chip.
Example: 8051 having Program & Data Memory, I/O Ports, Serial Communication, Counters
and Timers and Interrupt Control logic on the chip.
EXTERNAL MEMORY MICROCONTROLLERS
An external system has a microcontroller unit that does not have all the functional blocks
available on a chip.
All or part of the memory units are externally interfaced using an interfacing circuit called the
glue circuit.
Example: 8031 has no program memory on the chip.
ACCODING TO INSTRUCTION SET
CISC (COMPLEX INSTRUCTION SET COMPUTER)
ARCHITECTURE MICROCONTROLLERS
Has an instruction set that supports many addressing modes for the arithmetic and logical
instructions, data transfer and memory accesses instructions.
Many of the instructions are macro like.
Allows the programmer to use one instruction in place of many simpler instructions.
Example: Intel 8096 family.
RISC (REDUCED INSTRUCTION SET COMPUTER)
ARCHITECTURE MICROCONTROLLERS
Contains an instruction set that supports fewer addressing modes for the arithmetic and logical instructions and for data
transfer instructions.
Allows simultaneous access of program and data.
Instruction pipelining increases execution speed
Allow each instruction to operate on any register or use any addressing mode.
Smaller chip and pin count.
Very low power consumption.
ACCORDING TO MEMORY ARCHITECTURE
The architectures of microcontrollers differ in the way data and programs are stored and accessed.
1. VON-NEUMAN /PRINCETON ARCHITECTURE
Single data bus that is used to fetch both instructions and data.
Program instructions and data are stored in a common main memory.
When such a controller addresses main memory, it first fetches an instruction, and then it fetches the data to
support the instruction.
1. VON-NEUMAN /PRINCETON ARCHITECTURE(cont.)
A specialized program found as part of the microcontroller designed to prevent the microcontroller
from halting or “locking up” because of a user-written program since the instructions are processed step-
by-step.
Uses a routine that is based on timing. If a program has not been completed or repeated as a loop
within a certain amount of time, the watchdog timer issues a reset command.
A system reset sets all the register values to zero.
The reset feature allows the controller to recover from the crash.
It releases the program and sets the controller to start over again.
Stack Pointer and Program Counter
Stack pointer - keeps track of the last stack location used while the processor is busy manipulating data
values, checking ports, or checking interrupts.
Program counter - is used to hold the address of the instruction to be executed next.
Buses
Bus represents a physical connection used to carry a signal from one point to another inside a
microcontroller. The signal carried by a bus may represent address, data, control signal, or power.
Microcontroller operation
MODULE
When a microcontroller is mounted on a circuit board with other components function as a single
unit, is referred as a module or a microcontroller board.
A microcontroller module typically consists of microcontroller, a power source, an interface for
connecting to a programming device, I/O ports, and additional memory.
Microcontroller operation: (Cont.)
A power source - powers the microcontroller and any accompanying components located on the printed circuit
board.
An interface - communicate with the controller.
A set of input/output (I/O) ports - send and receive signals from the devices the microcontroller is designed
to control.
I/O ports when programmed as an output pin, each pin can output digital signals. When programmed as an
input pin, each pin can receive digital signals.
Digital-to-analog and analog-to-digital converters change the digital pulses into analog signals.
Internal Operation
The microcontroller consists of thousands of digital circuits that are combined into areas to provide specific functions.
The parts of the microcontroller are used to save data and programs, perform math and logic functions,
and generate timing signals.
The different areas are connected by a bus system. The bus system contains tiny parallel circuits that carry the digital pulse
patterns from section to section.
The ROM stores the program required for the microcontroller to function and controls how the chip components operate and how
data and instructions flow through the chip.
RAM stores programs and data temporarily.
Ports and registers are special memory locations dedicated to a specific function such as a hardware location or a place to
manipulate data.
ADVANTAGEOUS FEATURES
Easy to use and Programmable.
Reusable - Ability to reprogram using Flash, EEPROM or EPROM.
Flexibility and dependable.
Design and Simulation.
Energy efficient, small and cost effective.
Ports multifunctionality.
High Integration and can fit inside other devices.
Easy upgrade.
AREAS OF MICROCONTROLLER APPLICATION
Home monitoring system.
Automotive applications such as robotics.
Appliances such as microwave oven, refrigerators, television and VCRs, stereos.
Automobiles in engine control, diagnostics, climate control.
Environmental control in greenhouse, temperature, humidity, factory, home.
Instrumentation.
Aerospace.
Basic Features related to Microprocessor and Microcontrollers