Coa Coa Imp Question Bank With Answers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

lOMoARcPSD|49076371

Coa - coa imp question bank with answers

Computer Organization And Architecture (University of Mumbai)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Yash Thombare ([email protected])
lOMoARcPSD|49076371

Q1) Explain the memory segmentation and memory banking of 8086 Microprocessor.
Segmentation is the process in which the main memory of the computer is logically
divided into different segments and each segment has its own base address. It is basically
used to enhance the speed of execution of the computer system, so that the processor is able
to fetch and execute the data from the memory easily and fast.
Need for Segmentation –
The Bus Interface Unit (BIU) contains four 16 bit special purpose registers (mentioned
below) called as Segment Registers.
 Code segment register (CS): is used for addressing memory location in the code
segment of the memory, where the executable program is stored.
 Data segment register (DS): points to the data segment of the memory where the
data is stored.
 Extra Segment Register (ES): also refers to a segment in the memory which is
another data segment in the memory.
 Stack Segment Register (SS): is used for addressing stack segment of the memory.
The stack segment is that segment of memory which is used to store stack data.
The number of address lines in 8086 is 20, 8086 BIU will send 20bit address, so as to
access one of the 1MB memory locations. The four segment registers actually contain the
upper 16 bits of the starting addresses of the four memory segments of 64 KB each with
which the 8086 is working at that instant of time. A segment is a logical unit of memory
that may be up to 64 kilobytes long. Each segment is made up of contiguous memory
locations. It is an independent, separately addressable unit. Starting address will always be
changing. It will not be fixed.
Below is the one way of positioning four 64 kilobyte segments within the 1M byte memory
space of an 8086.

Types Of Segmentation –
1. Overlapping Segment – A segment starts at a particular address and its
maximum size can go up to 64kilobytes. But if another segment starts along with this
64kilobytes location of the first segment, then the two are said to be Overlapping
Segment.
2. Non-Overlapped Segment – A segment starts at a particular address and its
maximum size can go up to 64kilobytes. But if another segment starts before this
64kilobytes location of the first segment, then the two segments are said to be Non-
Overlapped Segment.
Rules of Segmentation Segmentation process follows some rules as follows:
 The starting address of a segment should be such that it can be evenly divided by 16.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

 Minimum size of a segment can be 16 bytes and the maximum can be 64 kB.
The 8086 processor provides a 16-bit data bus. So It is capable of transferring 16 bits in one
cycle but each memory location is only of a byte(8 bits), therefore we need two cycles to
access 16 bits(8 bit each) from two different memory locations. The solution to this
problem is Memory Banking. Through Memory banking, our goal is to access two
consecutive memory locations in one cycle(transfer 16 bits).
The memory chip is equally divided into two parts(banks). One of the banks contains even
addresses called Even bank and the other contains odd addresses called Odd bank. Even
bank always gives lower byte So Even bank is also called Lower bank(LB) and Odd bank
is also called Higher bank(HB).
This banking scheme allows to access two aligned memory locations from both banks
simultaneously and process 16-bit data transfer. Memory banking doesn’t make it
compulsory to transfer 16 bits, it facilitates the 16-bit data transfer.
The choice between 8 bit and 16-bit transfer depends on the instructions given by the
programmer.
One memory bank contains all the bytes which have even addresses such as 00000h, 00002h,
and 00004h etc. the data lines of this bank is connected to the lower 8 bit data lines i.e. from
D0 to D7 of 8086.
The other memory bank contains all bytes which have odd addresses such as 00001h, 00003h
and 00005h etc. the data lines of this bank is connected to the upper 8 bit data lines i.e. from
D8 to D15 of 8086.

The Least Significant bit of address (A 0 is not used for byte selection) is reserved for bank
selection. Therefore A 0=0 will select Even bank. The BHE signal is used for the selection of
odd banks. The processor will use a combination of these two signals to decide the type of
data transfer.

BHEA0 types of Transfer


0 0 16-0bit data transfer from both HB and LB
0 1 8-bit data transfer from HB
1 0 8-bit data transfer from LB
1 1 None(Idle)

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

In this case, the first machine cycle generates an odd address (A 0=1) transfer lower order 8
data bits on a higher-order data bus. In the second machine cycle, the higher-order data bus
will be transferred to the lower-order data bus.
Q2) With the help of diagram, explain 6-stage pipeline architecture and various pipeline
hazards with example.
1. A typical instruction cycle can be split into many sub cycles like Fetch instruction,
Decode instruction, Execute and Store. The instruction cycle and the corresponding
sub cycles are performed for each instruction. These sub cycles for different
instructions can thus be interleaved or in other words these sub cycles of many
instructions can be carried out simultaneously, resulting in reduced overall execution
time. This is called instruction pipelining.
2. The more are the stages in the pipeline, the more the throughput is of the CPU.
3. If the instruction processing is split into six phases, the pipelined CPU will have six
different stages for the execution of the sub phases.
4. The six stages are as follows:

 Fetch instruction (FI):


 Decode instruction ((DI):
 Calculate operand (CO):
 Fetch operands (FO):
 Execute Instruction (EI):
 Write operand (WO):
Fetch instruction: Instructions are fetched from the memory into a temporary buffer
before it gets executed.
Decode instruction: The instruction is decoded by the CPU so that the necessary op
codes and operands can be determined.
Calculate operand: Based on the addressing scheme used, either operands are
directly provided in the instruction or the effective address has to be calculated.
Fetch Operand: Once the address is calculated, the operands need to be fetched from
the address that was calculated. This is done in this phase.
Execute Instruction: The instruction can now be executed.
Write operand: Once the instruction is executed, the result from the execution needs
to be stored or written back in the memory.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

1. Pipeline hazards are situations that prevent the next instruction in the instruction
stream from executing during its designated clock cycles.
2. Any condition that causes a stall in the pipeline operations can be called a hazard.
3. There are primarily three types of hazards:
i. Data Hazards
ii. Control Hazards or instruction Hazards
iii. Structural Hazards.
i. Data Hazards:
A data hazard is any condition in which either the source or the destination operands of an
instruction are not available at the time expected in the pipeline. As a result of which some
operation has to be delayed and the pipeline stalls. Whenever there are two instructions one
of which depends on the data obtained from the other.
A=3+A
B=A*4
For the above sequence, the second instruction needs the value of ‘A’ computed in the first
instruction.
Thus the second instruction is said to depend on the first.
If the execution is done in a pipelined processor, it is highly likely that the interleaving of
these two instructions can lead to incorrect results due to data dependency between the
instructions. Thus the pipeline needs to be stalled as and when necessary to avoid errors.
ii. Structural Hazards:
This situation arises mainly when two instructions require a given hardware resource at the
same time and hence for one of the instructions the pipeline needs to be stalled.
The most common case is when memory is accessed at the same time by two instructions.
One instruction may need to access the memory as part of the Execute or Write back phase
while other instruction is being fetched. In this case if both the instructions and data reside in
the same memory. Both the instructions can’t proceed together and one of them needs to be

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

stalled till the other is done with the memory access part. Thus in general sufficient hardware
resources are needed for avoiding structural hazards.
iii. Control hazards:
The instruction fetch unit of the CPU is responsible for providing a stream of instructions to
the execution unit. The instructions fetched by the fetch unit are in consecutive memory
locations and they are executed.
However the problem arises when one of the instructions is a branching instruction to some
other memory location. Thus all the instruction fetched in the pipeline from consecutive
memory locations are invalid now and need to removed(also called flushing of the
pipeline).This induces a stall till new instructions are again fetched from the memory address
specified in the branch instruction.
Thus the time lost as a result of this is called a branch penalty. Often dedicated hardware is
incorporated in the fetch unit to identify branch instructions and compute branch addresses as
soon as possible and reducing the resulting delay as a results.
Q3) Explain different cache mapping techniques.
Cache Mapping:
There are three different types of mapping used for the purpose of cache memory which are
as follows: Direct mapping, Associative mapping, and Set-Associative mapping. These are
explained below.
1. Direct Mapping –
The simplest technique, known as direct mapping, maps each block of main memory
into only one possible cache line. or
In Direct mapping, assign each memory block to a specific line in the cache. If a line is
previously taken up by a memory block when a new block needs to be loaded, the old
block is trashed. An address space is split into two parts index field and a tag field. The
cache is used to store the tag field whereas the rest is stored in the main memory. Direct
mapping`s performance is directly proportional to the Hit ratio.
For purposes of cache access, each main memory address can be viewed as consisting
of three fields. The least significant w bits identify a unique word or byte within a block
of main memory. In most contemporary machines, the address is at the byte level. The
remaining s bits specify one of the 2 s blocks of main memory. The cache logic interprets
these s bits as a tag of s-r bits (most significant portion) and a line field of r bits. This
latter field identifies one of the m=2 r lines of the cache.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Associative Mapping –
In this type of mapping, the associative memory is used to store content and addresses of
the memory word. Any block can go into any line of the cache. This means that the word id
bits are used to identify which word in the block is needed, but the tag becomes all of the
remaining bits. This enables the placement of any word at any place in the cache memory. It
is considered to be the fastest and the most flexible mapping form.

Set-associative Mapping –
This form of mapping is an enhanced form of direct mapping where the drawbacks of direct
mapping are removed. Set associative addresses the problem of possible thrashing in the
direct mapping method. It does this by saying that instead of having exactly one line that a

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

block can map to in the cache, we will group a few lines together creating a set. Then a
block in memory can map to any one of the lines of a specific set..Set-associative mapping
allows that each word that is present in the cache can have two or more words in the main
memory for the same index address. Set associative cache mapping combines the best of
direct and associative cache mapping techniques.
In this case, the cache consists of a number of sets, each of which consists of a number of
lines.

Cache Mapping Technique:-


The different Cache mapping technique are as follows:-
1) Direct Mapping
2) Associative Mapping
3) Set Associative Mapping
Consider a cache consisting of 128 blocks of 16 words each, for total of 2048(2K) works and
assume that the main memory is addressable by 16 bit address. Main memory is 64K which
will be viewed as 4K blocks of 16 works each.
(1) Direct Mapping:-

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

1) The simplest way to determine cache locations in which store Memory blocks is direct
Mapping technique.
2) In this block J of the main memory maps on to block J modulo 128 of the cache. Thus
main memory blocks 0,128,256,….is loaded into cache is stored at block 0. Block 1,129,257,
….are stored at block 1 and so on.
3) Placement of a block in the cache is determined from memory address. Memory address is
divided into 3 fields, the lower 4-bits selects one of the 16 words in a block.
4) When new block enters the cache, the 7-bit cache block field determines the cache
positions in which this block must be stored.
5) The higher order 5-bits of the memory address of the block are stored in 5 tag bits
associated with its location in cache. They identify which of the 32 blocks that are mapped
into this cache position are currently resident in the cache.
6) It is easy to implement, but not Flexible

(2) Associative Mapping:-

1) This is more flexible mapping method, in which main memory block can be placed into
any cache block position.
2) In this, 12 tag bits are required to identify a memory block when it is resident in the cache.
3) The tag bits of an address recevied from the processor are compared to the tag bits of each
block of the cache to see, if the desired block is present. This is known as Associative
Mapping technique.
4) Cost of an associated mapped cache is higher than the cost of direct-mapped because of the
need to search all 128 tag patterns to determine whether a block is in cache. This is known as
associative search.
(3) Set-Associated Mapping:-
1) It is the combination of direct and associative mapping technique.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

2) Cache blocks are grouped into sets and mapping allow block of main memory reside into
any block of a specific set. Hence contention problem of direct mapping is eased , at the same
time , hardware cost is reduced by decreasing the size of associative search.
3) For a cache with two blocks per set. In this case, memory block 0, 64, 128,…..,4032 map
into cache set 0 and they can occupy any two block within this set.
4) Having 64 sets means that the 6 bit set field of the address determines which set of the
cache might contain the desired block. The tag bits of address must be associatively
compared to the tags of the two blocks of the set to check if desired block is present. This is
two way associative search.

Q5) Explain in detail with suitable Architecture of 8086 microprocessor


A Microprocessor is an Integrated Circuit with all the functions of a CPU however, it
cannot be used stand alone since unlike a microcontroller it has no memory or peripherals.
8086 does not have a RAM or ROM inside it. However, it has internal registers for storing
intermediate and final results and interfaces with memory located outside it through the
System Bus.
In case of 8086, it is a 16-bit Integer processor in a 40 pin, Dual Inline Packaged IC.
The size of the internal registers(present within the chip) indicate how much information
the processor can operate on at a time (in this case 16-bit registers) and how it moves data
around internally within the chip, sometimes also referred to as the internal data bus.
8086 provides the programmer with 14 internal registers, each 16 bits or 2 Bytes wide.

The internal architecture of Intel 8086 is divided into 2 units: The Bus Interface Unit (BIU),
and The Execution Unit (EU). These are explained as following below.

1. The Bus Interface Unit (BIU):

It provides the interface of 8086 to external memory and I/O devices via the System Bus. It
performs various machine cycles such as memory read, I/O read etc. to transfer data
between memory and I/O devices.
BIU performs the following functions-
 It generates the 20 bit physical address for memory access.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

 It fetches instructions from the memory.


 It transfers data to and from the memory and I/O.
 Maintains the 6 byte prefetch instruction queue(supports pipelining).
BIU mainly contains the 4 Segment registers, the Instruction Pointer, a prefetch queue
and an Address Generation Circuit.
Instruction Pointer (IP):
 It is a 16 bit register. It holds offset of the next instructions in the Code Segment.
 IP is incremented after every instruction byte is fetched.
 IP gets a new value whenever a branch instruction occurs.
 CS is multiplied by 10H to give the 20 bit physical address of the Code Segment.
 Address of the next instruction is calculated as CS x 10H + IP.

Code Segment register:


CS holds the base address for the Code Segment. All programs are stored in the Code
Segment and accessed via the IP.
Data Segment register:
DS holds the base address for the Data Segment.
Stack Segment register:
SS holds the base address for the Stack Segment.
Extra Segment register:
ES holds the base address for the Extra Segment.
Address Generation Circuit:
 The BIU has a Physical Address Generation Circuit.
 It generates the 20 bit physical address using Segment and Offset addresses using
the formula:

Physical Address
= Segment Address x 10H + Offset Address
6 Byte Pre-fetch Queue:
 It is a 6 byte queue (FIFO).
 Fetching the next instruction (by BIU from CS) while executing the current
instruction is called pipelining.
 Gets flushed whenever a branch instruction occurs.

2. The Execution Unit (EU):

The main components of the EU are General purpose registers, the ALU, Special purpose
registers, Instruction Register and Instruction Decoder and the Flag/Status Register.
1. Fetches instructions from the Queue in BIU, decodes and executes arithmetic and
logic operations using the ALU.
2. Sends control signals for internal data transfer operations within the microprocessor.
3. Sends request signals to the BIU to access the external module.
4. It operates with respect to T-states (clock cycles) and not machine cycles.
8086 has four 16 bit general purpose registers AX, BX, CX and DX. Store intermediate
values during execution. Each of these have two 8 bit parts (higher and lower).

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

 AX register:
It holds operands and results during multiplication and division operations. Also an
accumulator during String operations.

 BX register:
It holds the memory address (offset address) in indirect addressing modes.

 CX register:
It holds count for instructions like loop, rotate, shift and string operations.

 DX register:
It is used with AX to hold 32 bit values during multiplication and division.

Arithmetic Logic Unit (16 bit):


Performs 8 and 16 bit arithmetic and logic operations.
Special purpose registers (16-bit):
 Stack Pointer:
Points to Stack top. Stack is in Stack Segment, used during instructions like PUSH,
POP, CALL, RET etc.
 Base Pointer:
BP can hold offset address of any location in the stack segment. It is used to access
random locations of the stack.
 Source Index:
It holds offset address in Data Segment during string operations.
 Destination Index:
It holds offset address in Extra Segment during string operations.
Instruction Register and Instruction Decoder:
The EU fetches an opcode from the queue into the instruction register. The instruction
decoder decodes it and sends the information to the control circuit for execution.
Flag/Status register (16 bits):
It has 9 flags that help change or recognize the state of the microprocessor.
6 Status flags:

1. carry flag(CF)
2. parity flag(PF)
3. auxiliary carry flag(AF)
4. zero flag(Z)
5. sign flag(S)
6. overflow flag (O)
Status flags are updated after every arithmetic and logic operation.
3 Control flags:
1. trap flag(TF)
2. interrupt flag(IF)
3. direction flag(DF)
These flags can be set or reset using control instructions like CLC, STC, CLD, STD, CLI,
STI, etc.
The Control flags are used to control certain operations.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Q6) List and explain in detail characteristics /parameters of memory


The key characteristics of memory devices or memory system are as follows:

1. Location
2. Capacity
3. Unit of Transfer
4. Access Method
5. Performance
6. Physical type
7. Physical characteristics
8. Organization

1. Location:
It deals with the location of the memory device in the computer system. There are three
possible locations:

 CPU : This is often in the form of CPU registers and small amount of cache
 Internal or main: This is the main memory like RAM or ROM. The CPU can directly
access the main memory.
 External or secondary: It comprises of secondary storage devices like hard disks,
magnetic tapes. The CPU doesn’t access these devices directly. It uses device
controllers to access secondary storage devices.

2. Capacity:

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

The capacity of any memory device is expressed in terms of: i)word size ii)Number of words
 Word size: Words are expressed in bytes (8 bits). A word can however mean any
number of bytes. Commonly used word sizes are 1 byte (8 bits), 2bytes (16 bits) and 4
bytes (32 bits).
 Number of words: This specifies the number of words available in the particular
memory device. For example, if a memory device is given as 4K x 16.This means the
device has a word size of 16 bits and a total of 4096(4K) words in memory.
3. Unit of Transfer:
It is the maximum number of bits that can be read or written into the memory at a time. In
case of main memory, it is mostly equal to word size. In case of external memory, unit of
transfer is not limited to the word size; it is often larger and is referred to as blocks.
4. Access Methods:
It is a fundamental characteristic of memory devices. It is the sequence or order in which
memory can be accessed. There are three types of access methods:
 Random Access: If storage locations in a particular memory device can be accessed
in any order and access time is independent of the memory location being accessed.
Such memory devices are said to have a random access mechanism. RAM (Random
Access Memory) IC’s use this access method.
 Serial Access: If memory locations can be accessed only in a certain predetermined
sequence, this access method is called serial access. Magnetic Tapes, CD-ROMs
employ serial access methods.
 Semi random Access: Memory devices such as Magnetic Hard disks use this access
method. Here each track has a read/write head thus each track can be accessed
randomly but access within each track is restricted to a serial access.
5. Performance: The performance of the memory system is determined using three
parameters
 Access Time: In random access memories, it is the time taken by memory to
complete the read/write operation from the instant that an address is sent to the
memory. For non-random access memories, it is the time taken to position the read
write head at the desired location. Access time is widely used to measure performance
of memory devices.
 Memory cycle time: It is defined only for Random Access Memories and is the sum
of the access time and the additional time required before the second access can
commence.
 Transfer rate: It is defined as the rate at which data can be transferred into or out of a
memory unit.
6. Physical type: Memory devices can be either semiconductor memory (like RAM) or
magnetic surface memory (like Hard disks).
7.Physical Characteristics:

 Volatile/Non- Volatile: If a memory devices continues hold data even if power is


turned off. The memory device is non-volatile else it is volatile.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

8. Organization:
 Erasable/Non-erasable: The memories in which data once programmed cannot be
erased are called Non-erasable memories. Memory devices in which data in the
memory can be erased is called erasable memory.
E.g. RAM(erasable), ROM(non-erasable).

Q7) Explain the working of 8:1 Multiplexer.

8 to 1 Multiplexer
In the 8 to 1 multiplexer, there are total eight inputs, i.e., A0, A1, A2, A3, A4, A5, A6, and A7,
3 selection lines, i.e., S0, S1and S2 and single output, i.e., Y. On the basis of the combination
of inputs that are present at the selection lines S0, S1, and S2, one of these 8 inputs are
connected to the output. The block diagram and the truth table of the 8×1 multiplexer are
given below.
Block Diagram:

Truth Table:

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

The logical expression of the term Y is as follows:


Y=S0'.S1'.S2'.A0+S0.S1'.S2'.A1+S0'.S1.S2'.A2+S0.S1.S2'.A3+S0'.S1'.S2 A4+S0.S1'.S2
A5+S0'.S1.S2 .A6+S0.S1.S3.A7

Logical circuit of the above expression is given below:

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Q8) Discuss the need of I/O module in computing system.

1. Processor communication -- this involves the following tasks: (a). exchange of data
between processor and I/O module (b). command decoding - I/O module accepts
commands sent from the processor. E.g., the I/O module for a disk drive may accept
the following commands from the processor: READ SECTOR, WRITE SECTOR,
SEEK track, etc. (c). status reporting – The device must be able to report its status to
the processor, e.g., disk drive busy, ready etc. Status reporting may also involve
reporting various errors. (d). Address recognition – Each I/O device has a unique
address and the I/O module must recognize this address.

2. Device communication – The I/O module must be able to perform device


communication such as status reporting.

3. Control & timing – The I/O module must be able to co-ordinate the flow of data
between the internal resources (such as processor, memory) and external devices.

4. Data buffering – This is necessary as there is a speed mismatch between speed of data
transfer between processor and memory and external devices. Data coming from the

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

main memory are sent to an I/O module in a rapid burst. The data is buffered in the
I/O module and then sent to the peripheral device at its rate.

5. Error detection – The I/O module must also be able to detect errors and report them to
the processor. These errors may be mechanical errors (such as paper jam in a printer),
or changes in the bit pattern of transmitted data. A common way of detecting such
errors is by using parity bits.

Q9) With neat diagram, explain Memory Hierarchy.

In the design of the computer system, a processor, as well as a large amount of memory
devices, has been used. However, the main problem is, these parts are expensive. So
the memory organization of the system can be done by memory hierarchy. It has several
levels of memory with different performance rates. But all these can supply an exact purpose,
such that the access time can be reduced. The memory hierarchy was developed depending
upon the behavior of the program.
The memory in a computer can be divided into five hierarchies based on the speed as well as
use. The processor can move from one level to another based on its requirements. The five
hierarchies in the memory are registers, cache, main memory, magnetic discs, and magnetic
tapes. The first three hierarchies are volatile memories which mean when there is no power,
and then automatically they lose their stored data. Whereas the last two hierarchies are not
volatile which means they store the data permanently.
The memory hierarchy design in a computer system mainly includes different storage
devices. Most of the computers were inbuilt with extra storage to run more powerfully
beyond the main memory capacity. The following memory hierarchy diagram is a
hierarchical pyramid for computer memory. The designing of the memory hierarchy is
divided into two types such as primary (Internal) memory and secondary (External) memory.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Primary Memory
The primary memory is also known as internal memory, and this is accessible by the
processor straightly. This memory includes main, cache, as well as CPU registers.

Secondary Memory
The secondary memory is also known as external memory, and this is accessible by the
processor through an input/output module. This memory includes an optical disk, magnetic
disk, and magnetic tape.

Level-0 − Registers
Usually, the register is a static RAM or SRAM in the processor of the computer which is used
for holding the data word which is typically 64 or 128 bits. The registers are present inside
the CPU. As they are present inside the CPU, they have least access time. Registers are most
expensive and smallest in size generally in kilobytes. They are implemented by using Flip-
Flops.
Level-1 − Cache
Cache memory is used to store the segments of a program that are frequently accessed by the
processor. It is expensive and smaller in size generally in Megabytes and is implemented by
using static RAM.
Level-2 − Primary or Main Memory
It directly communicates with the CPU and with auxiliary memory devices through an I/O
processor. Main memory is less expensive than cache memory and larger in size generally in
Gigabytes. This memory is implemented by using dynamic RAM.
Level-3 − Secondary storage
Secondary storage devices like Magnetic Disk are present at level 3. They are used as backup
storage. They are cheaper than main memory and larger in size generally in a few TB.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Level-4 − Tertiary storage


Tertiary storage devices like magnetic tape are present at level 4. They are used to store
removable files and are the cheapest and largest in size (1-20 TB).

Q10) Describe Flynn’s classification of parallel computing in detail


Parallel computing is a computing where the jobs are broken into discrete parts that can be
executed concurrently. Each part is further broken down to a series of instructions.
Instructions from each part execute simultaneously on different CPUs. Parallel systems deal
with the simultaneous use of multiple computer resources that can include a single
computer with multiple processors, a number of computers connected by a network to form
a parallel processing cluster or a combination of both.

The sequence of instructions read from memory constitutes an instruction stream.

The operations performed on the data in the processor constitute a data stream.

Flynn's classification divides computers into four major groups that are:

1. Single instruction stream, single data stream (SISD)


2. Single instruction stream, multiple data stream (SIMD)
3. Multiple instruction stream, single data stream (MISD)
4. Multiple instruction stream, multiple data stream (MIMD)

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Flynn’s classification –
1. Single-instruction, single-data (SISD) systems –
An SISD computing system is a uniprocessor machine which is capable of executing a
single instruction, operating on a single data stream. In SISD, machine instructions are
processed in a sequential manner and computers adopting this model are popularly
called sequential computers. Most conventional computers have SISD architecture. All
the instructions and data to be processed have to be stored in primary memory.

The speed of the processing element in the SISD model is limited(dependent) by the rate at
which the computer can transfer information internally. Dominant representative SISD
systems are IBM PC, workstations.
2. Single-instruction, multiple-data (SIMD) systems –
An SIMD system is a multiprocessor machine capable of executing the same instruction
on all the CPUs but operating on different data streams. Machines based on an SIMD
model are well suited to scientific computing since they involve lots of vector and
matrix operations. So that the information can be passed to all the processing elements
(PEs) organized data elements of vectors can be divided into multiple sets(N-sets for N
PE systems) and each PE can process one data set.

Dominant representative SIMD systems is Cray’s vector processing machine.


3. Multiple-instruction, single-data (MISD) systems –
An MISD computing system is a multiprocessor machine capable of executing different
instructions on different PEs but all of them operating on the same dataset .

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Example Z = sin(x)+cos(x)+tan(x)
The system performs different operations on the same data set. Machines built using the
MISD model are not useful in most of the application, a few machines are built, but
none of them are available commercially.
4. Multiple-instruction, multiple-data (MIMD) systems –
An MIMD system is a multiprocessor machine which is capable of executing multiple
instructions on multiple data sets. Each PE in the MIMD model has separate instruction
and data streams; therefore machines built using this model are capable to any kind of
application. Unlike SIMD and MISD machines, PEs in MIMD machines work
asynchronously.

MIMD machines are broadly categorized into shared-memory


MIMD and distributed-memory MIMD based on the way PEs are coupled to the main
memory.
MICROPROGRAMMED CONTROL
ATTRIBUTES HARDWIRED CONTROL UNIT UNIT

1. Speed Speed is fast Speed is slow

2. Cost of
Implementation More costlier. Cheaper.

Not flexible to accommodate new More flexible to accommodate new


system specification or new system specification or new instruction
3. Flexibility instruction redesign is required. sets.

4. Ability to Handle Difficult to handle complex Easier to handle complex instruction


Complex Instructions instruction sets. sets.

5. Decoding Complex decoding and sequencing Easier decoding and sequencing logic.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

MICROPROGRAMMED CONTROL
ATTRIBUTES HARDWIRED CONTROL UNIT UNIT

logic.

6. Applications RISC Microprocessor CISC Microprocessor

7. Instruction set of
Size Small Large

8. Control Memory Absent Present

9. Chip Area Required Less More

10. Occurrence Occurrence of error is more Occurrence of error is less

It is not applicable to change the structure It is applicable to make modifications by changing the
and instruction set, once it is developed. microprogram saved in the control memory.

The design of the computer is complex. The design of the computer is simplified.

The architecture and instructions set are The architecture and instruction set is specified.
not specified.

It has a processor to create signals to be It facilitates the microsequencer from which instruction bits
executed in the right sequence. are decoded and executed.

It operates through the need for drums, flip It controls the sub-devices including ALU, registers, buses,
flops, flip chips, and sequential circuits. instruction registers.
Q12) Identify the addressing modes of the following instructions
1. MOV AX,1000
 It uses immediate addressing mode.
 The value 1000 immediately copied in AX.
2. MOV AX,[1000]
 It uses direct addressing mode.
 The data bytes from memory location [1001]:[1000] will be copied in (AH):(AL)
registers.
3. MOV AX,BX
 It uses register addressing mode.
 The 16 bit data word from register (BX) will be copied into (AX)
4. MOV [BX],AX
 It uses register indirect addressing.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

 The 16 bit data word from register (AX) will be copied in successive two memory
locations which are indirectly addressed by
 [BX] ←←(AL)
 [BX + 1 ] ←← (AH)

5. MOV AX,[SI+200]
 It uses register relative addressing.
 It copies 16 bit data word from successive two memory locations into AX as follow
 (AL) ←← [SI + 200]
 (AH) ←← [SI + 200 ] + 1

Q13) Write short note on DMA


Direct memory access (DMA) is a method that allows an input/output (I/O) device to
send or receive data directly to or from the main memory, bypassing the CPU to speed
up memory operations. The process is managed by a chip known as a DMA controller
(DMAC).
DMA controller registers :
The DMA controller has three registers as follows.
 Address register – It contains the address to specify the desired location in
memory.
 Word count register – It contains the number of words to be transferred.
 Control register – It specifies the transfer mode.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

The unit communicates with the CPU through data bus and control lines. Through the use
of the address bus and allowing the DMA and RS register to select inputs, the register
within the DMA is chosen by the CPU. RD and WR are two-way inputs. When BG (bus
grant) input is 0, the CPU can communicate with DMA registers. When BG (bus grant)
input is 1, the CPU has relinquished the buses and DMA can communicate directly with the
memory.
All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU
can both read and write into the DMA registers under program control via the data bus.
The CPU initializes the DMA by sending the given information through the data bus.
 The starting address of the memory block where the data is available (to read) or
where data are to be stored (to write).
 It also sends word count which is the number of words in the memory block to be
read or write.
 Control to define the mode of transfer such as read or write.
 A control to begin the DMA transfer.

It is also referred to as cycle stealing, because the DMA module in effect steals a bus cycle.
When the processor wishes to read or write a block of data, it issues a command to the DMA
module, by sending to the DMA module the following information:
i. Whether a read or write is requested, using the read or write control line between
the processor and the DMA module
ii. The address of the I/O device involved, communicated on the data lines
iii. The starting location in memory to read from or write to, communicated on the
data lines and stored by the DMA module in its address register
iv. The number of words to be read or written, again communicated via the data lines
and stored in the data count register

 The Control Logic in the DMA module is responsible for the generation of control
signals.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

 The processor then continues with other work. It has delegated this I/O operation to
the DMA module. The DMA module transfers the entire block of data, one word at a
time, directly to or from memory, without going through the processor. When the
transfer is complete, the DMA module sends an interrupt signal to the processor.
 Thus, the processor is involved only at the beginning and end of the transfer. In the
instruction cycle the processor may be suspended. In each case, the processor is
suspended just before it needs to use the bus. The DMA module then transfers one
word and returns control to the processor.

The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for
floating-point computation which was established in 1985 by the Institute of Electrical
and Electronics Engineers (IEEE).
IEEE 754 has 3 basic components:
1. The Sign of Mantissa –
This is as simple as the name. 0 represents a positive number while 1 represents a
negative number.
2. The Biased exponent –
The exponent field needs to represent both positive and negative exponents. A bias is
added to the actual exponent in order to get the stored exponent.
3. The Normalised Mantissa –
The mantissa is part of a number in scientific notation or a floating-point number,

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

consisting of its significant digits. Here we have only 2 digits, i.e. O and 1. So a
normalised mantissa is one with only one 1 to the left of the decimal.

These floating point numbers can be represented using IEEE 754 format as given below.
1.Single precision format:

It is 32 bit format and uses the bias value of 1271012710 to calculate biased exponent.

2.Double precision format:

 It is 64 bit format and uses the bias value (1023)10μ(1023)10μ 3FFH to calculate
biased exponent

Example –
85.125
85 = 1010101
0.125 = 001
85.125 = 1010101.001
=1.010101001 x 2^6
sign = 0

1. Single precision:
biased exponent 127+6=133
133 = 10000101
Normalised mantisa = 010101001
we will add 0's to complete the 23 bits

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

The IEEE 754 Single precision is:

SIGN BIASED EXPONENT MANTISSA


0 10000101 01010100100000000000000

This can be written in hexadecimal form 42AA4000


Hence,
(85.125) to single precision IEEE 754 format is (42AA4000) H

2. Double precision:

biased exponent 1023+6=1029


1029 = 10000000101
Normalised mantisa = 010101001
we will add 0's to complete the 52 bits

The IEEE 754 Double precision is:

SIGN B.EXPONENT MANTISSA


0 10000000101 0101010010000000000000000000000000000000000000000000

This can be written in hexadecimal form 4055480000000000


Hence,
(85.125) to double precision IEEE 754 format is (4055480000000000) H

Mode-1:
Burst Mode –
 In this mode Burst of data (entire data or burst of block containing data) is
transferred before CPU takes control of the buses back from DMAC.
 This is the quickest mode of DMA Transfer since at once a huge amount of data is
being transferred.
 Since at once only the huge amount of data is being transferred so time will be
saved in huge amount.
Percentage of Time CPU remains blocked:
Let time taken to prepare the data be Tx and time taken to transfer the data be Ty. Then
percentage of time CPU remains blocked due to DMA is as follows.
Percentage of time CPU remains in blocked state = Ty * 100% / Tx + Ty
Mode-2:
Cycle Stealing Mode –

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

 Slow IO device will take some time to prepare data (or word) and within that time
CPU keeps the control of the buses.
 Once the the data or the word is ready CPU give back control of system buses to
DMAC for 1-cycle in which the prepared word is transferred to memory.
 As compared to Burst mode this mode is little bit slowest since it requires little bit
of time which is actually consumed by IO device while preparing the data.
Percentage of Time CPU remains blocked:
Let time taken to prepare data be Tx and time taken to transfer the data be Ty. Then
percentage of time CPU remains blocked due to DMA is as follows.
Percentage of time CPU remains in blocked state = Ty * 100% / Tx
Mode-3:
Interleaving Mode –
 Whenever CPU does not require the system buses then only control of buses will be
given to DMAC.
 In this mode, CPU will not be blocked due to DMA at all.
 This is the slowest mode of DMA Transfer since DMAC has to wait might be for so
long time to just even get the access of system buses from the CPU itself.
 Hence due to which less amount of data will be transferred.

There are three types of techniques under the Direct Memory Access Data Transfer:

1. Burst or block transfer DMA


2. Cycle steal or the single-byte transfer DMA
3. Transparent or hidden DMA

Burst or Block Transfer DMA

 This is the fastest DMA mode of data transfer.


 In the Burst mode, at least two or more bytes of data are transferred continuously i.e.,
the entire block of data is transferred in a straight sequence. Here, the microprocessor
is disconnected from the main system during the data transfer, and thus, the processor
is unable to execute any program on its own during the information transfer.
 In fact, in this mode, the DMA controller acts as a Master.
 If there are about N number of bytes to be transferred, then N number of machine
cycles will be adopted into the working of the processor.
 The DMA controller first sends a HOLD signal to the microprocessor to request
access for the system’s buses, and in turn, wait for the HLDA signal.
 One the HLDA signal is received, the DMA controller gets access over the system bus
and sends one byte of information.
 Once a single byte is sent, the memory address is incremented, the counter is
decremented, and then the next byte is sent.
 Thus, following this technique, all data bytes are transferred between memory and I/O
devices. Once all the information is sent, the DMA controller disables the HOLD
signal.
 It then enters into the slave mode.
 This method is generally used for loading data files or important programs into the
memory. However, it keeps the CPU inactive for relatively long periods.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Cycle steal or Single-Byte transfer DMA

 In this mode, only a single byte is transferred at a time.


 This is thus much slower than burst DMA.
 In the cycle-steal DMA, The DMA controller sends a HOLD signal to the
microprocessor.
 It then waits for the HLDA signal in return.
 Once the HLDA signal is received, it gets access over the system buses and executes
one DMA cycle only.
 After this transfer, the HOLD signal is disabled, and it enters into the slave mode.
 The processor thus gets back its control over the address and data bus and continues
executing the following machine cycle.
 However, if the counter has not touched down to zero, and there is still data to be
exchanged, then the DMA controller sends a HOLD signal again to the processor, and
sends the next byte of the information block.

Thus, only one DMA cycle takes place between every two machine cycles of the processor,
and the execution speed of the instructions in the microprocessor falls back a bit.
The DMA Controller obtains the access for the system buses by repeatedly issuing the
requests using the Bus Request (BR) and Bus Grant (BG) signals until the entire data block
has been transferred.
Though the information bytes are not transferred as fast as in the Burst mode, the only
advantage of the cycle-steal mode is that the CPU does not remain idle for long durations of
time, as in the burst DMA mode. This method is mainly used in controllers which are used in
monitoring data in real-time.
Transparent or Hidden DMA transfer

 The Hidden DMA Transfer method is considered the slowest among all the other
DMA transfers.
 Here, the microprocessor executes some of its states during which it floats the data
bus and the address bus.
 In these states, the microprocessor gets isolated from the main system bus.
 In this isolation, the DMA controller transfers the information byte between the
external peripherals and the memory. This, thus, becomes transparent to the
microprocessor.
 The instruction execution speed of the microprocessor does not get reduced. However,
this DMA mode requires extra logic to sense the states in which the processor is
floating the buses.
 This mode is favored at the time when we do not want the CPU to stop executing its
main program, as the data transfer is performed only when the CPU does not need the
system bus.
 Nevertheless, the hardware needed to check such states can be pretty complex and a
little too expensive as well.
 Cycle Steal:
A read or write signal is generated by the DMAC, and the I/O device either generates
or latches the data.The DMAC effectively steals cycles from the processor in order to
transfer the byte, so single byte transfer is also known as cycle stealing.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

i. Requests by DMA devices for using the bus are always given higher priority than
processor requests.
ii. Among different DMA devices, top priority is given to high-speed peripherals such
as disks, high-speed network interface, and graphics display device).
iii. Since the processor initiates most memory access cycles, it is often stated that
DMA steals memory cycles from the processor (cycle stealing) for its purpose.
iv. If DMA controller is given exclusive access to the main memory to transfer a block
of data without interruption, this is called block or burst mode.
 Burst Transfer:
i. To achieve block transfers, some DMAC's incorporate an automatic sequencing of
the value presented on the address bus. A register is used as a byte count, being
decremented for each byte transfer, and upon the byte count reaching zero, the DMAC
will release the bus. When the DMAC operates in burst mode, the CPU is halted for
the duration of the data transfer.
ii. In burst mode, the processor is stopped completely until the DMA transfer is
completed. Although the processor has no control over its system during such a delay,
this mode appears to be more appropriate when predictability is the main goal. The
main disadvantage being that the CPU is halted for the time when the DMA is in
control of the bus.
 Hidden mode/ Transparent mode:
i. It is possible to perform hidden DMA, which is transparent to the normal operation
of the CPU. In other words, the bus is grabbed by the DMAC when the processor is
not using it. The DMAC monitors the execution of the processor, and when it
recognises the processor executing an instruction which has sufficient empty clock
cycles to perform a byte transfer; it waits till the processor is decoding the op code
and then grabs the bus during this time.
ii. The processor is not slowed down, but continues processing normally. Naturally,
the data transfer by the DMAC must be completed before the processor starts.

Amdahl’s argument is a formula which gives the theoretical speedup in latency of the
execution of a task at a fixed workload that can be expected of a system whose resources
are improved. In other words, it is a formula used to find the maximum improvement
possible by just improving a particular part of a system. It is often used in parallel
computing to predict the theoretical speedup when using multiple processors.

Amdahl’s law uses two factors to find speedup from some enhancement –
 Fraction enhanced – The fraction of the computation time in the original computer that
can be converted to take advantage of the enhancement. For example- if 10 seconds of the
execution time of a program that takes 40 seconds in total can use an enhancement , the
fraction is 10/40. This obtained value is Fraction Enhanced.
Fraction enhanced is always less than 1.

 Speedup enhanced – The improvement gained by the enhanced execution mode; that is,
how much faster the task would run if the enhanced mode were used for the entire program.
For example – If the enhanced mode takes, say 3 seconds for a portion of the program, while

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

it is 6 seconds in the original mode, the improvement is 6/3. This value is Speedup enhanced.
Speedup Enhanced is always greater than 1.

The overall Speedup is the ratio of the execution time:-

Amdahl's law can be formulated the following way:

where
S latency is the theoretical speedup of the execution of the whole task;
s is the speedup of the part of the task that benefits from improved system resources;
p is the proportion of execution time that the part benefiting from improved resources
originally occupied.

1. In microprogrammed processors, an instruction fetched from memory is interpreted


by a micro program stored in a single control memory CM; whereas in other
microprogrammed processors, the micro instructions are not directly used by the
decoder to generate control signals.
This is achieved by the use of a second control memory called a Nano control
memory (nCM).
2. So now there are two levels of control memories, a higher level control memory is
known as micro control memory (µCM) and a lower level control memory is known
as Nano control memory (nCM). This is shown in Figure 7.
3. Thus a microinstruction is in primary control-store memory, it then has the control
signals generated for each microinstruction using a secondary control store memory
The output word from the secondary memory is called Nano instruction.
4. The µCM stores micro instructions whereas nCM stores nano instructions.
The decoder uses Nano instructions from nCM to generate control signals.

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Thus Nano programming gives an alternative strategy to generate control signals.

Nano instruction addresses are generated by a nano program counter and nano
instructions are placed in a register nIR. The next address of nIR is directly obtained.
The next address is generated by either incrementing the nano program counter or
loading it from external source(branch field or address from micro instruction opcode)

Advantages of Nano programming


1. Reduces total size of required control memory
In two level control design technique, the total control memory size S2can be calculated
as
S2=HmxWm+HnxWn
Where H−mn represents the number of words in the high level memory
Wm represents the size of word in the high level memory
Hn represents the number of words in the low level memory
Wn represents the size of word in the low level memory
In Nano programming, we have a highly parallel horizontal organization, which makes
Wn large and Hn is small. This gives the compatible size for single level control unit as
S1=Hmx Wn which is larger than S2. The reduced size of control memory reduces the
total chip area.

2. Greater design flexibility


Because of two level memories organization more design flexibility exists between
instructions and hardware.
Disadvantage of Nano programming
1. Increased memory access time:
The main disadvantage of the two level memory approaches is the loss of speed due to the
extra memory access required for Nano control memory.
Write a program in 8086 microprocessor to find out the addition of two 8-bit BCD
numbers, where numbers are stored from starting memory address 2000 : 500 and store the
result into memory address 2000 : 600 and carry at 2000 : 601.
Example –

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

Algorithm –

1. Load data from offset 500 to register AL (first number)


2. Load data from offset 501 to register BL (second number)
3. Add these two numbers (contents of register AL and register BL)
4. Apply DAA instruction (decimal adjust)
5. Store the result (content of register AL) to offset 600
6. Set register AL to 00
7. Add contents of register AL to itself with carry
8. Store the result (content of register AL) to offset 601
9. Stop
Program

Downloaded by Yash Thombare ([email protected])


lOMoARcPSD|49076371

MEMORY ADDRESS MNEMONICS COMMENT

400 MOV AL, [500] AL<-[500]

404 MOV BL, [501] BL<-[501]

408 ADD AL, BL AL<-AL+BL

40A DAA DECIMAL ADJUST AL

40B MOV [600], AL AL->[600]

40F MOV AL, 00 AL<-00

411 ADC AL, AL AL<-AL+AL+cy(prev)

413 MOV [601], AL AL->[601]

417 HLT END

Explanation –
1. MOV AL, [500]: load data from offset 500 to register AL
2. MOV BL, [501]: load data from offset 501 to register BL
3. ADD AL, BL: ADD contents of registers AL AND BL
4. DAA: decimal adjust AL
5. MOV [600], AL: store data from register AL to offset 600
6. MOV AL, 00: set value of register AL to 00
7. ADC AL, AL: add contents of register AL to AL with carry
8. MOV [601], AL: store data from register AL to offset 601
9. HLT: stop

Downloaded by Yash Thombare ([email protected])

You might also like