Coa Coa Imp Question Bank With Answers
Coa Coa Imp Question Bank With Answers
Coa Coa Imp Question Bank With Answers
Q1) Explain the memory segmentation and memory banking of 8086 Microprocessor.
Segmentation is the process in which the main memory of the computer is logically
divided into different segments and each segment has its own base address. It is basically
used to enhance the speed of execution of the computer system, so that the processor is able
to fetch and execute the data from the memory easily and fast.
Need for Segmentation –
The Bus Interface Unit (BIU) contains four 16 bit special purpose registers (mentioned
below) called as Segment Registers.
Code segment register (CS): is used for addressing memory location in the code
segment of the memory, where the executable program is stored.
Data segment register (DS): points to the data segment of the memory where the
data is stored.
Extra Segment Register (ES): also refers to a segment in the memory which is
another data segment in the memory.
Stack Segment Register (SS): is used for addressing stack segment of the memory.
The stack segment is that segment of memory which is used to store stack data.
The number of address lines in 8086 is 20, 8086 BIU will send 20bit address, so as to
access one of the 1MB memory locations. The four segment registers actually contain the
upper 16 bits of the starting addresses of the four memory segments of 64 KB each with
which the 8086 is working at that instant of time. A segment is a logical unit of memory
that may be up to 64 kilobytes long. Each segment is made up of contiguous memory
locations. It is an independent, separately addressable unit. Starting address will always be
changing. It will not be fixed.
Below is the one way of positioning four 64 kilobyte segments within the 1M byte memory
space of an 8086.
Types Of Segmentation –
1. Overlapping Segment – A segment starts at a particular address and its
maximum size can go up to 64kilobytes. But if another segment starts along with this
64kilobytes location of the first segment, then the two are said to be Overlapping
Segment.
2. Non-Overlapped Segment – A segment starts at a particular address and its
maximum size can go up to 64kilobytes. But if another segment starts before this
64kilobytes location of the first segment, then the two segments are said to be Non-
Overlapped Segment.
Rules of Segmentation Segmentation process follows some rules as follows:
The starting address of a segment should be such that it can be evenly divided by 16.
Minimum size of a segment can be 16 bytes and the maximum can be 64 kB.
The 8086 processor provides a 16-bit data bus. So It is capable of transferring 16 bits in one
cycle but each memory location is only of a byte(8 bits), therefore we need two cycles to
access 16 bits(8 bit each) from two different memory locations. The solution to this
problem is Memory Banking. Through Memory banking, our goal is to access two
consecutive memory locations in one cycle(transfer 16 bits).
The memory chip is equally divided into two parts(banks). One of the banks contains even
addresses called Even bank and the other contains odd addresses called Odd bank. Even
bank always gives lower byte So Even bank is also called Lower bank(LB) and Odd bank
is also called Higher bank(HB).
This banking scheme allows to access two aligned memory locations from both banks
simultaneously and process 16-bit data transfer. Memory banking doesn’t make it
compulsory to transfer 16 bits, it facilitates the 16-bit data transfer.
The choice between 8 bit and 16-bit transfer depends on the instructions given by the
programmer.
One memory bank contains all the bytes which have even addresses such as 00000h, 00002h,
and 00004h etc. the data lines of this bank is connected to the lower 8 bit data lines i.e. from
D0 to D7 of 8086.
The other memory bank contains all bytes which have odd addresses such as 00001h, 00003h
and 00005h etc. the data lines of this bank is connected to the upper 8 bit data lines i.e. from
D8 to D15 of 8086.
The Least Significant bit of address (A 0 is not used for byte selection) is reserved for bank
selection. Therefore A 0=0 will select Even bank. The BHE signal is used for the selection of
odd banks. The processor will use a combination of these two signals to decide the type of
data transfer.
In this case, the first machine cycle generates an odd address (A 0=1) transfer lower order 8
data bits on a higher-order data bus. In the second machine cycle, the higher-order data bus
will be transferred to the lower-order data bus.
Q2) With the help of diagram, explain 6-stage pipeline architecture and various pipeline
hazards with example.
1. A typical instruction cycle can be split into many sub cycles like Fetch instruction,
Decode instruction, Execute and Store. The instruction cycle and the corresponding
sub cycles are performed for each instruction. These sub cycles for different
instructions can thus be interleaved or in other words these sub cycles of many
instructions can be carried out simultaneously, resulting in reduced overall execution
time. This is called instruction pipelining.
2. The more are the stages in the pipeline, the more the throughput is of the CPU.
3. If the instruction processing is split into six phases, the pipelined CPU will have six
different stages for the execution of the sub phases.
4. The six stages are as follows:
1. Pipeline hazards are situations that prevent the next instruction in the instruction
stream from executing during its designated clock cycles.
2. Any condition that causes a stall in the pipeline operations can be called a hazard.
3. There are primarily three types of hazards:
i. Data Hazards
ii. Control Hazards or instruction Hazards
iii. Structural Hazards.
i. Data Hazards:
A data hazard is any condition in which either the source or the destination operands of an
instruction are not available at the time expected in the pipeline. As a result of which some
operation has to be delayed and the pipeline stalls. Whenever there are two instructions one
of which depends on the data obtained from the other.
A=3+A
B=A*4
For the above sequence, the second instruction needs the value of ‘A’ computed in the first
instruction.
Thus the second instruction is said to depend on the first.
If the execution is done in a pipelined processor, it is highly likely that the interleaving of
these two instructions can lead to incorrect results due to data dependency between the
instructions. Thus the pipeline needs to be stalled as and when necessary to avoid errors.
ii. Structural Hazards:
This situation arises mainly when two instructions require a given hardware resource at the
same time and hence for one of the instructions the pipeline needs to be stalled.
The most common case is when memory is accessed at the same time by two instructions.
One instruction may need to access the memory as part of the Execute or Write back phase
while other instruction is being fetched. In this case if both the instructions and data reside in
the same memory. Both the instructions can’t proceed together and one of them needs to be
stalled till the other is done with the memory access part. Thus in general sufficient hardware
resources are needed for avoiding structural hazards.
iii. Control hazards:
The instruction fetch unit of the CPU is responsible for providing a stream of instructions to
the execution unit. The instructions fetched by the fetch unit are in consecutive memory
locations and they are executed.
However the problem arises when one of the instructions is a branching instruction to some
other memory location. Thus all the instruction fetched in the pipeline from consecutive
memory locations are invalid now and need to removed(also called flushing of the
pipeline).This induces a stall till new instructions are again fetched from the memory address
specified in the branch instruction.
Thus the time lost as a result of this is called a branch penalty. Often dedicated hardware is
incorporated in the fetch unit to identify branch instructions and compute branch addresses as
soon as possible and reducing the resulting delay as a results.
Q3) Explain different cache mapping techniques.
Cache Mapping:
There are three different types of mapping used for the purpose of cache memory which are
as follows: Direct mapping, Associative mapping, and Set-Associative mapping. These are
explained below.
1. Direct Mapping –
The simplest technique, known as direct mapping, maps each block of main memory
into only one possible cache line. or
In Direct mapping, assign each memory block to a specific line in the cache. If a line is
previously taken up by a memory block when a new block needs to be loaded, the old
block is trashed. An address space is split into two parts index field and a tag field. The
cache is used to store the tag field whereas the rest is stored in the main memory. Direct
mapping`s performance is directly proportional to the Hit ratio.
For purposes of cache access, each main memory address can be viewed as consisting
of three fields. The least significant w bits identify a unique word or byte within a block
of main memory. In most contemporary machines, the address is at the byte level. The
remaining s bits specify one of the 2 s blocks of main memory. The cache logic interprets
these s bits as a tag of s-r bits (most significant portion) and a line field of r bits. This
latter field identifies one of the m=2 r lines of the cache.
Associative Mapping –
In this type of mapping, the associative memory is used to store content and addresses of
the memory word. Any block can go into any line of the cache. This means that the word id
bits are used to identify which word in the block is needed, but the tag becomes all of the
remaining bits. This enables the placement of any word at any place in the cache memory. It
is considered to be the fastest and the most flexible mapping form.
Set-associative Mapping –
This form of mapping is an enhanced form of direct mapping where the drawbacks of direct
mapping are removed. Set associative addresses the problem of possible thrashing in the
direct mapping method. It does this by saying that instead of having exactly one line that a
block can map to in the cache, we will group a few lines together creating a set. Then a
block in memory can map to any one of the lines of a specific set..Set-associative mapping
allows that each word that is present in the cache can have two or more words in the main
memory for the same index address. Set associative cache mapping combines the best of
direct and associative cache mapping techniques.
In this case, the cache consists of a number of sets, each of which consists of a number of
lines.
1) The simplest way to determine cache locations in which store Memory blocks is direct
Mapping technique.
2) In this block J of the main memory maps on to block J modulo 128 of the cache. Thus
main memory blocks 0,128,256,….is loaded into cache is stored at block 0. Block 1,129,257,
….are stored at block 1 and so on.
3) Placement of a block in the cache is determined from memory address. Memory address is
divided into 3 fields, the lower 4-bits selects one of the 16 words in a block.
4) When new block enters the cache, the 7-bit cache block field determines the cache
positions in which this block must be stored.
5) The higher order 5-bits of the memory address of the block are stored in 5 tag bits
associated with its location in cache. They identify which of the 32 blocks that are mapped
into this cache position are currently resident in the cache.
6) It is easy to implement, but not Flexible
1) This is more flexible mapping method, in which main memory block can be placed into
any cache block position.
2) In this, 12 tag bits are required to identify a memory block when it is resident in the cache.
3) The tag bits of an address recevied from the processor are compared to the tag bits of each
block of the cache to see, if the desired block is present. This is known as Associative
Mapping technique.
4) Cost of an associated mapped cache is higher than the cost of direct-mapped because of the
need to search all 128 tag patterns to determine whether a block is in cache. This is known as
associative search.
(3) Set-Associated Mapping:-
1) It is the combination of direct and associative mapping technique.
2) Cache blocks are grouped into sets and mapping allow block of main memory reside into
any block of a specific set. Hence contention problem of direct mapping is eased , at the same
time , hardware cost is reduced by decreasing the size of associative search.
3) For a cache with two blocks per set. In this case, memory block 0, 64, 128,…..,4032 map
into cache set 0 and they can occupy any two block within this set.
4) Having 64 sets means that the 6 bit set field of the address determines which set of the
cache might contain the desired block. The tag bits of address must be associatively
compared to the tags of the two blocks of the set to check if desired block is present. This is
two way associative search.
The internal architecture of Intel 8086 is divided into 2 units: The Bus Interface Unit (BIU),
and The Execution Unit (EU). These are explained as following below.
It provides the interface of 8086 to external memory and I/O devices via the System Bus. It
performs various machine cycles such as memory read, I/O read etc. to transfer data
between memory and I/O devices.
BIU performs the following functions-
It generates the 20 bit physical address for memory access.
Physical Address
= Segment Address x 10H + Offset Address
6 Byte Pre-fetch Queue:
It is a 6 byte queue (FIFO).
Fetching the next instruction (by BIU from CS) while executing the current
instruction is called pipelining.
Gets flushed whenever a branch instruction occurs.
The main components of the EU are General purpose registers, the ALU, Special purpose
registers, Instruction Register and Instruction Decoder and the Flag/Status Register.
1. Fetches instructions from the Queue in BIU, decodes and executes arithmetic and
logic operations using the ALU.
2. Sends control signals for internal data transfer operations within the microprocessor.
3. Sends request signals to the BIU to access the external module.
4. It operates with respect to T-states (clock cycles) and not machine cycles.
8086 has four 16 bit general purpose registers AX, BX, CX and DX. Store intermediate
values during execution. Each of these have two 8 bit parts (higher and lower).
AX register:
It holds operands and results during multiplication and division operations. Also an
accumulator during String operations.
BX register:
It holds the memory address (offset address) in indirect addressing modes.
CX register:
It holds count for instructions like loop, rotate, shift and string operations.
DX register:
It is used with AX to hold 32 bit values during multiplication and division.
1. carry flag(CF)
2. parity flag(PF)
3. auxiliary carry flag(AF)
4. zero flag(Z)
5. sign flag(S)
6. overflow flag (O)
Status flags are updated after every arithmetic and logic operation.
3 Control flags:
1. trap flag(TF)
2. interrupt flag(IF)
3. direction flag(DF)
These flags can be set or reset using control instructions like CLC, STC, CLD, STD, CLI,
STI, etc.
The Control flags are used to control certain operations.
1. Location
2. Capacity
3. Unit of Transfer
4. Access Method
5. Performance
6. Physical type
7. Physical characteristics
8. Organization
1. Location:
It deals with the location of the memory device in the computer system. There are three
possible locations:
CPU : This is often in the form of CPU registers and small amount of cache
Internal or main: This is the main memory like RAM or ROM. The CPU can directly
access the main memory.
External or secondary: It comprises of secondary storage devices like hard disks,
magnetic tapes. The CPU doesn’t access these devices directly. It uses device
controllers to access secondary storage devices.
2. Capacity:
The capacity of any memory device is expressed in terms of: i)word size ii)Number of words
Word size: Words are expressed in bytes (8 bits). A word can however mean any
number of bytes. Commonly used word sizes are 1 byte (8 bits), 2bytes (16 bits) and 4
bytes (32 bits).
Number of words: This specifies the number of words available in the particular
memory device. For example, if a memory device is given as 4K x 16.This means the
device has a word size of 16 bits and a total of 4096(4K) words in memory.
3. Unit of Transfer:
It is the maximum number of bits that can be read or written into the memory at a time. In
case of main memory, it is mostly equal to word size. In case of external memory, unit of
transfer is not limited to the word size; it is often larger and is referred to as blocks.
4. Access Methods:
It is a fundamental characteristic of memory devices. It is the sequence or order in which
memory can be accessed. There are three types of access methods:
Random Access: If storage locations in a particular memory device can be accessed
in any order and access time is independent of the memory location being accessed.
Such memory devices are said to have a random access mechanism. RAM (Random
Access Memory) IC’s use this access method.
Serial Access: If memory locations can be accessed only in a certain predetermined
sequence, this access method is called serial access. Magnetic Tapes, CD-ROMs
employ serial access methods.
Semi random Access: Memory devices such as Magnetic Hard disks use this access
method. Here each track has a read/write head thus each track can be accessed
randomly but access within each track is restricted to a serial access.
5. Performance: The performance of the memory system is determined using three
parameters
Access Time: In random access memories, it is the time taken by memory to
complete the read/write operation from the instant that an address is sent to the
memory. For non-random access memories, it is the time taken to position the read
write head at the desired location. Access time is widely used to measure performance
of memory devices.
Memory cycle time: It is defined only for Random Access Memories and is the sum
of the access time and the additional time required before the second access can
commence.
Transfer rate: It is defined as the rate at which data can be transferred into or out of a
memory unit.
6. Physical type: Memory devices can be either semiconductor memory (like RAM) or
magnetic surface memory (like Hard disks).
7.Physical Characteristics:
8. Organization:
Erasable/Non-erasable: The memories in which data once programmed cannot be
erased are called Non-erasable memories. Memory devices in which data in the
memory can be erased is called erasable memory.
E.g. RAM(erasable), ROM(non-erasable).
8 to 1 Multiplexer
In the 8 to 1 multiplexer, there are total eight inputs, i.e., A0, A1, A2, A3, A4, A5, A6, and A7,
3 selection lines, i.e., S0, S1and S2 and single output, i.e., Y. On the basis of the combination
of inputs that are present at the selection lines S0, S1, and S2, one of these 8 inputs are
connected to the output. The block diagram and the truth table of the 8×1 multiplexer are
given below.
Block Diagram:
Truth Table:
1. Processor communication -- this involves the following tasks: (a). exchange of data
between processor and I/O module (b). command decoding - I/O module accepts
commands sent from the processor. E.g., the I/O module for a disk drive may accept
the following commands from the processor: READ SECTOR, WRITE SECTOR,
SEEK track, etc. (c). status reporting – The device must be able to report its status to
the processor, e.g., disk drive busy, ready etc. Status reporting may also involve
reporting various errors. (d). Address recognition – Each I/O device has a unique
address and the I/O module must recognize this address.
3. Control & timing – The I/O module must be able to co-ordinate the flow of data
between the internal resources (such as processor, memory) and external devices.
4. Data buffering – This is necessary as there is a speed mismatch between speed of data
transfer between processor and memory and external devices. Data coming from the
main memory are sent to an I/O module in a rapid burst. The data is buffered in the
I/O module and then sent to the peripheral device at its rate.
5. Error detection – The I/O module must also be able to detect errors and report them to
the processor. These errors may be mechanical errors (such as paper jam in a printer),
or changes in the bit pattern of transmitted data. A common way of detecting such
errors is by using parity bits.
In the design of the computer system, a processor, as well as a large amount of memory
devices, has been used. However, the main problem is, these parts are expensive. So
the memory organization of the system can be done by memory hierarchy. It has several
levels of memory with different performance rates. But all these can supply an exact purpose,
such that the access time can be reduced. The memory hierarchy was developed depending
upon the behavior of the program.
The memory in a computer can be divided into five hierarchies based on the speed as well as
use. The processor can move from one level to another based on its requirements. The five
hierarchies in the memory are registers, cache, main memory, magnetic discs, and magnetic
tapes. The first three hierarchies are volatile memories which mean when there is no power,
and then automatically they lose their stored data. Whereas the last two hierarchies are not
volatile which means they store the data permanently.
The memory hierarchy design in a computer system mainly includes different storage
devices. Most of the computers were inbuilt with extra storage to run more powerfully
beyond the main memory capacity. The following memory hierarchy diagram is a
hierarchical pyramid for computer memory. The designing of the memory hierarchy is
divided into two types such as primary (Internal) memory and secondary (External) memory.
Primary Memory
The primary memory is also known as internal memory, and this is accessible by the
processor straightly. This memory includes main, cache, as well as CPU registers.
Secondary Memory
The secondary memory is also known as external memory, and this is accessible by the
processor through an input/output module. This memory includes an optical disk, magnetic
disk, and magnetic tape.
Level-0 − Registers
Usually, the register is a static RAM or SRAM in the processor of the computer which is used
for holding the data word which is typically 64 or 128 bits. The registers are present inside
the CPU. As they are present inside the CPU, they have least access time. Registers are most
expensive and smallest in size generally in kilobytes. They are implemented by using Flip-
Flops.
Level-1 − Cache
Cache memory is used to store the segments of a program that are frequently accessed by the
processor. It is expensive and smaller in size generally in Megabytes and is implemented by
using static RAM.
Level-2 − Primary or Main Memory
It directly communicates with the CPU and with auxiliary memory devices through an I/O
processor. Main memory is less expensive than cache memory and larger in size generally in
Gigabytes. This memory is implemented by using dynamic RAM.
Level-3 − Secondary storage
Secondary storage devices like Magnetic Disk are present at level 3. They are used as backup
storage. They are cheaper than main memory and larger in size generally in a few TB.
The operations performed on the data in the processor constitute a data stream.
Flynn's classification divides computers into four major groups that are:
Flynn’s classification –
1. Single-instruction, single-data (SISD) systems –
An SISD computing system is a uniprocessor machine which is capable of executing a
single instruction, operating on a single data stream. In SISD, machine instructions are
processed in a sequential manner and computers adopting this model are popularly
called sequential computers. Most conventional computers have SISD architecture. All
the instructions and data to be processed have to be stored in primary memory.
The speed of the processing element in the SISD model is limited(dependent) by the rate at
which the computer can transfer information internally. Dominant representative SISD
systems are IBM PC, workstations.
2. Single-instruction, multiple-data (SIMD) systems –
An SIMD system is a multiprocessor machine capable of executing the same instruction
on all the CPUs but operating on different data streams. Machines based on an SIMD
model are well suited to scientific computing since they involve lots of vector and
matrix operations. So that the information can be passed to all the processing elements
(PEs) organized data elements of vectors can be divided into multiple sets(N-sets for N
PE systems) and each PE can process one data set.
Example Z = sin(x)+cos(x)+tan(x)
The system performs different operations on the same data set. Machines built using the
MISD model are not useful in most of the application, a few machines are built, but
none of them are available commercially.
4. Multiple-instruction, multiple-data (MIMD) systems –
An MIMD system is a multiprocessor machine which is capable of executing multiple
instructions on multiple data sets. Each PE in the MIMD model has separate instruction
and data streams; therefore machines built using this model are capable to any kind of
application. Unlike SIMD and MISD machines, PEs in MIMD machines work
asynchronously.
2. Cost of
Implementation More costlier. Cheaper.
5. Decoding Complex decoding and sequencing Easier decoding and sequencing logic.
MICROPROGRAMMED CONTROL
ATTRIBUTES HARDWIRED CONTROL UNIT UNIT
logic.
7. Instruction set of
Size Small Large
It is not applicable to change the structure It is applicable to make modifications by changing the
and instruction set, once it is developed. microprogram saved in the control memory.
The design of the computer is complex. The design of the computer is simplified.
The architecture and instructions set are The architecture and instruction set is specified.
not specified.
It has a processor to create signals to be It facilitates the microsequencer from which instruction bits
executed in the right sequence. are decoded and executed.
It operates through the need for drums, flip It controls the sub-devices including ALU, registers, buses,
flops, flip chips, and sequential circuits. instruction registers.
Q12) Identify the addressing modes of the following instructions
1. MOV AX,1000
It uses immediate addressing mode.
The value 1000 immediately copied in AX.
2. MOV AX,[1000]
It uses direct addressing mode.
The data bytes from memory location [1001]:[1000] will be copied in (AH):(AL)
registers.
3. MOV AX,BX
It uses register addressing mode.
The 16 bit data word from register (BX) will be copied into (AX)
4. MOV [BX],AX
It uses register indirect addressing.
The 16 bit data word from register (AX) will be copied in successive two memory
locations which are indirectly addressed by
[BX] ←←(AL)
[BX + 1 ] ←← (AH)
5. MOV AX,[SI+200]
It uses register relative addressing.
It copies 16 bit data word from successive two memory locations into AX as follow
(AL) ←← [SI + 200]
(AH) ←← [SI + 200 ] + 1
The unit communicates with the CPU through data bus and control lines. Through the use
of the address bus and allowing the DMA and RS register to select inputs, the register
within the DMA is chosen by the CPU. RD and WR are two-way inputs. When BG (bus
grant) input is 0, the CPU can communicate with DMA registers. When BG (bus grant)
input is 1, the CPU has relinquished the buses and DMA can communicate directly with the
memory.
All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU
can both read and write into the DMA registers under program control via the data bus.
The CPU initializes the DMA by sending the given information through the data bus.
The starting address of the memory block where the data is available (to read) or
where data are to be stored (to write).
It also sends word count which is the number of words in the memory block to be
read or write.
Control to define the mode of transfer such as read or write.
A control to begin the DMA transfer.
It is also referred to as cycle stealing, because the DMA module in effect steals a bus cycle.
When the processor wishes to read or write a block of data, it issues a command to the DMA
module, by sending to the DMA module the following information:
i. Whether a read or write is requested, using the read or write control line between
the processor and the DMA module
ii. The address of the I/O device involved, communicated on the data lines
iii. The starting location in memory to read from or write to, communicated on the
data lines and stored by the DMA module in its address register
iv. The number of words to be read or written, again communicated via the data lines
and stored in the data count register
The Control Logic in the DMA module is responsible for the generation of control
signals.
The processor then continues with other work. It has delegated this I/O operation to
the DMA module. The DMA module transfers the entire block of data, one word at a
time, directly to or from memory, without going through the processor. When the
transfer is complete, the DMA module sends an interrupt signal to the processor.
Thus, the processor is involved only at the beginning and end of the transfer. In the
instruction cycle the processor may be suspended. In each case, the processor is
suspended just before it needs to use the bus. The DMA module then transfers one
word and returns control to the processor.
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for
floating-point computation which was established in 1985 by the Institute of Electrical
and Electronics Engineers (IEEE).
IEEE 754 has 3 basic components:
1. The Sign of Mantissa –
This is as simple as the name. 0 represents a positive number while 1 represents a
negative number.
2. The Biased exponent –
The exponent field needs to represent both positive and negative exponents. A bias is
added to the actual exponent in order to get the stored exponent.
3. The Normalised Mantissa –
The mantissa is part of a number in scientific notation or a floating-point number,
consisting of its significant digits. Here we have only 2 digits, i.e. O and 1. So a
normalised mantissa is one with only one 1 to the left of the decimal.
These floating point numbers can be represented using IEEE 754 format as given below.
1.Single precision format:
It is 32 bit format and uses the bias value of 1271012710 to calculate biased exponent.
It is 64 bit format and uses the bias value (1023)10μ(1023)10μ 3FFH to calculate
biased exponent
Example –
85.125
85 = 1010101
0.125 = 001
85.125 = 1010101.001
=1.010101001 x 2^6
sign = 0
1. Single precision:
biased exponent 127+6=133
133 = 10000101
Normalised mantisa = 010101001
we will add 0's to complete the 23 bits
2. Double precision:
Mode-1:
Burst Mode –
In this mode Burst of data (entire data or burst of block containing data) is
transferred before CPU takes control of the buses back from DMAC.
This is the quickest mode of DMA Transfer since at once a huge amount of data is
being transferred.
Since at once only the huge amount of data is being transferred so time will be
saved in huge amount.
Percentage of Time CPU remains blocked:
Let time taken to prepare the data be Tx and time taken to transfer the data be Ty. Then
percentage of time CPU remains blocked due to DMA is as follows.
Percentage of time CPU remains in blocked state = Ty * 100% / Tx + Ty
Mode-2:
Cycle Stealing Mode –
Slow IO device will take some time to prepare data (or word) and within that time
CPU keeps the control of the buses.
Once the the data or the word is ready CPU give back control of system buses to
DMAC for 1-cycle in which the prepared word is transferred to memory.
As compared to Burst mode this mode is little bit slowest since it requires little bit
of time which is actually consumed by IO device while preparing the data.
Percentage of Time CPU remains blocked:
Let time taken to prepare data be Tx and time taken to transfer the data be Ty. Then
percentage of time CPU remains blocked due to DMA is as follows.
Percentage of time CPU remains in blocked state = Ty * 100% / Tx
Mode-3:
Interleaving Mode –
Whenever CPU does not require the system buses then only control of buses will be
given to DMAC.
In this mode, CPU will not be blocked due to DMA at all.
This is the slowest mode of DMA Transfer since DMAC has to wait might be for so
long time to just even get the access of system buses from the CPU itself.
Hence due to which less amount of data will be transferred.
There are three types of techniques under the Direct Memory Access Data Transfer:
Thus, only one DMA cycle takes place between every two machine cycles of the processor,
and the execution speed of the instructions in the microprocessor falls back a bit.
The DMA Controller obtains the access for the system buses by repeatedly issuing the
requests using the Bus Request (BR) and Bus Grant (BG) signals until the entire data block
has been transferred.
Though the information bytes are not transferred as fast as in the Burst mode, the only
advantage of the cycle-steal mode is that the CPU does not remain idle for long durations of
time, as in the burst DMA mode. This method is mainly used in controllers which are used in
monitoring data in real-time.
Transparent or Hidden DMA transfer
The Hidden DMA Transfer method is considered the slowest among all the other
DMA transfers.
Here, the microprocessor executes some of its states during which it floats the data
bus and the address bus.
In these states, the microprocessor gets isolated from the main system bus.
In this isolation, the DMA controller transfers the information byte between the
external peripherals and the memory. This, thus, becomes transparent to the
microprocessor.
The instruction execution speed of the microprocessor does not get reduced. However,
this DMA mode requires extra logic to sense the states in which the processor is
floating the buses.
This mode is favored at the time when we do not want the CPU to stop executing its
main program, as the data transfer is performed only when the CPU does not need the
system bus.
Nevertheless, the hardware needed to check such states can be pretty complex and a
little too expensive as well.
Cycle Steal:
A read or write signal is generated by the DMAC, and the I/O device either generates
or latches the data.The DMAC effectively steals cycles from the processor in order to
transfer the byte, so single byte transfer is also known as cycle stealing.
i. Requests by DMA devices for using the bus are always given higher priority than
processor requests.
ii. Among different DMA devices, top priority is given to high-speed peripherals such
as disks, high-speed network interface, and graphics display device).
iii. Since the processor initiates most memory access cycles, it is often stated that
DMA steals memory cycles from the processor (cycle stealing) for its purpose.
iv. If DMA controller is given exclusive access to the main memory to transfer a block
of data without interruption, this is called block or burst mode.
Burst Transfer:
i. To achieve block transfers, some DMAC's incorporate an automatic sequencing of
the value presented on the address bus. A register is used as a byte count, being
decremented for each byte transfer, and upon the byte count reaching zero, the DMAC
will release the bus. When the DMAC operates in burst mode, the CPU is halted for
the duration of the data transfer.
ii. In burst mode, the processor is stopped completely until the DMA transfer is
completed. Although the processor has no control over its system during such a delay,
this mode appears to be more appropriate when predictability is the main goal. The
main disadvantage being that the CPU is halted for the time when the DMA is in
control of the bus.
Hidden mode/ Transparent mode:
i. It is possible to perform hidden DMA, which is transparent to the normal operation
of the CPU. In other words, the bus is grabbed by the DMAC when the processor is
not using it. The DMAC monitors the execution of the processor, and when it
recognises the processor executing an instruction which has sufficient empty clock
cycles to perform a byte transfer; it waits till the processor is decoding the op code
and then grabs the bus during this time.
ii. The processor is not slowed down, but continues processing normally. Naturally,
the data transfer by the DMAC must be completed before the processor starts.
Amdahl’s argument is a formula which gives the theoretical speedup in latency of the
execution of a task at a fixed workload that can be expected of a system whose resources
are improved. In other words, it is a formula used to find the maximum improvement
possible by just improving a particular part of a system. It is often used in parallel
computing to predict the theoretical speedup when using multiple processors.
Amdahl’s law uses two factors to find speedup from some enhancement –
Fraction enhanced – The fraction of the computation time in the original computer that
can be converted to take advantage of the enhancement. For example- if 10 seconds of the
execution time of a program that takes 40 seconds in total can use an enhancement , the
fraction is 10/40. This obtained value is Fraction Enhanced.
Fraction enhanced is always less than 1.
Speedup enhanced – The improvement gained by the enhanced execution mode; that is,
how much faster the task would run if the enhanced mode were used for the entire program.
For example – If the enhanced mode takes, say 3 seconds for a portion of the program, while
it is 6 seconds in the original mode, the improvement is 6/3. This value is Speedup enhanced.
Speedup Enhanced is always greater than 1.
where
S latency is the theoretical speedup of the execution of the whole task;
s is the speedup of the part of the task that benefits from improved system resources;
p is the proportion of execution time that the part benefiting from improved resources
originally occupied.
Nano instruction addresses are generated by a nano program counter and nano
instructions are placed in a register nIR. The next address of nIR is directly obtained.
The next address is generated by either incrementing the nano program counter or
loading it from external source(branch field or address from micro instruction opcode)
Algorithm –
Explanation –
1. MOV AL, [500]: load data from offset 500 to register AL
2. MOV BL, [501]: load data from offset 501 to register BL
3. ADD AL, BL: ADD contents of registers AL AND BL
4. DAA: decimal adjust AL
5. MOV [600], AL: store data from register AL to offset 600
6. MOV AL, 00: set value of register AL to 00
7. ADC AL, AL: add contents of register AL to AL with carry
8. MOV [601], AL: store data from register AL to offset 601
9. HLT: stop