Computer Organization and Design Final After Corrections

Download as pdf or txt
Download as pdf or txt
You are on page 1of 238

COMPUTER ORGANIZATION AND DESIGN

SYLLABUS

UNIT-I
Digital Fundamentals: Digital Systems, Binary Numbers, Octal, Hexadecimal Conversions,
Signed Binary Numbers, Complements, Logic Gates, Boolean Algebra, K-maps, Standard
Forms, NAND-NOR Implementation
UNIT-II
Combinational and Sequential Circuits: Combinational Circuits, Adder, Subtractor, ALU
Design, Decoder, Encoder, Multiplexers, Sequential Circuits: Latches, Flip-flops, Registers,
Memories, Up-down Counters
UNIT-III
Processor Fundamentals: Von Neumann Architecture, Processor: Definition, Structure,
Category, Technology, ALU Concept, Stored Programs, Fetch Execute Cycle, Instruction
Formats, Clock Rate Instruction Rate, Pipeline, Current Processors, Multi Core Processors
UNIT-IV
Memory: Physical Memory, Addressing, Virtual Memory, Address Translation, Paging,
Cache, L1, L2, L3 Cache Memories, Cache Mapping, LRU Replacement
UNIT-V
I/O Data Transfer: Data Transfer, Serial and Parallel Data Transfer, Full Duplex–Half Duplex
Interaction, Bus Interface, Programmed I/O, Polling, Interrupt Driven I/O, Hardware
Interrupt Mechanism, Interrupt Vectors, Multi Level of Interrupts, DMA, Buffer Chaining,
Operation Chaining
ANNA UNIVERSITY

2
ANNA UNIVERSITY

COMPUTER ORGANIZATION & DESIGN


SCHEME

Page No.

UNIT I-DIGITAL FUNDAMENTALS 5


Chapter 1 Digital Systems
Chapter 2 Logic Gates 23
Chapter 3 Boolean Algebra 32

UNIT II-COMBINATIONAL AND SEQUENTIAL 50


Chapter 4 CIRCUITS
Combinational Circuits
Chapter 5 Sequential Circuits 70

UNIT III- PROCESSOR FUNDAMENTALS 98


Chapter 6 Processor Basics
Chapter 7 Multicore Processors 115

UNIT IV- MEMORY 125


Chapter 8 Memory
Chapter 9 Cache Memory 134

UNIT V- I/O Data Transfer 141


Chapter 10 Data Transfer
Chapter 11 Modes of Data Transfer 150
Model Question Paper

3
ANNA UNIVERSITY

4
ANNA UNIVERSITY

UNIT I-DIGITAL FUNDAMENTALS


Chapter 1 - DIGITAL SYSTEMS

CONTENTS
Learning Objectives

Learning Outcomes

Overview
1.1 Concept of Digital System
1.1.1 Non-positional Number System
1.1.2 Positional Number System
1.2 Different Types of Number Systems
1.2.1 Decimal Number System
1.2.2 Binary Number System
1.2.3 Octal Number System
1.2.4 Hexadecimal Number System
1.3 Conversions
1.3.1 Decimal to Other Number Systems
1.3.2 Binary System to Other Number Systems
1.3.3 Octal to Other Number Systems
1.3.4 Hexadecimal to Other Number Systems
1.4 Complement
1.4.1. 1’s Complement of Binary Number
1.4.2. 2’s Complement of Binary Number
1.5 Signed Binary Numbers
1.5.1 1’s Complement of Signed Binary Number
1.5.2 2’s Complement of Signed Binary Number
Summary
Keywords
Self-Assessment Questions
Further Readings

5
ANNA UNIVERSITY

LEARNING OBJECTIVES
After studying this chapter, you can able to:
• Understood the concept of digital system
• Analyze and identify different number systems
• Describe the Number system conversion concept
• Understood the concept signed binary number and complements

LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• The basics of digital system in computer organization
• To explain positional and non-positional number system
• To describe different types of conversion techniques in number system
• The basics of signed binary number
• How to solve 1’s complement and 2’s complement

OVERVIEW
We all are familiar with the impact of digital watches, present day correspondence systems,
mini-computers, and PCs in regular day to day existence. Digital systems assume such an
indispensable part in regular daily existence that we call the present mechanical period as the
Digital Age. Digital systems are utilized in clinical treatment, correspondence, climate
observing, space direction, traffic light, web, and numerous other logical, business and modern
ventures.
In this chapter, you will consider the idea of digital systems. You will find out about various
number systems and their transformation procedures. Ultimately, you will contemplate the idea
of marked paired numbers and supplements.
1.1 CONCEPT OF DIGITAL SYSTEM
The term digital system refers to elements such as hardware, software and networks and their
use. There may be many different components that make up one system; for example, a
computer has a central processing unit, a hard disk, keyboard, mouse, screen etc. A peripheral
device is a digital component that can be connected to a digital system, for example, a digital
camera or printer.
A digital system is the interconnection of digital hardware modules that achieve a particular data
preparing task. The advancement of digital systems occurred with the appearance of integrated
circuits (ICs) which is an offspring of semiconductor technology. The modest manufacture of
ICs has made the subject Digital Electronics simple to consider. One small IC can perform the
task of thousands of transistors diodes and resistors. Numerous ICs are utilized to develop digital
circuits. This is an intriguing and quickly developing field which utilizes a few standards for the
working of computers, communication systems, digital apparatuses and so on.

6
ANNA UNIVERSITY

The fundamental thought is to allow the amateurs to comprehend the activity of the Digital
system and numerous different systems dependent on the standards of Digital Procedures.
Devices working under Digital Methods are considered as Digital systems and the electronic
organizations used to make them functional are called digital circuits.
Electronic circuits utilize two sorts of signs. They are simple signs (constant stockpile of
voltages and flows) and digital signs (discrete voltages and current).
Circuits (electronic organization) utilizing simple signs are known as direct or simple circuits.
Likewise, the electronic organization of an electronic mini-computer or digital watch that utilizes
digital signs is called digital circuits.
An analogue device, is one that has a signal, which varies continuously in time with the input,
whereas, a digital device operates with a digital signal that varies discontinuously.
Figure 1.1 shows the behaviour of analogue and digital signals and the possibility of
conversion from analogue to digital.

Figure 1.1: (a). Block Diagram of Electronic Circuit


Figure 1.1: (b). Analogue Signal, Figure 1.1: (c). Digital Signal

Example 1.1: Figure 1.2 represents a symbol of some devices using analogue and digital
world. This example is set to explain how a real-life problem like movement of a pivot on a
water tank that indicates the level of water can be translated to analogue/digital form.

Figure 1.2: Digital System used to Interpret Float Level in Water Tank

7
ANNA UNIVERSITY

Figure 1.3: Digital Signal Representation

Digital system works under logic and hence they are called Logic Circuits, whose building
blocks are known as Gates. These circuits employ two different representations of digital signal
known as Positive Logic Systems and Negative Logic Systems as shown in Figure 1.3. The two
discrete signal levels HIGH and LOW are generally represented by Binary Digits 1 and 0
respectively is referred to as bit and binary number with 8 bits is known as a byte. Since a digital
signal can have only one of the two possible level 1 and 0, the Binary Number System can be
used for the analysis and design of digital system, which was introduced by George Boolean in
1854 and the corresponding algebra is known as Boolean Algebra. These logic concepts have
been adopted for the design of digital circuit.
The number system that we use in day-to-day life is called Decimal Number System. In this
system, one works with 10 different Digits, (0 through 9) and is known as base-ten system. But
digital electronic devices used a ‘strange’ number system called binary. Digital computers and
microprocessor-based systems use other strange number systems called Hexadecimal and Octal.
One who works in electronics must know how to convert numbers from the everyday decimal
system to binary, to Hexadecimal, and to Octal system.
The hexadecimal number system uses the 16 symbol: 0 through 9,A, B, C, D, E, and F and is
referred to as base-sixteen system. The letter ‘A’ stands for decimal 10, ‘B’ for decimal 11, and
so on. The Octal number system uses the 8 symbols: 0 through 7 and are referred to as base-eight
system.
Any Decimal number of any magnitude can be expressed by using the system of positional
weighting in which the right most digit of the number called the ‘least significant digit’ is
multiplied by 100 to the number, the digit left to the least significant digit is multiplied by 101 .
Similarly, as we move in the number towards left side the power increases in steps of 1.
The number system can be classified into two types:
• Positional number system
• Non-Positional number system
1.1.1 Non-positional Number System
In ancient times people used stones and pebbles to count things. This method of counting

8
ANNA UNIVERSITY

uses an additive approach and is called the Non-positional Number System.


Due to the complexities of scientific calculations, this system was not able to work; hence,
Positional Number System was developed.
Few examples of non-weighted number system are gray code, roman code, excess-3 code, etc.
1.1.2 Positional Number System
The positional number system contains a few digits only, just like Decimal Number system
contains 10 digits (0, 1, 2,............., 8, 9) and binary number system contains two digits (0
and 1) only.
The value of a particular number in a number system is calculated on the following basis:
1. The digit itself.
2. The base of the number system (Like base of Decimal Number System is 10, and
that of Binary Number System is 2).
3. The “position” of digit in the number.
The most common number system that we make use of in our daily life is Decimal Number
System whose base is 10 and it contains (0, 1, 2, 3, ................. 8, 9) digits.
Other popular number system that a computer uses is Binary Number System whose base is
2 and it contains only two digits 0 and 1.
Example 1.2:
1358=1 × 103 + 3 × 102 + 5 × 101 + 8 × 100
13.58=1 × 101 + 3 × 100 + 5 × 10−1 + 8 × 10−2
Few examples of positional number system are decimal number system, Binary number system,
octal number system, hexadecimal number system, BCD, etc.

1.2 DIFFERENT TYPES OF NUMBER SYSTEM


In this section, we will discuss the different number systems.
1.2.1 Decimal Number System
The decimal number system comprises digits from 0-9 that are 0,1,2,3,4,5,6,7,8,9. The base
or radix of the decimal system is 10 because the total number of digits available in the
decimal number system is 10. This number system is most easily understandable and is most
popular in day-to-day life.
Example 1.3: For example in decimal number (236)10 the weightage of digit 6 is 6x100 =6, the
weightage of digit 3 is 3x101 = 30 and for digit 2 is 2x102 =200.
Summing all three values 6, 30, and 200 we get
236 =2 x hundred + 3 x ten + 6 x unity

9
ANNA UNIVERSITY

=2x102 + 3x101 + 6x100


=2x100 + 3x10 + 6x1
=200 + 30 + 6
=(236)10

1.2.2 Binary Number System


The computer can understand only two conditions “On” and “Off” i.e 1 and 0. The binary
number system deals with study of 0s and 1s.
A binary number system represents a number with base 2, by using the digit 1 and 0.As it
uses only two digits 0 and 1 and has base of 2, it is called binary. All electronic devices use
a binary system in their electronic circuit. The input 0 indicates the “OFF” state and the
input 1 indicates the “ON” state. Because of this implementation the binary number system
is used in modern computer technology. Each digit is referred to as a bit. Examples of the
binary number system are 1012, 100112, etc.
The binary number system is exactly like the Decimal system except that the base is 2 instead of
10. Again each position in a binary number represent a power of the base 2. In this system, the
right most position is the unit 20 position, the second position from the right is the 2’s (21), and
proceeding in this way, we have 4’s (22), 8’s (23) position, and so on.
Example 1.4: The decimal equivalent of the binary number 10110 (written as 10110)2 is
1x24 +0x23 +1x22 +1x21 +0x20
or 16+0+4+2+0 or 22
Thus, we write (10101)2 = (22)10
1.2.3 Octal Number System
A number system with base or radix of 8 is known as octal number system. In the octal
system the largest single digit is 7 (one less than the base). Again each position an octal
number represent a power of the base 8. The octal number system is also called the base 8
number system. The octal number system is generally used in minicomputer. Examples of
octal number system are 4328, 618, etc.
Example 1.5: The decimal equivalent of the octal number 523 (written as 523)8 is
5x82 +2x81 +3x80
or 5x64+16+3
or 320+16+3
or 339
so we have (523)8 = (339)10

10
ANNA UNIVERSITY

1.2.4 Hexadecimal Number System


Hexadecimal is a number system with base of 16. In hexadecimal system, the largest single
digit is F or 15 (one less than the base). Again, each position in a Hexadecimal system
represents a power of the base 16. Digits used in this number system are 0 to 9 and A, B,
C, D, E, F where decimal values of A to F are 10 to 15 respectively. A represents 10, B
represents 11 and so on. In binary representation, 4 bits (1 nibble) are used to represent
each digit of hexadecimal numbers. It is more compact than any other number system.
Example1.6: The decimal equivalent of the Hexadecimal number 2CF written as (2CF)16 or H
is
2x162 +Cx161 +Fx160
or 2x256+12x16+15x1
or 512+192+15
or 719
Thus (2CF)16 =(719)10

1.3 CONVERSIONS
Representations of a number in a different radix are said to be equivalent if they have the same
decimal representation. For example, (0011)8 and (1001)2 are equivalent—both have decimal
value 9. The conversion of a number in base r to decimal is done by expanding the number in a
power series and adding all the terms as shown previously. We now present a general procedure
for the reverse operation of converting a decimal number to a number in base r. If the number
includes a decimal point, it is necessary to separate the number into an integer part and a fraction
part, since each part must be converted differently. The conversion of a decimal integer to a
number in base r is done by dividing the number and all successive quotients by r and
accumulating the remainders. This procedure is best illustrated by example. Now, we will
discuss the set of rules to convert them. These are discussed below:
1.3.1 Decimal to Other Number Systems
You can convert the decimal number systems to other number systems by using methods
given below:
1.3.1.1 Decimal to Binary Number Systems
A decimal number has base 10 and a binary number has base 2. In decimal to binary conversion,
the base of the number also changes, i.e. from base 10 to base 2. All the decimal numbers have
its equivalent binary numbers. These binary numbers are majorly used in computer applications,
where it is used for programming or coding purpose. This is because computers understand the
language of binary digits, 0 and 1.
Hence, once we give the input to the computer system in the decimal form, it converts them into

11
ANNA UNIVERSITY

binary digits, performs the required operations and provides the output with into decimal form
again. Now, learn here how the decimal number can be represented here in binary form. But
before learning the steps for conversion, first, let us see the table to know the equivalent binary
number for a decimal number.
Conversion steps:
1. Divide the number by 2.
2. Get the integer quotient for the next iteration.
3. Get the remainder for the binary digit.
4. Repeat the steps until the quotient is equal to 0.
Example 1.7: Convert 1310 to binary

So 1310 = 11012
Example 1.8: Convert 17410 to binary

So 17410 = 101011102
13.1.2 Decimal to Octal Number Systems
To convert decimal to octal, we have to learn about both the number system first. A number with
base 8 is the octal number and a number with base 10 is the decimal number. Here we will
convert a decimal number to an equivalent octal number.

12
ANNA UNIVERSITY

Conversion steps:
1. Divide the number by 8.
2. Get the integer quotient for the next iteration.
3. Get the remainder for the octal digit.
4. Repeat the steps until the quotient is equal to 0.
Example 1.9: Convert 756210 to octal

Division Quotient Remainder Remainder


Digit #
by 8 (integer) (decimal) (octal)

7562/8 945 2 2 0

945/8 118 1 1 1

118/8 14 6 6 2

14/8 1 6 6 3

1/8 0 1 1 4

So 756210 = 166128
Example 1.10: Convert 3563110 to octal

Division Remainder Remainder


Quotient Digit #
by 8 (decimal) (octal)

35631/8 4453 7 7 0

4453/8 556 5 5 1

556/8 69 4 4 2

69/8 8 5 5 3

8/8 1 0 0 4

1/8 0 1 1 5

So 3563110 = 1054578

13
ANNA UNIVERSITY

1.3.1.3 Decimal to Hexadecimal Number Systems


To convert decimal to hex, we need to check if the given number is less than 16 or greater than
16. If the given decimal number is less than 16 then the hexadecimal number will be the same as
a decimal value.
Conversion steps:
1. Divide the number by 16.
2. Get the integer quotient for the next iteration.
3. Get the remainder for the hex digit.
4. Repeat the steps until the quotient is equal to 0.
Example 1.11: Convert 756210 to hex

Division Quotient Remainder Remainder


Digit #
by 16 (integer) (decimal) (hex)

7562/16 472 10 A 0

472/16 29 8 8 1

29/16 1 13 D 2

1/16 0 1 1 3

So 756210 = 1D8A16
Example 1.12: Convert 3563110 to hex

Division Remainder Remainder


Quotient Digit #
by 16 (decimal) (hex)

35631/16 2226 15 F 0

2226/16 139 2 2 1

139/16 8 11 B 2

8/16 0 8 8 3

So 3563110 = 8B2F16

14
ANNA UNIVERSITY

1.3.2 Binary to Other Number Systems


You can convert the binary number systems to other number systems by using methods
given below:
1.3.2.1 Binary to Decimal Number Systems
For binary number with n digits: dn-1 ... d3 d2 d1 d0
The decimal number is equal to the sum of binary digits (dn) times their power of 2 (2n): decimal
= d0×20 + d1×21 + d2×22 + ...
Example 1.13: Find the decimal value of 1110012

binary number: 1 1 1 0 0 1

power of 2: 25 24 23 22 21 20

1110012 = 1⋅25+1⋅24+1⋅23+0⋅22+0⋅21+1⋅20 = 5710


1.3.2.2 Binary to Octal Number Systems
Conversion steps:
To convert binary numbers to octal numbers, follow the below steps:
• Take the given binary number
• Multiply each digit by 2n-1 where n is the position of the digit from the decimal
• The resultant is the equivalent decimal number for the given binary number
• Divide the decimal number by 8
• Note the remainder
• Continue the above two steps with the quotient till the quotient is zero
• Write the remainder in the reverse order
• The resultant is the required octal number for the given binary number
Example 1.14: Convert 10101012 to octal
Given binary number is 10101012
First, we convert given binary to decimal
10101012 = (1 * 26) + (0 * 25 ) + (1 * 24) + (0 * 23) + (1 * 22) + (0 * 21) + (1 * 20)
= 64 + 0 + 16 + 0 + 4 + 0 + 1
= 64 + 21
10101012 = 85 (Decimal form)
Now we will convert this decimal to octal form

15
ANNA UNIVERSITY

Therefore, the equivalent octal number is 1258.

Example 1.15: Convert 011012 to octal


Given binary number is 011012
First we convert given binary to decimal
011012 = (0 * 24) + (1 * 23) + (1 * 22) + (0 * 21) + (1 *20)
= 0 + 8 + 4 + 0 +1
011012= 13 (Decimal form)
Now we will convert this decimal to octal form

Therefore, the equivalent octal number is 158.


1.3.2.3 Binary to Hexa Number Systems
Converting from binary to hex is easy since hexadecimal numbers are simplified versions of
binary strings. You just need to remember that each hex digit represents four binary digits. It
follows that four binary digits will be equal to one hex digit. The method is easier than it sounds
but it’s always useful to use a binary to hex conversion chart to save time.
Example 1.16: Convert binary 11011002 to hex
Convert every 4 binary bits (from bit0) to hex digit:
11011002 = 110 1100 = 6 C = 6C16
1.3.3 Octal to Other Number Systems
You can convert the octal number systems to other number systems by using methods
given below:
1.3.3.1 Octal to Decimal
Conversion Steps:
• Count the number of digits present in the given number. Let the number of digits be ‘n’.

16
ANNA UNIVERSITY

• Now multiply each digit of the number with 8n-1, when the digit is in the nth position from
the right end of the number. If the number has a decimal part, multiply each digit in the
decimal part by `8-m` when the digit is in the mth position from the decimal point.
• Add all the terms after multiplication.
• The obtained value is the equivalent decimal number.
Example 1.17: Convert 418 to a decimal number
418 = (4 * 81) + (1 * 80)
=4*8+1*1
= 32+1
= (33)10
1.3.3.2 Octal to Binary
In order to convert octal to binary number, we have to follow a few steps. Octal numbers have
base 8 and binary numbers have base 2. Separate each digit by its equivalent 3bit binary
numbers.

Example 1.18: Convert 128 into a binary number


Now with the help of the table, we can write;
128 = (001 | 010)2
Since zeros on the left, most of digit 1 does not have any significance. Thus,
128 = (1010)2
1.3.3.3 Octal to Hexadecimal
Hexadecimal numbers consist of numbers and alphabets. It is represented with base 16. The
numbers from 0-9 are represented in the usual form, but from 10 to 15, it is denoted as A, B, C,
D, E, F. Conversion of the octal number to hexadecimal requires two steps.

17
ANNA UNIVERSITY

• First, convert octal number to binary number.


• Then, convert binary number to hexadecimal number.
Example 1.19: Convert (55)8 to hexadecimal
Now, convert (55)8 to binary number (101 101)2
Then separate into 4 bits for hexadecimal conversion
(10 | 1101)2= 2 D
Therefore, we can write, (55)8 = (2D)16
1.3.4 Hexadecimal to Other Number Systems
You can convert the hexadecimal number systems to other number systems by using
methods given below:
1.3.4.1 Hexadecimal to Decimal
A regular decimal number is the sum of the digits multiplied with power of 10.
137 in base 10 is equal to each digit multiplied with its corresponding power of 10:
13710 = 1×102+3×101+7×100 = 100+30+7
Hex numbers are read the same way, but each digit counts power of 16 instead of power of 10.
For hex number with n digits:
dn-1 ... d3 d2 d1 d0
Multiply each digit of the hex number with its corresponding power of 16 and sum:
decimal = dn-1×16n-1 + ... + d3×163 + d2×162 + d1×161+d0×160
Example 1.20: 3B in base 16 is equal to each digit multiplied with its corresponding 16n
3B16 = 3×161+11×160 = 48+11 = 5910
Example 1.21: E7A9 in base 16 is equal to each digit multiplied with its corresponding 16n
E7A916 = 14×163+7×162+10×161+9×160 = 57344+1792+160+9 = 5930510
1.3.4.2 Hexadecimal to Binary
Conversion Steps:
To convert a hexadecimal number into its equivalent binary number, separate each digit and
write its equivalent 4 bit binary number.
Example 1.22: Convert A2B16 to an equivalent binary number.
Convert the given hexadecimal to the equivalent binary number.
A16 | 216 | B16 = 10102 | 00102 | 10112

18
ANNA UNIVERSITY

Hence, A2B16 = 1010001010112


Example1.23: Convert E16 to an equivalent binary number.
First, convert the given hexadecimal to the equivalent decimal number.
E16 = 11102
Hence, E16 = 11102
1.3.4.3 Hexadecimal to Octal
As we know, hexadecimal numbers include binary digits; therefore, we can club these binary
numbers into a pair so that we can relate it with the octal numbers. Let us check the method with
steps and example.
For each given hexadecimal number digit, write the equivalent binary number. If any of the
binary equivalents are less than 4 digits, add 0’s to the left side.
Combine and make the groups of binary digits from right to left, each containing 3 digits. Add
0’s to the left if there are less than 3 digits in the last group. Find the octal equivalent of each
binary group.
Example1.24: Convert 1BC16 into an octal number.
1 → 0001, B → 1011, C →1100
Now group them from right to left, each having 3 digits.
000, 110, 111, 100
000→0, 110 →6, 111→7, 100→4
Hence, 1BC16 = 6748
1.4 COMPLEMENT
Complements are used in digital computers to simplify the subtraction operation and for logical
manipulation. Simplifying operations leads to simpler, less expensive circuits to implement the
operations. There are two types of complements for each base‐r system: the radix complement
and the diminished radix complement. The first is referred to as the r’s complement and the
second as the (r - 1)’s complement. When the value of the base r is substituted in the name, the
two types are referred to as the 2’s complement and 1’s complement for binary numbers and the
10’s complement and 9’s complement for decimal numbers.
Generally, there are two types of complement of Binary number: 1’s complement and 2’s
complement. To get 1’s complement of a binary number, simply invert the given number. For
example, 1’s complement of binary number 110010 is 001101. To get 2’s complement of binary
number is 1’s complement of given number plus 1 to the least significant bit (LSB). For example
2’s complement of binary number 10010 is (01101) + 1 = 01110.

19
ANNA UNIVERSITY

1.4.1 1’s Complement of a Binary Number


There is a simple algorithm to convert a binary number into 1’s complement. To get 1’s
complement of a binary number, simply invert the given number. You can simply implement
logic circuit using only NOT gate for each bit of Binary number input. Implementation of logic
circuit of 4-bit 1’s complement is given as following below.

Figure 1.4: 4-bit 1’s Complement

Example 1.25: Find 1’s complement of binary number 10101110.


Simply invert each bit of given binary number, so 1’s complement of given number will be
01010001.
Example 1.26: Find 1’s complement of binary number 10001001.
Simply invert each bit of given binary number, so 1’s complement of given number will be
01110110.
Uses of 1’s Complement Binary Numbers
There are various uses of 1’s complement of Binary numbers, mainly in signed Binary number
representation and various arithmetic operations for Binary numbers, e.g., additions,
subtractions, etc.
1.4.2 2’s Complement of a Binary Number
It is an extension of 1’s complement because it can be acquired after operating a binary number
by 1’s complement. 2’s complement denotes any negative binary number. The concept is taken
from subtracting the number from 2N, where N is the number of bits of that number. Therefore,
if we want to denote a negative binary number we have to first consider its positive magnitude,
then subtracting that value form 2N. That number is the negative equivalent of that negative
number in 2’s complement method.
After reading the example given below, you will be able to understand the method of finding 2’s
complement of a number.

Example 1.27: To find out the 2’s complement of a binary number, first you have to find out the
1’s complement of that number then add 1. Now read the example given below to understand the
method in a better way.

20
ANNA UNIVERSITY

Imagine we have to represent ( –5)10 in binary number.


Step 1: We have to consider binary equivalent of (+ 5)10 which is (0 1 0 1)2.
Step 2: Now we have to find out 1’s complement of 0 1 0 1 which is = 1 0 1 1.
Step 3: In the end, we have to add 1 to the result.
This is the negative binary representation of (–5). If we convert the result in decimal number
system, we will get 1 0 1 1= 1110.
1.5 SIGNED BINARY NUMBERS
The binary numbers which can be identified by their MSB (Most Significant Bit), whether
they are positive or negative are called “Signed binary numbers”.
Example 1.28:1001— >+9(positive) 1001— >-1(negative)
This is the simplest way of representing the both positive and neagtive numbers in binary
system.
An 8 bit number system can be used to produce 256 combinations (from 0 to 255). The
first 128 combinations (0 to 127) represent positive numbers and next 128 combinations (128
to 255) represent negative numbers.
In mathematics, positive numbers (including zero) are represented as unsigned numbers. That is
we do not put the +ve sign in front of them to show that they are positive numbers.
However, when dealing with negative numbers we do use a -ve sign in front of the number to
show that the number is negative in value and different from a positive unsigned value, and the
same is true with signed binary numbers.
However, in digital circuits there is no provision made to put a plus or even a minus sign to a
number, since digital systems operate with binary numbers that are represented in terms of “0’s”
and “1’s”. When used together in microelectronics, these “1’s” and “0’s”, called a bit (being a
contraction of Binary digit), fall into several range sizes of numbers which are referred to by
common names, such as a byte or a word.
We have also seen previously that an 8-bit binary number (a byte) can have a value ranging from
0 (000000002) to 255 (111111112), that is 28 = 256 different combinations of bits forming a
single 8-bit byte. So for example an unsigned binary number such as:
010011012 = 64 + 8 + 4 + 1 = 7710 in decimal. But Digital Systems and computers must also be
able to use and to manipulate negative numbers as well as positive numbers.
Mathematical numbers are generally made up of a sign and a value (magnitude) in which the
sign indicates whether the number is positive, ( + ) or negative, ( – ) with the value indicating the
size of the number, for example 23, +156 or -274. Presenting numbers is this fashion is called
“sign-magnitude” representation since the left most digit can be used to indicate the sign and the
remaining digits the magnitude or value of the number.

21
ANNA UNIVERSITY

Sign-magnitude notation is the simplest and one of the most common methods of representing
positive and negative numbers either side of zero, (0). Thus negative numbers are obtained
simply by changing the sign of the corresponding positive number as each positive or unsigned
number will have a signed opposite, for example, +2 and -2, +10 and -10, etc.
But how do we represent signed binary numbers if all we have is a bunch of one’s and zero’s.
We know that binary digits, or bits only have two values, either a “1” or a “0” and conveniently
for us, a sign also has only two values, being a “+” or a “–“.
Then we can use a single bit to identify the sign of a signed binary number as being positive or
negative in value. So to represent a positive binary number (+n) and a negative (-n) binary
number, we can use them with the addition of bit.
For signed binary numbers the most significant bit (MSB) is used as the sign bit. If the sign bit is
“0”, this means the number is positive in value. If the sign bit is “1”, then the number is negative
in value. The remaining bits in the number are used to represent the magnitude of the binary
number in the usual unsigned binary number format way.
Then we can see that the Sign-and-Magnitude (SM) notation stores positive and negative values
by dividing the “n” total bits into two parts: 1 bit for the sign and n–1 bits for the value which is
a pure binary number. For example, the decimal number 53 can be expressed as an 8-bit signed
binary number as follows.
Positive Signed Binary Numbers

Figure 1.5: Positive Signed Binary Numbers

Negative Signed Binary Numbers

Figure 1.6: Negative Signed Binary Numbers

The disadvantage here is that whereas before we had a full range n-bit unsigned binary number,
we now have an n-1 bit signed binary number giving a reduced range of digits from:
-2(n-1) – 1 to +2(n-1) – 1
So for example: if we have 4 bits to represent a signed binary number, (1-bit for the Sign bit and

22
ANNA UNIVERSITY

3-bits for the Magnitude bits), then the actual range of numbers we can represent in sign-
magnitude notation would be:
-2(4-1) – 1 to +2(4-1) – 1
-2(3) – 1 to +2(3) – 1
-7 to +7
Whereas before, the range of an unsigned 4-bit binary number would have been from 0 to 15,
or 0 to F in hexadecimal, we now have a reduced range of -7 to +7. Thus an unsigned binary
number does not have a single sign-bit, and therefore can have a larger binary range as the most
significant bit (MSB) is just an extra bit or digit rather than a used sign bit.
Another disadvantage here of the sign-magnitude form is that we can have a positive result for
zero, +0 or 00002, and a negative result for zero, -0 or 10002. Both are valid but which one is
correct.
Signed Binary Numbers Example No1
Convert the following decimal values into signed binary numbers using the sign-magnitude
format:

-1510 as a 6-bit number ⇒ 1011112

+2310 as a 6-bit number ⇒ 0101112

-5610 as a 8-bit number ⇒ 101110002

+8510 as a 8-bit number ⇒ 010101012

-12710 as a 8-bit
⇒ 111111112
number

The boxed first binary numbers are said to be sign bit.


Note that for a 4-bit, 6-bit, 8-bit, 16-bit or 32-bit signed binary number all the bits MUST have a
value, therefore “0’s” are used to fill the spaces between the leftmost sign bit and the first or
highest value “1”.
The sign-magnitude representation of a binary number is a simple method to use and understand
for representing signed binary numbers, as we use this system all the time with normal decimal
(base 10) numbers in mathematics. Adding a “1” to the front of it if the binary number is
negative and a “0” if it is positive.
However, using this sign-magnitude method can result in the possibility of two different bit
patterns having the same binary value.
For example, +0 and -0 would be 0000 and 1000 respectively as a signed 4-bit binary number.

23
ANNA UNIVERSITY

So we can see that using this method there can be two representations for zero, a positive zero
( 00002 ) and also a negative zero ( 10002 ) which can cause big complications for computers and
digital systems.
1.5.1 1’s Complement in Signed Binary Number Representation
1’s complement binary numbers are very useful in Signed number representation. Positive
numbers are simply represented as Binary number. There is nothing to do for positive binary
number. But in case of negative binary number representation, we represent in 1’s complement.
If the number is negative then it is represented using 1’s complement. First represent the number
with positive sign and then take 1’s complement of that number.
Example 1.29: Let’s assume 5 bits register. The representation of -5 and +5 will be as follows:

+5 is represented as it is represented in sign magnitude method. -5 is represented using the


following steps:
(i) +5 = 0 0101
(ii) Take 1’s complement of 0 0101 and that is 1 1010. MSB is 1 which indicates that number is
negative.
MSB is always 1 in case of negative numbers.
Range of Numbers: For k bits register, positive largest number that can be stored is ( 2 (k-1) -1)
and negative lowest number that can be stored is -( 2 (k-1) -1).
Note that drawback of this system is that 0 has two different representation one is -0 (e.g., 1 1111
in five bit register) and second is +0 (e.g., 0 0000 in five bit register).

Figure 1.8: Range of Numbers


Lets see arithmetic operations: Subtractions and Additions in 1’s complement binary numbers.
Subtractions by 1’s Complement
The algorithm to subtract two binary number using 1’s complement is explained as following
below:
• Take 1’s complement of the subtrahend.
• Add with minuend.

24
ANNA UNIVERSITY

• If the result of above addition has carry bit 1, then add it to the least significant bit (LSB)
of given result.
• If there is no carry bit 1, then take 1’s complement of the result which will be negative.
Note that subtrahend is number that to be subtracted from the another number, i.e., minuend.
Example (Case-1: When Carry bit 1): Evaluate 10101 - 00101
According to above algorithm, take 1’s complement of subtrahend 00101, which will be 11010,
then add both of these. So, 10101 + 11010 =101111 . Since, there is carry bit 1, so add this to the
LSB of given result, i.e., 101111+1=110000 which is the answer.
Example (Case-2: When no Carry bit): Evaluate 11110 from 1110
According to above algorithm, take 1’s complement of subtrahend 11110, which will be 00001.
Then add both of these, So, 1110 + 00001 =01111 . Since there is no carry bit 1, so take 1’s
complement of above result, which will be 10000, and this is negative number, i.e, 10000, which
is the answer.
Similarly, you can subtract two mixed (with fractional part) binary numbers. Note that you
always add Carry bit the the least significant bit (LSB) of the result, whenever you get carry bit
1. LSB of fractional binary number is last (rightmost) bit of mixed or fractional binary numbers.
Additions by 1’s Complement
There are difference scenario for addition of two binary numbers using 1’s complement. These
are explained as following below.
Case-1: Addition of positive and negative number when positive number has greater
magnitude:
When positive number has greater magnitude, then take simply 1’s complement of negative
number and the end-around carry of the sum is added to the least significant bit (LSB).
Example1.30: Add 1110 and -1101.
So, take 1’s complement of 1101, which will be 0010, then add with given number. So,
1110+0010=1 0000 , then add this carry bit to the LSB, 0000+1=0001 , which is the answer.
Note that if the register size is big then fill the same value of MSB to preserve sign magnitude
for inputs and output.
Case-2: Addition of positive and negative number when negative number has greater
magnitude:
When the negative number has greater magnitude, then take 1’s complement of negative number
and add with given positive number. Since there will not be any end-around carry bit, so take 1’s
complement of the result and this result will be negative.
Example1.31: Add 1010 and -1100 in five-bit registers.
Note that there are five-bit registers, so these new numbers will be 01010 and -01100.
Now take 1’s complement of 01100 which will be 10011 and add 01010+10011=11101 . Then
take 1’s complement of this result, which will be 00010 and this will be negative number, i.e., -

25
ANNA UNIVERSITY

00010, which is the answer.


Case-3: Addition of two negative numbers:
You need to take 1’s complement for both numbers, then add these 1’s complement of numbers.
Since there will always be end-around carry bit, so add this again to the MSB of result. Now,
take 1’s complement also of previous result, so this will be negative number.
Alternatively, you can add both negative number directly, and get this result which will be
negative only.
Example 1.32: add -1010 and -0101 in five bit-register.
These five bit numbers are -01010 and -00101. Add complements of these numbers,
10101+11010 =1 01111 . Since, there is a carry bit 1, so add this to the LSB of result, i.e.,
01111+1=10000 . Now take the 1’s complement of this result, which will be 01111 and this
number is negative, i.e, -01111, which is answer.
1.5.2 2’s Complement in Signed Binary Number Representation
2’s Complement is termed as another method like the previous sign-magnitude and one’s
complement form, which we can use to represent negative binary numbers in a signed binary
number system. In two’s complement, the positive numbers are exactly the same as before for
unsigned binary numbers. A negative number, however, is represented by a binary number,
which when added to its corresponding positive equivalent results in zero.
In two’s complement form, a negative number is the 2’s complement of its positive number with
the subtraction of two numbers being A – B = A + ( 2’s complement of B ) using much the same
process as before as basically, two’s complement is one’s complement + 1.
The main advantage of two’s complement over the previous one’s complement is that there is no
double-zero problem plus it is a lot easier to generate the two’s complement of a signed binary
number. Therefore, arithmetic operations are relatively easier to perform when the numbers are
represented in the two’s complement format.
Example1.33:
Let’s look at the subtraction of our two 8-bit numbers 115 and 27 from above using two’s
complement, and we remember from above that the binary equivalents are:
11510 in binary is: 0 1 1 1 0 0 1 1 2
2710 in binary is: 0 0 0 1 1 0 1 1 2
Our numbers are 8-bits long, then there are 28 digits available to represent our values and in
binary this equals: 1000000002 or 25610. Then the two’s complement of 2710 will be:
(28)2 – 00011011 = 100000000 – 00011011 = 111001012
The complementation of the second negative number means that the subtraction becomes a much

26
ANNA UNIVERSITY

easier addition of the two numbers so therefore the sum is: 115 + ( 2’s complement of 27 ) which
is:
01110011 + 11100101 = 1010110002
As previously, the 9th overflow bit is disregarded as we are only interested in the first 8-bits, so
the result is: 010110002 or (64 + 16 + 8) = 8810 in decimal the same as before.

Learning Activity

Find out the 2’s complement of the 11100101

The Base Ten Game


The game involves students using a place value board and concrete materials to develop an
understanding of the structure of the number system and to learn to operate on numbers using
this structure. To play the most basic version of the game, the student rolls two dice, adds the
numbers shown, and collects that quantity of pop-sticks to add to their game-board, which is
ruled up into place value columns (see Figure 1). The only rule of the game is that there can
be no more than nine items in any one column.
Once there are more than nine sticks in the units column, ten sticks are combined to make a
bundle of ten sticks which is then placed in the tens column. The remaining sticks are left in
the units column. When there are more than nine bundles of ten in the tens column, these are
combined to make a bundle of ten tens. Rubber bands can be used to hold the bundles of
sticks together.
Teachers have embraced this game because it is versatile enough to be used to develop
relational understanding of the number system for all learners, from those just beginning to
count and write numerals, to those exploring very large numbers and decimal numbers.
Through participation in the Base Ten Game, students have the opportunity to develop their
ideas about place value, and to learn to operate on numbers using this structure: Instead of
using materials to demonstrate the facts, skills and procedures of arithmetic, the (game)
frames the learning by giving children experiences with the structure as a whole . By giving
children access to models of the place value system they gradually build an understanding of
it.
The only rule of the game is that there can be no more than nine in any one column, that is, in
the units column there can be no more than nine sticks, in the tens column there can be no
more than nine bundles of ten sticks, in the hundreds column there can be no more than nine
bundles of 100 sticks, and so on. The project was undertaken using a model of action
research, involving nine teachers. The teachers selected to participate in this study had all
previously undertaken extended professional development to incorporate constructivist

27
ANNA UNIVERSITY

learning theories in their teaching, and the teachers were keen to explore the use of concrete
materials, especially the base ten game in relation to student learning.
The five project schools represented the diverse range of communities within the independent
sector, including an isolated rural school with high numbers of Indigenous students, small and
large urban schools, and two schools with high numbers of students from a low socio-
economic background. The research took place between March and October of 2001, and
involved approximately 280 primary students.
The research methodology enabled the teacher researchers to focus on improved student
learning through changes in teacher practice and allowed concentration on the practical, day-
to-day realities of the classroom. Under the broad question, ’What are the most effective
teaching methods and management structures that will maximize the learning of the base ten
number system in a whole class setting?’ each teacher developed their own specific research
question. Within the parameters of each individual research question were common elements
of investigation that included:
• Range of teaching and learning strategies that were employed to enhance base ten
learning
• Monitoring and evaluation of these strategies
• Teacher knowledge that was required to effectively help students learn
• Use of concrete materials to assist learning

Game Board Showing 25 Pop-sticks


Throughout the project the teacher researchers were guided by a Project Officer, Mrs Andrea
Broadbent. The teacher researchers came together for professional development and sharing
of experiences, received visits to their schools, and kept reflective journals and student work
samples. Each teacher reflected upon their learning and wrote a report of their learning
journey. A final research report, drawing together common elements was constructed.
Questions:
1. What is the objective of the case?
2. Discuss the use of place value board and concrete materials in developing an understanding
of the number system.

28
ANNA UNIVERSITY

1. In number system base-10, this is likewise called decimal number


system. Other normal number systems incorporate base-16 (hexadecimal),
base-8 (octal), and base-2 (binary).
2. Digital systems are intended to store, measure, and impart data in digital
structure.
3. The binary number system is the most normal number system, requires a
modest quantity of hardware and is more productive.

SUMMARY

• A digital system is the interconnection of digital hardware modules that achieve a particular
data preparing task.
• Digital systems work under rationale and subsequently they are called Logic Circuits, whose
building blocks are known as Gates.
• Ascertaining utilizing just 1s and 0s is known as the double system.
• A number system with base or radix of 10 is known as decimal number system.
• A number system with base or radix of 8 is known as octal number system. A number system
with base of 16 is known as hexadecimal number system.
• A n-digit marked twofold number comprises of two sections, that is, sign piece and size.
• Complements are utilized to work on the deduction activity.
KEYWORDS
Digital Systems: Digital systems are designed to store, process, and communicate information
in digital form.
Binary Number System: Binary number is a number expressed in the base-2 numeral system or
binary numeral system, a method of mathematical expression which uses only two symbols:
typically "0" and "1".
Decimal Number System: A number system with base or radix of 10 is known as decimal
number system.
Octal Number System: A number system with base or radix of 8 is known as octal number
system.
Hexadecimal Number System: A number system with base of 16 is known as hexadecimal
number system.
Complements: Complements are used in digital computers to simplify the subtraction
operation.
SELF-ASSESSMENT QUESTIONS

29
ANNA UNIVERSITY

Short Answer Questions


1. Define the concept of digital system.
2. Differentiate between positional and non-positional number system.
3. Define 1’s complement with example.
4. What are signed binary numbers?
5. Differentiate 1’s complement and 2’s complement of signed binary number.
6. Write the steps included in power method for converting decimal number systems to
other number systems.
7. Define hexa decimal number system with example.
Long Answer Questions
1. Discuss about the concept of different number systems with examples.
2. Explain decimal number system to other number system conversion process with
example.
3. Describe the concept of representing signed numbers using the binary number
system.
4. Explain in detail the concept of 1’s complement and 2’s complement with example.
FURTHER READING
1. Comer E. Douglas. 2012, “Essentials of Computer Architecture”, Pearson, sixth edition.
2. Godse A.D. 2008. “Computer Organisation and Architecture”, Technical Publications.
3. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
4.Mano Morris. 2012. “Logic and Computer Design Fundamentals”,Cram101.

30
ANNA UNIVERSITY

UNIT I-DIGITAL FUNDAMENTALS


Chapter 2 – LOGIC GATES

CONTENTS
Learning Objectives
Learning Outcomes
Overview
2.1 Types of Logic Gates
2.1.1 The NOT (Inverter) Gate
2.1.2 The AND Gate
2.1.3 The OR Gate
2.1.4 NAND Gate
2.1.5 The NOR Gate
2.1.6 X-OR Gate(exclusive-OR)
2.1.7 X-NOR Gate
2.1.8 Non-inverter or Buffer
2.1.9 Open Collector and Open Drain
2.2 Tri-state Gates
2.2.2 Tri-state Buffer Switch Equivalent
2.2.3 Active “HIGH” Tri-state Gates
2.2.4 Active “LOW” Tri-state Gate
2.2.5 Multi-bit tri-state Buffers
2.2.6 Tri-state Gate Control
Summary
Self-Assessment Questions
Keywords
Further Reading

31
ANNA UNIVERSITY

LEARNING OBJECTIVES
After studying this chapter, you should be able to:
• Gain knowledge about concept of logic gates
• Understand about the different types of logic gates
• Analyze the logic symbol and truth table for each gate
• Gain knowledge about tri-state gate
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand:
• The implementation of logic gates
• Analyze the different types of logic gates and their symbol
• Identify the logic symbols and truth table for each gates
• The concept of tri-state gates
OVERVIEW
In the previous chapter, you have contemplated the idea of digital systems, distinctive number
systems and their transformations. You have likewise perceived the idea of marked double
numbers and supplements.
In this chapter, you will contemplate the idea of rationale doors. Rationale doors are the essential
structure squares of digital hardware. As the name suggests, they work by "opening" or
"shutting" to concede or dismiss the progression of digital data. It is a romanticized or actual
gadget executing a Boolean capacity, that is, it plays out an intelligent procedure on at least one
legitimate sources of info, and produces a solitary sensible yield. Entryways execute
electronically straightforward consistent procedure on Boolean factors, that factors that can have
just one of two states (0/1, low/high, bogus/valid). This exercise clarifies various sorts of
rationale doors. You will likewise examine tri-state rationale entryway that is an uncommon
yield door.
You can build digital systems by utilizing rationale doors and the diverse rationale entryways
that you will learn in this exercise are AND, OR, NOT, NAND, NOR, X-OR and X-NOR.
2.1 TYPES OF LOGIC GATES
Logic gates are the basic building blocks of any digital system. It is an electronic circuit having
one or more than one input and only one output. The relationship between the input and the
output is based on certain logic. Based on this, logic gates are named as AND gate, OR gate,
NOT gate etc. The logic state of a terminal can, and usually does, change often, as the circuit
processes data. In most logic gates, the low state is approximately zero volts (0 V), while the
high state is approximately five volts positive (+5 V).
There are mainly seven types of logic gates that are used in expressions. By combining them in
different ways, you will be able to implement all types of digital components. Now, each basic
logic gate and their operation are discussed.

32
ANNA UNIVERSITY

2.1.1 The NOT (Inverter) Gate


The NOT gate is also known as an inverter as it changes the input to its opposite. The NOT
gate is having only one input and one corresponding output. It is a device whose output is
always the compliment of the given input. That means, the NOT gate produces an output of
logic 1 state when the input is of logic 0 state and also produces the output of logic 0 state
when the input is of logic 1 state. The NOT operation is denoted by ’–‘(bar). When the
input variable to the NOT gate is represented by ‘X’ and the output is represented by ‘Z’, then
the output is read as ‘Z’ is equal to ‘X̄. The logic symbol and truth table are given below:

Figure 2.1: NOT Gate

2.1.2 THE AND GATE


An AND gate requires two or more inputs and produces only one output. The AND gate
produces an output of logic 1 state when each of the inputs are at logic 1 state and also produces
an output of logic 0 state even if any of its inputs are at logic 0 state. The symbol for AND
operation is ‘.’, or we use no symbol for representing. If the inputs are of X and Y, then the
output can be expressed as Z=XY. The AND gate is so named because, if 0 is called “false” and
1 is called “true,” the gate performs in the same way as the logical “and” operator. The AND
gate is also named as all or nothing gate. The logic symbols and truth tables of two-input and
three-input AND gates are given below.

Figure 2.2: AND Gate

2.1.3 THE OR GATE

33
ANNA UNIVERSITY

Similar to AND gate, an OR gate may also have two or more inputs but produce only one output.
The OR gate produces an output of logic 1 state even if any of its inputs is in logic 1 state and
also produces an output of logic 0 state if any of its inputs is in logic 0 state. The symbol for OR
operation is ‘+’. If the inputs are of X and Y, then the output can be represented as Z=X+Y. An
OR gate may also be defined as a device whose output is 1, even if one of its input is 1. OR gate
is also called as any or all gate. It is also called as an inclusive OR gate because it consists of the
condition of ‘both the inputs can be present’. The logic symbols and truth table for two-input
OR gates are given below.

Figure 2.3: OR Gate

2.1.4 THE NAND GATE


The NAND and NOR gates are the universal gates. Each of these gates can realize the logic
circuits single-handedly. The NAND and NOR are also called universal building blocks. Both
NAND and NOR have the ability to perform three basic logic functions like AND, OR, and
NOT. NAND gate is a combination of an AND gate and a NOT gate. The expression for the
NAND gate is ‘—‘whole bar. The output of the NAND gate is at logic 0 level only when each of
the inputs assumes a logic 1 level. The truth table of the two-input NAND gate is given below:

Figure 2.4: NAND Gate


2.1.5 THE NOR GATE
NOR means NOT OR. That means, NOR gate is a combination of an OR gate and a NOT gate. The
output is logic 1 level, only when each of its inputs assumes a logic 0 level. For any other combination of
inputs, the output is a logic 0 level. The truth table of two-input NOR gate is given below:

34
ANNA UNIVERSITY

Figure 2.5: NOR Gate

2.1.6 X-OR Gate (Exclusive-OR)


An X-OR gate is a two-input, one output logic circuit. X-OR gate assumes a logic 1 state when
any of its two inputs assumes a logic 1 state. When both the inputs assume the logic 0 state or
when both the inputs assume the logic 1 state, the output assumes a logic 0 state. The output of
the X-OR gate will be the sum of the modulo sum of its inputs. X-OR gate is also termed an
anti-coincidence gate or inequality detector. An X-OR gate can also be used as an inverter by
connecting one of the two input terminals to logic1 and also by inputting the sequence to be
inverted to the other terminal.

Figure 2.6: X-OR Gate


2.1.7 X-NOR GATE
An X-NOR gate is a combination of an X-OR gate and a NOT gate. The X-NOR gate is also a two-
input, one output concept. The output of the X-NOR gate will be logic 1 state when both the inputs
assume a 0 state or when both the inputs assume a 1 state. The output of the X-NOR gate will be logic 0
state when one of the inputs assumes a 0 state and the other a 1 state. It is also named a coincidence
gate, because its output will be 1 only when the inputs coincide. X-NOR gate can also be used as an
inverter by connecting one of the two input terminals to logic 0 and also by inputting the sequence to be
inverted to the other terminal.

35
ANNA UNIVERSITY

Figure 2.7: X-NOR Gate

2.1.8 Non-Inverting Buffer


The buffer is a single-input device that has a gain of 1, mirroring the input at the output. It
has value for impedance matching and for isolation of the input and output. A single input
digital buffer is different from an inverter. It does not invert or alter its input signal in any way.
It reads an input and outputs a value.

Figure 2.8: Buffer

2.1.9 Open Collector and Open Drain


When driving external devices, it is sometimes necessary to use open-collector or open-
drain outputs. Open-collector or open-drain outputs do not source current, they can only
sink current. Essentially, they act as simple switches to ground. In their high state, they are
open circuits, and in their low state they pull the output pin to ground, sinking as much
current as needed, up to a limit.
An open-collector/open-drain signal wire can also be bi-directional. Bi-directional means that a
device can both output and input a signal on the wire at the same time. In addition to
controlling the state of its pin that is connected to the signal wire (active, or non-active), a
device can also sense the voltage level of the signal wire. Although the output of a open-
collecter/open-drain device may be in the non-active (high) state, the wire attached to the
device may be in the active (low) state, due to activity of another device attached to the wire.
Example: Prove the following rules and draw their respective circuits:
(a) AC + ABC = AC
(b) A + AB = A

36
ANNA UNIVERSITY

Solutions:
(a) AC + ABC = AC (1 + B) = AC, since 1 + B = 1

(b) A + AB = A (1 + B) = A

2.2 TRI-STATE GATES

A tri-state buffer (a special gate output) is a useful device that allows us to control when
current passes through the device, and when it does not.

Figure 2.9: Tri-state Buffer

A tri-state buffer has two inputs: a data input x and a control input c. The control input acts
like a valve. When the control input is active, the output is the input. That is, it behaves
just like a normal buffer. The "valve" is open. When the control input is not active, the
output is "Z". The "valve" is open, and no electrical current flows through. Thus, even if x is
0 or 1, that value does not flow through.

Figure 2.10: Tri-state buffer

37
ANNA UNIVERSITY

2.2.1 Tri-state Buffer Switch Equivalent

Figure 2.11 Switch Equivalent

When activated into its third state it disables or turns “OFF” its output producing an open circuit
condition that is neither at a logic “HIGH” or “LOW”, but instead gives an output state of very
high impedance, High-Z, or more commonly Hi-Z. Then this type of device has two logic state
inputs, “0” or a “1” but can produce three different output states, “0”, “1” or ” Hi-Z ” which is
why it is called a “Tri” or “3-state” device.
Note that this third state is NOT equal to a logic level “0” or “1”, but is an high impedance state
in which the buffers output is electrically disconnected from the rest of the circuit. As a result, no
current is drawn from the supply.
There are four different types of Tri-state Buffer, one set whose output is enabled or disabled by
an “Active-HIGH” control signal producing an inverted or non-inverted output, and another set
whose buffer output is controlled by an “Active-LOW” control signal producing an inverted or
non-inverted output as shown below.
2.2.2 Active “HIGH” Tri-state Gates
In this case, when the output is Z, that means it is high impedance, neither 0, nor 1, i.e., no
current.

Figure.2.12: High Tri-state Gate

When c = 1 the valve is open, and z = x. When c = 0 the valve is closed, and z= Z (e.g., high
impedance/no current).
2.2.3 Active “LOW” Tri-state Gate
Some tri-state buffers are active low. In an active-low tri-state buffer, c = 0 turns open the
valve, while c = 1 turns it off. Here's the condensed truth table for an active-low tri-state
buffer.

38
ANNA UNIVERSITY

Figure 2.13: Active Low Tri-state Gate

As you can see, when c = 0 the valve is open, and z = x. When c = 1 the valve is closed,
and z = Z (e.g., high impedance/no current).
2.2.4 Multi-bit Tri-state Buffers
So far, we've talked about a tri-state buffer controlling the output of a single wire.
However, it's more common to deal with many wires.
Example:

Figure 2.14: Tri-state Buffer with a Bus of 32 Bits

In this case, we have 32 wires coming into the tri-state buffer. We have 32 wires coming
out of the tri-state buffer. There's still only 1 control bit.
This can easily be implemented using 32 tri-state buffers taking one bit as input and one bit as
output.
2.2.5 Tri-state Gate Control
The third state (Hi-Z) removes the device's influence from the rest of the circuit. If more
than one device is electrically connected, putting an output into the Hi-Z state is often used to
prevent short circuits, or one device driving high (logical 1) against another device driving
low (logical 0).Three-state buffers can also be used to implement efficient multiplexers,
especially those with large numbers of inputs. Three-state logic can reduce the number of
wires needed to drive a set of LED.

Figure 2.15 Tri-state Gate Control Connected to a Single Wire or Bus

The Tri-state Buffer is used in many electronic and microprocessor circuits as they allow
multiple logic devices to be connected to the same wire or bus without damage or loss of data.
For example, suppose we have a data line or data bus with some memory, peripherals, I/O or a
CPU connected to it. Each of these devices is capable of sending or receiving data to each other
39
ANNA UNIVERSITY

onto this single data bus at the same time creating what is called a contention.
Contention occurs when multiple devices are connected together because some want to drive
their output high and some low. If these devices start to send or receive data at the same time a
short circuit may occur when one device outputs to the bus a logic “1”, the supply voltage, while
another is set at logic level “0” or ground, resulting in a short circuit condition and possibly
damage to the devices as well as loss of data. Digital information is sent over these data buses or
data highways either serially, one bit at a time, or it may be up to eight (or more) wires together
in a parallel form such as in a microprocessor data bus allowing multiple tri-state buffers to be
connected to the same data highway without damage or loss of data as shown.
Learning Activity

Illustrate about Tri-state Buffer switch equivalent.

Buffer in Computer Network


Buffer is a region of memory used to temporarily hold data while it is being moved from one
place to another. A buffer is used when moving data between processes within a computer.
Majority of buffers are implemented in software. Buffers are generally used when there is a
difference between the rate at which data is received and the rate at which it can be processed.
If we remove buffers, then either we will have data loss, or we will have lower bandwidth
utilization.
What is Buffering ?
In Buffering, Whether the communication is direct or indirect, message exchanged by
communicating processes reside in a temporary queue.
Types of Buffering :
Zero Capacity –This queue cannot keep any message waiting in it. Thus it has maximum
length 0. For this, a sending process must be blocked until the receiving process receives the
message. It is also known as no buffering.
Bounded Capacity –This queue has finite length n. Thus it can have n messages waiting in it.
If the queue is not full, new message can be placed in the queue, and a sending process is not
blocked. It is also known as automatic buffering.
Unbounded Capacity –This queue has infinite length. Thus any number of messages can
wait in it. In such a system, a sending process is never blocked.
The Purpose of Buffering
We live in a connected world. The first thing that may come to mind when you hear the word

40
ANNA UNIVERSITY

buffer is probably related to watching videos on YouTube or live streams from Netflix or
Hulu. In a video stream, a buffer represents the amount of data required to be downloaded
before the video can be played back to the viewer in real time. The buffer stores transmitted
data temporarily as it is going between devices or between a device and an app.
A buffer in a computer environment means that a set amount of data is going to be stored in
order to preload the required data right before it gets used by the CPU. Computers have many
different devices that all operate at varying speeds, and a buffer is needed to act as a sort of
temporary placeholder for everything that is interacting. This is done to keep everything
running efficiently and without issues between all the devices, programs, and processes
running at that time.
Questions:
1. Discuss about need for buffer in computer network.
2. List out types of buffer.

1. NAND has the ability to perform three basic logic functions such as
AND, OR and NOT.
2. The buffer is a single-input device that has a gain of 1, mirroring the
input at the output. It has value for impedance matching and for isolation of
the input and output.

SUMMARY

• Logic gates perform essential logical capacities and are the major structure squares of
advanced coordinated circuits.
• The NOT gate is likewise called as an inverter as it changes the contribution to its inverse.
• The AND Gate takes in two or more sources of info and produce just one yield.
• The OR gate delivers a yield of logic 1 state regardless of whether any of its information
sources is in logic 1 state and likewise creates a yield of logic 0 state if any of its data sources
is in logic 0 state.
• NAND can perform three fundamental logic capacities, for example, AND, OR and NOT.
• NOR implies NOT OR. It is a mix of an OR gate and a NOT gate.
• A X-OR gate is a two information, one yield logic circuit. X-OR gate accepts logic 1 state
when any of its two sources of info expects logic 1 state.
• An X-NOR gate is a combination of an X-OR gate and a NOT gate. The X- NOR gate is also
a two input, one output concept.

41
ANNA UNIVERSITY

• The buffer is a single-input device, which has a gain of 1, mirroring the input at the output.
• Open-collector or open-drain outputs do not source current, they can only sink current.
• A tri-state buffer is a useful device that allows us to control when current passes through the
device, and when it doesn't.
KEYWORDS
Logic Gates: They are the basic building blocks of digital circuitry.
NOT Gate: It is also called as an inverter as it changes the input to its opposite.
AND Gate: It takes in two or more inputs and produces only one output
OR Gate: It produces an output of logic 1 state even if any of its inputs is in logic 1 state
and also produces an output of logic 0 state if any of its inputs is in logic 0 state.
NAND Gate: It is a universal gate that has the ability to perform three basic logic functions
such as AND, OR and NOT.
NOR Gate: NOR gate is a combination of an OR gate and a NOT gate.
X-OR Gate: It is a two input, one output logic circuit that assumes logic 1 state when any of
its two inputs assumes logic 1 state.
X-NOR Gate: It is a combination of an X-OR gate and a NOT gate.
Buffer: It is a single-input device that has a gain of 1, mirroring the input at theoutput.
Tri-state Buffer: It is a useful device that allows us to control when current passes through
the device, and when it doesn't.
SELF-ASSESSMENT QUESTIONS
Short Answer Questions
1. What are logic gates? Discuss.
2. Differentiate between basic and universal gates.
3. What is a buffer?
4. Discuss about X-OR Gate.
5. Define X-NOR Gate.
6. Discuss open drain concept.
7. What is the use of tri-state gate?
8. Discuss the gate fan out example.

42
ANNA UNIVERSITY

Long Answer Questions


1. Explain the different types of logic gates with suitable examples.
2. Represent each of the AND, OR, NOT, and XOR gates using only NOR gates.
3. Discuss and brief about the X-OR Gate with examples.
4. Illustrate the difference between EXOR and EXNOR gates with examples.
5. Explain how we use NOT gate as double invertor. Illustrate with example.
6. Illustrate the difference between active “HIGH” Tri-state Gates and active “LOW” Tri-
state Gates.
7. Open-collector or open-drain outputs do not source current, they can only sink current.
Comment.
FURTHER READING
1. Mano, M. Morris, and M. Morris Mano. 2007. “Digital design”. Englewood Cliffs, N.J.:
Prentice-Hall.
2. Mohammad A. Karim, Xinghao Chen. 2007. “Digital Design Basic Concepts and Principles”.
3. Gill s. Naseeb. 2011. “ Digital Design and Computer Organisation”, Firewall Media.
4. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
5. Mano Morris. 2012. “Logic and Computer Design Fundamentals”,Cram101.

43
ANNA UNIVERSITY

UNIT I-DIGITAL FUNDAMENTALS


CHAPTER 3 – BOOLEN ALGEBRA

CONTENTS
Learning Objectives
Learning Outcomes
Overview
3.1 Concept of Boolean Algebra
3.1.1 Postulates of Boolean Algebra
3.1.2 Laws of Boolean Algebra and their Truth Table
3.1.3 Boolean Arithmetic
3.1.4 Boolean Algebraic Identities
3.1.5 Translating Truth Tables into Boolean Expression
3.1.6 Boolean Rules for Simplification
3.1.7 Boolean Theorems
3.2 DeMorgan’s Theorems
3.2.1 DeMorgan's FirstTheorems
3.2.2 DeMorhan’s Second Theorem
3.3 K-maps
3.3.1 Karnaugh Map Format
3.3.2 Looping
3.4 Standard Forms
3.5 NAND NOR Implementation
3.5.1 NAND Implementation
3.5.2 NOR Implementation
Summary
Keywords
Self-Assessment Questions
Further Reading

44
ANNA UNIVERSITY

LEARNING OBJECTIVES
After Studying this chapter, you should be able to:
• Explain the concept of Boolean Algebra
• Define DeMorgan’s Theorems
• Describe the concept of K-maps
• Explain standard forms
• Illustrate NAND and NOR implementation
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand:
• Gains knowledge about Boolean Arithmetic
• Laws of Boolean Algebra and their truth table
• How to apply DeMorgan’s theorems
• Gains knowledge about functions of K-Maps
• Concept of minterms and maxterms
• Applications of K-map digital circuits and standard forms
OVERVIEW
In the past chapter, you have examined the idea of logic gates and comprehended the
various kinds of logic gates, for example, OR, AND, NOR, NAND, X-OR, X-NOR and
tri-state support, which is an exceptional gate yield.
In this chapter, you will ponder the idea of Boolean variable based math. You can
characterize Boolean polynomial math with a bunch of components, operators or various
hypothesizes. In certain regards, Boolean variable based math looks like ordinary variable
based math. You will contemplate portrayal of Boolean polynomial math and DeMorgan's
theorem. We will likewise cover the idea of K-map and the utilizations of K-Map
advanced circuits.
The investigation of Boolean variable based math will help you in understanding circuits
and their working. The information on Boolean variable based math will cause you to
understand the ideas of combinational and successive circuits, which you will concentrate
in the next chapter.
3.1 CONCEPT OF BOOLEAN ALGEBRA
In 1854 George Boole introduced a systematic treatment of logic and developed for this purpose
an algebraic system known as symbolic logic, or Boolean algebra. Boolean algebra is a branch of
mathematics and it can be used to describe the manipulation and processing of binary
information. The two-valued Boolean algebra has important application in the design of modern
computing systems. This chapter contains a brief introduction the basics of logic design. It
provides minimal coverage of Boolean algebra and this algebra’s relationship to logic gates and
basic digital circuit.

45
ANNA UNIVERSITY

Boolean algebra is algebra for the manipulation of objects that can take on only two values,
typically true and false. It is common to interpret the digital value 0 as false and the digital value
1 as true.
Example 3.1: A+A=A, not 2A
Also 1+1=1, not 2 as it is logical expression.
One can visualize this as,
TRUE+TRUE=TRUE
FALSE+FALSE=FALSE
A Boolean algebraic expression is composed of variables, constant and operators. The
variables are commonly denoted by the letters of the Alphabet (say A), which can have two
possible values 1 or 0. The interpretation of 1 may be that the variable is present, input signal
is ON, is TRUE, and is a positive voltage. If A is 0, then it means that the variable is absent,
input signal if OFF, is FALSE, and is a negative voltage. Similarly, the Boolean constant can
have any two values, either 1 or 0.
Boolean operators are used in Boolean algebra where a mathematical function called Boolean
function is constructed. These operators are the symbols:
• PLUS (+) meaning an OR operation
• DOT (.) meaning an AND operation
• BAR read as COMPLEMENT meaning a NOT operation.
3.1.1 Postulates of Boolean algebra
In this section, let us discuss about the Boolean postulates and basic laws that are used in
Boolean algebra. These are useful in minimizing Boolean functions.
Boolean Postulates
x.1 = x
x.0 = 0
x.x = x
x.x’ = 0
Consider the binary numbers 0 and 1, Boolean variable x and its complement x̄. Either the
Boolean variable or complement of it is known as literal. The four possible logical
OR operations among these literals and binary numbers are shown below.
x+0=x
x+1=1
x+x=x
x + x’ = 1
Similarly, the four possible logical AND operations among those literals and binary numbers are
shown. These are the simple Boolean postulates. We can verify these postulates easily, by
substituting the Boolean variable with ‘0’ or ‘1’.

46
ANNA UNIVERSITY

3.1.2 Laws of Boolean Algebra and their Truth Table


(a) Commutative Law: Any binary operation which satisfies the following expression is
referred to as a commutative operation. Commutative law states that changing the sequence of
the variables does not have any effect on the output of a logic circuit.
• A.B = B. A
• A+B=B+A
A B A+B A.B B+A B.A
0 0 0 0 0 0
0 1 1 0 1 0
1 0 1 0 1 0
1 1 1 1 1 1
Figure 3.1: Truth Table for Commutative Law
(b) Associative Law: It states that the order in which the logic operations are performed is irrelevant as
their effect is the same.
• ( A. B ). C = A . ( B . C )
• ( A + B ) + C = A + ( B + C)
A B C B+C A+B A+(B+C) (A+B)+C B.C A.B A.(B.C) (A.B).C

Figure 3.2: Truth Table for Associative Law

(c) Distributive Law: Distributive law states the following conditions:

• A.( B + C) = (A. B) + (A. C)


• A + (B. C) = (A + B) . ( A + C)
A B C B+C A.B A.C A.(B+C) (A.B+A.C) B.C A+B A+C A+(B.C) (A+B). (A+C)

Figure 3.3: Truth Table for Distributive Law

47
ANNA UNIVERSITY

3.1.3 Boolean Arithmetic


Binary arithmetic is an essential part of various digital systems. You can add, subtract,
multiply, and divide binary numbers using various methods. These operations are much easier
than decimal number arithmetic operations because the binary system has only two digits: 0
and 1.
Let us begin our exploration of Boolean algebra by adding numbers together:

The first three sums make perfect sense to anyone familiar with elementary addition. The last
sum, though, is quite possibly responsible for more confusion than any other single statement in
digital electronics, because it seems to run contrary to the basic principles of mathematics. Well,
it does contradict the principles of addition for real numbers, but not for Boolean numbers.
Remember that in the world of Boolean algebra, there are only two possible values for any
quantity and for any arithmetic operation: 1 or 0. There is no such thing as “2” within the scope
of Boolean values. Since the sum “1 + 1” certainly isn’t 0, it must be 1 by process of elimination.
It does not matter how many or few terms we add together, either. Consider the following sums:

Binary Addition
Binary addition is much like your normal everyday addition (decimal addition), except that
rather than carrying a value of 0 and 1 instead of 1 to 9.
Example3.2: In decimal addition, if you add 8 + 2 you get ten, which you write as 10; in the sum
this gives a digit 0 and a carry of 1. Something similar happens in binary addition when you add
1 and 1; the result is two (as always), but since two is written as 10 in binary, we get, after
summing 1 + 1 in binary, a digit 0 and a carry of 1.
Therefore in binary:
0+0=0
0+1=1
1+0=1
1 + 1 = 10 (which is 0 carry 1)

48
ANNA UNIVERSITY

Example 3.3: 101101+11001=1000110 10001+11101=101110

1 1 1 1 1 1
1 0 1 1 0 1 1 0 0 0 1
+ 1 1 0 0 1 + 1 1 1 0 1
1 0 0 0 1 1 0 1 0 1 1 1 0

Binary Subtraction
Subtraction and borrow are the words used very frequently for the binary subtraction. There are
four rules of binary subtraction.
A B Difference Borrow
0 0 0 0
0 1 1 1
1 0 1 0
1 1 0 0
Work the columns right to left subtracting in each column. If you must subtract a one from a
zero, you need to “borrow” from the left, just as in decimal subtraction.
Example 3.4: 1010110-101010=101100 1011011-10010=1001001
*0 *0
1 0 1 0 1 1 0 1 0 1 1 0 1 1
− 1 0 1 0 1 0 − 1 0 0 1 0
1 0 1 1 0 0 1 0 0 1 0 0 1
Binary Multiplication
Binary multiplication is the process of multiplying binary numbers. The process of multiplying
binary numbers is the same as that of arithmetic multiplication with decimal numbers. The only
difference is that binary multiplication involves numbers that are consist of 0s and 1s, whereas,
decimal multiplication involves numbers that comprise digits from 0 to 9.

49
ANNA UNIVERSITY

Example 3.5: Using the binary multiplication rules, multiply (110)2 and (11)2.
The rules for binary multiplication are:
0×0=0
0×1=0
1×0=0
1×1=1
Let us use the above rules to multiply the binary numbers.
GIven, multiplicand = 1102, multiplier = 112. We multiply the two numbers as shown below.

Binary Division
The binary division operation is similar to the base 10 decimal system, except the base 2. The
division is probably one of the most challenging operations of the basic arithmetic operations.
There are different ways to solve division problems using binary operations.
1÷1 = 1
1÷0 = Meaningless
0÷1 = 0
0÷0 = Meaningless
Example 3.6: Solve 01111100 ÷ 0010
01111100 ÷ 0010
Here the dividend is 01111100, and the divisor is 0010
Remove the zero’s in the Most Significant Bit in both the dividend and divisor, that doesn’t
change the value of the number.
So the dividend becomes 1111100, and the divisor becomes 10.
Example 3.7 : Solve using the long division method: 101101 ÷ 101

Complement
Boolean algebra also supports complement of a variable, which are the opposite of its value. It
uses capital alphabetical letters to denote variables.

50
ANNA UNIVERSITY

Figure 3.4: Complement


Complement of A is read as A-not or A-bar. We can also use a prime symbol to represent
complementation.
3.1.4 Boolean Algebraic Identities
An identity is a statement true for all possible values of its variable or variables. Boolean algebra
has its own unique identities based on the bivalent states of Boolean variables.
Boolean Addition
X+0=X
X+1=1
X+X=X
X+ x̅=1
Boolean Multiplication
X.0=0
X.1=X
X.X=X
X.X̅=0
3.1.5 Translating Truth Tables into Boolean Expressions
A truth table is a chart of 1s and 0s arranged to indicate the results (or outputs) of all possible
inputs. All possible inputs are arranged in columns on the left and the resulting outputs are listed
in columns on the right. There are 2n possible states (or combination of inputs).

Figure 3.5:Truth Table


There are two basic ways to convert truth tables into Boolean expressions:

51
ANNA UNIVERSITY

Sum of Products (SOP)


Sum of Product is the abbreviated form of SOP. Sum of product form is a form of expression in
Boolean algebra in which different product terms of inputs are being summed together. This
product is not arithmetical multiply but it is Boolean logical AND and the Sum is Boolean
logical OR.
Example 3.8: 1010 → A B̅ C D̅
Substitute, and see the product of term is 1:

Product of Sum
The product of Sum form is a form in which products of different sum terms of inputs are taken.
These are not arithmetic product and sum but they are logical Boolean AND and OR
respectively.
Example 3.9: 1010 → A̅ B C̅ D
Substitute, and see the sum of term is 0:
A+B+C+D=1̅+0+1̅+0=0+0+0+0=0
3.1.6 Boolean Rules for Simplification
Boolean algebra is also used in simplifying logic circuits. By applying specific algebraic rules to
a Boolean equation resulting from a truth table, you will get a much simpler equation.

Figure 3.6: Boolean Identities

Boolean Theorems
The Boolean theorems are grouped into two categories below. Table 1 lists the single-variable
theorems. Table lists the theorems relating expressions with multiple variables. Table 3 able
lists two special multivariable theorems which express how to change an AND expression to an
OR expression and vice versa. DeMorgan’s theorems in Table.
Note that AB and A  B are equivalent representations for the AND function. The dot notation

52
ANNA UNIVERSITY

is used below to avoid confusion when combining numbers and variables, ie, to differentiate the
expression x  0 from the variable x0.

1. x  0 = 0 x 5. x + 0 = x x
0
x
0
0

2. x 1 = x x 6. x + 1 = 1 x
x 1
1 1

3. x  x = x x 7. x + x = x x
x x

4. x  x = 0 x 8. x + x = 1 x
0 1

Table 1: Single-variable Theorems

9. x + y = y + x 13a. x( y + z ) = xy + xz

10. x  y = y  x 13b. ( w + x)( y + z ) = wy + xy + wz + xz

11. x + ( y + z ) = ( x + y ) + z = x + y + z 14. x + xy = x

12. x( yz ) = ( xy ) z = xyz 15a. x + xy = x + y

15b. x + xy = x + y
Table 2: Multivariable Theorems

16. ( x + y ) = x  y 17. ( x  y ) = x + y
Table 3: DeMorgan’s Theorems
These theorems can be used to simplify expressions as shown in the examples below. In all of
the following examples, we refer to the theorems above by number for simplicity.
Example 3.10:
Simplify the expression z = ABD + ABD to a sum of products (SOP) form.
z = ABD + ABD
z = AB ( D + D) by theorem 13
z = AB (1) by theorem 8
z = AB by theorem 2  z = AB

53
ANNA UNIVERSITY

Example 3.11:

Simplify the expression z = ( A + B )( A + B ) to a sum of products form.


z = ( A + B )( A + B )
z = AA + AB + BA + BB by theorem 13
z = 0 + AB + BA + B by theorems 3 and 4
z = B( A + A + 1) by theorem 13
z = B (1) by theorem 6
z=B by theorem 2 z=B
Example 3.12:
Simplify the expression z = ( A + C)  ( B + D) to a sum of products form.
z = ( A + C)  ( B + D)
z = ( A + C ) + ( B + D) by theorem 17
z = ( AC ) + ( BD) by theorem 16
z = ( AC ) + ( BD)  z = AC + BD

Example 3.13:
Determine the output expression of the circuit in Figure 1 and simplify to a sum of products.

A
B
C z = A B C by analysis
z

z = A+ B +C by theorem 17

z = A+ B +C
Figure 1: Logic Circuit

 z = A+ B +C
Note that in Example we have changed an AND expression (since a NAND gate is an AND gate
with an inverter after it) to an OR expression. This conversion leads directly to our ability to
implement any logic expression using any universal gate, eg, NAND or NOR.
3.2 DEMORGAN’S THEOREMS
DeMorgan´s Theorem and Laws can be used to to find the equivalency of the NAND and NOR
gates.
As we have seen previously, Boolean Algebra uses a set of laws and rules to define the operation
of a digital logic circuit with “0’s” and “1’s” being used to represent a digital input or output
condition. Boolean Algebra uses these zeros and ones to create truth tables and mathematical

54
ANNA UNIVERSITY

expressions to define the digital operation of a logic AND, OR and NOT (or inversion)
operations as well as ways of expressing other logical operations such as the XOR (Exclusive-
OR) function.
While George Boole’s set of laws and rules allows us to analyise and simplify a digital circuit,
there are two laws within his set that are attributed to Augustus DeMorgan (a nineteenth century
English mathematician) which views the logical NAND and NOR operations as separate NOT
AND and NOT OR functions respectively.
But before we look at DeMorgan’s Theory in more detail, let’s remind ourselves of the basic
logical operations where A and B are logic (or Boolean) input binary variables, and whose values
can only be either “0” or “1” producing four possible input combinations, 00, 01, 10, and 11.
Input Output Conditions
Variable

A B AND NAND OR NOR


0 0 0 1 0 1
0 1 0 1 1 0
1 0 0 1 1 0
1 1 1 0 1 0
Figure 3.7: Truth Table
The following table gives a list of the common logic functions and their equivalent Boolean
notation where a “.” (a dot) means an AND (product) operation, a “+” (plus sign) means
an OR (sum) operation, and the complement or inverse of a variable is indicated by a bar over
the variable.

Figure 3.8:Logic Functions and Notations


DeMorgan’s Theorem
DeMorgan’s Theorems are basically two sets of rules or laws developed from the Boolean
expressions for AND, OR and NOT using two input variables, A and B. These two rules or
theorems allow the input variables to be negated and converted from one form of a Boolean
function into an opposite form.
DeMorgan’s first theorem states that two (or more) variables NOR´ed together is the same as the
two variables inverted (Complement) and AND´ed, while the second theorem states that two (or
more) variables NAND´ed together is the same as the two terms inverted (Complement) and

55
ANNA UNIVERSITY

OR´ed. That is replace all the OR operators with AND operators, or all the AND operators with
an OR operators.
3.2.1 DeMorgan’s First Theorem
DeMorgan’s First theorem proves that when two (or more) input variables are AND’ed and
negated, they are equivalent to the OR of the complements of the individual variables. Thus the
equivalent of the NAND function will be a negative-OR function, proving that A̅ ̅ B
̅ ̅ = A̅ + B̅. We
can show this operation using the following table.
Verifying DeMorgan’s First Theorem using Truth Table

Figure 3.9 : Truth Table

DeMorgan’s First Law Implementation using Logic Gates

Figure 3.10: Logic Gate

3.2.2 DeMorgan’s Second Theorem


DeMorgan’s Second theorem proves that when two (or more) input variables are OR’ed and
negated, they are equivalent to the AND of the complements of the individual variables. Thus the
equivalent of the NOR function is a negative-AND function proving that A̅ ̅ B
̅ ̅ = A̅ . B̅, and again

56
ANNA UNIVERSITY

we can show operation this using the following truth table.


Verifying DeMorgan’s Second Theorem using Truth Table
Inputs Truth Table Outputs For Each Term
B A A+B A B
0 0 0 1 1 1 1
0 1 1 0 0 1 0
1 0 1 0 1 0 0
1 1 1 0 0 0 0
Figure 3. 11: Truth Table

DeMorgan’s Second Law Implementation using Logic Gates

Figure 3.12 :Logic Gates

The top logic gate arrangement of: A+B can be implemented using a standard NOR gate function
using inputs A and B. The lower logic gate arrangement first inverts the two inputs, thus
producing A̅ and B̅. Thus then become the inputs to the AND gate. Therefore the output from
the AND gate becomes: A̅ . B̅
Then we can see that a standard AND gate function with inverters (NOT gates) on each of its
inputs produces an equivalent output condition to a standard NOR gate function, and an
individual NOR gate can be represented in this way as the equivalency of a NOR gate is a
negative-AND.
3.2.3 DeMorgan’s Equivalent Gates
We have seen here that by using DeMorgan’s Theorems we can replace all of the AND (.)
operators with an OR (+) and vice versa, and then complements each of the terms or variables in
the expression by inverting it, that is 0’s to 1’s and 1’s to 0’s before inverting the entire function.
Thus to obtain the DeMorgan equivalent for an AND, NAND, OR or NOR gate, we simply add
inverters (NOT-gates) to all inputs and outputs and change an AND symbol to an OR symbol or
change an OR symbol to an AND symbol as shown in the following table.

57
ANNA UNIVERSITY

Standard Logic Gate DeMorgan’s Equivalent Gate

Figure 3.13: Equivalent Gate Logic

Then we have seen in this tutorial about DeMorgan’s Thereom that the complement of two (or
more) AND’ed input variables is equivalent to the OR of the complements of these variables,
and that the complement of two (or more) OR’ed variables is equivalent to the AND of the
complements of the variables as defined by DeMorgan.
3.3 K-MAPS
A Karnaugh map (K-map) method is a graphical way of minimizing a Boolean expression based
on the rule of complementation. It works well if to use for expressions with not more than 4
variables. Hardware components of a computer are built of logic circuits to perform various logic
functions. A logic function can be represented by a truth table, a circuit diagram, or a Boolean
expression. The truth table of a function is unique. A function, however, can be represented by
many equivalent Boolean expressions. A Boolean algebraic manipulation or a K-map may
simplify a Boolean expression. The algebraic manipulation method is often difficult because it
lacks specific rules for predicting each succeeding step in the manipulative process. The K- map
method provides a simple, straightforward procedure for minimizing Boolean expressions.
The idea behind a K-map (Karnaugh, 1953) is to draw an expression's truth table as a matrix in
such a way that each row and each column of the matrix puts minterms that differ in the value of
a single variable adjacent to each other. Then, by grouping adjacent cells of the matrix, you can
identify product terms that eliminate all complemented literals, resulting in a minimized version

58
ANNA UNIVERSITY

of the expression.
3.3.1 K-maps Format
cd
ab 00 01 11 10
00
01
11
10
F(a,b,c,d) =
This will save you the trouble of creating a table format on your own. You can simply copy and
paste the template into your own document and fill it out. It’s easy to change the terms from
a,b,c,d to w,x,y,z or whatever you need. Below are some grouping examples.
cd
ab 00 01 11 10
00 1 1
01
11 1 1 1 1
10 1 1
F(a,b,c,d) = ab + b’d’
Where the groups don’t overlap, it’s simplest to just use shading. Too bad it’s not always that
simple.
cd
ab 00 01 11 10
00 1 1
01
11 1 1 1 1
10 1 1 1 1
F(a,b,c,d) = ??
When the groups overlap, you’ll need to use a painting program. Paste the map into it and use
the program’s shape tool to draw a box around the groups. Paint is quick and dirty and gets the
job done, but it won’t let you paste the map directly; you’ll have to take a screen shot and paste
that.

59
ANNA UNIVERSITY

In an unused part of the image, you can draw a rectangle and break it up into corners, like the red
one above. Then you can paste those corners to capture groups going off the edges. Use
Transparent Selection to get just the rectangle parts and paste them without obscuring the map
your pasting them over:

F(a,b,c,d) = a + b’d’
As you can see, then all you have to do is select the part of the image you want to put in the
document, Ctrl-C to copy it, and Ctrl-V to paste it into your document.
3.3.2 Looping
Looping out logical adjacencies is a graphical alternative to algebraic calculations.
Unit distance code (Gray code.)
For two bits, the Gray code is:
00 01 11 10

Only one bit changes as you go from left to right. This code preserves logical adjacencies.
The K-map method is to loop out groups of 2n logically adjacent minterms. Each looped out
group corresponds to a product term in a minimal SOP expression.
1. Loop out single 1s (n=0) which have no logical adjacencies.
2. Loop out all pairs of 1s (n=1) which cannot be included in a larger group.
3. Loop out all quads of 1s (n=2) which cannot be included in a larger group.
4. Etc.
Example 3.14:

60
ANNA UNIVERSITY

Example 3.15: f(A,B,C)= ∑ 𝑚(0,2,3,6)

Figure: K-map for f(A,B,C)= ∑ 𝑚(0,2,3,6).

Moving left to right or up to down in the K-map changes only one digit in the minterm code.
Note the wrap-around at the ends: because of logical adjacency, the top and bottom are joined,
and the left and right are joined.
A̅B̅C̅+A̅BC̅+A̅BC+A̅BC̅+ A̅BC̅+ ABC̅
A̅C(B+B̅)+A̅B(C+C̅)+BC̅(A+A̅)
A̅C+A̅B+BC̅
Therefore the minimal SOP representation is f = A̅C+A̅B+BC̅
Example 3.16: Minimize the following boolean function-
F(A, B, C, D) = Σm(0, 1, 2, 5, 7, 8, 9, 10, 13, 15)
Solution:
• Since the given boolean expression has 4 variables, so we draw a 4 x 4 K Map.
• We fill the cells of K Map in accordance with the given boolean function.
• Then, we form the groups in accordance with the above rules.
Then, we have-

Now,
F(A,B,C,D)= (A̅B+AB)(C̅D+CD)+(A̅B̅+A̅B+AB+AB̅)C̅D+(A̅B̅+AB̅)(C̅D̅+CD̅)
= BD+C̅D+B̅D̅
Thus, minimized boolean expression is-
F(A, B, C, D) = BD + C̅D + B̅D̅

61
ANNA UNIVERSITY

Example 3.17:Minimize the following boolean function-


F(A, B, C, D) = Σm(0, 1, 3, 5, 7, 8, 9, 11, 13, 15)
Solution:
• Since the given boolean expression has 4 variables, so we draw a 4 x 4 K Map.
• We fill the cells of K Map in accordance with the given boolean function.
• Then, we form the groups in accordance with the above rules.
Then, we have-

Now,
F(A, B, C, D)= (A̅B̅ + A̅B + AB + AB̅)(C̅D + CD) + (A̅B̅ + AB̅)(C̅D̅ + C̅D)
= D + B̅C̅
Thus, minimized boolean expression is-
F(A, B, C, D) = B̅C̅ + D
Example 3.18:Minimize the following boolean function-
F(A, B, C, D) = Σm(1, 3, 4, 6, 8, 9, 11, 13, 15) + Σd(0, 2, 14)
Solution-
• Since the given boolean expression has 4 variables, so we draw a 4 x 4 K Map.
• We fill the cells of K Map in accordance with the given boolean function.
• Then, we form the groups in accordance with the above rules.
Then, we have

Now,
F(A, B, C, D)= (AB + AB̅)(C̅D + CD) + (A̅B̅ + AB̅)(C̅D + CD) + (A̅B̅ + AB̅)(C̅D̅ + C̅D)
+(A̅B̅ + A̅B)(C̅D̅ + CD̅)

62
ANNA UNIVERSITY

F(A, B, C, D)= AD + B̅D + B̅C̅ + A̅D̅


Thus, minimized boolean expression is-
F(A, B, C, D) = AD + B̅D + B̅C̅ + A̅D̅
3.4 STANDARD FORMS
Another important form of Boolean expression is standard form. In the configuration of standard
form, the terms that form the function may contain one, two, or any number of literals.
The standard form of a Boolean expression has the same basic sum-of-products and products-of-
sums form, but with standard form the terms are not required to be minterms or maxterms.
The sum-of-products is a Boolean expression containing AND terms called product terms, of one
or more literals each.
Example 3.19: the following expression is in sum-of-product standard form:

A product-of-sum is a Boolean expression containing OR terms called sum terms. Each term
may contain any number of literals.
3.5 NAND AND NOR IMPLEMENTATION
Any logic function can be implemented using NAND gates. To achieve this, first the logic
function has to be written in Sum of Product (SOP) form. Once logic function is converted to
SOP, then is very easy to implement using NAND gate. In other words any logic circuit with
AND gates in first level and OR gates in second level can be converted into a NAND-NAND
gate circuit.
Consider the following SOP expression
F = W.X.Y + X.Y.Z + Y.Z.W
The above expression can be implemented with three AND gates in first stage and one OR gate
in second stage as shown in figure.

Figure 3.14: Second Stage

If bubbles are introduced at AND gates output and OR gates inputs (the same for NOR gates),
the above circuit becomes as shown in figure.

63
ANNA UNIVERSITY

Now replace OR gate with input bubble with the NAND gate. Now we have circuit which is
fully implemented with just NAND gates.

Implementing an inverter using NAND gate

64
ANNA UNIVERSITY

Any logic function can be implemented using NOR gates. To achieve this, first the logic function
has to be written in Product of Sum (POS) form. Once it is converted to POS, then it's very easy
to implement using NOR gate. In other words any logic circuit with OR gates in first level and
AND gates in second level can be converted into a NOR-NOR gate circuit.
Consider the following POS expression
F = (X+Y) . (Y+Z)
The above expression can be implemented with three OR gates in first stage and one AND gate
in second stage as shown in figure.

If bubble are introduced at the output of the OR gates and the inputs of AND gate, the above
circuit becomes as shown in figure.
Now replace AND gate with input bubble with the NOR gate. Now we have circuit which is
fully implemented with just NOR gates.

65
ANNA UNIVERSITY

Learning Activity

Minimize the following boolean function-


F(A, B, C, D) = Σm(0, 1, 3, 5, 7, 8, 9, 10, 12, 15)

History of Boolean Algebra


Boolean algebra is often referred to as the algebra of logic. The English mathematician
George Boole (1815–1864), who is largely responsible for its beginnings, was the first to
apply algebraic techniques to logical methodology. He showed that logical propositions and
their connectives could be expressed in the language of set theory. Thus, Boolean algebra is
also the algebra of sets. Algebra is that branch of mathematics which is concerned with the
relations of quantities.
Beginning with the members of a specific set (called the universal set), together with one or
more binary operations defined on that set, procedures are derived for manipulating the
members of the set using the defined operations, and combinations of those operations. Both
the language and the rules of manipulation vary, depending on the properties of elements in
the universal set. For instance, the algebra of real numbers differs from the algebra of
complex numbers, because real numbers and complex numbers are defined differently,
leading to differing definitions for the binary operations of addition and multiplication, and
resulting in different rules for manipulating the two types of numbers. Boolean algebra
consists of the rules for manipulating the subsets of any universal set, independent of the
particular properties associated with individual members of that set. It depends, instead, on
the properties of sets. The universal set may be any set, including the set of real numbers or
the set of complex numbers, because the elements of interest, in Boolean algebra, are not the

66
ANNA UNIVERSITY

individual members of the universal set, but all possible subsets of the universal set.
Properties of sets
A set is a collection of objects, called members or elements. The members of a set can be
physical objects, such as people, stars, or roses, or they can be abstractions such as numbers
or even other sets. A set is referred to as the universal set (usually called I) if it contains all
the elements under consideration. A set S not equal to I is called a proper subset of I if every
element of S is contained in I. This is written and read “S is contained in I.”
If S equals I, then S is called an improper subset of I, that is, I is an improper subset of itself
(note that two sets are equal if and only if they both contain exactly the same elements). The
special symbol is given to the set with no elements, called the empty set or null set. The null
set is a subset of every set.
When dealing with sets there are three important operations. Two of these operations are
binary (that is, they involve combining sets two at a time), and the third involves only one set
at a time. The two binary operations are union and intersection. The third operation is
complementation. The union of two sets S and T is the collection of those members that
belong to either S or T or both.
The intersection of the sets S and T is the collection of those members that belong to both S
and T.
The complement of a subset, S, is that part of I not contained in S, and is written S’.
Properties of Boolean algebra
The properties of Boolean algebra can be summarized in four basic rules.
(1) Both binary operations have the property of commutativity, that is, order doesn’t matter.
S∩T= T∩S, and S∪T = T∪S.
(2) Each binary operation has an identity element associated with it. The universal set is the
identity element for the operation of intersection, and the null set is the identity element for
the operation of union.
S∩I = S, and S ∪Ø = S.
(3) each operation is distributive over the other.
S ∪(T∩ V) = (S ∪ T) ∩ (S ∪ V), and S ∩ (T∪V) = (S ∩ T) U (S ∩ V).
This differs from the algebra of real numbers, for which multiplication is distributive over
addition, a(b + c) = ab + ac, but addition is not distributive over multiplication, a + (bc) not
equal (a + b)(a + c).
(4) each element has associated with it a second element, such that the union or intersection of
the two results in the identity element of the other operation.
A ∪′ = I, and A ∩ ′ = Ø.
This also differs from the algebra of real numbers. Each real number has two others
associated with it, such that its sum with one of them is the identity element for addition, and
its product with the other is the identity element for multiplication. That is, a + (-a) = 0, and
a(1/a) = 1.

67
ANNA UNIVERSITY

Questions:
1. Discuss the history of Boolean Algebra.

1. De Morgan's laws are a pair of transformation rules that are both valid
rules of inference.
2. A logic gate is an idealized model of computation or physical electronic
device implementing a Boolean function, a logical operation performed on
one or more binary inputs that produces a single binary output.

SUMMARY
•Boolean algebra is a branch of mathematics that deals with operations on logical values
with binary variables.
• A Boolean expression is an expression used in programming languages that produces a
Boolean value when evaluated. A Boolean value is either true or false.
• A mathematician named DeMorgan developed a pair of important rules regarding group
complementation in Boolean algebra.
• K-map is table like representation but it gives more information than truth table. We fill
grid of K-map with 0’s and 1’s then solve it by making groups.
• We will get four Boolean product terms by combining two variables x and y with logical
AND operation. These Boolean product terms are called as min terms or standard product
terms. The min terms are x’y’, x’y, xy’ and xy.
• Any logic circuit with AND gates in first level and OR gates in second level can be
converted into a NAND-NAND gate circuit.
KEYWORDS
Boolean Algebra- Boolean algebra is a division of mathematics that deals with operations on
logical values and incorporates binary variables.
Identity-A statement true for all possible values of its variable or variables.
Truth Table- A truth table is a mathematical table used in logic specifically in connection with
Boolean algebra, boolean functions, and propositional calculus.
SOP - A way of representing boolean expressions as sum of product terms.
POS- A way of representing boolean expressions as product of sum terms.
Minterm- A minterm is a Boolean expression resulting in 1 for the output of a single cell,
and 0s for all other cells in a Karnaugh map, or truth table. If a minterm has a single 1 and the
remaining cells as 0s, it would appear to cover a minimum area of 1s.
Maxterm- A maxterm is a Boolean expression resulting in a 0 for the output of a single cell
expression, and 1s for all other cells in the Karnaugh map, or truth table. The illustration above
left shows the maxterm (A+B+C), a single sum term, as a single 0 in a map that is otherwise 1s.
K-Map- The K-map is a systematic way of simplifying Boolean expressions. With the help of
the K-map method, we can find the simplest POS and SOP expression, which is known as the
minimum expression. The K-map provides a cookbook for simplification.

68
ANNA UNIVERSITY

SELF ASEESMENT QUESTIONS


Short Answer Questions
1. Define Boolean algebra.
2. List out the rules for binary addition.
3. How can we simply a Boolean expression?
4. Define SOP and POS.
5. Recall K-maps.
Long Answer Questions
1. Write the steps to convert a truth table in to Boolean expression.
2. Describe DeMorgan’s theorem.
3. Illustrate in detail about the K-maps.
4. Analyze NAND and NOR gate implementation.

FURTHER READING
1. Mano, M. Morris, and M. Morris Mano. 2007. “Digital design”. Englewood Cliffs, N.J.:
Prentice-Hall.
2. Mohammad A. Karim, Xinghao Chen. 2007. “Digital Design Basic Concepts and Principles”.
3. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
4. Mano Morris. 2012. “Logic and Computer Design Fundamentals”,Cram101.

69
ANNA UNIVERSITY

UNIT II- COMBINATIONAL AND SEQUENTIAL CIRCUITS


Chapter 4- Combinational Circuits

CONTENTS
Learning Objectives
Learning Outcomes
Overview
4.1 Concept of Combinational Circuits
4.2 Adder
4.2.1 Half Adder
4.2.2 Full Adder
4.2.3 n-Bit Array Adder
4.3 Subtractor
4.3.1 Half Subtractor
4.3.2 Full Subtractor
4.3.3 n-Bit Array Subtractor
4.4 ALU Design
4.4.1 Designing a Bit-Slice ALU
4.4.2 Full 8-bit ALU Circuit
4.5 Decoder
4.5.1 ENABLE Inputs
4.6 Encoder
4.6.1 Binary Encoder
4.6.2 Priority Encoder
4.7 Multiplexer
4.7.1 2-to-1 MUX
4.7.2 4-to-1 MUX
4.7.3 4-to-2 MUX
4.8 Demultiplexer
4.8.1 1-to-4 DEMUX
Summary
Keywords
Self-Assessment Questions
Further Reading

70
ANNA UNIVERSITY

LEARNING OBJECTIVE
After studying this chapter, you should be able to:
• Illustrate the concept of combinational circuit
• Gain knowledge about the Adder and Subtractor circuits
• Design and implementation of various types of decoder and encoder circuits
• Explain ALU Design
• Understand the operation of multiplexers
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• Gain knowledge about the combinational circuits
• Analyze the block diagram of the adder and subtractor
• The operation of decoder and encoder logic and use them in different circuit application
• The concept of encoder and multiplexer
• How priority encoder works.
OVERVIEW
In the previous chapter, we have examined the idea of Boolean variable based math. We have
clarified different proposes, laws, and hypotheses of Boolean variable based math. We have
likewise found out about the idea of K-map which is utilized in limiting Boolean capacities.
Standard structures and NAND-NOR execution were additionally examined.
In this chapter, you will find out about combinational (combinatorial) circuits which
acknowledge Boolean capacities and manage digitized signals, generally signified by 0s and 1s.
The conduct of a combinational circuit is memory less. This implies that given an improvement
to the contribution of a combinational circuit, a reaction arises at the yield after some
proliferation delay, yet the reaction isn't put away or taken care of back. At the end of the day,
the yield relies solely upon its latest information and is autonomous of the circuit's set of
experiences. You will examine different combinational circuits like adder, decoder, encoder, and
so forth, and consider the idea of ALU design.
In this chapter, you will utilize the information acquired from the past chapter and study different
combinational circuits, which are significant for the comprehension of digital computers and
systems.
4.1 CONCEPT OF COMBINATIONAL CIRCUITS
Combinational Logic Circuits are memoryless digital rationale circuits whose yield at any
moment in time relies just upon the blend of its data sources. A combinational circuit comprises
of rationale doors, whose yields not set in stone straightforwardly from the current mix of
contributions concerning past inputs.
In combinational circuit, we combine the different gates in the circuit such as encoder, decoder,
multiplexer, etc. Combination Logic Circuits are made of basic gates (AND, OR, NOT) or universal

71
ANNA UNIVERSITY

gates (NAND, NOR) gates, which are connected or combined together to make more complicated
switching circuits. These logic gates are the elementary units of combinational logic circuits.
Dissimilar to Sequential Logic Circuits whose yields are reliant upon both their current data
sources and their past yield state giving them some type of Memory. The yields of
Combinational Logic Circuits are just dictated by the consistent capacity of their present
information state, rationale "0" or rationale "1", at some random moment on schedule. The
outcome is that combinational rationale circuits have no feedback, and any progressions to the
signs being applied to their data sources will quickly have an impact at the yield. At the end of
the day, in a Combinational Logic Circuit, the yield is dependant consistently on the blend of its
information sources. Consequently a combinational circuit is memoryless. So on the off chance
that one of its information sources condition changes state, from 0-1 or 1-0, so too will the
subsequent yield as of course combinational rationale circuits have "no memory", "timing" or
"feedback loops" inside their plan.
Example 4.1: An illustration of a combinational circuit is a decoder, which changes over
the double code information present at its contribution to various diverse yield lines, each in
turn delivering an identical decimal code at its yield. In these circuits the yields at any
moment of time relies upon the data sources present right then and there as it were.
Some of the characteristics of combinational circuits are following.
• They can have m number of yields and n number of information sources.
• The yield of combinational circuits whenever relies just upon the levels present at
input terminals.
• These circuits don't utilize any memory. The previous condition of information
doesn't have any impact on the current situation with the circuit.
• For the plan of combinational digital circuits, all inclusive entryways or essential
doors are utilized.
Example 4.2: Various examples for combinational digital circuits are Half adder, Full adder,
Half subtractor, Full subtractor, Code converter, Decoder,Multiplexer, Encoder, ROM, etc.

Figure 4.1: Combinational Circuits

You can determine combinational circuits through Boolean rationale articulations, underlying
portrayals, or truth tables. Combinational circuits utilized with fixed rationale slanted to be more
costly, with regards to hardware cost and plan exertion. Notwithstanding, they are frequently
denser, quicker and utilize less force. This is the reason they are proper for high-volume

72
ANNA UNIVERSITY

production or high speed circuits. Executions that utilization programmable logic circuits or
memory gadgets are prudent for quick prototyping and low-volume production. Be that as it
may, they may not bring about the force utilization, thickness or best execution.
The way toward choosing the most un-expensive execution among the practical and comparable
executions with the designated innovation is called improvement. For little combinational
circuits, it very well may be reasonable to do manual improvement dependent on revising or
controlling rationale articulations in one of a few identical structures. In most pragmatic cases, in
any case, programmed hardware combination apparatuses are utilized that have inherent
improvement abilities. These modified improvements are performed utilizing a blend of heuristic
and algorithmic changes. Confirmation implies the way toward determining the executed circuit
acts as initially visualized or indicated.
4.2 ADDER
An adder is a digital logic circuit in electronics that is extensively used for the addition of
numbers. In many computers and other types of processors, adders are even used to calculate
addresses and related activities and calculate table indices in the ALU and even utilized in other
parts of the processors. These can be built for many numerical representations like excess-3 or
binary coded decimal. Adders are basically classified into two types: Half Adder and Full Adder.
4.2.1 Half Adder
The half adder circuit has two inputs: A and B, which add two input digits and generates a carry
and a sum. A half adder is a logical circuit that performs an addition operation on two binary
digits. The half adder produces a sum and a carry value which are both binary digits.
Half Adder Truth Table with Carry-Out
Symbol Truth Table
B A Sum Carry
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
Figure 4.2: Symbol and Truth Table

From the truth table of the half adder we can see that the SUM (S) output is the result of
the Exclusive-OR gate and the Carry-out (Cout) is the result of the AND gate. Then the Boolean
expression for a half adder is as follows.

73
ANNA UNIVERSITY

For the SUM bit:

SUM = A XOR B = AB̅+BA̅ = A ⊕ B


For the CARRY bit:

CARRY = A AND B = A.B


One major disadvantage of the Half Adder circuit when used as a binary adder, is that there is no
provision for a “Carry-in” from the previous circuit when adding together multiple data bits.
For example, suppose we want to add together two 8-bit bytes of data, any resulting carry bit
would need to be able to “ripple” or move across the bit patterns starting from the least
significant bit (LSB). The most complicated operation the half adder can do is “1 + 1” but as the
half adder has no carry input the resultant added value would be incorrect. One simple way to
overcome this problem is to use a Full Adder type binary adder circuit.
4.2.2 Full Adder
The main difference between the Full Adder and the previous Half Adder is that a full adder has
three inputs. The same two single bit data inputs A and B as before plus an additional Carry-
in (C-in) input to receive the carry from a previous stage as shown below.

Figure 4.3: Full Adder Block Diagram


Then the full adder is a logical circuit that performs an addition operation on three binary digits
and just like the half adder, it also generates a carry out to the next addition column. Then

74
ANNA UNIVERSITY

a Carry-in is a possible carry from a less significant digit, while a Carry-out represents a carry to
a more significant digit.
In many ways, the full adder can be thought of as two half adders connected together, with the
first half adder passing its carry to the second half adder as shown.

Figure 4.4: Full Adder Logic Diagram

As the full adder circuit above is basically two half adders connected together, the truth table for
the full adder includes an additional column to take into account the Carry-in, CIN input as well
as the summed output, S and the Carry-out, COUT bit.
Symbol Truth Table
C- B A Sum C-
in out
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1 0
1 0 1 0 1
1 1 0 0 1
1 1 1 1 1
Figure 4.5 :Full Adder Symbol and Truth Table with Carry

Then the Boolean expression for a full adder is as follows.


For the SUM (S) bit:
SUM = (A XOR B) XOR Cin = (A ⊕ B) ⊕ Cin
For the CARRY-OUT (Cout) bit:
CARRY-OUT = A AND B OR Cin(A XOR B) = A.B + Cin(A ⊕ B)
4.2.3 An n-bit Binary Adder
We have seen above that single 1-bit binary adders can be constructed from basic logic gates.
But what if we wanted to add together two n-bit numbers, then n number of 1-bit full adders
need to be connected or “cascaded” together to produce what is known as a Ripple Carry Adder.

75
ANNA UNIVERSITY

A “ripple carry adder” is simply “n“, 1-bit full adders cascaded together with each full adder
representing a single weighted column in a long binary addition. It is called a ripple carry adder
because the carry signals produce a “ripple” effect through the binary adder from right to left,
(LSB to MSB).
For example, suppose we want to “add” together two 4-bit numbers, the two outputs of the first
full adder will provide the first place digit sum (S) of the addition plus a carry-out bit that acts as
the carry-in digit of the next binary adder.
The second binary adder in the chain also produces a summed output (the 2nd bit) plus another
carry-out bit and we can keep adding more full adders to the combination to add larger numbers,
linking the carry bit output from the first full binary adder to the next full adder, and so forth.
Example 4.3: An example of a 4-bit adder is given below.
A 4-bit Ripple Carry Adder

Figure 4.6:4-bit Ripple Carry Adder

One main disadvantage of “cascading” together 1-bit binary adders to add large binary numbers
is that if inputs A and B change, the sum at its output will not be valid until any carry-input has
“rippled” through every full adder in the chain because the MSB (most significant bit) of the sum
has to wait for any changes from the carry input of the LSB (less significant bit). Consequently,
there will be a finite delay before the output of the adder responds to any change in its inputs
resulting in a accumulated delay.
When the size of the bits being added is not too large for example, 4 or 8 bits, or the summing
speed of the adder is not important, this delay may not be important. However, when the size of
the bits is larger for example 32 or 64 bits used in multi-bit adders, or summation is required at a
very high clock speed, this delay may become prohibitively large with the addition processes not
being completed correctly within one clock cycle.
This unwanted delay time is called Propagation delay. Also another problem called “overflow”
occurs when an n-bit adder adds two parallel numbers together whose sum is greater than or
equal to 2n.
One solution is to generate the carry-input signals directly from the A and B inputs rather than

76
ANNA UNIVERSITY

using the ripple arrangement above. This then produces another type of binary adder circuit
called a Carry Look Ahead Binary Adder where the speed of the parallel adder can be greatly
improved using carry-look ahead logic.
The advantage of carry look ahead adders is that the length of time a carry look ahead adder
needs in order to produce the correct SUM is independent of the number of data bits used in the
operation, unlike the cycle time a parallel ripple adder needs to complete the SUM which is a
function of the total number of bits in the addend.
Example 4.4: 4-bit full adder circuits with carry look ahead features are available as standard IC
packages in the form of the TTL 4-bit binary adder 74LS83 or the 74LS283 and the CMOS 4008
which can add together two 4-bit binary numbers and generate a SUM and a CARRY output as
shown.

Figure 4.7:74LS83 Logic Symbol

4.3 SUBTRACTOR
The Subtractor is another type of combinational arithmetic circuit that produces an output which
is the subtraction of two binary numbers.
As their name implies, a Binary Subtractor is a decision making circuit that subtracts two binary
numbers from each other, for example, X – Y to find the resulting difference between the two
numbers.
Unlike the Binary Adder which produces a SUM and a CARRY bit when two binary numbers
are added together, the binary subtractor produces a DIFFERENCE, D by using a BORROW
bit, B from the previous column. Then obviously, the operation of subtraction is the opposite to
that of addition.
We learnt from our maths lessons at school that the minus sign, “–” is used for a subtraction
calculation, and when one number is subtracted from another, a borrow is required if the
subtrahend is greater than the minuend. Consider the simple subtraction of the two denary (base
10) numbers below.

We can not directly subtract 8 from 3 in the first column as 8 is greater than 3, so we have to

77
ANNA UNIVERSITY

borrow a 10, the base number, from the next column and add it to the minuend to produce 13
minus 8. This “borrowed” 10 is then return back to the subtrahend of the next column once the
difference is found. Simple school math’s, borrow a 10 if needed, find the difference and return
the borrow.
The subtraction of one binary number from another is exactly the same idea as that for
subtracting two decimal numbers but as the binary number system is a Base-2 numbering system
which uses “0” and “1” as its two independent digits, large binary numbers which are to be
subtracted from each other are therefore represented in terms of “0’s” and “1’s”.
4.3.1 Half Subtraction
Half Subtraction can take many forms but the rules for subtraction are the same whichever
process you use. As binary notation only has two digits, subtracting a “0” from a “0” or a “1”
leaves the result unchanged as 0-0 = 0 and 1-0 = 1. Subtracting a “1” from a “1” results in a “0”,
but subtracting a “1” from a “0” requires a borrow. In other words 0 – 1 requires a borrow.

Figure 4.8: Binary Subtraction of Two Bits

For the simple 1-bit subtraction problem above, if the borrow bit is ignored the result of their
binary subtraction resembles that of an Exclusive-OR Gate. To prevent any confusion in this
tutorial between a binary subtractor input labelled, B and the resulting borrow bit output from the
binary subtractor also being labelled, B, we will label the two input bits as X for the minuend
and Y for the subtrahend. Then the resulting truth table is the difference between the two input
bits of a single binary subtractor is given as:
2-input Exclusive-OR Gate
Symbol Truth Table
Y X Q
0 0 0
0 1 1
1 0 1
2-input Ex-OR Gate 1 1 0
Figure 4.9:Exclusive-OR Gate

As with the Binary Adder, the difference between the two digits is only a “1” when these two
inputs are not equal as given by the Ex-OR expression. However, we need an additional output to
produce the borrow bit when input X = 0 and Y = 1. Unfortunately there are no standard logic
gates that will produce an output for this particular combination of X and Y inputs.
But we know that an AND Gate produces an output “1” when both of its inputs X and Y are “1”
(HIGH) so if we use an inverter or NOT Gate to complement the input X before it is fed to

78
ANNA UNIVERSITY

the AND gate, we can produce the required borrow output when X = 0 and Y = 1 as shown
below.

Figure 4.10: Logic Diagram

Then by combining the Exclusive-OR gate with the NOT-AND combination results in a simple
digital binary subtractor circuit known commonly as the Half Subtractor as shown.
A Half Subtractor Circuit
A half subtractor is a logical circuit that performs a subtraction operation on two binary digits.
The half subtractor produces a sum and a borrow bit for the next stage.

Figure 4.11: Block Diagram

Half Subtractor with Borrow-out


Symbol Truth Table
X Y Difference Borrow
0 0 0 0
0 1 1 0
1 0 1 1
1 1 0 0
Figure 4.12:Truth Table

From the truth table of the half subtractor we can see that the DIFFERENCE (D) output is the
result of the Exclusive-OR gate and the Borrow-out (Bout) is the result of the NOT-
AND combination. Then the Boolean expression for a half subtractor is as follows.
For the DIFFERENCE bit:
D = X XOR Y = X ⊕ Y
For the BORROW bit
B = X̅ AND Y = X̅.Y
If we compare the Boolean expressions of the half subtractor with a half adder, we can see that
the two expressions for the SUM (adder) and DIFFERENCE (subtractor) are exactly the same
and so they should be because of the Exclusive-OR gate function. The two Boolean expressions
for the binary subtractor BORROW is also very similar to that for the adders CARRY. Then all
that is needed to convert a half adder to a half subtractor is the inversion of the minuend input X.
One major disadvantage of the Half Subtractor circuit when used as a binary subtractor, is that
there is no provision for a “Borrow-in” from the previous circuit when subtracting multiple data

79
ANNA UNIVERSITY

bits from each other. Then we need to produce what is called a “full binary subtractor” circuit to
take into account this borrow-in input from a previous circuit.
4.3.2 Full Subtractor
The main difference between the Full Subtractor and the previous Half Subtractor circuit is that a
full subtractor has three inputs. The two single bit data inputs X (minuend) and Y (subtrahend)
the same as before plus an additional Borrow-in (B-in) input to receive the borrow generated by
the subtraction process from a previous stage as shown below.

Figure 4.13: Block Diagram

Then the combinational circuit of a “full subtractor” performs the operation of subtraction on
three binary bits producing outputs for the difference D and borrow B-out. Just like the binary
adder circuit, the full subtractor can also be thought of as two half subtractors connected
together, with the first half subtractor passing its borrow to the second half subtractor as follows.
Full Subtractor Logic Diagram

Figure 4.14: Logic Diagram

As the full subtractor circuit above represents two half subtractors cascaded together, the truth
table for the full subtractor will have eight different input combinations as there are three input
variables, the data bits and the Borrow-in, BIN input. Also includes the difference output, D and
the Borrow-out, BOUT bit.
Symbol Truth Table
B-in Y X Diff. B-out
0 0 0 0 0
0 0 1 1 0
0 1 0 1 1
0 1 1 0 0
1 0 0 1 1
1 0 1 0 0
1 1 0 0 1
1 1 1 1 1
Figure 4.15:Full Subtractor Truth Table

80
ANNA UNIVERSITY

Then the Boolean expression for a full subtractor is as follows.


For the DIFFERENCE (D) bit:
D = (X.Y.BIN) + (X.Y.BIN) + (X.Y.BIN) + (X.Y.BIN)
which can be simplified too:
D = (X XOR Y) XOR BIN = (X ⊕ Y) ⊕ BIN
For the BORROW OUT (BOUT) bit:
BOUT = (X.Y.BIN) + (X.Y.BIN) + (X.Y.BIN) + (X.Y.BIN)
which will also simplify too:

BOUT = X AND Y OR (X XOR Y)BIN =


4.3.3 n-bit Binary Subtractor
As with the binary adder, we can also have n number of 1-bit full binary subtractor connected or
“cascaded” together to subtract two parallel n-bit numbers from each other.
Example 4.4: For example two 4-bit binary numbers. We said before that the only difference
between a full adder and a full subtractor was the inversion of one of the inputs.
So by using an n-bit adder and n number of inverters (NOT Gates), the process of subtraction
becomes an addition as we can use two’s complement notation on all the bits in the subtrahend
and setting the carry input of the least significant bit to a logic “1” (HIGH).
4.4 ALU DESIGN
The addition circuit and several other basic circuits are merged into a special
combinational circuit called the Arithmetic Logic Unit or ALU. You can consider this
circuit the core of the processor, and as it is involved in almost every operation, it must be
quite fast. There are several different fundamental operations that the processor does; some
of them are given below:
• Multiply or divide two binary numbers
• Determine if a number is zero, is positive, or is negative 1
• Determine after an arithmetic operation if there has been an error (e.g., if the result is
too big)
• Take the NOT of a number (i.e. invert each of the bits) Take the bit-wise OR of two
numbers
• Add or subtract two binary numbers
• Take the bit-wise AND of two numbers (i.e. apply and to each pair of bits, from left to
right)
• Determine if two numbers are the same Take the bit-wise X-OR or two numbers
• Increment or decrement a binary number by 1

81
ANNA UNIVERSITY

However, not all these operations are required to be applied in the ALU. We would like to
implement the ALU with a set of “minimal” operations – those that we cannot do without,
which are extremely common and can be performed very fast, say, in a small number of gate
delays. All other operations will be performed by combining these basic operations.
Therefore, some operations, like multiplication and division, are too complicated to be done
here, and are usually relegated to a separate part of the processor. Others like X-OR, are not
common enough to justify implementing in the most important combinational circuit in the
computer. What we would like is a minimal set that can be combined in several ways to
implement all the operations we require.
Now, if you want to build an ALU, which is as simple as possible, so you will choose only
the following four operations on two binary numbers A and B:
• (A or B) (A and B)
• (A + (not B)) – This is the sum of A and the bit-wise inverse of B(A + B)
• This is just the sum of A and B
As a result, all the other operations you would like to perform on two numbers can be
implemented by combinations of these four fundamental operations. You will return to this
point after you design the circuit to do these four operations.
4.4.1 Designing a Bit-Slice ALU
Normally, binary information in modern computer systems is denoted in 16, 32, or even
64-bit quantities. However, to make the presentation simpler, we will present most of the
concepts using 8-bit quantities. To build an 8-bit ALU, you are required to design a
combinational circuit that can take two 8-bit numbers, A and B, and return another 8-bit
number C, which is the result of one of the four basic operations.
In addition, it needs two control wires to choose which of the four operations are to be
performed. In the case of addition, you will input the Cin value and output the Cout value
from the Ripple-Carry Adder.

Figure 4.16: One-bit ALU Slice

82
ANNA UNIVERSITY

You will again use the divide and conquer method for developing the design. First,
consider the problem of designing, for example, ALU for 1-bit quantities, and then combine
these solutions into an 8-bit ALU (or 32-bit or 64-bit). The ALU design is called a “bit-
slice,” because you can assume taking an 8-bit ALU and slicing it into eight equal parts,
one for each bit position.
The 1-bit ALU design, built from circuits is presented in Figure 4.16.
From the given below F0 and F1 lines, choose the suitable function of the inputs:

F0 F1 | Out
0 0 | (A and B)
0 1 | (A or B)
1 0 | (A + B)
1 1 | (A + (not B))
This is a standard use of a multiplexer circuit. A number of input lines with various different
types of information come into the multiplexer, and the address lines choose which input to
route to the output.
4.4.2 Full 8-bit ALU Circuit
Like with this Ripple-Cary Adder circuit, to make a wider ALU, we simply join the suitable
carry lines to propagate the carry between the stages of the adder circuits. The first stage
will still have a Carry-In, and the last stage will have a Carry-Out.
You can see the eight-bit ALU in figure 4.17 and standard ALU icon in figure 4.18.

Figure 4.17: Eight-bit ALU

Figure 4.18: Standard ALU Icon

83
ANNA UNIVERSITY

4.5 DECODER
A decoder is a logic circuit that accepts a set of inputs that represents a binary number and
activates only the output that corresponds to that input number. The diagram for a general
decoder is shown in Figure. 4.19 with N inputs and M outputs. Since each of the N inputs can be
0 or 1, there are 2N possible input combinations or codes. For each of these input combinations
only one of the M outputs will be active (HIGH); all the other outputs are LOW. Many decoders
are designed to produce active-LOW outputs, where only the selected outputs is LOW while all
others are HIGH. This would be indicated by the presence of small circles on the output lines in
the decoder diagram.

Figure 4.19: General Decoder Diagram

Some decoders do not utilize all of the 2N possible input codes but only certain ones. For
example, a BCD-to-decimal decoder has a 4-bit inputcode and ten output lines that correspond to
the ten BCD code groups 0000 through 1001. If any of the unused codes are applied to the input
none of the outputs will be activated.
Figure. 4.20 shows the circuitry for a decoder with three inputs and 23=8 outputs. It uses all
AND gates, and so the outputs are active-HIGH. Note that for a given input code, the only output
which is active (HIGH) is the one corresponding to the decimal equivalent of the binary input
code (e.g.; output O6 goes HIGH only when CBA=1102=610).

Figure 4.20:3-line-to-8-line (or 1-of-8) decoder

84
ANNA UNIVERSITY

Figure 4.21:Truth Table

This decoder can be referred to in several ways. It can be called a 3-line to 8-line decoder,
because it has three input lines and eight output lines. It could also be called a binary-to-octal
decoder or converter because it takes a 3-bit binary input code and activates the one of the eight
(octal) outputs corresponding to that code. It is also referred to as a 1-of-8 decoder, because only
1 of the 8 outputs is activated at one time.
4.5.1 ENABLE Inputs
Some decoders have one or more ENABLE inputs that are used to control the operation of the
decoder. For example, refer to the decoder in Figure 4.22 and visualize having a common
ENABLE line connected to a fourth input of each gate.

Figure 4.22 : Logic diagram for 74LS138 decoder


With this ENABLE line held HIGH the decoder will function normally and the A,B, C inputs
code will determine which output is HIGH. With ENABLE held LOW, however, all the outputs
will be forced to the LOW state regardless of the levels at the A, B, C inputs. Thus, the decoder
is ENABLED only if ENABLE is HIGH.
Figure 4.23 shows the logic diagram for the 74LS138 decoder, it has NAND-gate out-puts, so

85
ANNA UNIVERSITY

that its outputs are active-LOW. Another indication is the labeling of the outputs as
and so on; overbar indicates active-LOW outputs.

Figure 4.23: Logic diagram for the 74LS138 decoder

Figure 4.24: Truth Table

The input code is applied at A2, A1, and A0, where A2 is the MSB. With three inputs and eight
outputs, this is a 3-to-8 decoder or, equivalently, a 1-of-8 decoder.

Inputs , and E3 are separate enable inputs that are combined in the AND gate. In order to
enable the output NAND gates to respond to the input code at A2,A1,A0, this AND gate output
has to be HIGH. This will occur only when = = 0 and E3=1. E1 and E2 are active LOW, E3
is active HIGH. If one or more of the enable inputs is in its inactive state, the AND output will be
LOW, which will force all NAND outputs to their inactive HIGH state regardless of the input
code. This operation is summarized in the truth table Figure 4.24. Recall that x represents the
“don’t care” condition.
4.6 ENCODER
Encoders take all of their data inputs one at a time and converts them into an equivalent binary
code at its output.
Unlike a multiplexer that selects one individual data input line and then sends that data to a
single output line or switch, Digital Encoder more commonly called a Binary Encoder takes ALL
its data inputs one at a time and then converts them into a single encoded output. So we can say
that a binary encoder, is a multi-input combinational logic circuit that converts the logic level “1”
data at its inputs into an equivalent binary code at its output.
Generally, digital encoders produce outputs of 2-bit, 3-bit or 4-bit codes depending upon the
number of data input lines. An “n-bit” binary encoder has 2n input lines and n-bit output lines
with common types that include 4-to-2, 8-to-3 and 16-to-4 line configurations.

86
ANNA UNIVERSITY

The output lines of a digital encoder generate the binary equivalent of the input line whose value
is equal to “1” and are available to encode either a decimal or hexadecimal input pattern to
typically a binary or “B.C.D” (binary coded decimal) output code.
4.6.1 4-to-2 Bit Binary Encoder

Figure 4.25: 4-to-2 Bit Binary Encoder

One of the main disadvantages of standard digital encoders is that they can generate the wrong
output code when there is more than one input present at logic level “1”. For example, if we
make inputs D1 and D2 HIGH at logic “1” both at the same time, the resulting output is neither at
“01” or at “10” but will be at “11” which is an output binary number that is different to the actual
input present. Also, an output code of all logic “0”s can be generated when all of its inputs are at
“0” OR when input D0 is equal to one.
One simple way to overcome this problem is to “Prioritise” the level of each input pin. So if
there is more than one input at logic level “1” at the same time, the actual output code would
only correspond to the input with the highest designated priority. Then this type of digital
encoder is known commonly as a Priority Encoder or P-encoder for short.
4.6.2 Priority Encoder
The Priority Encoder solves the problems mentioned above by allocating a priority level to each
input. The priority encoders output corresponds to the currently active input which has the
highest priority. So when an input with a higher priority is present, all other inputs with a lower
priority will be ignored.
The priority encoder comes in many different forms with an example of an 8-input priority
encoder along with its truth table shown below.

Figure 4.26: 8-to-3 Bit Priority Encoder

87
ANNA UNIVERSITY

Priority encoders are available in standard IC form and the TTL 74LS148 is an 8-to-3 bit priority
encoder which has eight active LOW (logic “0”) inputs and provides a 3-bit code of the highest
ranked input at its output.
Priority encoders output the highest order input first for example, if input lines “D2“, “D3” and
“D5” are applied simultaneously the output code would be for input “D5” (“101”) as this has the
highest order out of the 3 inputs. Once input “D5” had been removed the next highest output code
would be for input “D3” (“011”), and so on.
The truth table for a 8-to-3 bit priority encoder is given as:
Digital Inputs Binary Output
D7 D6 D5 D4 D3 D2 D1 D0 Q2 Q1 Q0
0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 X 0 0 1
0 0 0 0 0 1 X X 0 1 0
0 0 0 0 1 X X X 0 1 1
0 0 0 1 X X X X 1 0 0
0 0 1 X X X X X 1 0 1
0 1 X X X X X X 1 1 0
1 X X X X X X X 1 1 1
Figure 4.27: Truth Table
Where X equals “dont care”, that is it can be at a logic “0” level or at a logic “1” level.
From this truth table, the Boolean expression for the encoder above with data inputs D0 to D7 and
outputs Q0, Q1, Q2 is given as:
Output Q0

Output Q1

Output Q2

88
ANNA UNIVERSITY

Then the final Boolean expression for the priority encoder including the zero inputs is defined as:
Priority Encoder Output Expression

In practice these zero inputs would be ignored allowing the implementation of the final Boolean
expression for the outputs of the 8-to-3 priority encoder. We can constructed a simple encoder
from the expression above using individual OR gates as follows.
Digital Encoder using Logic Gates

Figure 4.28: Logic Gates


4.7 MULTIPLEXER
The multiplexer is a combinational logic circuit designed to switch one of several input lines to a
single common output line.
Multiplexing is the generic term used to describe the operation of sending one or more analogue
or digital signals over a common transmission line at different times or speeds and as such, the
device we use to do just that is called a Multiplexer.
The multiplexer, shortened to “MUX” or “MPX”, is a combinational logic circuit designed to
switch one of several input lines through to a single common output line by the application of a
control signal. Multiplexers operate like very fast acting multiple position rotary switches
connecting or controlling multiple input lines called “channels” one at a time to the output.
Multiplexers, or MUX’s, can be either digital circuits made from high speed logic gates used to
switch digital or binary data or they can be analogue types using transistors, MOSFET’s or relays
to switch one of the voltage or current inputs through to a single output.
The most basic type of multiplexer device is that of a one-way rotary switch as shown.
Basic Multiplexing Switch

Figure 4.29: Multiplexing Switch

89
ANNA UNIVERSITY

The rotary switch, also called a wafer switch as each layer of the switch is known as a wafer, is a
mechanical device whose input is selected by rotating a shaft. In other words, the rotary switch is
a manual switch that you can use to select individual data or signal lines simply by turning its
inputs “ON” or “OFF”. So how can we select each data input automatically using a digital
device.
In digital electronics, multiplexers are also known as data selectors because they can “select”
each input line, are constructed from individual Analogue Switches encased in a single IC
package as opposed to the “mechanical” type selectors such as normal conventional switches and
relays.
They are used as one method of reducing the number of logic gates required in a circuit design or
when a single data line or data bus is required to carry two or more different digital signals. For
example, a single 8-channel multiplexer.
Generally, the selection of each input line in a multiplexer is controlled by an additional set of
inputs called control lines and according to the binary condition of these control inputs, either
“HIGH” or “LOW” the appropriate data input is connected directly to the output. Normally, a
multiplexer has an even number of 2n data input lines and a number of “control” inputs that
correspond with the number of data inputs.
Note that multiplexers are different in operation to Encoders. Encoders are able to switch an n-bit
input pattern to multiple output lines that represent the binary coded (BCD) output equivalent of
the active input.
We can build a simple 2-line to 1-line (2-to-1) multiplexer from basic logic NAND gates as
shown.
4.7.1 2-to-1 Channel Multiplexer

Figure 4.30: 2-input Multiplexer Design

The input A of this simple 2-1 line multiplexer circuit constructed from standard NAND gates
acts to control which input ( I0 or I1 ) gets passed to the output at Q.
From the truth table above, we can see that when the data select input, A is LOW at logic 0,

90
ANNA UNIVERSITY

input I1 passes its data through the NAND gate multiplexer circuit to the output, while input I0 is
blocked. When the data select A is HIGH at logic 1, the reverse happens and now input I0 passes
data to the output Q while input I1 is blocked.
So by the application of either a logic “0” or a logic “1” at A we can select the appropriate input,
I0 or I1 with the circuit acting a bit like a single pole double throw (SPDT) switch.
As we only have one control line, (A) then we can only switch 21 inputs and in this simple
example, the 2-input multiplexer connects one of two 1-bit sources to a common output,
producing a 2-to-1-line multiplexer. We can confirm this in the following Boolean expression.

and for our 2-input multiplexer circuit above, this can be simplified too:

We can increase the number of data inputs to be selected further simply by following the same
procedure and larger multiplexer circuits can be implemented using smaller 2-to-1 multiplexers
as their basic building blocks. So for a 4-input multiplexer we would therefore require two data
select lines as 4-inputs represents 22 data control lines give a circuit with four inputs,
I0, I1, I2, I3 and two data select lines A and B as shown.
4.7.2 4-to-1 Channel Multiplexer

Figure 4.31: 4-to-1 Channel Multiplexer

The Boolean expression for this 4-to-1 Multiplexer above with inputs A to D and data select
lines a, b is given as:

In this example at any one instant in time only ONE of the four analogue switches is closed,
connecting only one of the input lines A to D to the single output at Q. As to which switch is
closed depends upon the addressing input code on lines “a” and “b“.
So for this example to select input B to the output at Q, the binary input address would need to
be “a” = logic “1” and “b” = logic “0”. Thus we can show the selection of the data through the
multiplexer as a function of the data select bits as shown.

91
ANNA UNIVERSITY

Multiplexer Input Line Selection

Figure 4.32: Input Line Selection


Adding more control address lines, (n) will allow the multiplexer to control more inputs as it can
switch 2n inputs but each control line configuration will connect only ONE input to the output.
Then the implementation of the Boolean expression above using individual logic gates would
require the use of seven individual gates consisting of AND, OR and NOT gates as shown.
4 Channel Multiplexer using Logic Gates

Figure 4.34: Logic Gates


The symbol used in logic diagrams to identify a multiplexer is as follows:
Multiplexer Symbol

Figure 4.35: MUX Symbol

Multiplexers are not limited to just switching a number of different input lines or channels to one
common single output. There are also types that can switch their inputs to multiple outputs and
have arrangements or 4-to-2, 8-to-3 or even 16-to-4 etc configurations and an example of a
simple Dual channel 4 input multiplexer (4-to-2) is given below:
4.7.3 4-to-2 Channel Multiplexer

Figure 4.36: 4-to-2 Channel Multiplexer

Here in this example the 4 input channels are switched to 2 individual output lines but larger

92
ANNA UNIVERSITY

arrangements are also possible. This simple 4-to-2 configuration could be used for example, to
switch audio signals for stereo pre-amplifiers or mixers.
4.8 DEMULTIPLEXER
The demultiplexer is a combinational logic circuit designed to switch one common input line to
one of several seperate output line.
The data distributor, known more commonly as a Demultiplexer or “Demux” for short, is the
exact opposite of the Multiplexer we saw in the previous tutorial.
The demultiplexer takes one single input data line and then switches it to any one of a number of
individual output lines one at a time. The demultiplexer converts a serial data signal at the input
to a parallel data at its output lines as shown below.
4.8.1 1-to-4 Channel De-multiplexer

Figure: 1-to-4 Channel De-Multiplexer


Output Data
Select Output
b a Selected
0 0 A
0 1 B
1 0 C
1 1 D
Figure 4.37: Truth Table
The Boolean expression for this 1-to-4 Demultiplexer above with outputs A to D and data select
lines a, b is given as:

The function of the Demultiplexer is to switch one common data input line to any one of the 4
output data lines A to D in our example above. As with the multiplexer the individual solid state
switches are selected by the binary input address code on the output select pins “a” and “b” as
shown.
Demultiplexer Output Line Selection

Figure 4.38: Output Line Selection

93
ANNA UNIVERSITY

As with the previous multiplexer circuit, adding more address line inputs it is possible to switch
more outputs giving a 1-to-2n data line outputs.
Some standard demultiplexer IC´s also have an additional “enable output” pin which disables or
prevents the input from being passed to the selected output. Also some have latches built into
their outputs to maintain the output logic level after the address inputs have been changed.
However, in standard decoder type circuits the address input will determine which single data
output will have the same value as the data input with all other data outputs having the value of
logic “0”.
The implementation of the Boolean expression above using individual logic gates would require
the use of six individual gates consisting of AND and NOT gates as shown.
4 Channel Demultiplexer using Logic Gates

Figure 4.39: Logic Gates

The symbol used in logic diagrams to identify a demultiplexer is as follows.


The Demultiplexer Symbol

Figure 4.40:Demultiplexer Symbol

Again, as with the previous multiplexer example, we can also use the demultiplexer to digitally
control the gain of an operational amplifier as shown.

Learning Activity

Why Multiplexer is required and explain in detail.

Types of Multiplexing Techniques


These techniques are mainly used in communication, and these are classified into three types.
The 3 types of multiplexing techniques include the following.

94
ANNA UNIVERSITY

• Frequency Division Multiplexing (FDM)


• Wavelength Division Multiplexing (WDM)
• Time Division Multiplexing (TDM)
Frequency Division Multiplexing (FDM)
The FDM is used in telephone companies in the 20th century in long-distance connections for
multiplexing a number of voice signals using a system like a coaxial cable. For small
distances, low-cost cables were utilized for different systems such as bell systems, K-and N-
carrier, however, they don’t let huge bandwidths. This is analog multiplexing used to unite
analog signals. This type of technique is useful when the link’s bandwidth is better than the
United bandwidth of the transmitted signals.

Frequency Division Multiplexing


In FDM, signals are produced by transmitting various device modulated carrier frequencies,
and then these are united into a solo signal which can be moved by the connection. To hold
the adapted signal, the carrier frequencies are divided by sufficient bandwidth, & these ranges
of bandwidths are the channels through the different traveling signals. These can be divided
by bandwidth which is not used. The best examples of the FDM comprise signal transmission
in TV and radio.
Wavelength Division Multiplexing (WDM)
In fiber communications, the WDM is one type of technology. This is the most useful concept
in high-capacity communication systems. At the end of the transmitter section, the
multiplexer is used to combine the signals as well as at the end of the receiver section, the de-
multiplexer for dividing the signals separately. The main function of WDM at the multiplexer
is for uniting various light sources into only light sources, and this light can be changed into
numerous light sources at the de-multiplexer.
The main intention of WDM is to utilize the high data rate capacity of the FOC (fiber optic
cable). The high data rate of this FOC cable is superior to the data rate of the metallic
transmission cable. Theoretically, the WDM is similar to the FDM, apart from the data
transmission through the FOC in which the multiplexing & de-multiplexing occupies optical
signals. Please refer to the link to know more about Wavelength Division Multiplexing
(WDM) Working and Applications

95
ANNA UNIVERSITY

WDM
Time Division Multiplexing (TDM)
The TDM is one kind of method for transmitting a signal over a channel of particular
communication by separating the time edge into slots. Like single slot is used for each
message signal.

Time Division Multiplexing


TDM is mainly useful for analog and digital signals, in which several channels with low
speed are multiplexed into high-speed channels used for transmission. Depending on the
time, every low-speed channel will be assigned to an exact position, wherever it works in the
mode of synchronization. Both the ends of MUX and DEMUX are synchronized timely & at
the same time switch toward the next channel.
Questions:
1. Explain Frequency Division Multiplexing (FDM).
2. What do you mean by Multiplexing?

1. Half adder includes two logic gates like AND gate and EX-OR gate. Full
adder includes two EX-OR gates, two OR gates, and two AND gates. The
input bits in the half adder are two like A, B.
2. A combinational circuit comprises of logic gates whose outputs at any
time are determined directly from the present combination of inputs without
any regard to previous inputs.

96
ANNA UNIVERSITY

SUMMARY
• Combinational circuit will be circuit in which we consolidate the various gates in the
circuit.
• A half adder is a logical circuit that plays out an expansion procedure on two paired
digits.
• Subtractor accepts two double numbers as info and deducts one parallel number
contribution from the other twofold number information.
• Encoder is a gadget used to change a sign, (for example, a bitstream) or information into
a code.
• Dynamic high yield type decoders are developed with AND gates and dynamic low yield
kind of decoders are built with NAND gates.
• Dynamic low yield sorts of decoders will give the yield low for given info blend and any
remaining yields are high.
• A multiplexer or MUX is a gadget that chooses one of a few simple or advanced info
signals and advances the chose contribution to a solitary line.
KEYWORDS
Adder: It is a digital circuit that performs addition of numbers.
Half Adder: It is a logical circuit that performs an addition operation on two binary digits.
Full Adder: It adds three bit binary numbers (X,Y,Z) & outputs two nos. of one bit binary
numbers, Sum & Carry.
Subtractor: It takes two binary numbers as input and subtracts one binary number input
from the other binary number input.
Half Subtractor: It is a combinational circuit which is used to perform subtraction of two
bits.
Full Subtractor: It subtracts one bit from the other by taking pervious borrow into account
and generates difference and borrow.
Encoder: It is a device used to change a signal (such as a bitstream) or data into a code.
Priority Encoder: It has a priority given to each input.
Decoder: It is a logic circuit that converts an n bit binary input code into M (2n) output
lines such that each output line will be activated for only one of the possible combinations of
inputs.
MUX: It is a device that selects one of several analog or digital input signals and forwards
the selected input into a single line.
SELF ASSESMENT QUESTIONS
Short Answer Questions
1. What are the characteristics of combinational circuits?
97
ANNA UNIVERSITY

2. Write difference between priority encoder and binary encoder


3. What is the function of the enable inputs of a decoder?
4. What is the difference between multiplexer and demultiplexer ?
5. Write down the uses of demultiplexer.
Long Answer Questions
1. Illustrate the implementation of full adder with example.
2. Explain in detail the different types of subtractors.
3. Describe the concept of designing a Bit-Slice ALU.
4. Explain the operation of a demultiplexer.
5. Explain how to draw the truth table and circuit diagram for an 8-to-3 priority
encoder.
FURTHER READING
1. Mano, M. Morris, and M. Morris Mano. 2007. “Digital design”. Englewood Cliffs, N.J.:
Prentice-Hall.
2. Mohammad A. Karim, Xinghao Chen. 2007. “Digital Design Basic Concepts and Principles”.
3. Gill s. Naseeb. 2011. “ Digital Design and Computer Organisation”, Firewall Media.
4. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
5. Mano Morris. 2012. “Logic and Computer Design Fundamentals”,Cram101.

98
ANNA UNIVERSITY

UNIT II- COMBINATIONAL AND SEQUENTIAL CIRCUITS


Chapter 5-Sequential Circuits

CONTENTS
Learning Objectives
Learning Outcomes
Overview
5.1 Concept of Sequential Circuits
5.1.1 Synchronous Sequential Circuits
5.1.2 Asynchronous Sequential Circuits
5.2 Latches
5.2.1 SR Latch
5.2.2 D Latch
5.3 Flip-flops
5.3.1 SR Flip-Flop
5.3.2 JK Flip-Flop
5.3.3 D Flip-Flop
5.3.4 T Flip-Flop
5.3.5 Master-Slave Flip-Flop
5.4 Registers
5.4.1 Shift Registers
5.5 Memories
5.5.1 Semiconductor Memories
5.6 Counters
5.6.1 Asynchronous Counter
5.6.2 Synchronous Counters

Summary
Keywords
Self-Assessment Questions
Further Reading

99
ANNA UNIVERSITY

LEARNING OBJECTIVE
After studying this chapter, you can able to:
• Gain knowledge about the concept of sequential circuit
• Describe about the concept of latch circuit
• Illustrate about type of flip flops
• Identify the different types registers
• Clarify the concept of memory unit.
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• Design the sequential circuit
• The concept of latches and flip flops
• Analyze the different type of flip flops
• Gain knowledge about registers and memories
• Design of counters.
OVERVIEW
In the past chapter, you have contemplated different combinational circuits. You have found out
about the ideas of adder, subtractor, decoder, encoder and multiplexers. We have additionally
covered the way toward planning ALU circuit.
In this exercise, you will contemplate the idea of sequential circuits. In digital circuit hypothesis,
sequential logic is a sort of logic circuit whose yield depends not just on the current worth of its
information flags yet on the previous history of its data sources. Sequential logic is considered as
a combinational logic with memory. In this exercise, you will find out about latch – a logic
circuit having two sources of info and one yield, flip-flops that is a circuit having two stable
states and registers, which play out the capacity of putting away and moving the information.
You will likewise comprehend memory unit which is carried out on a semiconductor-based
incorporated circuit and counters that cycle through a decent arrangement of states.
Virtually all circuits in down to earth digital gadgets are a combination of combinational and
sequential logic. The ideas canvassed in this exercise will help you in utilizing sequential logic to
develop limited state machine, an essential structure block in all digital circuitry just as memory
circuits and different gadgets.
5.1 CONCEPT OF SEQUENTIAL CIRCUITS
A sequential circuit comprises of combinational circuit to which memory components are
associated with structure a criticism way. A sequential circuit acquires twofold data from outer
sources of info.
Digital sequential logic circuits are partitioned into synchronous and asynchronous sorts. A
synchronous sequential circuit is characterized as a framework whose conduct can be
characterized from the information on its signs at discrete moments of time.

100
ANNA UNIVERSITY

In asynchronous circuits, the condition of the gadget can change whenever because of evolving
inputs. The conduct of asynchronous sequential circuit relies on the request wherein its info
signals change and can be influenced at any moment of time.
5.1.1 Synchronous Sequential Logic
Essentially all sequential logic today is timed or synchronous logic. In a synchronous circuit, an
electronic oscillator called a clock produces a succession of tedious heartbeats called the clock
signal, which is circulated to all the memory components in the circuit. The essential memory
component in sequential logic is the flip-flop.
The yield of each flip-flop possibly changes when set off by the clock beat, so changes to the
logic signals all through the circuit all start simultaneously, at customary spans, synchronized by
the clock. The yield of all the capacity components (flip-flops) in the circuit at some random
time, the paired information they contain, is known as the condition of the circuit. The condition
of a synchronous circuit just changes on clock beats. At each cycle, the following state is dictated
by the present status and the worth of the info signals when the clock beat happens.
The fundamental benefit of synchronous logic is its straightforwardness. The logic doors which
play out the procedure on the information require a limited measure of time to react to changes to
their inputs. This is called engendering delay. The span between clock beats should be long
enough with the goal that all the logic entryways have the opportunity to react to the
progressions and their yields "settle" to stable logic esteems, before the following clock beat
happens. However long this condition is met (overlooking certain different subtleties) the circuit
is destined to be steady and dependable. This decides the most extreme working velocity of a
synchronous circuit.
Let us discuss some of the disadvantages of synchronous logic:
• The maximum possible clock rate is determined by the slowest logic path in the
circuit, otherwise known as the critical path.
• Logic paths that complete their calculations quickly are idle much of the time,
waiting for the next clock pulse. Therefore synchronous logic can be slower than
asynchronous logic. One way to speed up synchronous circuits is to split complex
operations into several simple operations which can be performed in successive clock
cycles. This technique is extensively used in microprocessor design, and helps to
improve the performance of modern processors.
• The clock signal must be distributed to every flip-flop in the circuit. As the clock is
usually a high-frequency signal, this distribution consumes a relatively large
amount of power and dissipates much heat. Even the flip- flops that are doing
nothing consume a small amount of power, thereby generating waste heat in the
chip. In portable devices which have limited battery power, the clock signal goes on
even when the device is not being used, consuming power.

101
ANNA UNIVERSITY

Figure 5.1: Synchronous Sequential Logic

5.1.2 Asynchronous Sequential Logic


Asynchronous sequential logic isn't synchronized by a clock signal; the yields of the circuit
change straightforwardly because of changes in inputs. The benefit of asynchronous logic is
that it very well may be quicker than synchronous logic, in light of the fact that the circuit
doesn't need to sit tight for a clock sign to deal with inputs. The speed of the gadget is
conceivably restricted simply by the spread postponements of the logic doors utilized. Be that
as it may, asynchronous logic is more hard to plan and is dependent upon issues not
experienced in synchronous plans.
The principle issue is that digital memory components are touchy to the request that their info
signals show up; if two signs show up at a logic entryway at practically a similar time, which
express the circuit goes into can rely upon which sign gets to the door first. Thusly the circuit
can go into some unacceptable state, contingent upon little contrasts in the proliferation
postponements of the logic entryways. This is known as a race condition. This issue isn't as
extreme in synchronous circuits on the grounds that the yields of the memory components just
change at each clock beat. The stretch between clock signals is intended to be adequately long
to permit the yields of the memory components to settle so they are not changing when the
following clock comes.
Along these lines, the solitary planning issues are because of "asynchronous inputs"; inputs to
the circuit from different frameworks which are not synchronized to the clock signal.

Figure 5.2: Asynchronous Sequential Logic


5.2 LATCHES
• Latches and Flip-Flops are the basic building blocks of the most sequential circuits.
• Latches are used for a sequential device that checks all of its inputs continuously and changes
its outputs accordingly at any time independent of clocking signal.

102
ANNA UNIVERSITY

• Enable signal is provided with the latch. When enable signal is active output changes occur
as the input changes. But when enable signal is not activated input changes do not affect the
output.
• Flip-Flop is used for a sequential device that normally samples its inputs and changes its
outputs only at times determined by clocking signal.
5.2.1 SR Latch
The simplest type of latch is the Set-Reset (SR) latch. It can be constructed from either two NOR gates or
two NAND gates
i) SR latch using NOR gates

Figure.5.3: a SR latch using NOR gates Figure.5.3: b. Logic Symbol


The two NOR gates are cross-coupled so that the output of NOR gate 1 is connected to one of the inputs
of NOR gate 2 and vice versa. The latch has two outputs Q and Q’ and two inputs, set and reset. Before
going to analyze the SR latch, we recall that a logic 1 at any input of a NOR gate forces its output to a
logic 0. Let us understand the operation of this circuit for various input/ output possibilities.
Case 1: S= 0 and R= 0

Figure.5.4.Q= 1 and Q’= 0

Let us assume that initially Q=1 and Q’=0.


• With Q’=0, both inputs to NOR gate 1 are at logic 0. So, its output, Initially, Q= 0 and
Q’= 1
• With Q’=1, one input of NOR gate 1 is at logic 0, hence its output, Q is at logic 0.
• With Q=0, both inputs to NOR gate 2 are at logic 0. So, its output Q’ is at logic 1. In
this case also there is no change in the output state.

103
ANNA UNIVERSITY

Figure 5.5: Q= 0 and Q’=1


Case 2: S= 0 and R= 1

Figure 5.6: S= 0 and R= 1


In this case, R input of the NOR gate 1 is at logic 1, hence its output, Q is at logic 0. Both inputs
to NOR gate 2 are now at logic 0. So that its output, Q’ is at logic 1.
Case 3: S= 1 and R= 0

Figure 5.7: S= 1 and R= 0


In this case, S input of the NOR gate 2 is at logic 1, hence its output, Q is at logic 0. Both inputs
to NORgate 1 are now at logic 0. So its output, Q is at logic 1.
Case 4: S= 1 and R= 1
When R and S both are at logic 1, they force the outputs of both NOR gates to the low state, i.e.,
(Q=0 and Q’=0). So, we call this an indeterminate or prohibited state, and represent this
condition in the truth table as an asterisk (*). This condition also violates the basic definition of a
latch that requires Q to be complement of Q’. Thus in normal operation this condition must be
avoided by making sure that 1’s are not applied to both the inputs simultaneously. The operation
of SR latch is as summarized as follows:
• When S= 0 and R= 0, the output, Qn+1 remains in its present state, Qn.
• When S= 0 and R= 1, the latch is reset to 0.
• When S= 1 and R= 0, the latch is set to 1.
• When S= 1 and R= 1, the output of both gates will produce 0. i.e., Qn+1= Qn+1’= 0.

104
ANNA UNIVERSITY

The truth table of NOR based SR latch is shown below.


S R Qn Qn+1 State
0 0 0 0 No Change (NC)
0 0 1 1
0 1 0 0
Reset
0 1 1 0
1 0 0 1
Set
1 0 1 1
1 1 0 x Indeterminate
1 1 1 x *
FIGURE 5.8: TRUTH TABLE
ii. SR latch using NAND gates:

Figure.5.9: a. SR latch using NAND gates Figure. 5.9: b. Logic Symbol

The SR latch can also be implemented using NAND gates. The inputs of this Latch are S and R.
To understand how this circuit functions, recall that a low on any input to a NAND gate forces
its output high.
The operation of SR using NAND latch is summarized as follows: When S= 0 and R= 0, the
output of both gates will produce 0. i.e., Qn+1= Qn+1’= 1.
• When S= 0 and R= 1, the latch is reset to 0.
• When S= 1 and R= 0, the latch is set to 1.
• When S= 1 and R= 1, the output, Qn+1 remains in its present state, Qn.
The truth table of NAND based SR latch is shown below.
S R Qn Qn+1 State
0 0 0 X Indeterminate
No change
0 0 1 X *
0 1 0 1
Set
Reset
0 1 1 1
1 0 0 0
Set
1 0 1 0 Reset
1 1 0 0 NoIndeterminate
Change (NC)
Figure.5.10: Truth table

105
ANNA UNIVERSITY

iii. Gated SR Latch


In the SR latch, the output changes occur immediately after the input changes i.e, the latch is
sensitive to its S and R inputs all the time. A latch that is sensitive to the inputs only when
an enable input is active. Such a latch with enable input is known as gated SR latch. The
circuit behaves like SR latch when EN= 1. It retains its previous state when EN= 0.

Figure 5.11: a. SR Latch with enable input using NAND gates


Figure. 5.11: b. Logic Symbol
EN S R Qn Qn+1 State
1 0 0 0 0
No Change (NC)
1 0 0 1 1
1 0 1 0 0
Reset
1 0 1 1 0
1 1 0 0 1
Set
1 1 0 1 1
1 1 1 0 x Indeterminate
1 1 1 1 x *
0 x x 0 0
0 x x 1 1 No Change (NC)
Figure 5.12: Truth Table

When S is HIGH and R is LOW, a HIGH on the EN input sets the latch. When S is LOW and R
is HIGH, a HIGH on the EN input resets the latch.

Figure 5.13: Waveform of Gate SR Latch


iv. S’-R’ SR Latch

106
ANNA UNIVERSITY

Figure 5.14: Logic Diagram and Truth Table of S’-R’ SR Latch

In the above logic circuit if S = 0 and also R = 0, the condition of Q is totally unpredictable.
• First suppose Q is previously 0.
• Now both inputs of G2 are 0 as R = 0 and Q = 0. So output of G2, i.e. Q’ is at logic 1.
• Now the inputs of G1 are 0 and 1 as S=0 and Q’= 1. So output of G1, i.e. Q is at logic1.
That means Q is changed.
• Now Q is 1. So inputs of G2 are 0 and 1 as R = 0 and Q = 1. So output of G2 i.e. Q’ is at
logic 1. That means Q’ is unchanged.
• Now the inputs of G1 are 0 and 1 as S=0 and Q’= 1. So output of G1 i.e. Q is or 1. That
means Q is unchanged.
5.2.2 D Latch
There is one drawback of SR Latch. That is the next state value can’t be predicted when both the
inputs S & R are one. So, we can overcome this difficulty by D Latch. It is also called as Data
Latch. The circuit diagram of D Latch is shown in the following figure.
The D latch as shown below has an enable input. When the E input is 1, the Q output follows the
D input. In this situation, the latch is said to be "open" and the path from the input D to the
output Q is "transparent". Thus the circuit is also known as a transparent latch. When E is 0, the
latch is disabled or "closed", and the Q output retains its last value independent of the D input.

Figure 5.15: Logic Diagram


E D Q Q’
0 X No change
1 0 0 0
1 1 1 1
Figure 5.16: Truth Table

107
ANNA UNIVERSITY

5.3 FLIP-FLOPS
In electronics, a flip-flop or latch is a circuit that has two stable states and can be used to store
state information – a bistable multivibrator. The circuit can be made to change state
by signals applied to one or more control inputs and will have one or two outputs. It is the basic
storage element in sequential logic. Flip-flops and latches are fundamental building blocks
of digital electronics systems used in computers, communications, and many other types of
systems.
5.3.1 SR Flip-Flop
The SR flip-flop, also known as a SR Latch, can be considered as one of the most basic
sequential logic circuit possible. This simple flip-flop is basically a one-bit memory bistable
device that has two inputs, one which will “SET” the device (meaning the output = “1”), and is
labelled S and one which will “RESET” the device (meaning the output = “0”), labelled R.
Then the SR description stands for “Set-Reset”. The reset input resets the flip-flop back to its
original state with an output Q that will be either at a logic level “1” or logic “0” depending upon
this set/reset condition.
A basic NAND gate SR flip-flop circuit provides feedback from both of its outputs back to its
opposing inputs and is commonly used in memory circuits to store a single data bit. Then the SR
flip-flop actually has three inputs, Set, Reset and its current output Q relating to it’s current state
or history. The term “Flip-flop” relates to the actual operation of the device, as it can be
“flipped” into one logic Set state or “flopped” back into the opposing logic Reset state.
The NAND Gate SR Flip-Flop
The simplest way to make any basic single bit set-reset SR flip-flop is to connect together a pair
of cross-coupled 2-input NAND gates as shown, to form a Set-Reset Bistable also known as an
active LOW SR NAND Gate Latch, so that there is feedback from each output to one of the
other NAND gate inputs. This device consists of two inputs, one called the Set, S and the other
called the Reset, R with two corresponding outputs Q and its inverse or complement Q (not-Q)
as shown below.
The Basic SR Flip-flop

Figure 5.17: Symbol & Circuit

108
ANNA UNIVERSITY

The Set State


Consider the circuit shown above. If the input R is at logic level “0” (R = 0) and input S is at
logic level “1” (S = 1), the NAND gate Y has at least one of its inputs at logic “0” therefore, its
output Q must be at a logic level “1” (NAND Gate principles). Output Q is also fed back to input
“A” and so both inputs to NAND gate X are at logic level “1”, and therefore its output Q must be
at logic level “0”.
Again NAND gate principals. If the reset input R changes state, and goes HIGH to logic “1”
with S remaining HIGH also at logic level “1”, NAND gate Y inputs are now R = “1” and B =
“0”. Since one of its inputs is still at logic level “0” the output at Q still remains HIGH at logic
level “1” and there is no change of state. Therefore, the flip-flop circuit is said to be “Latched” or
“Set” with Q = “1” and Q = “0”.
Reset State
In this second stable state, Q is at logic level “0”, (not Q = “0”) its inverse output at Q is at logic
level “1”, (Q = “1”), and is given by R = “1” and S = “0”. As gate X has one of its inputs at logic
“0” its output Q must equal logic level “1” (again NAND gate principles). Output Q is fed back
to input “B”, so both inputs to NAND gate Y are at logic “1”, therefore, Q = “0”.
If the set input, S now changes state to logic “1” with input R remaining at logic “1”,
output Q still remains LOW at logic level “0” and there is no change of state. Therefore, the flip-
flop circuits “Reset” state has also been latched and we can define this “set/reset” action in the
following truth table.
Truth Table for this Set-Reset Function
S R Qn Qn+1 Description
0 0 0 0
No Change
0 0 1 1
0 1 0 0
Reset
0 1 1 0
1 0 0 1
Set
1 0 1 1
1 1 0 X Invalid
1 1 1 X Condition
Figure 5.18: Truth Table

It can be seen that when both inputs S = “1” and R = “1” the outputs Q and Q can be at either
logic level “1” or “0”, depending upon the state of the inputs S or R BEFORE this input
condition existed. Therefore the condition of S = R = “1” does not change the state of the
outputs Q and Q.
However, the input state of S = “0” and R = “0” is an undesirable or invalid condition and must
be avoided. The condition of S = R = “0” causes both outputs Q and Q to be HIGH together at

109
ANNA UNIVERSITY

logic level “1” when we would normally want Q to be the inverse of Q. The result is that the flip-
flop looses control of Q and Q, and if the two inputs are now switched “HIGH” again after this
condition to logic “1”, the flip-flop becomes unstable and switches to an unknown data state
based upon the unbalance as shown in the following switching diagram.
S-R Flip-flop Switching Diagram

Figure 5.19: Switching Diagram

This unbalance can cause one of the outputs to switch faster than the other resulting in the flip-
flop switching to one state or the other which may not be the required state and data corruption
will exist. This unstable condition is generally known as its Meta-stable state.
Then, a simple NAND gate SR flip-flop or NAND gate SR latch can be set by applying a logic
“0”, (LOW) condition to its Set input and reset again by then applying a logic “0” to
its Reset input. The SR flip-flop is said to be in an “invalid” condition (Meta-stable) if both the
set and reset inputs are activated simultaneously.
As we have seen above, the basic NAND gate SR flip-flop requires logic “0” inputs to flip or
change state from Q to Q and vice versa. We can however, change this basic flip-flop circuit to
one that changes state by the application of positive going input signals with the addition of two
extra NAND gates connected as inverters to the S and R inputs as shown.
Positive NAND Gate SR Flip-flop

Figure 5.20: NAND Circuit

As well as using NAND gates, it is also possible to construct simple one-bit SR Flip-flops using

110
ANNA UNIVERSITY

two cross-coupled NOR gates connected in the same configuration. The circuit will work in a
similar way to the NAND gate circuit above, except that the inputs are active HIGH and the
invalid condition exists when both its inputs are at logic level “1”, and this is shown below.
The NOR Gate SR Flip-flop

Figure 5.21: NOR Circuit

5.3.2 JK Flip-Flop
The JK Flip-flop is similar to the SR Flip-flop but there is no change in state when the J and K
inputs are both LOW.
The basic S-R NAND flip-flop circuit has many advantages and uses in sequential logic circuits
but it suffers from two basic switching problems.
1. the Set = 0 and Reset = 0 condition (S = R = 0) must always be avoided
2. if Set or Reset change state while the enable (EN) input is high the correct latching action may
not occur
Then to overcome these two fundamental design problems with the SR flip-flop design, the JK
flip Flop was developed.
This simple JK flip Flop is the most widely used of all the flip-flop designs and is considered to
be a universal flip-flop circuit. The two inputs labelled “J” and “K” are not shortened abbreviated
letters of other words, such as “S” for Set and “R” for Reset, but are themselves autonomous
letters chosen by its inventor Jack Kilby to distinguish the flip-flop design from other types.
The sequential operation of the JK flip flop is exactly the same as for the previous SR flip-flop
with the same “Set” and “Reset” inputs. The difference this time is that the “JK flip flop” has no
invalid or forbidden input states of the SR Latch even when S and R are both at logic “1”.
The JK flip flop is basically a gated SR flip-flop with the addition of a clock input circuitry that
prevents the illegal or invalid output condition that can occur when both inputs S and R are equal
to logic level “1”. Due to this additional clocked input, a JK flip-flop has four possible input
combinations, “logic 1”, “logic 0”, “no change” and “toggle”. The symbol for a JK flip flop is
similar to that of an SR Bistable Latch as seen in the previous tutorial except for the addition of a
clock input.

111
ANNA UNIVERSITY

The Basic JK Flip-flop


Both the S and the R inputs of the previous SR bistable have now been replaced by two inputs
called the J and K inputs, respectively after its inventor Jack Kilby. Then this equates
to: J = S and K = R.
The two 2-input AND gates of the gated SR bistable have now been replaced by two 3-
input NAND gates with the third input of each gate connected to the outputs at Q and Q. This
cross coupling of the SR flip-flop allows the previously invalid condition of S = “1” and R = “1”
state to be used to produce a “toggle action” as the two inputs are now interlocked.

Figure 5.22: Symbol & Circuit

If the circuit is now “SET” the J input is inhibited by the “0” status of Q through the
lower NAND gate. If the circuit is “RESET” the K input is inhibited by the “0” status
of Q through the upper NAND gate. As Q and Q are always different we can use them to control
the input. When both inputs J and K are equal to logic “1”, the JK flip flop toggles as shown in
the following truth table.
The Truth Table for the JK Function
J K Qn Qn+1 Description
0 0 0 0
No Change
0 0 1 1
0 1 0 0
Reset
0 1 1 0
1 0 0 1
Set
1 0 1 1
1 1 0 1
Toggle
1 1 1 0
Figure 5.23: Truth Table

Then the JK flip-flop is basically an SR flip flop with feedback which enables only one of its two
input terminals, either SET or RESET to be active at any one time under normal switching
thereby eliminating the invalid condition seen previously in the SR flip flop circuit.
However, if both the J and K inputs are HIGH at logic “1” (J = K = 1), when the clock input goes
HIGH, the circuit will “toggle” as its outputs switch and change state complementing each other.
This results in the JK flip-flop acting more like a T-type toggle flip-flop when both terminals are

112
ANNA UNIVERSITY

“HIGH”. However, as the outputs are fed back to the inputs, this can cause the output at Q to
oscillate between SET and RESET continuously after being complemented once.
While this JK flip-flop circuit is an improvement on the clocked SR flip-flop it also suffers from
timing problems called “race” if the output Q changes state before the timing pulse of the clock
input has time to go “OFF”. To avoid this the timing pulse period ( T ) must be kept as short as
possible (high frequency). As this is sometimes not possible with basic JK flip-flops built using
basic NAND or NOR gates, far more advanced master-slave (edge-triggered) flip-flops were
developed which are more stable.
5.3.3 D Flip-Flops
The D-type flip-flop is a modified Set-Reset flip-flop with the addition of an inverter to prevent
the S and R inputs from being at the same logic level.
One of the main disadvantages of the basic SR NAND Gate Bistable circuit is that the
indeterminate input condition of SET = “0” and RESET = “0” is forbidden.
This state will force both outputs to be at logic “1”, over-riding the feedback latching action and
whichever input goes to logic level “1” first will lose control, while the other input still at logic
“0” controls the resulting state of the latch.
But in order to prevent this from happening an inverter can be connected between the “SET” and
the “RESET” inputs to produce another type of flip flop circuit known as a Data Latch, Delay
flip flop, D-type Bistable, D-type Flip Flop or just simply a D Flip Flop as it is more generally
called.
The D Flip Flop is by far the most important of all the clocked flip-flops. By adding an inverter
(NOT gate) between the Set and Reset inputs, the S and R inputs become complements of each
other ensuring that the two inputs S and R are never equal (0 or 1) to each other at the same time
allowing us to control the toggle action of the flip-flop using one single D (Data) input.
Then this Data input, labelled “D” and is used in place of the “Set” signal, and the inverter is
used to generate the complementary “Reset” input thereby making a level-sensitive D-type flip-
flop from a level-sensitive SR-latch as now S = D and R = not D as shown.
D-type Flip-Flop Circuit

Figure 5.24: Symbol and Circuit

We remember that a simple SR flip-flop requires two inputs, one to “SET” the output and one to
“RESET” the output. By connecting an inverter (NOT gate) to the SR flip-flop we can “SET”

113
ANNA UNIVERSITY

and “RESET” the flip-flop using just one input as now the two input signals are complements of
each other. This complement avoids the ambiguity inherent in the SR latch when both inputs are
LOW, since that state is no longer possible.
Thus this single input is called the “DATA” input. If this data input is held HIGH the flip flop
would be “SET” and when it is LOW the flip flop would change and become “RESET”.
However, this would be rather pointless since the output of the flip flop would always change on
every pulse applied to this data input.
To avoid this an additional input called the “CLOCK” or “ENABLE” input is used to isolate the
data input from the flip flop’s latching circuitry after the desired data has been stored. The effect
is that D input condition is only copied to the output Q when the clock input is active. This then
forms the basis of another sequential device called a D Flip Flop.
The “D flip flop” will store and output whatever logic level is applied to its data terminal so long
as the clock input is HIGH. Once the clock input goes LOW the “set” and “reset” inputs of the
flip-flop are both held at logic level “1” so it will not change state and store whatever data was
present on its output before the clock transition occurred. In other words the output is “latched”
at either logic “0” or logic “1”.
Truth Table for the D-type Flip Flop
Clk D Qn Qn+1 Description
0 X Qn Qn Memory
no change
1 0 0 0 Reset Q » 0
1 1 1 1 Set Q » 1
Figure 5.25: Truth Table

Note that: ↓ and ↑ indicates direction of clock pulse as it is assumed D-type flip flops are edge
triggered
The Master-Slave D Flip Flop
The basic D-type flip flop can be improved further by adding a second SR flip-flop to its output
that is activated on the complementary clock signal to produce a “Master-Slave D-type flip flop”.
On the leading edge of the clock signal (LOW-to-HIGH) the first stage, the “master” latches the
input condition at D, while the output stage is deactivated.
On the trailing edge of the clock signal (HIGH-to-LOW) the second “slave” stage is now
activated, latching on to the output from the first master circuit. Then the output stage appears to
be triggered on the negative edge of the clock pulse. “Master-Slave D-type flip flops” can be
constructed by the cascading together of two latches with opposite clock phases as shown.
5.3.4 T Flip-Flop
Toggle Flip-flops are sequential logic circuits frequently used as single bit bistable storage
elements in counters, memory divices or as frequency dividers in response to a clock pulse.

114
ANNA UNIVERSITY

The Toggle Flip-flop is another type of bistable sequential logic circuit based around the
previous clocked JK flip-flop circuit. The toggle flip-flop can be used as a basic digital element
for storing one bit of information, as a divide-by-two divider or as a counter. Toggle flip-flops
have a single input and one or two complementary outputs of Q and Q which change state on the
positive edge (rising edge) or negative edge (falling edge) of an input clock signal or pulse.
Toggle flip-flops, TFF’s or simply “T-type flip-flops” are not available commercially as a
dedicated TTL or CMOS logic chip, they can be easily constructed by connecting together
the J and K inputs of a basic JK flip-flop where the J input behaves like a Set (S) command, and
the K input behaves like a Reset (R) command. We remember from our previous tutorial that the
JK flip-flop is an asynchronous flip-flop where its input condition (HIGH or LOW), and its
present steady state, both determine its next switching state.

Figure 5.26: Flip-Flops

Figure 5.27: Clock Cycle

Suppose that initially CLK and input T are both LOW (CLK = T = 0), and that output Q is HIGH
(Q = 1). At the rising edge or falling edge of a CLK pulse, the logic “0” condition present
at T prevents the output at Q from changing state. Thus the output remains unchanged when T =
0.
Now let’s suppose that input T is HIGH (T = 1) and CLK is LOW (CLK = 0). At the rising edge
(assuming positive transistion) of a CLK pulse at time t1, the output at Q changes state and
becomes LOW, making Q HIGH. The negative transistion of the clock pulse from HIGH to
LOW at time t2 has no effect on the output at Q as the flip-flop is reset into one stable state.
At the next rising edge of the clock signal at time t3, the logic “1” at T passes to Q, changing its
state making output Q HIGH again. The negative transistion of the CLK pulse at time t4 from
HIGH to LOW once again has no effect on the output. Thus the Q output of the flip-flop
“toggles” at each positive going edge (for this example) of the CLK pulse.

115
ANNA UNIVERSITY

Characteristics Table for the Toggle Function


CLK T Q Q+1
_↑¯ 0 0 0
_↑¯ 0 1 1
_↑¯ 1 0 1
_↑¯ 1 1 0
Figure 5.28: Truth Table

Then we can define the switching action of the toggle flip-flop in Boolean form as being:
Q+1 = T.Q + T.Q
Where: Q represents the present steady state of the flip-flop and Q+1 is the next switching state.
You may have noticed that the characteristic equation given in Boolean form for the toggle flip-
flop above will produce an output HIGH for the next state (Q+1) if the two inputs of T and Q are
different, and a LOW output if these inputs are the same.
This idea of Q+1 is HIGH only when either of the inputs is HIGH but not when both inputs are
HIGH, that is either input but not both represents the same Boolean Algebra expression of
an Exclusive-OR Function which is given as:
Q+1 = TQ + TQ = T XOR Q = T ⊕ Q
Then we can represent the switching action of a toggle flip-flop using a 2-input Exclusive-OR
(Ex-OR) gate.
5.3.5 Master-Slave Flip-flop
The master-slave flip-flop eliminates all the timing problems by using two SR flip-flops
connected together in a series configuration. One flip-flop acts as the “Master” circuit, which
triggers on the leading edge of the clock pulse while the other acts as the “Slave” circuit, which
triggers on the falling edge of the clock pulse. This results in the two sections, the master section
and the slave section being enabled during opposite half-cycles of the clock signal.
Master-Slave JK Flip-flop
The Master-Slave Flip-Flop is basically two gated SR flip-flops connected together in a series
configuration with the slave having an inverted clock pulse. The outputs from Q and Q from the
“Slave” flip-flop are fed back to the inputs of the “Master” with the outputs of the “Master” flip
flop being connected to the two inputs of the “Slave” flip flop. This feedback configuration from
the slave’s output to the master’s input gives the characteristic toggle of the JK flip flop as
shown below.
The input signals J and K are connected to the gated “master” SR flip flop which “locks” the input
condition while the clock (Clk) input is “HIGH” at logic level “1”. As the clock input of the “slave” flip
flop is the inverse (complement) of the “master” clock input, the “slave” SR flip flop does not toggle. The
outputs from the “master” flip flop are only “seen” by the gated “slave” flip flop when the clock input
goes “LOW” to logic level “0”.
When the clock is “LOW”, the outputs from the “master” flip flop are latched and any additional changes

116
ANNA UNIVERSITY

to its inputs are ignored. The gated “slave” flip flop now responds to the state of its inputs passed over by
the “master” section.

Figure 5.29: Logic Diagram

Then on the “Low-to-High” transition of the clock pulse the inputs of the “master” flip flop are
fed through to the gated inputs of the “slave” flip flop and on the “High-to-Low” transition the
same inputs are reflected on the output of the “slave” making this type of flip flop edge or pulse-
triggered.
Then, the circuit accepts input data when the clock signal is “HIGH”, and passes the data to the
output on the falling-edge of the clock signal. In other words, the Master-Slave JK Flip flop is a
“Synchronous” device as it only passes data with the timing of the clock signal.
The Master-Slave D Flip Flop Circuit

Figure 5.30: Master Slave D Flip Flop Circuit

We can see from above that on the leading edge of the clock pulse the master flip-flop will be
loading data from the data D input, therefore the master is “ON”. With the trailing edge of the
clock pulse the slave flip-flop is loading data, i.e. the slave is “ON”. Then there will always be
one flip-flop “ON” and the other “OFF” but never both the master and slave “ON” at the same
time. Therefore, the output Q acquires the value of D, only when one complete pulse, ie, 0-1-0 is
applied to the clock input.
5.4 Registers
A Register is a collection of flip flops. A flip flop is used to store single bit digital data. For
storing a large number of bits, the storage capacity is increased by grouping more than one flip

117
ANNA UNIVERSITY

flops. If we want to store an n-bit word, we have to use an n-bit register containing n number of
flip flops.
The register is used to perform different types of operations. For performing the operations, the
CPU use these registers. The faded inputs to the system will store into the registers. The result
returned by the system will store in the registers.
5.4.1 Shift Register
The Shift Register is another type of sequential logic circuit that can be used for the storage or
the transfer of binary data.
This sequential device loads the data present on its inputs and then moves or “shifts” it to its
output once every clock cycle, hence the name Shift Register.
A shift register basically consists of several single bit “D-Type Data Latches”, one for each data
bit, either a logic “0” or a “1”, connected together in a serial type daisy-chain arrangement so
that the output from one data latch becomes the input of the next latch and so on.
Data bits may be fed in or out of a shift register serially, that is one after the other from either the
left or the right direction, or all together at the same time in a parallel configuration.
The number of individual data latches required to make up a single Shift Register device is
usually determined by the number of bits to be stored with the most common being 8-bits (one
byte) wide constructed from eight individual data latches.
Shift Registers are used for data storage or for the movement of data and are therefore commonly
used inside calculators or computers to store data such as two binary numbers before they are
added together, or to convert the data from either a serial to parallel or parallel to serial format.
The individual data latches that make up a single shift register are all driven by a common clock
( Clk ) signal making them synchronous devices.
Shift register IC’s are generally provided with a clear or reset connection so that they can be
“SET” or “RESET” as required. Generally, shift registers operate in one of four different modes
with the basic movement of data through a shift register being:
Serial-in to Parallel-out (SIPO) - the register is loaded with serial data, one bit at a time, with
the stored data being available at the output in parallel form.
Serial-in to Serial-out (SISO) - the data is shifted serially “IN” and “OUT” of the register, one
bit at a time in either a left or right direction under clock control.
Parallel-in to Serial-out (PISO) - the parallel data is loaded into the register simultaneously and
is shifted out of the register serially one bit at a time under clock control.
Parallel-in to Parallel-out (PIPO) - the parallel data is loaded simultaneously into the register,
and transferred together to their respective outputs by the same clock pulse.
The effect of data movement from left to right through a shift register can be presented

118
ANNA UNIVERSITY

graphically as:

Figure 5.31: Shift Register

Also, the directional movement of the data through a shift register can be either to the left, (left
shifting) to the right, (right shifting) left-in but right-out, (rotation) or both left and right shifting
within the same register thereby making it bidirectional. In this tutorial it is assumed that all the
data shifts to the right, (right shifting).
Serial-in to Parallel-out (SIPO) Shift Register

Figure 5.32 4-bit Serial-in to Parallel-out Shift Register

Let’s assume that all the flip-flops ( FFA to FFD ) have just been RESET ( CLEAR input ) and
that all the outputs QA to QD are at logic level “0” ie, no parallel data output.
If a logic “1” is connected to the DATA input pin of FFA then on the first clock pulse the output
of FFA and therefore the resulting QA will be set HIGH to logic “1” with all the other outputs
still remaining LOW at logic “0”. Assume now that the DATA input pin of FFA has returned
LOW again to logic “0” giving us one data pulse or 0-1-0.
The second clock pulse will change the output of FFA to logic “0” and the output
of FFB and QB HIGH to logic “1” as its input D has the logic “1” level on it from QA. The logic
“1” has now moved or been “shifted” one place along the register to the right as it is now at QA.
When the third clock pulse arrives this logic “1” value moves to the output of FFC ( QC ) and so
on until the arrival of the fifth clock pulse which sets all the outputs QA to QD back again to logic
level “0” because the input to FFA has remained constant at logic level “0”.
The effect of each clock pulse is to shift the data contents of each stage one place to the right,
and this is shown in the following table until the complete data value of 0-0-0-1 is stored in the
register. This data value can now be read directly from the outputs of QA to QD.
Then the data has been converted from a serial data input signal to a parallel data output. The
truth table and following waveforms show the propagation of the logic “1” through the register

119
ANNA UNIVERSITY

from left to right as follows.


Basic Data Movement Through A Shift Register
Clock Pulse No QA QB QC QD
0 0 0 0 0
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
5 0 0 0 0
Figure 5.33:Truth Table

Figure 5.34: Waveform

Note that after the fourth clock pulse has ended the 4-bits of data ( 0-0-0-1 ) are stored in the
register and will remain there provided clocking of the register has stopped. In practice the input
data to the register may consist of various combinations of logic “1” and “0”. Commonly
available SIPO IC’s include the standard 8-bit 74LS164 or the 74LS594.
Serial-in to Serial-out (SISO) Shift Register
This shift register is very similar to the SIPO above, except were before the data was read
directly in a parallel form from the outputs QA to QD, this time the data is allowed to flow
straight through the register and out of the other end. Since there is only one output,
the DATA leaves the shift register one bit at a time in a serial pattern, hence the name Serial-in
to Serial-Out Shift Register or SISO.
The SISO shift register is one of the simplest of the four configurations as it has only three
connections, the serial input (SI) which determines what enters the left hand flip-flop, the serial
output (SO) which is taken from the output of the right hand flip-flop and the sequencing clock
signal (Clk). The logic circuit diagram below shows a generalized serial-in serial-out shift
register.

120
ANNA UNIVERSITY

Figure 5.35: 4-bit Serial-in to Serial-out Shift Register

You may think what’s the point of a SISO shift register if the output data is exactly the same as
the input data. Well this type of Shift Register also acts as a temporary storage device or it can
act as a time delay device for the data, with the amount of time delay being controlled by the
number of stages in the register, 4, 8, 16 etc or by varying the application of the clock pulses.
Commonly available IC’s include the 74HC595 8-bit Serial-in to Serial-out Shift Register all
with 3-state outputs.
Parallel-in to Serial-out (PISO) Shift Register
The Parallel-in to Serial-out shift register acts in the opposite way to the serial-in to parallel-out
one above. The data is loaded into the register in a parallel format in which all the data bits enter
their inputs simultaneously, to the parallel input pins PA to PD of the register. The data is then
read out sequentially in the normal shift-right mode from the register at Q representing the data
present at PA to PD.
This data is outputted one bit at a time on each clock cycle in a serial format. It is important to
note that with this type of data register a clock pulse is not required to parallel load the register as
it is already present, but four clock pulses are required to unload the data.

Figure 5.36: 4-bit Parallel-in to Serial-out Shift Register

As this type of shift register converts parallel data, such as an 8-bit data word into serial format,
it can be used to multiplex many different input lines into a single serial DATA stream which
can be sent directly to a computer or transmitted over a communications line. Commonly
available IC’s include the 74HC166 8-bit Parallel-in/Serial-out Shift Registers.
Parallel-in to Parallel-out (PIPO) Shift Register
The final mode of operation is the Parallel-in to Parallel-out Shift Register. This type of shift
register also acts as a temporary storage device or as a time delay device similar to the SISO
configuration above. The data is presented in a parallel format to the parallel input
pins PA to PD and then transferred together directly to their respective output pins QA to QD by

121
ANNA UNIVERSITY

the same clock pulse. Then one clock pulse loads and unloads the register. This arrangement for
parallel loading and unloading is shown below.

Figure 5.37: 4-bit Parallel-in to Parallel-out Shift Register

The PIPO shift register is the simplest of the four configurations as it has only three connections,
the parallel input (PI) which determines what enters the flip-flop, the parallel output (PO) and the
sequencing clock signal (Clk).
Similar to the Serial-in to Serial-out shift register, this type of register also acts as a temporary
storage device or as a time delay device, with the amount of time delay being varied by the
frequency of the clock pulses. Also, in this type of register there are no interconnections between
the individual flip-flops since no serial shifting of the data is required.
5.5 MEMORIES
A memory unit is a collection of storage cells together with associated circuits needed to
transform information in and out of the device.
Sequential logic differs from combinational logic in that the output of the logic device is
dependent not only on the present inputs to the device, but also on past inputs; i.e., the output of
a sequential logic device depends on its present internal state and the present inputs. This
implies that a sequential logic device has some kind of memory of at least part of its “history”'
(i.e. its previous inputs).
A simple memory device can be constructed from combinational devices with, which we are
already familiar. By a memory device, we mean a device that can remember if a signal of logic
level 0 or 1 has been connected to one of its inputs, and can make this fact available at an
output. A very simple, but still useful, memory device can be constructed from a simple OR
gate, as shown in Figure 5.38.

Figure 5.38 Sequential Digital Circuits

In this memory device, if A and Q are initially at logic 0, then Q remains at logic 0.
However if the single input A ever becomes a logic 1, then the output Q will be logic 1 ever
after, regardless of any further changes in the input at A. In this simple memory, the output is

122
ANNA UNIVERSITY

a function of the state of the memory element only; after the memory is “written”' then it
cannot be changed back. However, it can be “read”. Such a device could be used as a simple
read only memory, which could be “programmed” only once. Often a state table or timing
diagram is used to describe the behaviour of a sequential device.

Figure 5.39: Sequential Circuit Timing Diagram

Figure 5.39 shows both a state table and a timing diagram for this simple memory. The state
table shows the state which the device enters after an input (the ``next state''), for all possible
states and inputs. For this device, the output is the value stored in the memory.
In particular, it is possible to inadvertently construct devices for which the output is not
determined by the inputs, and for which it is not possible to predict the output.
Semiconductor Memory
Semiconductor memory is used in any electronics assembly that uses computer processing
technology. Semiconductor memory is the essential electronics component needed for any
computer based PCB assembly.
In addition to this, memory cards have become commonplace items for temporarily storing data -
everything from the portable flash memory cards used for transferring files, to semiconductor
memory cards used in cameras, mobile phones and the like.
The use of semiconductor memory has grown, and the size of these memory cards has increased
as the need for larger and larger amounts of storage is needed.
To meet the growing needs for semiconductor memory, there are many types and technologies
that are used. As the demand grows new memory technologies are being introduced and the
existing types and technologies are being further developed.
A variety of different memory technologies are available - each one suited to different
applications. Names such as ROM, RAM, EPROM, EEPROM, Flash memory, DRAM, SRAM,
SDRAM, as well as F-RAM and MRAM are available, and new types are being developed to
enable improved performance.
Terms like DDR3, DDR4, DDR5 and many more are seen and these refer to different types of
SDRAM semiconductor memory.
In addition to this the semiconductor devices are available in many forms - ICs for printed board
assembly, USB memory cards, Compact Flash cards, SD memory cards and even solid state hard

123
ANNA UNIVERSITY

drives. Semiconductor memory is even incorporated into many microprocessor chips as on-board
memory.
Types of Semiconductor Memory
Electronic semiconductor memory technology can be split into two main types or categories,
according to the way in which the memory operates:
Random Access Memory (RAM): It is a type of memory in which all addresses are accessible
in an equal amount of time and can be selected in any order for a read or write operation. It is
meaning of RAM that it can access all necessary data and file programs randomly from cache
memory, and it is also known as “Primary Memory“, “Main Memory”, “Internal Memory”.
RAM is hardware part of computer which is embedded on the motherboard. All RAMs have both
read and write capability. Because RAMs lose stored data when the power is turned off, they are
volatile memories.
There are two important memory devices in the RAM family: SRAM and DRAM. The main
difference between them is the duration of the data stored. Static RAM (SRAM) retains its
contents as long as electrical power is applied to the chip. However, if the power is turned of for
lost temporarily, its contents will be lost forever. Dynamic RAM (DRAM), on the other hand,
has an extremely short data lifetime usually less than a quarter of a second. This is true even
when power is applied continuously.
SRAM: Full form of SRAM is “Static Random Access Memory”
Static RAM is able to retain all information into static form until power supply is turn off, so due
to this nature this memory is known as volatile memory. Main objective of using the static RAM
is to make Cache Memory. It is more expensive as well as its power consumption is more when
compared to dynamic RAM. Static RAM are faster than dynamic RAM. In SRAM, all data has
been stored in flip-flop. Flip-flop contains the every bit of this Ram. Flip-flop uses 4-6 transistors
for making a memory cell and its circuit do not need to refreshment continuously.
As long as dc power is applied to a static memory cell, it can retain a 1 or 0 state indefinitely. If
power is removed, the stored data bit is lost. The cell is selected by an active level on the Select
line and a data bit (l or 0) is written into the cell by placing it on the Data in line. A data bit is
read by taking it off the Data out line.
a).Types of SRAM: There are different types of SRAM, such as –
Non-Volatile SRAM: It is capable to store all information when power supply gets turn off.
Pseudo SRAM: It uses the self -refresh circuit, but it is slow speed to static RAM
b). Basic SRAM Organization: Basic Static Memory Cell Array
The memory cells in a SRAM are organized in rows and columns. All the cells in a row share
the same Row Select line. Each set of Data in and Data out lines go to each cell in a given
column and are connected to a single data line that serves as both an input and output (Data I/O)

124
ANNA UNIVERSITY

through the data input and data output buffers.


SRAM chips can be organized in single bits, nibbles (4 bits), bytes (8 bits), or multiple bytes
(16, 24, 32 bits, etc.). The memory cell array is arranged in 256 rows and 128 columns, each
with 8 bits as shown below. There are actually 215 = 32,768 addresses and each address
contains 8 bits. The capacity of this example memory is 32,768 bytes (typically expressed as 32
Kbytes).

Figure.5.40:Memory array configuration

c).Operation:SRAM cell is designed with two inverters, which are cross-linked like as latch
form. This latch is made connection to two bit line along with two transistors T1 and T2. Now
both transistors are capable to alter their modes (open or close) under control of word line, and
this entire process is controlled by address decoder. When word line goes to ground level then
both transistors get turned off, and latch starts to retain own state.

Figure.5.41: SRAM cell design using inverters

i).Read Operation: Both switches T1 and T2 are closed while activating the word line. When,
cell comes to state 1 then signal flows in high amount on b line and other side signal flows in
low amount on b’ line. Opposite is true when cell goes to state 0. Finally both b and b’ get
complement of each other’s. Sense/write, which are connected in the rear side of two bit line,
they monitor their states and finally convert into output respectively.
ii).Write Operation: In the write operation, Sense/Write circuit allows to drive bit lines b and it
complement b’, and then it provides accurate values on bit line b and b’ as well as go to activate
word line.
DRAM: Full form of DRAM is “Dynamic Random Access Memory”
DRAM is another type of semiconductor memory, and it is designed specially to store data or
program files which are needed by computer processors for performing their functions.
In DRAM, several capacitors are used for storing every bit of data. This is very simple path to

125
ANNA UNIVERSITY

save data in its memory because it needs small area to store same data to SRAM as well as it is
capable to store massive data than to SRAM but it requires the frequently refreshing of its circuit
for its charging, so it consumes more power compare to SRAM.
a).Types of DRAM:
SDRAM: It stands for “Synchronous Dynamic Access Memory”, and it can access any element
of data within 25 to 10 nano second. SDRAM are used in the DIMM (dual in-line memory
module) along with 168 contacts. In which, all data are stored with the help of capacitors using
IC’s “Integrated Circuits”, and it is inserted into its specific slot, which is embedded on the
motherboard.
ADRAM: Asynchronous DRAM is basic form of the DRAM, and Asynchronous DRAM is
enabling to different connections like as power, address inputs, and bidirectional data lines. It
controls the timing of all memory devices with asynchronously nature, and memory controller
circuit arises the useful control signals to control timing.
b).Operation: DRAM was invented by Robert Dennard in 1966, at IBM. DRAM uses two
elements as a storage cell like as transistor and capacitor. To keep charge or discharge of
capacitors to be used the transistor. If logic high or “1” it means capacitor is fully charged
otherwise it is discharged then its logic low or “0”. All operations of charging or discharging are
performed by work line and bit line as shown in figure.

Figure.5.42: Operation of DRAM

i).Write Operation: In this operation, Voltage is supplied on bit line as well as signal is supplied
on the address line for closing the transistor.
ii).Reading Operation: While storing the information on the cell, then transistor is turned on
and voltage is supplied for bit line. Due to this process, some charge is stored in the capacitors.
After some time transistor is turned off mode, and it goes to discharge. Hence, entire information
is stored in the cell which can be read easily.
Read Only Memory (ROM): A ROM contains permanently or semi-permanently stored data,
which can be read from the memory but either cannot be changed at all or cannot be changed
without specialization equipment. A ROM stores data that are used repeatedly in system
applications, such as tables, conversions, or programmed instructions for system initialization
and operation. ROMs retain stored data when the power is OFF and are therefore nonvolatile
memories.
126
ANNA UNIVERSITY

The ROMs are classified as follows:


i. Masked ROM (ROM)
ii. Programmed ROM (PROM)
iii. Erasable PROM (EPROM)
iv. Electrically Erasable PROM (EEPROM)
i. Masked ROM
The mask ROM is usually referred to simply as a ROM. It is permanently programmed during
the manufacturing process to provide widely used standard functions, such as popular
conversions, or to provide user-specified functions. Once the memory is programmed, it cannot
be changed.
Most IC ROMs utilize the presence or absence of a transistor connection at a row/column
junction to represent a 1 or a 0. The presence of a connection from a row line to the gate of a
transistor represents a 1 at that location because when the row line is taken HIGH; all
transistors with a gate connection to that row line turn on and connect the HIGH (1) to the
associated column lines. At row/column junctions where there are no gate connections, the
column lines remain LOW (0) when the row is addressed.

Figure.5.43.ROM Cells

ii. PROM (Programmable Read-Only Memory)


The PROM (Programmable Read-only memory), comes from the manufacturer unprogrammed
and are custom programmed in the field to meet the user‘s needs.A PROM uses some type of
fusing process to store bits, in which a memory link is burned open or left intact to represent a
0 or a 1. The fusing process is irreversible; once a PROM is programmed, it cannot be
changed.
The fusible links are manufactured into the PROM between the source of each cell's transistor
and its column line. In the programming process, a sufficient current is injected through the
fusible link to bum it open to create a stored O. The link is left intact for a stored 1. All drains are
commonly connected to VDD.
iii. EPROM (Erasable Programmable ROM)
An EPROM is an erasable PROM. Unlike an ordinary PROM, an EPROM can be reprogrammed

127
ANNA UNIVERSITY

if an existing program in the memory array is erased first. An EPROM uses an NMOSFET array
with an isolated-gate structure. The isolated transistor gate has no electrical connections and can
store an electrical charge for indefinite periods of time. The data bits in this type of array are
represented by the presence or absence of a stored gate charge. Erasure of a data bit is a process
that removes the gate charge.
Two basic types of erasable PROMs are the ultraviolet erasable PROM (UV EPROM) and the
electrically erasable PROM (EEPROM).
iv. EEPROM (Electrically Erasable Programmable ROM)
EEPROM is programmed and erased electrically. It can be erased and reprogrammed about ten
thousand times. Both erasing and programming take about 4 to 10 ms (millisecond). In
EEPROM, any location can be selectively erased and programmed. EEPROMs can be erased
one byte at a time, rather than erasing the entire chip. Hence, the process of reprogramming is
flexible but slow.
5.6 COUNTERS
A counter is a sequential circuit that cycles through a prescribed sequence of states upon the
application of input pulses. The state of the counter is stored in Flip-Flops. An n-bit counter
has n Flip-Flops and can cycle through at most 2nstates.
5.6.1 Asynchronous Counter
Asynchronous Counters use flip-flops which are serially connected together so that the input
clock pulse appears to ripple through the counter.
An Asynchronous counter can have 2n-1 possible counting states e.g. MOD-16 for a 4-bit
counter, (0-15) making it ideal for use in Frequency Division applications. But it is also possible
to use the basic asynchronous counter configuration to construct special counters with counting
states less than their maximum output number. For example, modulo or MOD counters.
This is achieved by forcing the counter to reset itself to zero at a pre-determined value producing
a type of asynchronous counter that has truncated sequences. Then an n-bit counter that counts
up to its maximum modulus ( 2n ) is called a full sequence counter and a n-bit counter whose
modulus is less than the maximum possible is called a truncated counter.
But why would we want to create an asynchronous truncated counter that is not a MOD-4,
MOD-8, or some other modulus that is equal to the power of two. The answer is that we can by
using combinational logic to take advantage of the asynchronous inputs on the flip-flop.
If we take the modulo-16 asynchronous counter and modified it with additional logic gates it can
be made to give a decade (divide-by-10) counter output for use in standard decimal counting and
arithmetic circuits.
Such counters are generally referred to as Decade Counters. A decade counter requires resetting
to zero when the output count reaches the decimal value of 10, ie. when DCBA = 1010 and to do

128
ANNA UNIVERSITY

this we need to feed this condition back to the reset input. A counter with a count sequence from
binary “0000” (BCD = “0”) through to “1001” (BCD = “9”) is generally referred to as a BCD
binary-coded-decimal counter because its ten state sequence is that of a BCD code but binary
decade counters are more common.
Asynchronous Decade Counter

Figure 5.44: Decade Counter

This type of asynchronous counter counts upwards on each trailing edge of the input clock signal
starting from 0000 until it reaches an output 1001 (decimal 9). Both outputs QA and QD are now
equal to logic “1”. On the application of the next clock pulse, the output from the 74LS10
NAND gate changes state from logic “1” to a logic “0” level.
As the output of the NAND gate is connected to the CLEAR ( CLR ) inputs of all the 74LS73 J-
K Flip-flops, this signal causes all of the Q outputs to be reset back to binary 0000 on the count
of 10. As outputs QA and QD are now both equal to logic “0” as the flip-flop’s have just been
reset, the output of the NAND gate returns back to a logic level “1” and the counter restarts again
from 0000. We now have a decade or Modulo-10 up-counter.
Decade Counter Truth Table
Clock Output bit Pattern Decimal
Count QD QC QB QA Value
1 0 0 0 0 0
2 0 0 0 1 1
3 0 0 1 0 2
4 0 0 1 1 3
5 0 1 0 0 4
6 0 1 0 1 5
7 0 1 1 0 6
8 0 1 1 1 7
9 1 0 0 0 8
10 1 0 0 1 9
11 Counter Resets its Outputs back to Zero
Figure 5.45: Truth Table

129
ANNA UNIVERSITY

Decade Counter Timing Diagram

Figure 5.46: Timing Diagram

By using the same idea of truncating counter output sequences, the above circuit could easily be
adapted to other counting cycles be simply changing the connections to the inputs of
the NAND gate or by using other logic gate combinations
5.6.2 Synchronous Counter
Synchronous Counters are so called because the clock input of all the individual flip-flops within
the counter are all clocked together at the same time by the same clock signal.
In the previous Asynchronous binary counter tutorial, we saw that the output of one counter
stage is connected directly to the clock input of the next counter stage and so on along the chain.
The result of this is that the Asynchronous counter suffers from what is known as “Propagation
Delay” in which the timing signal is delayed a fraction through each flip-flop.
However, with the Synchronous Counter, the external clock signal is connected to the clock
input of EVERY individual flip-flop within the counter so that all of the flip-flops are clocked
together simultaneously (in parallel) at the same time giving a fixed time relationship. In other
words, changes in the output occur in “synchronisation” with the clock signal.
The result of this synchronisation is that all the individual output bits changing state at exactly
the same time in response to the common clock signal with no ripple effect and therefore, no
propagation delay.

Figure 5.47: Binary 4-bit Synchronous Up Counter

It can be seen above, that the external clock pulses (pulses to be counted) are fed directly to each

130
ANNA UNIVERSITY

of the JK Flip-flops in the counter chain and that both the J and K inputs are all tied together in
toggle mode, but only in the first flip-flop, flip-flop FFA (LSB) are they connected HIGH, logic
“1” allowing the flip-flop to toggle on every clock pulse. Then the synchronous counter follows a
predetermined sequence of states in response to the common clock signal, advancing one state
for each pulse.
The J and K inputs of flip-flop FFB are connected directly to the output QA of flip-flop FFA, but
the J and K inputs of flip-flops FFC and FFD are driven from separate AND gates which are also
supplied with signals from the input and output of the previous stage. These
additional AND gates generate the required logic for the JK inputs of the next stage.
If we enable each JK flip-flop to toggle based on whether or not all preceding flip-flop outputs
(Q) are “HIGH” we can obtain the same counting sequence as with the asynchronous circuit but
without the ripple effect, since each flip-flop in this circuit will be clocked at exactly the same
time.
Then as there is no inherent propagation delay in synchronous counters, because all the counter
stages are triggered in parallel at the same time, the maximum operating frequency of this type of
frequency counter is much higher than that for a similar asynchronous counter circuit.

Figure 5.48: 4-bit Synchronous Counter Waveform Timing Diagram

Because this 4-bit synchronous counter counts sequentially on every clock pulse the resulting
outputs count upwards from 0 ( 0000 ) to 15 ( 1111 ). Therefore, this type of counter is also
known as a 4-bit Synchronous Up Counter.
However, we can easily construct a 4-bit Synchronous Down Counter by connecting
the AND gates to the Q output of the flip-flops as shown to produce a waveform timing diagram
the reverse of the above. Here the counter starts with all of its outputs HIGH ( 1111 ) and it
counts down on the application of each clock pulse to zero, ( 0000 ) before repeating again.
As synchronous counters are formed by connecting flip-flops together and any number of flip-
flops can be connected or “cascaded” together to form a “divide-by-n” binary counter, the
modulo’s or “MOD” number still applies as it does for asynchronous counters so a Decade
counter or BCD counter with counts from 0 to 2n-1 can be built along with truncated sequences.

131
ANNA UNIVERSITY

All we need to increase the MOD count of an up or down synchronous counter is an additional
flip-flop and AND gate across it.

Figure 5.49: Binary 4-bit Synchronous Down Counter

Decade 4-bit Synchronous Counter


A 4-bit decade synchronous counter can also be built using synchronous binary counters to
produce a count sequence from 0 to 9. A standard binary counter can be converted to a decade
(decimal 10) counter with the aid of some additional logic to implement the desired state
sequence. After reaching the count of “1001”, the counter recycles back to “0000”. We now have
a decade or Modulo-10 counter.

Figure 5.50: Decade 4-bit Synchronous Counter

The additional AND gates detect when the counting sequence reaches “1001”, (Binary 10) and
causes flip-flop FF3 to toggle on the next clock pulse. Flip-flop FF0 toggles on every clock
pulse. Thus, the count is reset and starts over again at “0000” producing a synchronous decade
counter.
We could quite easily re-arrange the additional AND gates in the above counter circuit to
produce other count numbers such as a Mod-12 counter which counts 12 states from”0000″ to
“1011” (0 to 11) and then repeats making them suitable for clocks, etc.
Triggering A Synchronous Counter
Synchronous Counters use edge-triggered flip-flops that change states on either the “positive-
edge” (rising edge) or the “negative-edge” (falling edge) of the clock pulse on the control input
resulting in one single count when the clock input changes state.
Generally, synchronous counters count on the rising-edge which is the low to high transition of

132
ANNA UNIVERSITY

the clock signal and asynchronous ripple counters count on the falling-edge which is the high to
low transition of the clock signal.

Figure 5.51: Edge Triggered Flip flop

It may seem unusual that ripple counters use the falling-edge of the clock cycle to change state,
but this makes it easier to link counters together because the most significant bit (MSB) of one
counter can drive the clock input of the next.
This works because the next bit must change state when the previous bit changes from high to
low – the point at which a carry must occur to the next bit. Synchronous counters usually have a
carry-out and a carry-in pin for linking counters together without introducing any propagation
delays.
Learning Activity

Differentiate types of memory.

Eelctronic Flip Flop


In electronics, a flip-flop or latch is a circuit that has two stable states and can be used to store
state information – a bistable multivibrator. The circuit can be made to change state by signals
applied to one or more control inputs and will have one or two outputs. It is the basic storage
element in sequential logic. Flip-flops and latches are fundamental building blocks of digital
electronics systems used in computers, communications, and many other types of systems.
Flip-flops and latches are used as data storage elements. A flip-flop is a device which stores a
single bit (binary digit) of data; one of its two states represents a "one" and the other
represents a "zero". Such data storage can be used for storage of state, and such a circuit is
described as sequential logic in electronics. When used in a finite-state machine, the output
and next state depend not only on its current input, but also on its current state (and hence,
previous inputs). It can also be used for counting of pulses, and for synchronizing variably-
timed input signals to some reference timing signal.
Flip-flops can be either level-triggered (asynchronous, transparent or opaque) or edge-
triggered (synchronous, or clocked). The term flip-flop has historically referred generically to
both level-triggered and edge-triggered circuits that store a single bit of data using gates.

133
ANNA UNIVERSITY

Recently, some authors reserve the term flip-flop exclusively for discussing clocked circuits;
the simple ones are commonly called transparent latches. Using this terminology, a level-
sensitive flip-flop is called a transparent latch, whereas an edge-triggered flip-flop is simply
called a flip-flop. Using either terminology, the term "flip-flop" refers to a device that stores a
single bit of data, but the term "latch" may also refer to a device that stores any number of bits
of data using a single trigger. The terms "edge-triggered", and "level-triggered" may be used
to avoid ambiguity.
When a level-triggered latch is enabled it becomes transparent, but an edge-triggered flip-
flop's output only changes on a single type (positive going or negative going) of clock edge.

An animated interactive SR latch ( R1, R2 = 1 kΩ; R3, R4 = 10 kΩ).


Questions:
1. Discuss the use of electronic flip-flop.
2. What is edge triggered flip-flop?

1. A combinational circuit produces an output based on input variable only,


but a Sequential circuit produces an output based on current input and
previous input variables.
2. Counters are used in digital electronics for counting purpose, they can
count specific event happening in the circuit.
SUMMARY
• A sequential circuit consists of combinational circuit to which memory elements
are connected to form a feedback path.
• Asynchronous sequential logic is not synchronised by a clock signal; the outputs of
the circuit change directly in response to changes in inputs.
• A latch is an electronic logic circuit that has two inputs and one output.
• Flip-flop is a circuit that has two stable states and can be used to store state
information.
• A register stores data and moves or shifts data.

134
ANNA UNIVERSITY

• Shift Registers is a series of flip-flops, sharing the same clock, in which the output of
each flip-flop is connected to the data input of the next flip-flop in the chain.
• A memory device is a device which can remember if a signal of logic level 0 or 1 has
been connected to one of its inputs, and can make this fact available at an output.
• Counter is a sequential circuit (Finite state machine) that cycles through a fixed
sequence of states.
• Asynchronous Counters can easily be made from Toggle or D-type flip-flops.
• Synchronous counters are easier to design than asynchronous counters.
• Synchronous counters are sometimes called parallel counters as the clock is fed in
parallel to all flip-flops.
KEYWORDS
Sequential Logic: A type of logic circuit whose output depends not only on the present
value of its input signals but on the past history of its inputs.
Synchronous Circuit: A circuit that has an electronic oscillator, which is called a clock. It
generates a sequence of repetitive pulses called the clock signal that is distributed to all the
memory elements in the circuit.
Asynchronous Sequential Logic: A logic synchronised by a clock signal; the outputs of
the circuit change directly in response to changes in inputs.
Latch: An electronic logic circuit that has two inputs and one output.
Flip-flop: A circuit that has two stable states and can be used to store state information.
Setup Time: The minimum amount of time the data signal should be held steady before
the clock event.
Hold Time: The minimum amount of time the data signal should be held steady after the
clock event.
Recovery Time: The time between active clock edge and asynchronous signal going
inactive.
Counter: A sequential circuit that cycles through a fixed sequence of states.
Register: Stores and moves or shifts data.
SELF ASSESMENT QUESTIONS
Short Answer Questions
1. What is sequential circuit?
2. What is the difference between latch and flip flop?
3. Does sequential circuit contain memory element?
4. What is the difference between asynchronous counter and synchronous counter ?
5. How many types of shift register counters are there and write their names also?
6. Write the characteristic equation of a JK flip-flop.
Long Answer Questions
1. Explain the different types of shift register.
2. Explain the operation of D-FF.

135
ANNA UNIVERSITY

3. Describe the working of master slave flip-flop.


4. Explain the different ways to construct a basic flip-flop.
5. Explain in detail about SR Latch.
FURTHER READING
1. Mano, M. Morris, and M. Morris Mano. 2007. “Digital design”. Englewood Cliffs, N.J.:
Prentice-Hall.
2. Mohammad A. Karim, Xinghao Chen. 2007. “Digital Design Basic Concepts and Principles”.
3. William Gothmann H .1977. “Digital Electronics : An Introduction To Theory And Practice”.
4. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
5. Mano Morris. 2012. “Logic and Computer Design Fundamentals”,Cram101.

136
ANNA UNIVERSITY

UNIT III- Processor Fundamentals


Chapter 6- Processor Basics

CONTENTS
Learning Objectives
Learning Outcome
Overview
6.1 Von Neumann Architecture
6.2 Processor
6.2.1 Definition
6.2.2 Structure
6.2.3 Category
6.2.4 Technology
6.3 ALU Concept
6.3.1 1-Bit ALU
6.4 Stored Program
6.5 Fetch Execution Cycle
6.6 Instruction Formats
6.6.1 The Opcode Size
6.6.2 Addressing Modes
6.7 Clock Rate and Instruction Rate
6.7.1 Clock Rate
6.7.2 Instruction Rate
6.7.3 Pipeline
6.8 Current Processors
Summary
Keywords
Self-Assessment Questions
Further Reading

137
ANNA UNIVERSITY

LEARNING OBJECTIVES
After studying this chapter, you can able to:
• Understand the structure of Von Neumann Architecture
• Gain knowledge about concept of processor
• Analyze Arithmetic Logic Unit (ALU)
• Understand the fetch execute cycle
• Know the concept of clock rate and pipelining
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• To Analyze the structure of Von Neumann Architecture
• Recall the definition, structure and category processor
• How to design ALU and explain floating-point representation
• The concept of fetch execution cycle and instruction formats
OVERVIEW
In the past chapter, we have examined the idea of different successive circuits, for example,
hooks, goes back and forth, registers, recollections and counters. We have examined every one of
these circuits with their various kinds.
In this chapter, you will find out about the nuts and bolts of processor. Initially, you will
comprehend Von Neumann Architecture. Von Neumann Architecture is an early, persuasive
kind of registering structure. It essentially comprises of memory chips that can both hold and
interaction information. Each chip can perform various undertakings, contingent upon how it is
influenced by the activity executed before it. This exercise will likewise cover the definition and
structure of the processor alongside the ALU idea, stored program, fetch execute cycle, and so
forth.
Fundamentally, the processor is the piece of a computer system that executes the guidelines of a
computer program. In this manner the processor permits you to play out the essential logical,
arithmetical and input and output tasks of the system.
6.1 VON NEUMANN ARCHITECTURE
In 1945, John von Neumann, who was a mathematician at the time, had delved into the study
that, a computer could have a fixed simple structure and still be able to execute any kind of
computation without hardware modification. This is providing that the computer is properly
programmed with proper instructions, in which it is able to execute them.
Von Neumann’s primary advancement was referred to as “conditional control transfer”, which
had allowed a program sequence to be interrupted and then reinitiated at any point, furthermore
this advancement had allowed data to be stored with instructions in the same memory unit.
This was beneficial because if instructions were desired, they can be arithmetically modified in
the same way as the data.
The von Neumann architecture describes a design model for a stored program digital computer

138
ANNA UNIVERSITY

that incorporates only one single processing unit and a one single separate storage structure,
which will hold both instructions and data.
The von Neumann architecture refers to one that keeps the data as well as the programmed
instructions in read-write RAM (Random Access Memory).
Characteristics of von Neumann Architecture
As mentioned above, the von Neumann Architecture is based on the fact that the program data
and the instruction data are stored in the same memory unit. This can also be referred to as the
“stored program concept”.
This design is still used in the computer produced nowadays:

Figure 6.1: General Structure of Von Neumann Architecture

Central Processing Unit (CPU):


The CPU is an electronic circuit, which executes instructions of the computer program. The CPU
can also be referred to as a microprocessor or a processor. Within the CPU, there is the an ALU,
CU, and the registers, which are described in more detail below:
Control Unit:
Controls the operation of the ALU, memory, and input/output, instructing them how to respond
to the instructions from the program it had just read and interpreted from the memory unit. The
control unit directs the operations of the CPU by executing the following jobs:
• Coordinating and controlling activities of the CPU
• Managing data flow between other components and the CPU
• Acknowledging and accepting the next instruction
• Decoding instructions
• Storing the resulting data back into a memory unit
Arithmetic and Logical Unit (ALU):
The ALU allows arithmetic (add, subtract etc) and logic (AND, OR, NOT etc) operations to be

139
ANNA UNIVERSITY

carried out.
Memory Unit:
Consists of RAM, which is partitioned out and consists of an address and its contents, which are
in binary form. RAM (Random Access Memory) is a fast type of memory unlike hard drives, it
is also directly accessible by the CPU. The existence of RAM in a CPU, allows it to function a
lot quicker and hence more efficiently.
Registers:
Small block in the CPU that consists of a high-speed storage memory cells that store data before
it is processed, all logical, arithmetic, and shift operations occur here.
The register consist of 5 components
• Program Counter (PC): Holds the address of the next instruction to be executed
• Accumulator (AC): Storage area where logic and arithmetic results are stored
• Memory Address Register (MAR): Holds the address of a location of the data that is to
be read from or written to
• Memory Data Register (MDR): Temporary storage location that stores data that has
been read, or data that still needs to be written
• Current Instruction Register (CIR): Area where the current instruction is being carried
out. The operation is divided into operand and opcode.
Operand: Contains data or the address of the data (where the operation will be performed)
Opcode: Specifies type of instruction to be executed
Buses:
These are a set of parallel wires, which connect components (two or more) inside the CPU.
Within the CPU, there are three types of buses, and these are all referred to a system bus. The
types of buses are: Data bus, Control bus, and Address bus.
Data bus: This bus is referred to as a bi-directional bus, which means “bits” can be carried in
both ways. This bus is used to transport data and instructions between the processor, memory
unit, and the input/output.
Address bus: Transmits the memory address specifying where the relevant data needs to be sent
or retrieved from. (The address bus does not carry the data, only the address)
Control bus: This is also a bi-directional bus used to transmit “control signals”/commands from
the CPU (and status signals from other components) in order to control and coordinate all the
activities within the computer.
Input/Outputs:
Information passed from the user/information received by the user.
Advantages
• The control unit retrieves instruction and data in the same way from one memory

140
ANNA UNIVERSITY

unit. This simplifies the development and design of the control unit.
• The above advantage would also mean that data from memory and from devices are
accessed the same way. Therefore increasing efficiency
• An advantageous characteristics is that programmers have control of memory
organization.
6.2 PROCESSOR
In this section, we will discuss the definition, structure, categories, and technologies related
to processor.
6.2.1 Definition
In computing, a processor or processing unit is a digital circuit which performs operations on
some external data source, usually memory or some other data stream. It typically takes the form
of a microprocessor, which can be implemented on a single metal–oxide–
semiconductor integrated circuit chip. The processor is a chip or a logical circuit that responds
and processes the basic instructions to drive a particular computer. The main functions of the
processor are fetching, decoding, executing, and write back the operations of an instruction. The
processor is also called the brain of any system which incorporates computers, laptops,
smartphones, embedded systems, etc. The ALU (Arithmetic Logic Unit) and CU (Control Unit)
are the two parts of the processors. The Arithmetic Logic Unit performs all mathematical
operations such as additions, multiplications, subtractions, divisions, etc and the control unit
works like traffic police, it manages the command or the operation of the instructions. The
processor communicates with the other components also they are input/output devices and
memory/storage devices.
6.2.2 Structure
The processor alone is incapable of successfully performing any tasks. It requires memory (for
program and data storage), support logic, and at least one I/O device (“input/output device”) used
to transfer data between the computer and the outside world. The basic computer system is
shown in Figure 6.2.

Figure 6.2: Basic computer system

141
ANNA UNIVERSITY

A microprocessor is a processor implemented (usually) on a single, integrated circuit. With the


exception of those found in some large supercomputers, nearly all modern processors are
microprocessors, and the two terms are often used interchangeably. Common microprocessors in
use today are the Intel Pentium series, Freescale/IBM PowerPC, MIPS, ARM, and the Sun
SPARC, among others. A microprocessor is sometimes also known as a CPU (Central
Processing Unit).
A microcontroller is a processor, memory, and some I/O devices contained within a single,
integrated circuit, and intended for use in embedded systems. The buses that interconnect the
processor with its I/O exist within the same integrated circuit. The range of available
microcontrollers is very broad. They range from the tiny PICs and AVRs (to be covered in this
book) to PowerPC processors with built-in I/O, intended for embedded applications. In this book,
we will look at both microprocessors and microcontrollers.
Microcontrollers are very similar to System-on-Chip (SoC) processors, intended for use in
conventional computers such as PCs and workstations. SoC processors have a different suite of
I/O, reflecting their intended application, and are designed to be interfaced to large banks of
external memory. Microcontrollers usually have all their memory on-chip and may provide only
limited support for external memory devices.
The memory of the computer system contains both the instructions that the processor will
execute and the data it will manipulate. The memory of a computer system is never empty. It
always contains something, whether it be instructions, meaningful data, or just the random
garbage that appeared in the memory when the system powered up.
Instructions are read (fetched) from memory, while data is both read from and written to
memory, as shown in Figure 6.3.

Figure 6.3: Data flow


This form of computer architecture is known as a Von Neumann machine, named after John Von
Neumann, one of the originators of the concept. With very few exceptions, nearly all modern
computers follow this form. Von Neumann computers are what can be termed control-flow
computers. The steps taken by the computer are governed by the sequential control of a program.
In other words, the computer follows a step-by-step program that governs its operation.
6.2.3 Category
Based on the overall performance and the type of work designed for, computer processors are

142
ANNA UNIVERSITY

grouped into three main categories. There could be other applicable methods to classify
computer processors.
However, this grouping will help you to ease the complication when selecting a processor both
for desktop and mobile computers. Computer processors are grouped into the following three
divisions. This grouping doesn’t include processors for Server and Workstation PCs.
1. High-End Processors
Processors in this group are designed for intensive applications, since the programs require high
processing power. What makes high-end is the advanced microprocessor technology
incorporated with these types of processors.
Other than the normal usage of computer applications, if you are into Statistical analysis,
intensive graphics, creating and editing professional videos, extreme 3D gaming, multitasking
and multi-threading application then you should choose a computer installed with a processor
categorized in high-end group.
Both Intel and AMD introduced processors categorized as high-end. The latest Intel Core I
Series processors: i3, i5, i7, i9 and AMD Phenom family processors are among high-end CPU’s.
The prices of high-end processors are a bit higher comparing with the rest CPU’s.
2. Mid-End computer processor types
Mi-range processors are meant for middle intensive tasks. Beyond the standard work you do with
mid-range processors, you can do tasks such as basic 3D Gaming, casual photo editing, home
video creating, and multimedia applications.
Some of the common mid-range Intel processors include Intel Core 2 Quad, Intel Core 2
Extreme, Intel Core 2 Duo, Intel Pentium Dual Core and Intel Core Duo/Solo. AMD Phenom 1
X3, AMD Turion family and AMD Athlon family processors are categorized into mid-range
CPUs.
3. Basic or economy computer processor types
As the name implies, processors in this group are low performing CPUs with cheap price. If you
are into non-intensive tasks such as simple gaming, office applications, internet browsing, email
and common graphics, then your choice of processor grouped as budget CPUs.
AMD Sempron, AMD Athlon Neo and Intel Atom, Intel Centrino, Centrino Duo and Celeron are
grouped into economy processors.
6.2.4 Technology
In this section, we will discuss technologies related to the processor.
RISC Architecture
Reduced Instruction Set Computer (RISC), is a type of computer architecture which operates on
small, highly optimised set of instructions, instead of a more specialised set of instructions,

143
ANNA UNIVERSITY

which can be found in other types of architectures. This architecture means that the computer
microprocessor will have fewer cycles per instruction.
The word “Reduced Instruction Set” may be incorrectly interpreted to refer to “reduced number
of instructions”. Though this is not the case, the term actually means that the amount of work
done by each instruction is decreased in terms of number of cycles.
Due to the design of Alan Turing 1946 Automatic Computing Engine, it had many
characteristics that resembled RISC architecture, furthermore many traits of RISC architectures
were seen in the 1960s due to them embodying the load/store approach.
That being said the term RISC had first been used by David Patterson of “Berkeley RISC
project”, who is considered to be a pioneer in his RISC processor designs. Patterson is currently
the Vice Chair of Board of Directors of the RISC-V Foundation.
A RICS chip doesn’t require many transistors, which makes them less costly to design and to
produce. One of RISCs main characteristics is that the instruction set contains relatively simple
and basic instruction from which more complex instructions can be produced.
RISC processors/architectures are used across a wide range of platforms nowadays, ranging from
tablet computers to smartphones, as well as supercomputers.
CISC Architecture
CISC, which stands for “Complex Instruction Set Computer”, is computer architecture where
single instructions can execute several low level operations, for instance, “load from memory an
arithmetic operation, and a memory store). CISC processors are also capable of executing multi-
step operations or addressing modes with single instructions.
CISC, as with RISC, is a type of microprocessor that contains specialised simple/complex
instructions.
Until recent times, all major manufacturers of microprocessors had used CISC based designs to
develop their products. The reason for that was because, CISC was introduced around the early
1970’s, where it was used for simple electronic platforms, such as stereos, calculators, video
games, not personal computers, therefore allowing the CISC technology to be used for these
types of applications, as it was more suitable.
However, eventually, CISC microprocessors found their way into personal computers, this was
to meet the increasing need of PC users. CISC manufactures started to focus their efforts from
general-purpose designs to a high performance computing orientation.
6.3 ALU CONCEPT
Inside a computer, there is an Arithmetic Logic Unit (ALU), which is capable of performing
logical operations (e.g. AND, OR, Ex-OR, Invert etc.) in addition to the arithmetic operations
(e.g. Addition, Subtraction etc.). The control unit supplies the data required by the ALU from
memory, or from input devices, and directs the ALU to perform a specific operation based on the

144
ANNA UNIVERSITY

instruction fetched from the memory. ALU is the “calculator” portion of the computer.

Figure 6.4: ALU

An arithmetic logic unit(ALU) is a major component of the central processing unit of the a
computer system. It does all processes related to arithmetic and logic operations that need to be
done on instruction words. In some microprocessor architectures, the ALU is divided into the
arithmetic unit (AU) and the logic unit (LU).
An ALU can be designed by engineers to calculate many different operations. When the
operations become more and more complex, then the ALU will also become more and more
expensive and also takes up more space in the CPU and dissipates more heat. That is why
engineers make the ALU powerful enough to ensure that the CPU is also powerful and fast, but
not so complex as to become prohibitive in terms of cost and other disadvantages.
ALU is also known as an Integer Unit (IU). The arithmetic logic unit is that part of the CPU that
handles all the calculations the CPU may need. Most of these operations are logical in nature.
Depending on how the ALU is designed, it can make the CPU more powerful, but it also
consumes more energy and creates more heat. Therefore, there must be a balance between how
powerful and complex the ALU is and how expensive the whole unit becomes. This is why faster
CPUs are more expensive, consume more power and dissipate more heat.
Different operation as carried out by ALU can be categorized as follows –
logical operations − These include operations like AND, OR, NOT, XOR, NOR, NAND, etc.
Bit-Shifting Operations − This pertains to shifting the positions of the bits by a certain number
of places either towards the right or left, which is considered a multiplication or division
operations.
Arithmetic operations − This refers to bit addition and subtraction. Although multiplication and
division are sometimes used, these operations are more expensive to make. Multiplication and
subtraction can also be done by repetitive additions and subtractions respectively.
6.3.1 1-Bit ALU
The logical operations are easiest, because they map directly onto the hardware components. The
1-bit logical unit for AND and OR looks like Figure 6.5. T e multiplexor on the right then selects
a AND b or a OR b, depending on whether the value of Operation is 0 or 1. T e line that controls
the multiplexor is shown in color to distinguish it from the lines containing data. Notice that we

145
ANNA UNIVERSITY

have renamed the control and output lines of the multiplexor to give them names that reflect the
function of the ALU. The next function to include is addition. An adder must have two inputs for
the operands and a single-bit output for the sum. There must be a second output to pass on the
carry, called CarryOut. Since the CarryOut from the neighbor adder must be included as an
input, we need a third input. This input is called CarryIn. Figure 6.6 shows the inputs and the
outputs of a 1-bit adder. Since we know what addition is supposed to do, we can specify the
outputs of this “black box” based on its inputs, as Figure 6.8 demonstrates. We can express the
output functions CarryOut and Sum as logical equations, and these equations can in turn be
implemented with logic gates. Let’s do CarryOut. Figure 6.9 shows the values of the inputs when
CarryOut is a 1.
We can turn this truth table into a logical equation:
Carryout = (b.CarryIn) + (a.CarryIn) + (a.b) + (a.b.CarryIn)

Figure 6.5: The 1-bit logical unit for AND and OR

Figure 6.6: A 1bit-adder


This adder is called a full adder; it is also called a (3,2) adder because it has 3 inputs and 2
outputs. An adder with only the a and b inputs is called a (2,2) adder or half-adder.

Figure 6.7: Truth Table

146
ANNA UNIVERSITY

If a . b . CarryIn is true, then all of the other three terms must also be true, so we can leave out
this last term corresponding to the fourth line of the table. We can thus simplify the equation to
CarryOut= (b.CarryIn) +(a.CarryIn)+(a.b)
Figure 6.9 shows that the hardware within the adder black box for CarryOut consists of three
AND gates and one OR gate. The three AND gates correspond exactly to the three parenthesized
terms of the formula above for CarryOut, and the OR gate sums the three terms.

Figure 6.8: Values of Inputs when CarryOut is a 1

Figure 6.9: Adder hardware for the CarryOut signal

The Sum bit is set when exactly one input is 1 or when all three inputs are 1. The Sum results in
a complex Boolean equation (recall that a means NOT a):

The drawing of the logic for the Sum bit in the adder black box is left as an exercise for the
reader.
Figure 6.10 shows a 1-bit ALU derived by combining the adder with the earlier components.
Sometimes designers also want the ALU to perform a few more simple operations, such as
generating 0. The easiest way to add an operation is to expand the multiplexor controlled by the
Operation line and, for this example, to connect 0 directly to the new input of that expanded
multiplexor.

Figure 6.10: A 1-bit ALU that performs AND, OR, and addition

147
ANNA UNIVERSITY

6.4 STORED PROGRAM


Stored-program computer, a computer that stores instructions in its memory to enable it to
perform a variety of tasks in sequence or intermittently. The idea was introduced in the late
1940s by John von Neumann, who proposed that a program be electronically stored in binary-
number format in a memory device so that instructions could be modified by the computer as
determined by intermediate computational results. (Previous computers had their program
instructions determined by the wiring of the machine or by holes on punched tape.) Other
engineers, notably John W. Mauchly and J. Presper Eckert, contributed to this idea, which
enabled digital computers to become much more flexible and powerful. Nevertheless, engineers
in England built the first stored-program computer, the Manchester Mark I, shortly before the
Americans built EDVAC, both operational in 1949.
You can see the von Neumann’s stored program computer architecture in Figure 6.11.

Figure 6.11:Von Numann’s Stored Program

Let us discuss some vital aspects of stored program computer architecture. First significant
aspect of stored program computer architecture is given below:
• It allows programs to load and store into the processor from the main memory with ease.
• Same set of control signal(s) and debugged, the numbers representing its instructions can
be written down onto a storage device, allowing the program to load back into the (main)
memory in the future.
Second ascpects of the stored-program abstraction is given below:
• Programs that treat themselves as data can also function as the self-modifying programs.
• This aspects is perhaps even more important than first.
• It allows programs to treat themselves or other programs as data.
• It facilitates compliers, debugging, programming tools designing.
6.5 FETCH EXECUTION CYCLE
The basic operation of a computer is called the ‘fetch-execute’ cycle. The CPU is designed to
understand a set of instructions - the instruction set. It fetches the instructions from the
main memory and executes them. This is done repeatedly from when the computer is booted up
to when it is shut down.

148
ANNA UNIVERSITY

The CPU fetches the instructions one at a time from the main memory into the registers. One
register is the program counter (pc). The pc holds the memory address of the next instruction to
be fetched from main memory.
• The CPU decodes the instruction.
• The CPU executes the instruction.
• Repeat until there are no more instructions.
A single piece of program code might require several instructions. Look at this Python (3.x)
code:
area = length * width
First, the computer needs to load in the value of the variable length into the immediate access
store (registers). Next it needs to load in the value of the variable width. Then it needs to
multiply the two numbers together, and finally it needs to store the result in the variable area.
6.6 INSTRUCTION FORMATS
One of the first lessons a beginner learns about computers is that in order to use a computer, a
program must be written, translated into machine language, and loaded into memory; that such a
program consists of machine instructions, and that a machine instruction is a number. In this
chapter we look at the details of machine instructions, and in particular the fact that a machine
instruction is rarely a single number. It normally consists of several numbers that are assembled
(placed together) by the assembler, to become the fields of the complete instruction. As a result,
any machine instruction has a certain format. The instruction is divided into fields, each a
number. A machine instruction must contain at least one field, namely the operation code
(opcode), that tells the control unit what the instruction is supposed to do. Most instructions
contain other fields specifying registers, memory addresses, immediate quantities, addressing
modes, and other parameters.

Figure 6.12: Instruction Formats

Figure 6.12 illustrates typical instruction formats. They range from the simplest to the very
complex and they demonstrate the following properties of machine instructions:
• Instructions can have different sizes. The size depends on the number and nature of the
individual fields.

149
ANNA UNIVERSITY

• The opcode can have different sizes. The opcode can also be broken up into a number of
separate fields.
• A field containing an address is much larger than a register field. This is because a
register number is small, typically 3–4 bits, while an address is large (typically 25–35
bits).
• Fields containing immediate operands (numbers used as data) can have different sizes.
Experience shows that most data items used by programs are small. Thus, a well-
designed instruction set should allow for both small and large data items, resulting in
short instructions whenever possible.
The last point, about short instructions, is important and leads us to the next topic of discussion,
namely the properties of a good instruction set. When a new computer is designed, one of the
first features that has to be determined and fully specified is the instruction set. First, the
instruction set has to be designed as a whole, then the designers decide what instructions should
be included in the set. The general form of the instruction set depends on features such as the
intended applications of the computer, the word size, address size, register set, memory
organization, and bus organization. Instruction sets vary greatly, but a good design for an
instruction set is based on a small number of principles that are independent of the particular
computer in question. These principles demand that an instruction set should satisfy the
following requirements:
1. Short instructions.
2. An instruction size that is both compatible with the word size, and variable.
1. Short instructions: Why should machine instructions be short? Not because short instructions
occupy less memory. Memory isn’t expensive and modern computers support large memories.
Also, not because short instructions execute faster. The execution time of an instruction has
nothing to do with its size. A “register divide” instruction, to divide the contents of two floating-
point registers, is short, since it only has to specify the registers involved. It takes, however, a
long time to execute.
Exercise 6.1: Find an example of a long instruction with fast execution. The reason why short
instructions are preferable is that it takes less time to fetch them from memory. An instruction
that is longer than one memory word has to be stored in at least two words. It therefore takes two
memory cycles to fetch it from memory. Cutting the size of an instruction such that it fits in one
word, cuts down the fetch time. Even though the fetch time is short (it is measured in
nanoseconds), many instructions are located inside loops and have to be executed (and therefore
fetched) many times. Also, since memory is slower than the processor, speeding up the
instruction fetch can speed up the entire computer.
2. Instruction size: The instruction size should also be compatible with the computer’s word
size. The best design results in instruction sizes of N, 2N and 3N where N is the word size.

150
ANNA UNIVERSITY

Instruction sizes such as 1.2N or 0.7N do not make any sense, since each memory read brings in
exactly one word. In a computer with long words, several instructions can be packed in one word
and, as a result, instruction sizes of N/2, N/3, and N/4 also make sense. The two requirements
above are satisfied by the use of variable-size opcodes and addressing modes. These two
important concepts are discussed next.
6.6.1 The Opcode Size
If the instruction is to be short, individual fields in the instruction should be short. In this section
we look at the opcode field. Obviously, the opcode cannot be too short or there would not be
enough codes for all the instructions. An opcode size of 6–8 bits, allowing for 64–256
instructions, is common. Most modern computers, however, use variable size opcodes, for two
good reasons. One reason has to do with the instruction size in relation to the word size, and the
other has to do with future extensions of the instruction set.
The first reason is easy to understand. We want our instructions to be short, but some instructions
have to contain more information than others and are naturally longer. If the opcode size varies,
longer instructions can be assigned short opcodes. Other instructions, with short operands, can be
assigned longer opcodes. This way the instruction size can be fine-tuned to fit in precisely N or
2N bits.
The second advantage of variable-size opcodes has to do with extensions to an original
instruction set. When a computer becomes successful and sells well, the manufacturer may
decide to come up with a new, extended, and upward compatible version of the original
computer. The 68000, 80x86, and Pentium families are familiar examples.
Upward compatibility means that any program running on the original computer should also run
on the new one. The instruction set of the new computer must therefore be an extension of the
original instruction set. With variable-size opcodes, such an extension is easy.
When an instruction is fetched, the hardware does not know what its opcode is. It has to decode
the instruction first. In a computer with variable-size opcodes, when an instruction is fetched, the
control unit does not even know how long the opcode is. It has to start by identifying the opcode
size, following which it can decode the instruction. Thus, with variable-size opcodes, the control
unit has to work harder. Three methods to implement variable-size opcodes are described here.
Prefix codes. We start by simply choosing several opcodes with the sizes that we need. For
example, if we need a 3-bit opcode in order to adjust the size of instruction A to one word, a 6-
bit opcode to adjust the size of instruction B to one word, and other opcodes with sizes 2, 4, and
7 bits, we can select the set of opcodes 000, 101010, 11, 0001, and 1110001. A little thinking,
however, shows that our opcodes cannot be decoded uniquely. Imagine an instruction with
opcode 0001 being fetched in Step 1 of the fetch-execute cycle and decoded in Step 3. The
control unit examines the first few bits of the instruction and finds them to be 0001... . It does not
know whether the opcode is the four bits 0001 or the three bits 000 (with the 1 being the start of
the next field). The problem exists because we selected the bit pattern 000 as both a code and the

151
ANNA UNIVERSITY

start of another code. The solution is to avoid such a situation. We have to design our opcodes
following the simple rule:
Once a bit pattern has been assigned as an opcode, no other opcode should start with that pattern
The five opcodes above can now be redesigned as, for example, 000, 101010, 11, 100, and
0100100. We say that these opcodes satisfy the prefix property, or that they are prefix codes.
These codes are discussed in more detail in Section 3.11. It is easy to see, by checking each of
the five codes, that they can be uniquely decoded. When the control unit finds, for example, the
pattern 000 it knows that this must be the first code, since no other codes start with 000. When it
finds 10... , it knows that this must be either code 2 or code 4. It checks the next bit to find out
which code it is.
Fixed prefix. The first 2–4 bits of the instruction tell what the opcode size is (and even what the
instruction format is). Figure 6.13 shows four instruction formats, denoted by 0–3 and identified
by the first two opcode bits. When the control unit fetches an instruction that starts with 00, it
knows that the next three bits are the opcode, so there can be eight instructions with a 5-bit
opcode (including the identification bits). Similarly, when the control unit fetches an instruction
that starts with 01, it knows that the opcode consists of the next two bits plus three more bits
elsewhere in the instruction, so there can be 32 instructions with a 7-bit opcode (including the
two identification bits).
Exercise 6.2: How many instructions can there be with the other two opcode sizes?

Figure 6.2: Opcodes with two identification bit

The average size of an opcode in this instruction set is therefore:

With fixed-size opcodes, the opcode size for a set of 120 instructions would be seven bits. The
advantage of this method is that several opcode sizes are available and the opcode size can easily
be determined. The disadvantages are: (1) the instruction set cannot be extended; there is no
room for new opcodes and (2) because of the prefix, the average opcode is longer than in the
case of fixed-size opcodes.
6.6.2 Addressing Modes
This is an important feature of computers. We start with the known fact that many instructions
have to include addresses; the instructions should be short, but addresses tend to be long.
Addressing modes are a solution to this problem. Using addressing modes, a short instruction
may specify a long address.

152
ANNA UNIVERSITY

The idea is that the instruction no longer includes the full address (also called the effective
address, or EA), but instead contains a number, normally called a displacement, that’s closely
related to the address. Another field, an addressing mode, is also added to the instruction. The
addressing mode is a code that tells the control unit how to obtain the effective address from the
displacement. It is more accurate to say that the addressing mode is a rule of calculation, or a
function, that uses the displacement as its main argument, and other hardware components—such
as the PC, the registers, and memory locations—as secondary arguments, and produces the EA
as a result.
The notation EA = fM(Disp,PC, Regs, Mem), where the subscript M indicates the mode is
convenient. For different modes there are different functions, but the idea is the same.
Before any individual modes are discussed, it may be useful to look at some of the numbers
involved when computers, memories, and instructions are designed.
Up until the mid 1970s, memories were expensive, and computers supported small memories. A
typical memory size in a second generation computer was 16K–32K words (24–48 bits/word),
and in an early third generation computer, 32K–64K words (48–60 bits/word). Today, with much
lower hardware prices, modern computers can access much larger memories. Most of the early
microprocessors could address 64K bytes, and most of today’s microprocessors can address
between 32M words and a few tera words (usually 8-bit words, bytes). As a result, those
computers must handle long addresses. Since 1M (1 mega) is defined as 1024 K = 1024 × 210 =
220, an address in a 1M memory is 20 bits long. In a 32M memory, an address is 25 bits long.
The Motorola 68000 microprocessor [Kane 81] and the Intel 80386 [Intel 85] generate 32-bit
addresses, and can thus physically address 4G (giga) bytes. (their virtual address space, though,
is 64 tera bytes, = 246.) Computers on the drawing boards right now could require much longer
addresses.
6.7 CLOCK RATE AND INSTRUCTION RATE
Processor circuits are controlled by a timing signal called clock. The clock designer the regular
time intervals called clock cycles. To execute a machine instruction the processor divides the
action to be performed into a sequence of basic steps that each step can be completed in one
clock cycle. The length P of one clock cycle is an important parameter that affects the processor
performance. Processor used in today’s personal computer and work station have a clock rates
that range from a few hundred million to over a billion cycles per second.
We now focus our attention on the processor time component of the total elapsed time. Let ‘T’
be the processor time required to execute a program that has been prepared in some high-level
language. The compiler generates a machine language object program that corresponds to the
source program. Assume that complete execution of the program requires the execution of N
machine cycle language instructions. The number N is the actual number of instruction execution
and is not necessarily equal to the number of machine cycle instructions in the object program.
Some instruction may be executed more than once, which in the case for instructions inside a
program loop others may not be executed all, depending on the input data used.

153
ANNA UNIVERSITY

Suppose that the average number of basic steps needed to execute one machine cycle instruction
is S, where each basic step is completed in one clock cycle. If clock rate is ‘R’ cycles per second,
the program execution time is given by
T=N*S/R
this is often referred to as the basic performance equation.
We must emphasize that N, S & R are not independent parameters changing one may affect
another. Introducing a new feature in the design of a processor will lead to improved
performance only if the overall result is to reduce the value of T.
Individual instructions still require several clock cycles to complete. But for the purpose of
computing T, effective value of S is 1.
A higher degree of concurrency can be achieved if multiple instructions pipelines are
implemented in the processor. This means that multiple functional units are used creating parallel
paths through which different instructions can be executed in parallel with such an arrangement,
it becomes possible to start the execution of several instructions in every clock cycle. This mode
of operation is called superscalar execution.
If it can be sustained for a long time during program execution the effective value of S can be
reduced to less than one. But the parallel execution must preserve logical correctness of
programs that is the results produced must be same as those produced by the serial execution of
program instructions. Now days many processors are designed in this manner.
6.7.1 Clock Rate
These are two possibilities for increasing the clock rate ‘R’.
1. Improving the IC technology makes logical circuit faster, which reduces the time of execution
of basic steps. This allows the clock period P, to be reduced and the clock rate R to be increased.
2. Reducing the amount of processing done in one basic step also makes it possible to reduce the
clock period P. however if the actions that have to be performed by an instructions remain the
same, the number of basic steps needed may increase.
Increase in the value ‘R’ that are entirely caused by improvements in IC technology affects all
aspects of the processor’s operation equally with the exception of the time it takes to access the
main memory. In the presence of cache the percentage of accesses to the main memory is small.
Hence much of the performance gain excepted from the use of faster technology can be realized.
6.7.2 Instruction Rate
Simple instructions require a small number of basic steps to execute. Complex instructions
involve a large number of steps. For a processor that has only simple instruction a large number
of instructions may be needed to perform a given programming task. This could lead to a large
value of ‘N’ and a small value of ‘S’ on the other hand if individual instructions perform more
complex operations, a fewer instructions will be needed, leading to a lower value of N and a

154
ANNA UNIVERSITY

larger value of S. It is not obvious if one choice is better than the other.
But complex instructions combined with pipelining (effective value of S ¿ 1) would achieve one
best performance. However, it is much easier to implement efficient pipelining in processors
with simple instruction sets.
RISC and CISC are computing systems developed for computers. Instruction set or instruction
set architecture is the structure of the computer that provides commands to the computer to guide
the computer for processing data manipulation. Instruction set consists of instructions,
addressing modes, native data types, registers, interrupt, exception handling and memory
architecture. Instruction set can be emulated in software by using an interpreter or built into
hardware of the processor. Instruction Set Architecture can be considered as a boundary between
the software and hardware. Classification of microcontrollers and microprocessors can be done
based on the RISC and CISC instruction set architecture.
6.7.3 Pipeline
We assume that instructions are executed one after the other. Hence the value of S is the total
number of basic steps, or clock cycles, required to execute one instruction. A substantial
improvement in performance can be achieved by overlapping the execution of successive
instructions using a technique called pipelining.
Consider Add R1 R2 R3
This adds the contents of R1 & R2 and places the sum into R3.
The contents of R1 & R2 are first transferred to the inputs of ALU. After the addition operation is
performed, the sum is transferred to R3. The processor can read the next instruction from the
memory, while the addition operation is being performed. Then of that instruction also uses, the
ALU, its operand can be transferred to the ALU inputs at the same time that the add instructions
is being transferred to R3. In the ideal case if all instructions are overlapped to the maximum
degree possible the execution proceeds at the rate of one instruction completed in each clock
cycle.
6.8 CURRENT PROCESSORS
Different kinds of processors have flooded the market. As you know, the two major
manufacturers of computer microprocessors are Intel and Advanced Micro Devices (AMD).
When it comes to speed, quality and efficiency, their products lead the market. Both companies
make single-core and multi-core processors. Intel’s desktop CPUs consist of Core, Pentium and
Celeron whereas AMD’s desktop processors consist of Athlon, phenom ans Sempron. Intel also
makes Pentium M, Celeron M and Core mobile processors for notebooks where AMD makes
versions of its Athlon, Turion and Sempron processors.
Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor
company based in Santa Clara, California, that develops computer processors and related
technologies for business and consumer markets. While it initially manufactured its own

155
ANNA UNIVERSITY

processors, the company later outsourced its manufacturing, a practice known as going fabless,
after Global Foundries was spun off in 2009. AMD's main products
include microprocessors, motherboard chipsets, embedded processors and graphics
processors for servers, workstations, personal computers and embedded system applications.
Recently both processor lunched latest processor in the market. Some of them discussed below:
Sandy Bridge-E:
Sandy Bridge is the codename for Intel's 32 nm microarchitecture used in the second generation
of the Intel Core processors (Core i7, i5, i3). The Sandy Bridge microarchitecture is the
successor to Nehalem and Westmere microarchitecture. Intel demonstrated a Sandy Bridge
processor in 2009, and released first products based on the architecture in January 2011 under
the Core brand.[2][3]
Sandy Bridge is manufactured in the 32 nm process and has a soldered contact with the die and
IHS (Integrated Heat Spreader), while Intel's subsequent generation Ivy Bridge uses a 22 nm die
shrink and a TIM (Thermal Interface Material) between the die and the IHS.
The main specification of the processor i7-3970X are:

Haswell:
Haswell processors will have 4 CPU cores with Hyper-Threading support, 8 MB L3 cache on
the i7-4900 series, and 6 MB cache on the i7-4800 series. The i7 processors will be clocked
from 2.7 GHz to 3 GHz, reaching as high as 3.7 - 3.9 GHz when Turbo Boost feature is active.
The i7-4930MX is supposed to be unlocked, and could be over clocked to achieve higher
performance. All processors will support vPro technology, VT-d (I/O virtualization) and
Trusted Execution. The CPUs will integrate HD 4600 graphics with 400 MHz base frequency.
The graphics unit will have Turbo technology enabled, that will increase GPU frequency up to
1.35 GHz on the unlocked i7-4930MX, and up to 1.3 GHz on the i7-4800MQ and i7-4900MQ
parts.
Feature Value
Number of Cores 4
Number of Threads 8
Clock Speed 3.5 GHz
Code Size 22nm
TDP 84 W
Memory Bandwidth 25.6 G/s

156
ANNA UNIVERSITY

Piledriver:
The CPU indeed runs at 4.0GHz frequency and turbo core speed of 4.2GHz. The CPU consists
of 16MB Cache (8MB L3/8MB L2), Supports DDR3-1866MHz+ and is compatible with the
AM3+ Socket. TDP of this part would be set at 125W.
With the architecture improvements in Piledriver, we could witness an 10-15% Multitasking
performance improvement with the FX-8350 over FX-8150.
Feature Value
Number of Cores 8
Number of Threads 8
Clock Speed 4.0 GHz
Code Size 32nm
TDP 125 W
Memory Bandwidth 19 G/s
Richland:
Richland core is manufactured on 32nm technology, and it combines 1 or 2 Piledriver modules
(2 or 4 CPU cores), and Radeon HD 8000 series graphics. Compared to its predecessor, Richland
core doesn't have any new CPU features, however the integrated GPU is upgraded to Radeon HD
8000 series graphics. Both the CPU and GPU portions chips offer better performance mainly due
to higher clock speeds. AMD also added support for DDR3-2133 memory to A10-Series APUs.
Mobile Desktop Richland APUs are available in a socket FM2-compatible package. Mainstream
mobile processors work in existing socket FS1 (FS1r2) motherboards, whereas mobile ULV
processors are produced in a BGA package.
Feature Value
Number of Cores 4
Number of Threads 4
Clock Speed 4.1 GHz
Code Size 32nm
TDP 100W
Memory
Bandwidth
Steamroller:
AMD Steamroller Family 15h is a microarchitecture developed by AMD for AMD APUs, which
succeeded Piledriver in the beginning of 2014 as the third-generation Bulldozer-based
microarchitecture. Steamroller APUs continue to use two-core modules as their predecessors,
while aiming at achieving greater levels of parallelism.

Learning Activity

How opcode size can be allocated to instruction format.

157
ANNA UNIVERSITY

AMD Processor
Advanced Micro Devices, Inc. (AMD), global company that specializes in manufacturing
semiconductor devices used in computer processing. The company also produces flash
memories, graphics processors, motherboard chip sets, and a variety of components used in
consumer electronics goods. The company is a major supplier of microprocessors (computer
chips). AMD is based in Sunnyvale, California.
AMD was founded in 1969 by Walter Jeremiah (Jerry) Sanders, a former executive at
Fairchild Semiconductor Corporation, and seven others. The company released its first
product in 1970 and went public two years later. In the mid-1970s the company began
producing computer chips. Starting out as a second-source manufacturer of computer chips,
the company placed a great emphasis on quality and grew steadily. In 1982 the company
began supplying second-source chips for the Intel Corporation, which made the
microprocessor used in IBM personal computers (PCs). The agreement with Intel ended in
1986. In 1991 AMD released the Am386 microprocessor family, a reverse-engineered chip
that was compatible with Intel’s next-generation 32-bit 386 microprocessor. There ensued a
long legal battle that was finally decided in a 1994 U.S. Supreme Court ruling in AMD’s
favour. That same year, Compaq Computer Corporation contracted with AMD to produce
Intel-compatible chips for their computers. In 1996 AMD acquired a microprocessor
company known as NexGen and began branching out from the Intel-compatible chip market.
In 2000 AMD introduced the Athlon processor, which was designed to run the Microsoft
Corporation’s Windows operating system. With the release of the Athlon processor, AMD
became the first company to produce a 1-GHz (gigahertz) microprocessor, which marked
AMD as a serious competitor in the chip market. In 2003 the company released the Opteron
chip, another product that showcased the company’s ability to produce high-end chips. In
2006 AMD absorbed ATI Technologies, a manufacturer of video graphics cards for use in
PCs. In 2008 AMD announced plans to split the company in two—with one part designing
microprocessors and the other manufacturing them. This announcement followed news that
the Advanced Technology Investment Company and the Mubadala Development Company,
both based in Abu Dhabi, would acquire a controlling interest in AMD, pending approval by
shareholders and the U.S. and German governments. In 2009, following a series of complaints
lodged by AMD, the European Commission fined rival Intel a record €1.06 billion (£948
million; $1.45 billion) for engaging in anticompetitive practices that violated the European
Union’s antitrust laws. These practices allegedly involved financially compensating and
providing rebates to manufacturers and retailers who favoured its computer chips over those
of AMD, as well as paying manufacturers to cancel or postpone the launching of products
utilizing AMD’s chips. In 2014 the company was restructured into two parts: computing and
graphics, which made processors for personal computers, and enterprise, embedded, and
semi-custom, which made more-specialized.

158
ANNA UNIVERSITY

On a scale of 1-10, AMD processors come at 5-10. It is cheaper than Intel Processors at a
similar range. These processors are efficient compared to the current generation Core series.
AMD APUs are also a good option for their good iGPU performance and comparable CPU
performance to Core i series. Laptops powered with Ryzen processors often clock lower and
less aggressively compared to Intel-powered laptops, they often run cooler and longer on
battery, thus for laptops, when higher iGPU performance and longer battery life is preferred,
Ryzen powered laptops can be used. Although, when building a new Desktop PC, older FX
series CPUs A-series APUs and should be avoided for their higher power consumption and
heat output.
If we talk about the desktop, mobile, and you only want to do normal gaming and for
everyday use, then Ryzen APU is the way to go. For heavier tasks like video editing, 3D
modelling, etc, Ryzen 7 or 9 CPUs or Threadripper should be preferred.
For Ryzen Desktop CPUs and APUs in the AM4 platform, the motherboard chipset should be
checked for support otherwise PC may not boot, although it can be easily solved with
motherboards with USB BIOS flashing for newer processors.
Example – AMD Ryzen, AMD Threadripper, AMD FX-Series, AMD EPYC, AMD Opteron,
AMD Athlon 64
Question:
1. Describe brief about AMD Processor.

1. Processors can be found in PCs, smartphones, tablets and other


computers. The two main competitors in the processor market
are Intel and AMD.
2. The instruction cycle is the cycle that the central processing unit follows
from boot-up until the computer has shut down in order to process
instructions.

SUMMARY
• The von Neumann architecture—also known as the von Neumann model or Princeton
architecture—is a computer architecture based on a 1945 description by John von
Neumann and others in the First Draft of a Report on the EDVAC.
• A central processing unit, also called a central processor, main processor or just
processor, is the electronic circuitry that executes instructions comprising a computer
program.
• An arithmetic-logic unit (ALU) is the part of a computer processor (CPU) that carries out
arithmetic and logic operations on the operands in computer instruction words.
• Stored-program computer, a computer that stores instructions in its memory to enable it
to perform a variety of tasks in sequence or intermittently.

159
ANNA UNIVERSITY

• An instruction format defines layout of bits of an instruction, in terms of its constituent


parts.
• Clocks per instruction (CPI) is an effective average. It is averaged over all of the
instruction executions in a program.
• CPI is affected by instruction-level parallelism and by instruction complexity.
KEYWORDS
Von Neumann Architecture- Von Neumann architecture is based on the stored-program
computer concept, where instruction data and program data are stored in the same memory.
Processor- A processor (CPU) is the logic circuitry that responds to and processes the basic
instructions that drive a computer.
Fetch Cycle-A cycle begins with retrieving the address stored in the Program Counter.
Execute Cycle-A cycle that executes the instruction that was fetched and decoded.
Clock Rate- The clock rate or clock speed typically refers to the frequency at which the clock
generator of a processor can generate pulses, which are used to synchronize the operations of its
components, and is used as an indicator of the processor's speed.
Instruction- An instruction is an order given to a computer processor by a computer program.
Pipelining- Pipelining is the process of accumulating instruction from the processor through a
pipeline.
SELF-ASSESSMENT QUESTIONS
Short Answer Questions
1. List the components of von Neumann architecture.
2. Define processor.
3. Draw the basic structure of processor.
4. Write about fetch execution cycle?
5. Recall opcode size.
Long Answer Questions
1. Explain the concept of processor.
2. Discuss in detail about the ALU design.
3. Write brief about clock rate and instruction rate of CPU.
4. Draw and explain the instruction format in detail.
5. Write short note about current processors.
FURTHER READING
1. Mano M Morris.2007. “ Computer System Architecture” Edition 3.
2. P.K. Sinha.2004. “Computer Fundamentals”.
3. Comer E. Douglas. 2012, “Essentials of Computer Architecture”, Pearson, sixth edition.
4. Godse A.D. 2008. “Computer Organisation and Architecture”, Technical Publications.

160
ANNA UNIVERSITY

UNIT III- Processor Fundamentals


Chapter 7- Multicore Processor

CONTENTS
Learning Objectives
Learning Outcomes
Overview
7.1 Microprocessors
7.1.1 Moore’s Law
7.2 The Need for Multicore
7.2.1 Multicore Basics
7.3 Multicore Implementations
7.3.1 Intel and AMD Dual-Core Processors
7.3.2 Tilera TILE64
7.4 Multicore Challenges
7.4.1 Power and Temperature
7.4.2 Cache Coherence
7.5 Open Issues
7.5.1 Improved Memory System
7.5.2 System Bus and Interconnection Networks
7.5.3 Parallel Programming
7.5.4 Homogeneous vs Heterogeneous Cores
7.5.5 False Sharing
Summary
Keywords
Self-Assessment Questions
Further Reading

161
ANNA UNIVERSITY

LEARNING OBJECTIVES
After studying this chapter, you can able to:
• Understand the concept of microprocessors
• Identify the need for multicore
• Analyze the multicore implementations
• Describe some challenges faced in multicore processing
• Identify the issues related to multicore processors.
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• Recall the Moore’s law
• Illustrate the concept of multicore
• Design of Intel and AMD dual-core processors
• The open issues related to design system bus and interconnection networks
OVERVIEW
In the past chapter, we have considered the idea of Von-Neumann design. We have found out
about the fundamentals of processor, for example, design of the processor, ALU idea, and so
forth put away projects and get execute cycle.
Also, we have examined guidance designs, clock rate and guidance rate alongside the idea of
pipeline.
In this chapter, you will consider the idea of multicore processors. Multi-core is typically the
term used to clarify at least two CPUs working at something similar time on a similar chip.
Otherwise called multicore innovation, it is a sort of engineering wherein a single actual
processor comprises of the core rationale of at least two processors. A double core set-up is
somewhat practically identical to having various, various processors introduced in a similar
computer, anyway as the two processors are truly connected to a similar socket, the association
between them is speedier. This exercise will cover the idea of chip, multicore executions and
difficulties alongside the issues identified with multicore processors. Multicore handling is a
growing industry pattern as single-core processors rapidly achieve the actual furthest reaches of
conceivable intricacy and speed. Greater part of the new systems are multicore. Systems with an
extraordinary number of processor core – tens or hundreds – are now and again portrayed as
many-core or greatly multicore systems.
7.1 MICROPROCESSORS
Microprocessor is a controlling unit of a micro-computer, fabricated on a small chip capable of
performing ALU (Arithmetic Logical Unit) operations and communicating with the other
devices connected to it.

162
ANNA UNIVERSITY

Microprocessor consists of an ALU, register array, and a control unit. ALU performs
arithmetical and logical operations on the data received from the memory or an input device.
Register array consists of registers identified by letters like B, C, D, E, H, L and accumulator.
The control unit controls the flow of data and instructions within the computer.

Figure 7.1: Block Diagram

The microprocessor follows a sequence: Fetch, Decode, and then Execute.


Initially, the instructions are stored in the memory in a sequential order. The microprocessor
fetches those instructions from the memory, then decodes it and executes those instructions till
STOP instruction is reached. Later, it sends the result in binary to the output port. Between these
processes, the register stores the temporarily data and ALU performs the computing functions.
List of Terms Used in a Microprocessor
Here is a list of some of the frequently used terms in a microprocessor −
Instruction Set − It is the set of instructions that the microprocessor can understand.
Bandwidth − It is the number of bits processed in a single instruction.
Clock Speed − It determines the number of operations per second the processor can perform. It
is expressed in megahertz (MHz) or gigahertz (GHz).It is also known as Clock Rate.
Word Length − It depends upon the width of internal data bus, registers, ALU, etc. An 8-bit
microprocessor can process 8-bit data at a time. The word length ranges from 4 bits to 64 bits
depending upon the type of the microcomputer.
Data Types − The microprocessor has multiple data type formats like binary, BCD, ASCII,
signed and unsigned numbers.
Features of a Microprocessor
Here is a list of some of the most prominent features of any microprocessor −
Cost-effective − The microprocessor chips are available at low prices and results its low cost.
Size − The microprocessor is of small size chip, hence is portable.
Low Power Consumption − Microprocessors are manufactured by using metaloxide
semiconductor technology, which has low power consumption.
Versatility − The microprocessors are versatile as we can use the same chip in a number of
applications by configuring the software program.
Reliability − The failure rate of an IC in microprocessors is very low, hence it is reliable.
7.1.1 Moore’s Law
Moore's Law refers to Moore's perception that the number of transistors on a microchip doubles

163
ANNA UNIVERSITY

every two years, though the cost of computers is halved. Moore's Law states that we can expect
the speed and capability of our computers to increase every couple of years, and we will pay less
for them. Another tenet of Moore's Law asserts that this growth is exponential.
Moore’s law, prediction made by American engineer Gordon Moore in 1965 that the number
of transistors per silicon chip doubles every year.
For a special issue of the journal Electronics, Moore was asked to predict developments over the
next decade. Observing that the total number of components in these circuits had roughly
doubled each year, he blithely extrapolated this annual doubling to the next decade, estimating
that microcircuits of 1975 would contain an astounding 65,000 components per chip. In 1975, as
the rate of growth began to slow, Moore revised his time frame to two years. His revised law was
a bit pessimistic; over roughly 50 years from 1961, the number of transistors doubled
approximately every 18 months. Subsequently, magazines regularly referred to Moore’s law as
though it were inexorable—a technological law with the assurance of Newton’s laws of motion.

Figure 7.2: Depiction of Moore’s Law

7.2 THE NEED FOR MULTICORE


Multicore processors are replacing traditional, single-core processors so that fewer single-core
processors are being produced and supported. Consequently, single-core processors are
becoming technologically obsolete. Heterogeneous multicore processors, such as computer-on-a-
chip processors, are becoming more common.
Although multicore processors have largely saturated some application domains (e.g., cloud
computing, data warehousing, and on-line shopping), they are just starting to be used in real-
time, safety- and security-critical, cyber-physical systems. One area in which multicore
processing is becoming popular is in environments constrained by size, weight, and power, and
cooling (SWAP-C), in which significantly increased performance is required.
7.2.1 Multicore Basics
In this section, we will give a basic overview of multicore architecture.
Multicore programming helps you create concurrent systems for deployment on multicore
processor and multiprocessor systems. A multicore processor system is a single processor with

164
ANNA UNIVERSITY

multiple execution cores in one chip. Though manufacturer designs vary from one another,
multicore architectures require to stick to definite aspects. The fundamental configuration of
a microprocessor is observed in Figure 7.3.

Figure 7.3: Single-core processor


A multi-core processor is an integrated circuit (IC) to which two or more processors have been
attached for enhanced performance, reduced power consumption, and more efficient
simultaneous processing of multiple tasks, it is a growing industry trend as single-core
processors rapidly reach the physical limits of possible complexity and speed. A basic block
diagram of a generic multi-core processor is shown in Figure 7.4.

Figure 7.4: Block Diagram of General Multi-core Processor


The first issue is the communication between core processors and the main memory in a
multicore processors' environment. This is done either by the use of a single communication bus
“shared memory model” or an interconnection network “distributed memory model” as shown in
Figure 7.5. The single bus approach has an upper limit of 32 cores, after that the bus will be
overfilled with transactions that lower the performance of the system.

Figure 7.5: (a) Shared Memory Model, (b) Distributed Memory Model

165
ANNA UNIVERSITY

Multicore processing is typically common place because it offers advantages in the following
seven areas:
1. Energy Efficiency. By using multicore processors, architects can decrease the number of
embedded computers. They overcome increased heat generation due to Moore's Law (i.e.,
smaller circuits increase electrical resistance, which creates more heat), which in turn
decreases the need for cooling. The use of multicore processing reduces power
consumption (less energy wasted as heat), which increases battery life.
2. True Concurrency. By allocating applications to different cores, multicore processing
increases the intrinsic support for actual (as opposed to virtual) parallel processing within
individual software applications across multiple applications.
3. Performance. Multicore processing can increase performance by running multiple
applications concurrently. The decreased distance between cores on an integrated chip
enables shorter resource access latency and higher cache speeds when compared to using
separate processors or computers. However, the size of the performance increase depends
on the number of cores, the level of real concurrency in the actual software, and the use
of shared resources.
4. Isolation. Multicore processors may improve (but do not guarantee) spatial and temporal
isolation (segregation) compared to single-core architectures. Software running on one
core is less likely to affect software on another core than if both are executing on the
same single core. This decoupling is due to both spatial isolation (of data in core-specific
cashes) and temporal isolation, because threads on one core are not delayed by threads on
another core. Multicore processing may also improve robustness by localizing the impact
of defects to single core. This increased isolation is particularly important in the
independent execution of mixed-criticality applications (mission-critical, safety critical,
and security-critical).
5. Reliability and Robustness. Allocating software to multiple cores increases reliability
and robustness (i.e., fault and failure tolerance) by limiting fault and/or failure
propagation from software on one core to software on another. The allocation of software
to multiple cores also supports failure tolerance by supporting failover from one core to
another (and subsequent recovery).
6. Obsolescence Avoidance. The use of multicore processors enables architects to avoid
technological obsolescence and improve maintainability. Chip manufacturers are
applying the latest technical advances to their multicore chips. As the number of cores
continues to increase, it becomes increasingly hard to obtain single-core chips.
7. Hardware Costs. By using multicore processors, architects can produce systems with
fewer computers and processors.
7.3 MULTICORE IMPLEMENTATIONS
As it happens with any technology, multicore architectures from various manufacturers
differ tremendously. Along with variations in communication and memory configuration
another difference comes in the shape of how many cores the microprocessor has. In few

166
ANNA UNIVERSITY

multicore architectures, distinct cores have distinct functions, therefore they are
heterogeneous.
7.3.1 Intel and AMD Dual-Core Processors
Intel as well as AMD are the mainstream manufacturers of microprocessors. Intel generates
many varied flavours of multicore processors. shows block diagrams for the Core 2 Duo and
Athlon 64 X2, respectively. The figure is self explanatory; both architectures are homogenous
dual-core processors.
Intel's Core 2 Duo contains a shared memory model with private L1 caches and a shared L2
cache. When a L1 cache miss occur, both the L2 cache and the second core's L1 cache are
traversed in parallel before sending a request to main memory. In contrast, the Athlon follows a
distributed memory model with discrete L2 caches. Instead of a bus, the L2 cache shares a
system request interface.
Example: The Pentium D is used in desktops, Core 2 Duo is used in both laptop and
desktop environments, and the Xeon processor is used in servers. AMD also generates
several different flavours.

Figure 7.6: (a) Intel Core 2 Duo, (b) AMD Athlon 64 X2

In Figure 7.6, you can see the block diagrams for the Core 2 Duo and Athlon 64 X2,
respectively. Both architectures are homogenous dual-core processors. The Core 2 Duo
sticks to a shared memory model with private L1 caches as well as a shared L2 cache which
“provides a peak transfer rate of 96 GB/sec.”
If a L1 cache miss happens both the L2 cache and the second core’s L1 cache are crossed in
parallel prior to sending a request to main memory. In contrast, the Athlon pursues a
distributed memory model with discrete L2 caches. Such L2 caches share a system request
interface, removing the need for a bus.
The system request interface also links the cores with an on-chip memory controller as well
as an interconnect termed as HyperTransport. HyperTransport efficiently decreases the
number of buses needed in a system, decreasing bottlenecks and enhancing bandwidth. The
Core 2 Duo rather utilises a bus interface. The Core 2 Duo also has obvious thermal and

167
ANNA UNIVERSITY

power control units on-chip. There is no definitive performance benefit of a bus vs. an
interconnect, and the Core 2 Duo and Athlon 64 X2 acquire definite performance measures,
each utilising a varied communication protocol.
7.3.2 Tilera TILE64
You need to know that Tilera has created a multicore chip with 64 homogeneous cores
established in a grid, depicted in Figure 7.8. An application that is written to take benefit of
these extra cores will run far quicker than if it were operate on a single core. Imagine having
a project to end, but rather than having to work on it alone you have 64 people to perform
for you. Every processor has its personal L1 and L2 cache for a whole of 5MB on-chip and a
switch that links the core into the mesh network instead of a bus or interconnect.

Figure 7.7: Tilera TILE64


The TILE64 also involves on-chip memory and I/O controllers. Similar to the CELL processor,
unutilised tiles (cores) can be placed into a sleep mode to further reduce power consumption. The
TILE64 utilises a 3-way VLIW (very long instruction word) pipeline to render 12 times the
instructions as a single-issue, single-core processor.
7.4 MULTICORE CHALLENGES
With different centers on a solitary chip, you can defy a few issues and difficulties. Force just as
temperature the executives are two worries that can rise dramatically with the expansion of
numerous centers. Memory/store soundness is another test, as all plans clarified above have
appropriated L1 and in couple of occurrences L2 caches which should be composed. At last,
using a multicore processor to its whole potential is another issue.
On the off chance that software engineers don't compose applications that take advantage of
different centers there is no benefit, and in certain occurrences there is a deficiency of execution.
Application needs to be composed so different parts can be worked simultaneously (with no
connections to another bit of the application that is being worked simultaneously).
7.4.1 Power and Temperature
In the event that two centers were situated on a solitary chip with no alteration, the chip would,
in principle, use twice as much force and produce a huge level of warmth. In the outrageous
occasion, on the off chance that a processor overheats your computer might even combust. To

168
ANNA UNIVERSITY

clarify for this each plan above works the different centers at a lower recurrence to diminish
power utilization.
To battle unnecessary force utilization a few plans additionally incorporate a force control unit
that has the ability to close down inactive centers or limit the measure of force. By controlling
off unused centers and using clock gating the amount of spillage in the chip is diminished. To
diminish the warmth delivered by various centers on a solitary chip, the chip is architected so the
quantity of problem areas doesn't foster too huge and the warmth is extended across the chip.

Figure 7.8: CELL Thermal Diagram

As observed in Figure 7.8, the most of the heat in the CELL processor is dispersed in the
Power Processing Element and the rest is expanded across the Synergistic Processing
Elements. The CELL processor pursues a common trend to develop temperature regulating
into the system, with its one linear sensor as well as ten internal digital sensors.
7.4.2 Cache Coherence
Cache coherence is the uniformity of shared resource data that ends up stored in multiple local
caches. When clients in a system maintain caches of a common memory resource, problems may
arise with incoherent data, which is particularly the case with CPUs in a multiprocessing system.
In the illustration on the right, consider both the clients have a cached copy of a particular
memory block from a previous read. Suppose the client on the bottom updates/changes that
memory block, the client on the top could be left with an invalid cache of memory without any
notification of the change. Cache coherence is intended to manage such conflicts by maintaining
a coherent view of the data values in multiple caches.
Example: Envision a dual-core processor where each core brought a square of memory into its
private cache. One core composes a worth to a particular area; when the subsequent core
endeavors to peruse that worth from its cache it will not have the refreshed duplicate except if its
cache section is negated and a cache miss happens. This cache miss powers the subsequent core's
cache passage to be refreshed. In the event that this cognizance strategy wasn't set up trash
information would be perused and invalid outcomes would be delivered, perhaps smashing the
program or the whole computer.
Generally, there are two schemes for cache coherence, a snooping protocol as well as a
directory-based protocol. The snooping protocol only performs with a bus-based system, and
utilizes numerous states to decide whether or not it requires updating cache entries and

169
ANNA UNIVERSITY

whether it has control over writing to the block. The directory-based protocol can be utilized
on an arbitrary network and is, hence, scalable to several processors or cores, as opposed to
snooping which isn’t scalable.
In such a scheme, a directory is utilized that holds information related to which memory
locations are being shared in multiple caches and which are utilized especially by one
core’s cache. The directory is aware when a block requires to be updated or invalidated.
7.5 OPEN ISSUES
In this section, we will discuss some issues related to multicore processors.
7.5.1 Improved Memory System
With a few cores on a sole chip there is a tremendous requirement for upgraded memory. 32-bit
processors, similar to the Pentium 4, can handle till 4GB of principle memory. With cores now
using 64-bit handles the measure of addressable memory is almost limitless. A further developed
memory system is an imperative; more primary memory and more noteworthy caches are needed
for multithreaded multiprocessors.
7.5.2 System Bus and Interconnection Networks
Extra memory will be useless if the measure of time required for memory demands doesn't
improve also. Reshaping the interconnection network between cores is a primary focal point of
chip producers. A speedier organization portrays a lower dormancy in between core
communication just as memory exchanges.
Intel is making their Quick way interconnect, which is a 20-bit wide transport working
somewhere in the range of 4.8 and 6.4 GHz; AMD's new Hyper Transport 3.0 is a 32-bit wide
transport and works at 5.2 GHz. A different kind of interconnect is seen in the TILE64's iMesh,
which includes five organizations used to fulfill I/O and off-chip memory communication.
7.5.3 Parallel Programming
To use multicore, you really need to utilize various strings. On the off chance that you know how
to do it, it's not awful. Nonetheless, the first occasion when you do it there are a lot of techniques
to mess yourself up. The bugs you dispatch with multithreading are very harder to discover.
Since various cores in a processor are set to turn out to be double at regular intervals, it just
includes that the software working on these cores thinks about this. At long last, software
engineers need to figure out how to compose equal projects that can be separated and worked
simultaneously on different cores instead of endeavoring to take advantage of single-core
hardware to upgrade parallelism of consecutive projects
7.5.4 Homogeneous vs Heterogeneous Cores
Architects have argued whether the cores in a multicore environment should be
heterogeneous or homogeneous, and there is no absolute reply yet. Homogenous cores are

170
ANNA UNIVERSITY

all precisely the same: equal frequencies, cache sizes, functions, etc. But, every core in a
heterogeneous system may have a varied memory model, function, frequency, etc.
There is an obvious trade-off between processor complexity and customization. Every design
has utilized homogeneous cores excluding the CELL processor, which has one Power
Processing Element and eight Synergistic Processing Elements. Homogeneous cores are
simpler to generate since the same instruction set is utilized across all cores and every core
contains the identical hardware. However are they the most effective utilization of multicore
technology.
Each core in a heterogeneous environment could have a specific capacity and work its own
particular guidance set. Creating on the CELL example, a heterogeneous model might have
a major unified core produced for nonexclusive preparing and working an OS, a core for
illustrations, a cryptographic core, a communications core, a further developed arithmetic
core, a sound core, and the rundown goes on. This model is more confounded, anyway may
have power, proficiency, and warm benefits that eclipse its intricacy.
With principle makers on the two sides of such an issue, this conversation will proceed for
quite a long time to come; it will be fascinating to see which side arises on top.
7.5.5 False Sharing
The smallest unit of memory that two processors interchange is a cache line or cache sector. Two
separate caches can share a cache line when they both need to read it, but if the line is written in
one cache, and read in another, it must be shipped between caches, even if the locations of
interest are disjoint.
Like two people writing in different parts of a log book, the writes are independent, but unless
the book can be ripped apart, the writers must pass the book back and forth. In the same way,
two hardware threads writing to different locations contend for a cache sector to the point where
it becomes a ping-pong game.

Figure 7.9: Cache Line Ping Ponging Caused by False Sharing


Figure 7.9 above illustrates such a ping-pong game. There are two threads, each running on a
different core. Each thread increments a different location belonging to the same cache line. But
because the locations belong to the same cache line, the cores must pass the sector back and forth
across the memory bus.

171
ANNA UNIVERSITY

Learning Activity

Discuss about features of microprocessor.

Shrek’s Law Meets Moore’s Law


With the stakes higher for each new computer-generated (CG) animated 3D feature
film, DreamWorks Animation has an ever-increasing need for computing
performance. Using Intel® Xeon® processor 5600 series-based platforms, the
studio is achieving more than a 60 percent performance increase over previous-
generation systems. DreamWorks Animation is using that boost in performance to
help deliver two stereoscopic 3D animated films in 2011: Kung Fu Panda 2* and
Puss in Boots*.
Challenge
High-performance, stereoscopic, 3D CG animation. DreamWorks Animation artistic teams
are ambitious and continually need more compute power, including performance, throughput,
and capacity, to produce increasingly visually rich characters, environments, and effects in
high quality, stereoscopic 3D.
Solutions
• Intel Xeon processors: DreamWorks Animation optimizes its data centers and
maximizes rendering throughput with render farms of the newest HP ProLiant* G7,
G6, and G5 server blades powered by Intel Xeon processors. Animators and
filmmakers use Intel Xeon processor based workstations to work more interactively.
Servers and workstations run Red Hat Enterprise Linux* V5, and servers are
configured with 48 GB of RAM.
• Collaboration: DreamWorks Animation uses Intel® visual computing software tools
and works with Intel software experts to optimize its applications for Intel
technologies.
Impact
• Twice the throughput per processor: The Intel Xeon processor 5600 series more
than doubles the rendering throughput per processor of previous generations while
improving density and energy efficiency.
• Unleashed creativity: Animators and filmmakers shorten the cycle time between
ideas and images. They’re better able to bring their creativity to life and elicit the
“wow” factor from audiences at the theatre and on home entertainment systems.
• Improved efficiency: Despite the rising complexity and sophistication of its films,
DreamWorks Animation continually improves efficiency and maximizes its
production dollars.

172
ANNA UNIVERSITY

Kung Fu Panda 2
In 1965, Intel pioneer Gordon Moore predicted the steady increases in processing
performance that we continue to see today. That performance has been a boon to
DreamWorks Animation, which has applied the increasing compute power to tell more
visually rich stories. This is most apparent from sequel to sequel. In fact, DreamWorks
Animation, in deference to Moore’s Law, coined “Shrek’s Law”: Every sequel will need
double the render power of the film before it. The original Shrek movie
required five million rendering hours; Shrek 2* required 10 million; and Shrek the Third
required 20 million. In 2010, the final Shrek feature film, Shrek Forever After*, consumed
over 45 million rendering hours.
Continuing that trend, DreamWorks Animation’s latest feature, Kung Fu Panda 2, used more
than 55 million rendering hours—more than double the original Kung Fu Panda in 2008.
DreamWorks Animation produces all its CG animated feature films in stereoscopic 3D. Each
shot in the film is conceived in 3D from the beginning of the filmmaking process, allowing
the use of depth to further the storytelling and immerse audiences into the characters’
environment.
Add in rising audience expectations and higher-definition devices, and it’s no wonder each
new film places increasingly heavier demands on server, desktop, and storage infrastructure
throughout its development.
“We use technology to enable the creative ambition of our artists and filmmakers,” says Ed
Leonard, DreamWorks Animation chief technology officer.
Intel Xeon Processors in the Data Centre and on the Desktop
To meets its rigorous infrastructure requirements, DreamWorks Animation provisions its data
centres with servers and storage systems powered by the latest Intel Xeon technologies. The
company was quick to adopt the Intel Xeon processor 5500 series and its energy-efficient
performance, as well as the Intel Xeon processor 5600 series, which uses Intel’s industry-
leading 32nm process technology to provide additional improvements in performance,
memory bandwidth, and energy consumption. Along with server processors, DreamWorks
Animation uses the latest Intel Xeon processors in the desktop workstations its artists use to
create their visions and bring them to virtual life. The company uses Intel® compilers and
tools to performance-tune its rendering and simulation applications, and Intel Xeon processor-
based storage solutions to keep pace with swelling data volumes.
“With Kung Fu Panda 2 we hit computational load peaks nearly 50 per cent higher than any
previous production, and our upcoming films will be higher still,” says Leonard. “Intel’s
contributions to performance and throughput are critical for us in both the software and
hardware arenas.”
Teamwork Drives Advances
In addition to relying on Intel technologies, DreamWorks Animation has forged a strategic
alliance with Intel to achieve DreamWorks Animation’s goals and accelerate the evolution of
high-quality 3D entertainment. Intel engineers work closely with the studio to support its 3D

173
ANNA UNIVERSITY

leadership and advance the company toward its vision of highly interactive toolsets.
The collaboration, along with the end-to-end Intel technologies, pays off for DreamWorks
Animation as it continues its drive to lead the digital entertainment industry and delight
audiences with riveting, high-quality entertainment experiences. “The work we’re doing on
Intel’s scalable multicore platforms allows us to deliver the highest-end computer graphics
software to our artists,” says Lincoln Wallen, head of research and development at
DreamWorks Animation. “Our partnership with Intel provides a unique and incredibly
valuable collaboration between their technology experts and ours.”
Questions:
1. What is the purpose of DreamWorks Animation?
2. Discuss the role of DreamWorks Animation in delivering 3D animated film in 2011: Kung
Fu Panda 2*.

1. Most systems have between 1GB to 4GB of fundamental memory


contrasted with roughly 32KB of L1 and 2MB of L2 store.
2. Utilizing five mesh networks gives the Tile engineering a for each tile (or
center) transfer speed of up to 1.28 Tbps (terabits each second).

SUMMARY
• A multicore processor is an integrated circuit (IC) to which at least two processors have
been connected for upgraded execution, decreased power utilization, and more effective
synchronous handling of numerous errands.
• Recently, the innovation of multicore has become standard with Intel and Advanced
Micro Devices (AMD) presenting numerous economically accessible multicore chips.
• According to Moore's Law, the quantity of semiconductors on a chip will generally
twofold every year.
• Intel and AMD are the standard makers of microprocessors.
• Multicore processors incorporate different difficulties like Power and Temperature,
Cache Coherence, and so forth
• Power and temperature the board are two worries that can increment dramatically with
the expansion of various cores.
• Cache intelligence is a worry in a multicore environment on account of dispersed L1 and
L2 cache.
• There are a few issues identified with multicore processors, for example, Improved
Memory System, Parallel Programming, Homogeneous versus Heterogeneous Cores.
KEYWORD
Multicore Processor: A multi-core processor is a computer processor on a single integrated
circuit with two or more separate processing units, called cores, each of which reads and
executes program instructions.

174
ANNA UNIVERSITY

Moore’s Law: The principle that the speed and capability of computers can be expected to
double every two years, as a result of increases in the number of transistors a microchip can
contain.
Microprocessor: A microprocessor is a computer processor where the data processing logic and
control is included on a single integrated circuit, or a small number of integrated circuits.
Level 1 Cache: It is a memory which stores the data frequently used by theprocessor.
Level 2 Cache: It is a memory which is used to store recently accessed information.
Level 3 Cache: It is a specialized memory that works hand-in-hand with L1 and L2 cache
to improve computer performance.
SELF-ASSESSMENT QUESTIONS
Short Answer Questions
1. What is Moore's Law?
2. Discuss the significance of multicore processors.
3. Compare and contrast homogeneous and heterogeneous cores.
4. List the plans utilized for cache cognizance.
5. How do multicore processors eliminate the lacks of single core processors? Talk about
momentarily.
Long Answer Questions
1. Explain the essential design of a microprocessor exhaustively.
2. Describe the idea of homogenous dual-core processors.
3. Describe different difficulties looked by multicore processors.
4. Explain why cache coherence is considered as a primary worry in a multicore
environment.
5. Explain the issues identified with multicore processors.
FURTHER READING
1. Hamacher Carl. 2002. “Computer Organization”, 5th Edition, Tata McGraw Hill.
2. Hennessy L. John. 2012. “Computer Architecture: A QuantitativeApproach”, Elsevier.
3. Patterson A. David. 2011. “Computer Organization and Design”, Elsevier.
4. Tanenbaum S.Andrew. 2012.“Structured Computer Organization”, Prentice Hall
PTR.

175
ANNA UNIVERSITY

UNIT IV- Memory


Chapter 8- Memory

CONTENTS
Learning Objectives
Learning Outcomes
Overview
8.1 Physical Memory
8.1.1 Types of RAM
8.2 Addressing
8.2.1 Types of Addressing
8.3 Virtual Memory
8.3.1 Need for Virtual Memory
8.3.2 Virtual Memory Working
8.4 Paging
8.4.1 Page Tables
8.5 Address Translation
8.5.1 Huge Page Tables
8.5.2 Inverted Page Tables
Summary
Keywords
Self-Assessment Questions
Further Reading

LEARNING OBJECTIVES
After studying this chapter, you can able to:
• Gain knowledge about physical memory
• Illustrate about the various types of addressing modes
• Understand about the concept of virtual memory
• Analyze about the concept of paging
• The translation of address from virtual to physical
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• The basic knowledge of physical memory
• Recall about addressing and types
• How the virtual memory works
• Gains basic knowledge about paging and address translation

176
ANNA UNIVERSITY

OVERVIEW
In the past chapter, we have examined the idea of multicore processors. We have examined the
requirement for multicore, multicore executions, and the difficulties and issues identified with
multicore processors.
In this chapter, you will examine the idea of memory system. Memory system is at the core of a
computer system. Memory is characterized as the capacity region wherein programs are kept
when they are running and that contains the information required by the running projects. This
exercise will cover the idea of actual memory, tending to modes, and virtual memory. You will
likewise comprehend the idea of paging and address interpretation.
The fundamental component of memory is that it allows exceptionally fast admittance to
guidelines just as information, regardless of where the things are inside it. Accordingly, memory
is the place where the computer stocks directions and information.
8.1 PHYSICAL MEMORY
Memory latency is traditionally quoted using two measures—access time and cycle time. Access
time is the time between when a read is requested and when the desired word arrives, cycle time
is the minimum time between requests to memory. One reason that cycle time is greater than
access time is that the memory needs the address lines to be stable between accesses.
Physical memory refers to the actual RAM of the system, which usually takes the form of cards
(DIMMs) attached onto the motherboard. Also called primary memory, it is the only storage type
directly accessibly to the CPU and holds the instructions of programs to execute. Physical
memory is linearly addressable; memory addresses increase in a linear fashion and each byte is
directly addressable.
8.1.1 Types of RAM
The two widely used forms of modern RAM are static RAM (SRAM) and dynamic
RAM (DRAM). In SRAM, a bit of data is stored using the state of a six-transistor memory cell,
typically using six MOSFETs (metal-oxide-semiconductor field-effect transistors). This form of
RAM is more expensive to produce, but is generally faster and requires less dynamic power than
DRAM. In modern computers, SRAM is often used as cache memory for the CPU. DRAM
stores a bit of data using a transistor and capacitor pair (typically a MOSFET and MOS
capacitor, respectively), which together comprise a DRAM cell. The capacitor holds a high or
low charge (1 or 0, respectively), and the transistor acts as a switch that lets the control circuitry
on the chip read the capacitor's state of charge or change it. As this form of memory is less
expensive to produce than static RAM, it is the predominant form of computer memory used in
modern computers.
SRAM
Static RAM is able to retain all information into static form until power supply is turn off, so due
to this nature this memory is known as volatile memory. Main objective of using the static RAM

177
ANNA UNIVERSITY

is to make Cache Memory. It is more expensive as well as its power consumption is more when
compared to dynamic RAM. Static RAM are faster than dynamic RAM. In SRAM, all data has
been stored in flip-flop. Flip-flop contains the every bit of this Ram. Flip-flop uses 4-6 transistors
for making a memory cell and its circuit do not need to refreshment continuously.
As long as dc power is applied to a static memory cell, it can retain a 1 or 0 state indefinitely. If
power is removed, the stored data bit is lost. The cell is selected by an active level on the Select
line and a data bit (l or 0) is written into the cell by placing it on the Data in line.
DRAM
DRAM is another type of semiconductor memory, and it is designed specially to store data or
program files which are needed by computer processors for performing their functions.
In DRAM, several capacitors are used for storing every bit of data. This is very simple path to
save data in its memory because it needs small area to store same data to SRAM as well as it is
capable to store massive data than to SRAM but it requires the frequently refreshing of its circuit
for its charging, so it consumes more power compare to SRAM.
8.2 ADDRESSING
To perform any operation, the corresponding instruction is to be given to the microprocessor. In
each instruction, programmer has to specify 3 things:
• Operation to be performed.
• Address of source of data.
• Address of destination of result.
In computing, a memory address is a reference to a specific memory location used at various
levels by software and hardware. Memory addresses are fixed-length sequences
of digits conventionally displayed and manipulated as unsigned integers. Such numerical
semantic bases itself upon features of CPU (such as the instruction pointer and
incremental address registers), as well upon use of the memory like an array endorsed by
various programming languages.
Types of Memory Addressing
• Physical Addressing
• Logical Addressing
Physical Addressing
A digital computer's main memory consists of many memory locations. Each memory location
has a physical address which is a code. The CPU (or other device) can use the code to access the
corresponding memory location. Generally only system software, i.e. the BIOS, operating
systems, and some specialized utility programs (e.g., memory testers), address physical memory
using machine code operands or processor registers, instructing the CPU to direct a hardware
device, called the memory controller, to use the memory bus or system bus, or
separate control, address and data busses, to execute the program's commands. The memory
controllers' bus consists of a number of parallel lines, each represented by a binary digit (bit).
The width of the bus, and thus the number of addressable storage units, and the number of bits in
each unit, varies among computers.

178
ANNA UNIVERSITY

Logical Addressing
A computer program uses memory addresses to execute machine code, and to store and
retrieve data. In early computers logical and physical addresses corresponded, but since the
introduction of virtual memory most application programs do not have knowledge of physical
addresses. Rather, they address logical addresses, or virtual addresses, using the
computer's memory management unit and operating system memory mapping.
8.2.1 Types of Addressing modes
Implied addressing mode
In this mode the operands are specified implicitly in the definition of the instruction. For
example the ‘complement accumulator’ instruction is an implied mode instruction because the
operand in the accumulator register is implied in the definition of the instruction itself. All
register reference instructions that use an accumulator are implied mode instructions. Zero
address instructions in a stack organized computer are implied mode instructions since the
operands are implied to be on the top of the stack.
Example : CMA

Figure 8.1: Implied addressing mode

Immediate addressing mode

Figure 8.2: Immediate addressing mode

In this mode the operand is specified in the instruction itself. In other words, an immediate mode
instruction has a operand field rather than an address field. The operand field contains the actual
operand to be used in conjunction with the operation specified in the instruction. Immediate
mode instructions are useful for initializing registers to a constant value.
Example: ADD 5
Direct addressing mode
In this mode the effective address is equal to the address part of the instruction. The operand
resides in memory and its address is given directly by the address field of instruction. In a branch
type instruction the address field specifies the actual branch address.

179
ANNA UNIVERSITY

Figure 8.3: Direct addressing mode


Indirect addressing mode
In this mode the address field of the instruction gives the address where the effective address is
stored in memory/register. Control fetches the instruction from memory and uses its address part
to access memory again to read the effective address.
EA = address contained in register/memory location.

Figure 8.4:Indirect addressing mode

Register addressing mode


In this mode the operands are in the registers that reside within the CPU. EA = R .
Example : ADD R1,R2
Register indirect addressing mode
In this mode the instruction specifies a register in the CPU whose contents give the effective
address of the operand in the memory. In other words, the selected register contains the address
of the operand rather than the operand itself. Before using a register indirect mode instruction,
the programmer must ensure that the memory address of the operand is placed in the processor
register with a previous instruction. The advantage of a register indirect mode instruction is that
the address field of the instruction uses fewer bits to select a register than would have been
required to specify a memory address directly.

Figure 8.5: Register addressing mode

180
ANNA UNIVERSITY

8.3 VIRTUAL MEMORY


The physical main memory is not as large as the address space spanned by an address issued by
the processor. When a program does not completely fit into the main memory, the parts of it not
currently being executed are stored on secondary storage devices, such as magnetic disks. Of
course, all parts of a program that are eventually executed are first brought into the main
memory. When a new segment of a program is to be moved into a full memory, it must replace
another segment already in the memory. The operating system moves programs and data
automatically between the main memory and secondary storage. This process is known as
swapping.
8.3.1 Need for Virtual Memory
Virtual memory was developed when physical RAM was very expensive, and RAM is still more
expensive per Gigabyte than storage media such as hard disks and solid state drives. For that
reason it is much less costly to use a combination of physical RAM and virtual memory than to
equip a computer system with more RAM. Since using virtual memory (or increasing virtual
memory) has no extra financial cost (because it uses existing storage space) it offers a way for a
computer to use more memory than is physically available on the system.
8.3.2 Virtual Memory Working
Thus, the application programmer does not need to be aware of limitations imposed by the
available main memory. Techniques that automatically move program and data blocks into the
physical main memory when they are required for execution are called virtual-memory
techniques. Programs, and hence the processor, reference an instruction and data space that is
independent of the available physical main memory space. The binary addresses that the
processor issues for either instructions or data are called virtual or logical addresses. These
addresses are translated into physical addresses by a combination of hardware and software
components. If a virtual address refers to a part of the program or data space that is currently in
the physical memory, then the contents of the appropriate location in the main memory are
accessed immediately. On the other hand, if the referenced address is not in the main memory, its
contents must be brought into a suitable location in the memory before they can be used.

Figure 8.6: Architecture

181
ANNA UNIVERSITY

Figure shows a typical organization that implements virtual memory. A special hardware unit,
called the Memory Management Unit (MMU), translates virtual addresses into physical
addresses. When the desired data (or instructions) are in the main memory, these data are fetched
as described in our presentation of the ache mechanism. If the data are not in the main memory,
the MMU causes the operating system to bring the data into the memory from the disk. The
DMA scheme is used to perform the data Transfer between the disk and the main memory.
The process of translating a virtual address into physical address is known as address translation.
It can be done with the help of MMU. A simple method for translating virtual addresses into
physical addresses is to assume that all programs and data are composed of fixed length units
called pages, each of which consists of a block of words that occupy contiguous locations in the
main memory. Pages commonly range from 2K to 16K bytes in length. They constitute the basic
unit of information that is moved between the main memory and the disk whenever the
translation mechanism determines that a move is required.
Pages should not be too small, because the access time of a magnetic disk is much longer
(several milliseconds) than the access time of the main memory. The reason for this is that it
takes a considerable amount of time to locate the data on the disk, but once located, the data can
be transferred at a rate of several megabytes per second. On the other hand, if pages are too large
it is possible that a substantial portion of a page may not be used, yet this unnecessary data will
occupy valuable space in the main memory.
The cache bridges the speed gap between the processor and the main memory and is
implemented in hardware. The virtual-memory mechanism bridges the size and speed gaps
between the main memory and secondary storage and is usually implemented in part by software
techniques. Conceptually, cache techniques and virtual- memory techniques are very similar.
They differ mainly in the details of their implementation.
A virtual-memory address translation method based on the concept of fixed-length pages. Each
virtual address generated by the processor, whether it is for an instruction fetch or an operand
fetch/store operation, is interpreted as a virtual page number (high-order bits) followed by an
offset (low-order bits) that specifies the location of a particular byte (or word) within a page.
Information about the main memory location of each page is kept in a page table. This
information includes the main memory address where the page is stored and the current status of
the page.
An area in the main memory that can hold one page is called a page frame. The starting address
of the page table is kept in a page table base register. By adding the virtual page number to the
contents of this register, the address of the corresponding entry in the page table is obtained. The
contents of this location give the starting address of the page if that page currently resides in the
main memory. Each entry in the page table also includes some control bits that describe the
status of the page while it is in the main memory. One bit indicates the validity of the page, that
is, whether the page is actually loaded in the main memory.

182
ANNA UNIVERSITY

This bit allows the operating system to invalidate the page without actually removing it. Another
bit indicates whether the page has been modified during its residency in the memory. As in cache
memories, this information is needed to determine whether the page should be written back to
the disk before it is removed from the main memory to make room for another page. Other
control bits indicate various restrictions that may be imposed on accessing the page. For
example, a program may be given full read and write permission, or it may be restricted to read
accesses only.
8.4 PAGING
In a multiprogramming system memory is divided into a number of fixed-size or variable-sized
partitions or regions that are allocated to running processes. For example: a process needs m
words of memory may run in a partition of n words where n ≥ m. The variable size partition
scheme may result in a situation where available memory is not contiguous, but fragmented into
many scattered blocks. We distinguish between internal fragmentation and external
fragmentation. The difference (n – m) is called internal fragmentation, memory that is internal to
a partition but is not being used. If a partition is unused and available, but too small to be used by
any waiting process, then it is accounted for external fragmentation. These memory fragments
cannot be used. In order to solve this problem, we can either compact the memory making large
free memory blocks, or implement paging scheme which allows a program’s memory to be non-
contiguous, thus permitting a program to be allocated to physical memory. Physical memory is
divided into fixed size blocks called frames. Logical memory is also divided into blocks of the
same, fixed size called pages. When a program is to be executed, its pages are loaded into any
available memory frames from the disk. The disk is also divided into fixed sized blocks that are
the same size as the memory frames. A very important aspect of paging is the clear separation
between the user’s view of memory and the actual physical memory.
Normally, a user believes that memory is one contiguous space containing only his/her program.
In fact, the logical memory is scattered through the physical memory that also contains other
programs. Thus, the user can work correctly with his/her own view of memory because of the
address translation or address mapping. The address mapping, which is controlled by the
operating system and transparent to users, translates logical memory addresses into physical
addresses. Because the operating system is managing the memory, it must be sure about the
nature of physical memory, for example: which frames are available, which are allocated; how
many total frames there are, and so on. All these parameters are kept in a data structure called
frame table that has one entry for each physical frame of memory indicating whether it is free or
allocated, and if allocated, to which page of which process. As there is much less physical
memory than virtual memory the operating system must be careful that it does not use the
physical memory inefficiently. One way to save physical memory is to load only virtual pages
that are currently being used by the executing program. For example, a database program may be
run to query a database. In this case not the entire database needs to be loaded into memory, just
those data records that are being examined.

183
ANNA UNIVERSITY

Also, if the database query is a search query then it is not necessary to load the code from the
database that deals with adding new records. This technique of only loading virtual pages into
memory as they are accessed is known as demand paging. When a process attempts to access a
virtual address that is not currently in memory the CPU cannot find a page table entry for the
virtual page referenced. For example, in Figure 1, there is no entry in Process X’s page table for
virtual PFN 2 and so if Process X attempts to read from an address within virtual PFN 2 the CPU
cannot translate the address into a physical one. At this point the CPU cannot cope and needs the
operating system to fix things up. It notifies the operating system that a page fault has occurred
and the operating system makes the process wait whilst it fixes things up. The CPU must bring
the appropriate page into memory from the image on disk. Disk access takes a long time, and so
the process must wait quite a while until the page has been fetched. If there are other processes
that could run then the operating system will select one of them to run. The fetched page is
written into a free physical page frame and an entry for the virtual PFN is added to the processes
page table. The process is then restarted at the point where the memory fault occurred. This time
the virtual memory access is made, the CPU can make the address translation and so the process
continues to run. This is known as demand paging and occurs when the system is busy but also
when an image is first loaded into memory. This mechanism means that a process can execute an
image that only partially resides in physical memory at any one time.
The valid/invalid bit of the page table entry for a page, which is swapped in, is set as valid.
Otherwise it is set as invalid, which will have no effect as long as the program never attempts to
access this page. If all and only those pages actually needed are swapped in, the process will
execute exactly as if all pages were brought in.
If the process tries to access a page, which was not swapped in, i.e., the valid/invalid bit of this
page table, entry is set to invalid, then a page fault trap will occur. Instead of showing the
“invalid address error” as usually, it indicates the operating system’s failure to bring a valid part
of the program into memory at the right time in order to minimize swapping overhead.
In order to continue the execution of process, the operating system schedules a disk read
operation to bring the desired page into a newly allocated frame. After that, the corresponding
page table entry will be modified to indicate that the page is now in memory. Because the state
(program counter, registers etc.) of the interrupted process was saved when the page fault trap
occurred, the interrupted process can be restarted at the same place and state. As shown, it is
possible to execute programs even though parts of it are not (yet) in memory.
In the extreme case, a process without pages in memory could be executed. Page fault trap would
occur with the first instruction. After this page was brought into memory, the process would
continue to execute. In this way, page fault trap would occur further until every page that is
needed was in memory. This kind of paging is called pure demand paging. Pure demand paging
says that “never bring a page into memory until it is required”.

184
ANNA UNIVERSITY

8.4.1 Page Tables


There are many approaches to the problem of deciding which page is to replace but the object is
the same for all-the policy that selects the page that will not be referenced again for the longest
time. A few page replacement policies are described below.
First In First Out (FIFO)
The First In First Out (FIFO) replacement policy chooses the page that has been in the memory
the longest to be the one replaced.
Second Chance (SC)
The Second Chance (SC) policy is a slight modification of FIFO in order to avoid the problem of
replacing a heavily used page. In this policy, a reference bit R is used to keep track of pages that
have been recently referenced. This bit is set to 1 each time the page is referenced. Periodically,
all the reference bits are set to 0 by the operating system to distinguish pages that have not been
referenced recently from those that have been. Using this bit, the operating system can determine
whether old pages are still being used (i.e., R = 1). If so, the page is moved to the end of the list
of pages, and its load time is updated as though it had just arrived in memory. Then the search
continues. Thus heavily accessed pages are given a “second chance.”
Least Recently Used (LRU)
The Least Recently Used (LRU) replacement policy chooses to replace the page which has not
been referenced for the longest time. This policy assumes the recent past will approximate the
immediate future. The operating system keeps track of when each page was referenced by
recording the time of reference or by maintaining a stack of references.
Optimal Algorithm (OPT)
Optimal algorithm is defined as replace the page that will not be used for the longest period of
time. It is optimal in the performance but not feasible to implement because we cannot predict
future time. Example: Let us consider a 4 frames example 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
1 1 1 1 1 1
2 2 2 2 2
3 3 3 3
4 5 5
Least Frequently Used (LFU)
The Least Frequently Used (LFU) replacement policy selects a page for replacement if the page
had not been used often in the past. This policy keeps count of the number of times that a page is
accessed. Pages with the lowest counts are replaced while pages with higher counts remain in
primary memory.

185
ANNA UNIVERSITY

8.5 ADDRESS TRANSLATION


The process of translating a virtual address into physical address is known as address translation.
It can be done with the help of MMU. A simple method for translating virtual addresses into
physical addresses is to assume that all programs and data are composed of fixedlength units
called pages, each of which consists of a block of words that occupy contiguous locations in the
main memory. Pages commonly range from 2K to 16K bytes in length. They constitute the basic
unit of information that is moved between the main memory and the disk whenever the
translation mechanism determines that a move is required.
Pages should not be too small, because the access time of a magnetic disk is much longer
(several milliseconds) than the access time of the main memory. The reason for this is that it
takes a considerable amount of time to locate the data on the disk, but once located, the data can
be transferred at a rate of several megabytes per second. On the other hand, if pages are too large
it is possible that a substantial portion of a page may not be used, yet this unnecessary data will
occupy valuable space in the main memory.
8.5.1 Huge Page Tables
Page tables can be very large. If every virtual page was valid, our page table would be 220 ×
21 bits. This is about 3 Megs just for one program’s page table. If there are many programs,
there are many tables, each occupying a lot of memory.
What’s worse, the page tables we’ve been talking about are incomplete. If we have a page
fault, we need to find the page on disk. Where is it located? That information is kept in
another page table, which is indexed by the virtual page, and tells you where on disk to find
it. Then, we have to copy that page to RAM, and update the first page table. Thus, we need
two page tables!
These page tables are basically just data. Thus, they occupy memory as any data occupies
memory. When you switch from one process to another, you are required to load its page
table in RAM for easy access. It’s useful to keep it located in certain parts of RAM for just
such a purpose.
A page table register can hold the physical address of the page table that’s currently active
to get quick access. Still, these are large, and we may want to find ways to speed things up.
Inverted Page Tables
There are many schemes to reduce the size of a page table. One way is to use a hierarchy.
Another idea is to use a kind of closed hash table. The hash table’s size is based on the
number of physical pages. The number of physical pages is usually a lot smaller than the
number of all virtual pages put together.
A hash function takes a virtual page number as input, and produces an index into the hash
table as the result. Each entry of the hash table consists of a virtual page number and a
physical page number. You check to see if the virtual page number matched, and if so, then

186
ANNA UNIVERSITY

you use the physical page.


If it missed, then you must resolve the collision based on the hash table. In practice, you
may need the number of entries of the hash table to be a few times larger than the number of
physical pages, to avoid excessive collisions.
An inverted page table takes longer to access because you may have collisions, but it takes up
a lot less memory. It helps with page hits. However, if you have a page fault, you still need a
page table that maps virtual pages to disk pages, and that will be large.

Learning Activity

Discuss in detail about virtual memory working.

Virtual Memory Management in VMware: Transparent Memory sharing


The case study is based on a benchmark using a simulated workload that generates contention
for machine memory. VMware ESX Server software was installed on a Dell Optiplex 990
with an Intel i7 quadcore processor and 16GB of RAM. (Hyper-Threading was disabled on
the processor through the BIOS.) Then four identical Windows Server 2012 guest machines
were configured, each configured to run with 8 GB of physical memory. Each Windows guest
is running a 64-bit application benchmark program designated as the
ThreadContentionGenerator.exe, which the author developed.
The benchmark program was written using the .NET Framework. The program allocates a
very large block of private memory and accesses that memory randomly. The benchmark
program is multi-threaded, and updates the allocated array using explicit locking to maintain
the integrity of its internal data structures. Executing threads also simulate IO waits
periodically by going to sleep, instead of executing reads or writes against the file system to
avoid exercising the machine’s physical disks.
Performance data from both VMware and the Windows guest machines was gathered at one
minute intervals for the duration of the benchmark testing, approximately 2 hours. For
comparison purposes, a single guest machine was activated to execute the same benchmark in
a standalone environment where there was no contention for machine memory. Running
standalone, with no memory contention, the benchmark executes in about 30 minutes.
Transparent memory sharing
Transparent memory sharing is one of the key memory management mechanisms that
supports aggressive server consolidation. Dynamically, VMware detects memory pages that
are identical within or across guest machine images. When identical pages are detected,
VMware maps them to a single page in machine memory. When guest machines are largely

187
ANNA UNIVERSITY

idle, transparent memory sharing enables VM to pack guest machine images efficiently into a
single hardware platform, especially for machines running the same OS, the same OS version,
and the same applications. However, when guest machines are active, the benefits of
transparent memory sharing are evidently greatly reduced, as will soon be apparent.
VMware uses a background thread that scans guest machine pages continuously, looking for
duplicates. This process is illustrated in Figure 5. Candidates for memory sharing are found
by calculating a hash value from the contents of the page and looking for a collision in a Hash
Table that is built from the known hash values from other current pages. If a collision is
found, then the candidate for sharing is compared to the base page byte by byte. If the
contents of the candidate and the base pages match, then VMware points the PTE of the copy
to the same page of machine memory backing the base page.
Memory sharing is provisional. VMware uses a Copy on Write mechanism whenever a
shared page is modified and can no longer be shared. This is accomplished by flagging the
shared page PTE as Read Only. Then, when an instruction executes that attempts to store data
in the page, the hardware generates an addressing exception. VMware handles the exception
by creating a duplicate, and re-executing the Store instruction that failed against the duplicate.
Transparent memory sharing has great potential benefits, but there is some overhead
necessary to support the feature. One source of overhead is the processing by the background
thread. There are tuning parameters to control the rate at which these background memory
scans run, but, unfortunately, there are no associated performance counters reported that
would help the system administrator to adjust these parameters. The other source of overhead
results from the Copy on Write mechanism, which entails the handling of additional hardware
interrupts associated with the soft page faults. There is no metric that provides the rate these
additional soft page faults occur either.

Figure. Transparent memory sharing uses a background thread to scan memory pages,
compute a hash code value from their contents, and compare to other Hash codes that
have already been computed. In the case of a collision, the contents of the page that is a
candidate for sharing are compared byte by byte to the collided page. If the pages
contain identical content, VMware points both pages to same physical memory location.

188
ANNA UNIVERSITY

In the case study, transparent memory sharing is initially extremely effective – when the guest
machines are largely idle. Figure 6 renders the Memory Shared performance counter from
each of the guest machines as a stacked area chart. At 9 AM, when the guest machines are
still idle, almost all the 8 GB granted to three of the machines (ESXAS12C, ESXAS12D, and
ESXAS12E) is being shared by pointing those pages to the machine memory pages that are
assigned to the 4th guest machine (ESXAS12B). Together, these three guest machines have
about 22 GB of shared memory, which allows VMware to pack 4 x 8-GB OS images into a
machine footprint of about 10-12 GB.
However, once the benchmark programs start to execute, the amount of shared memory
dwindles to near zero. This is an interesting result. With this workload of identically
configured virtual machines, even when the benchmark programs are active, there should still
be significant opportunities to share identical code pages. But VMware is apparently unable
to capitalize much on this opportunity once the guest machines become active. A likely
explanation for the diminished returns from memory sharing is simply that the virtual
memory management performed by each of the active guest Windows machines leads to the
contents of too many virtual memory pages changing too frequently, something which simply
overwhelms the copy detection sharing mechanism.
Questions:
1. Discuss the concept of benchmark program.
2. Discuss the transparent memory sharing and compare the performance with other codes.

1. The record register is a unique CPU register that contains a records


esteem.
2. In case RAM is huge, we can have a few cycles' page tables in RAM
simultaneously.
3. Paging is a significant piece of virtual memory executions in current
working frameworks, utilizing optional storage to allow programs to
surpass the size of accessible actual memory.

SUMMARY
• Physical memory is the fundamental memory that has direct/aberrant access to CPU.
• Physical memory basically comprises of RAM (Random Access Memory) which is the
most widely recognized type of Main Memory.
• Static Random Access Memory (SRAM) is a kind of RAM utilized in different electronic
applications including toys, vehicles, advanced gadgets and PCs.
• Dynamic random access memory (DRAM) is a kind of random access memory that

189
ANNA UNIVERSITY

stores each bit of information in a different capacitor inside an incorporated circuit.


• Tending to mode indicates a standard for deciphering or adjusting the location field of the
guidance before the operand is really alluded.
• Broadening memory is called virtual memory.
• Paging is a technique for composing information to, and perusing it from, secondary
storage for use in essential storage, otherwise called fundamental memory.
• A page table is an information structure which comprises of 220 page table passages
(PTEs).
• For each memory access, either to get a guidance or information, it is needed to make an
interpretation of the virtual location to a genuine physical location.
KEYWORD
Physical Memory: Physical memory refers to the actual RAM of the system, which usually
takes the form of cards (DIMMs) attached onto the motherboard.
SRAM: Static random-access memory is a type of random-access memory that uses latching
circuitry to store each bit.
DRAM: Dynamic random-access memory is a type of random-access semiconductor memory
that stores each bit of data in a memory cell consisting of a tiny capacitor and a transistor, both
typically based on metal-oxide-semiconductor technology.
Addressing Mode: Addressing mode specifies a rule for interpreting or modifying the
address field of the instruction before the operand is actually referred.
Virtual Memory: The idea of extending memory is called virtual memory.
Paging: Paging is a method of writing data to, and reading it from, secondary storage for
use in primary storage, also known as main memory.
Page Table: A page table is a data structure which consists of 220 page table entries
(PTEs).
SELF-ASSESSMENT QUESTIONS
Short Answer Questions
1. What is physical memory?
2. What are the upsides of SRAM and DRAM?
3. Compare and contrast relative tending to mode and ordered tending to mode.
4. Define the idea of page tables.
5. List the employments of virtual memory.
Long Answer Questions
1. Describe various kinds of RAM. Additionally separate between them.
2. Illustrate the various kinds of tending to modes with models.
3. Describe the working of virtual memory.
4. When paging is utilized, a frame doesn't need to include a single physically contiguous

190
ANNA UNIVERSITY

region in secondary storage. Elaborate.


5. Illustrate the way toward changing virtual location over to the physical address.
FURTHER READING
1. Ghoshal Subrata . 2011. “Computer Architecture and Organization”, Pearson
Education.
2. Gill s. Naseeb. 2011. “ Digital Design and Computer Organisation”, Firewall Media.
3. Wang Paul Shuangbao. 2012. “ Computer Architecture and Security”, John Wiley &
Sons.
4. Hamacher Carl. 2002. “Computer Organization”, 5th Edition, Tata McGraw Hill.

191
ANNA UNIVERSITY

UNIT IV- Memory


Chapter 9- Cache Memory

CONTENTS
Learning Objectives

Learning Outcomes

Overview

9.1 Cache
9.1.1 L1, L2 and L3 Cache Memories
9.2 Cache Mapping
9.2.1 Direct Mapping
9.2.2 Fully Associative Mapping
9.2.3 Set-associative Mapping
9.3 LRU Replacement
9.3.1 LRU Caching
9.3.2 Block/Line Replacement in Associative Caches
9.3.3 Page Replacement in Virtual Memory

Summary
Keywords
Self-Assessment Questions
Further Reading

LEARNING OBJECTIVES
After studying this chapter, you can able to:
• Characterize the term cache
• Perceive the significance of cache memory
• Portray LI, L2, L3 cache memories
• Delineate the idea of cache mapping
• Portray the idea of LRU replacement
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• Identify L1, L2 and L3 cache memories

192
ANNA UNIVERSITY

• Gains the basic knowledge of cache mapping and direct mapping


• Analyze how LRU replacement works
• Illustrate the concept of cache mapping
OVERVIEW
In the previous chapter, we have contemplated the idea of physical memory, addressing modes,
and virtual memory. We have additionally clarified about paging and the process of making an
interpretation of virtual location to physical location. In this part, you will find out about cache
memory. The meaning of a Cache is a spot for covering or storing anything. This suggests that a
Cache in PC terms is memory that is straightforward to the client. A cached memory system is
one that joins a modest quantity of quick costly memory, with a bigger measure of slower less
expensive memory, to accomplish a memory system that approximates the speed of the quick
memory while costing minimal more than the slower memory. This section will cover the idea of
L1, L2, and L3 caches. You will likewise consider the idea of cache mapping and LRU
substitution.
The motivation behind cache is to speed up memory gets to by storing as of late utilized pieces
of memory. That is cache memory improves successful memory move rates and raise processor
speeds. Current computers incorporate both L1 and L2 caches, and numerous now additionally
have L3 cache.
9.1 CACHE
Cache Memory is a special very high-speed memory. It is used to speed up and synchronizing
with high-speed CPU. Cache memory is costlier than main memory or disk memory but
economical than CPU registers. Cache memory is an extremely fast memory type that acts as a
buffer between RAM and the CPU. It holds frequently requested data and instructions so that
they are immediately available to the CPU when needed.
Cache memory is used to reduce the average time to access data from the Main memory. The
cache is a smaller and faster memory which stores copies of the data from frequently used
main memory locations. There are various different independent caches in a CPU, which store
instructions and data.

Figure 9.1: Block Diagram

Levels of memory:
Level 1 or Register
It is a type of memory in which data is stored and accepted that are immediately stored in CPU.

193
ANNA UNIVERSITY

Most commonly used register is accumulator, Program counter, address register etc.
Level 2 or Cache memory
It is the fastest memory which has faster access time where data is temporarily stored for faster
access.
Level 3 or Main Memory
It is memory on which computer works currently. It is small in size and once power is off data no
longer stays in this memory.
Level 4 or Secondary Memory
It is external memory which is not as fast as main memory but data stays permanently in this
memory.
Cache Performance:
When the processor needs to read or write a location in main memory, it first checks for a
corresponding entry in the cache.
• If the processor finds that the memory location is in the cache, a cache hit has occurred and
data is read from cache
• If the processor does not find the memory location in the cache, a cache miss has occurred.
For a cache miss, the cache allocates a new entry and copies in data from main memory,
then the request is fulfilled from the contents of the cache.
The performance of cache memory is frequently measured in terms of a quantity called Hit
ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
We can improve Cache performance using higher cache block size, higher associativity, reduce
miss rate, reduce miss penalty, and reduce the time to hit in the cache.
9.1.1 L1, L2 and L3 Cache Memories
L1 Cache:
Level 1 Caching is otherwise depicted as L1 cache, internal cache, primary cache or system
cache. A primary cache is always located on the processor chip. This cache is small and its
access time is comparable to that of processor registers.
Example: Internal caches like Intel 80486 microprocessor (which contain an 8K memory cache)
and the Pentium (which has a16K cache) are often known as Level 1(L1) caches.
L2 Cache:
Secondary cache is placed between the primary cache and the rest of the memory. It is referred to
as the
level 2 (L2) cache. Often, the Level 2 cache is also housed on the processor chip.
L2 cache was at first dispatched with the Intel Pentium and Pentium Pro computers and has been
engaged with ever processor as, except for the previous of Celeron processor. L2 Cache is short
for level 2 reserving. L2 is additionally generally portrayed as auxiliary cache or outer cache.
Most contemporary PCs additionally accompany outer cache memory, known as Level 2 (L2)
caches. Distinctive layer 1 cache, L2 cache was arranged on the motherboard on previous

194
ANNA UNIVERSITY

computers through with more up to date processors it is found on the processor chip.
At the point when L2 cache is found on the processor, on the off chance that the cache is
additionally on the motherboard, it is all the more appropriately called L3 cache. The L2 cache is
on the same processor chip utilize the same die as the CPU, but it is still not portion of the core
of the CPU.
The cache is not as quick as the L1 cache, but is only somewhat slower since it is still placed on
the same processor chip, and is still quicker than the computer’s memory. The L2 cache is the
second thing the computer observes when performing instructions.
L3 Cache:
A memory bank built onto the motherboard or within the CPU module. The L3 cache feeds the
L2 cache, and its memory is typically slower than the L2 memory, but faster than main memory.
The L3 cache feeds the L2 cache, which feeds the L1 cache, which feeds the processor.
9.2 CACHE MAPPING
There are three different types of mapping used for the purpose of cache memory which are as
follows: Direct mapping, Associative mapping, and Set-Associative mapping. These are
explained below.
9.2.1 Direct Mapping
The simplest technique, known as direct mapping, maps each block of main memory into only
one possible cache line. In Direct mapping, assigned each memory block to a specific line in the
cache. If a line is previously taken up by a memory block when a new block needs to be loaded,
the old block is trashed. An address space is split into two parts index field and a tag field. The
cache is used to store the tag field whereas the rest is stored in the main memory. Direct
mapping`s performance is directly proportional to the Hit ratio.
i = j modulo m
where
i=cache line number
j= main memory block number
m=number of lines in the cache
For purposes of cache access, each main memory address can be viewed as consisting of three
fields. The least significant w bits identify a unique word or byte within a block of main
memory. In most contemporary machines, the address is at the byte level. The remaining s bits
specify one of the 2s blocks of main memory. The cache logic interprets these s bits as a tag of s-
r bits (most significant portion) and a line field of r bits. This latter field identifies one of the
m=2r lines of the cache.

Figure 9.2: Direct Mapped Cache

195
ANNA UNIVERSITY

Figure 9.3: Block Frame Arrangement

9.2.2 Associative Mapping


In this type of mapping, the associative memory is used to store content and addresses of the
memory word. Any block can go into any line of the cache. This means that the word id bits are
used to identify which word in the block is needed, but the tag becomes all of the remaining bits.
This enables the placement of any word at any place in the cache memory. It is considered to be
the fastest and the most flexible mapping form.

Figure 9.4: Associative Mapped Block Frames


9.2.3 Set-associative Mapping
This form of mapping is an enhanced form of direct mapping where the drawbacks of direct
mapping are removed. Set associative addresses the problem of possible thrashing in the direct
mapping method. It does this by saying that instead of having exactly one line that a block can
map to in the cache, we will group a few lines together creating a set. Then a block in memory
can map to any one of the lines of a specific set..Set-associative mapping allows that each word
that is present in the cache can have two or more words in the main memory for the same index
address. Set associative cache mapping combines the best of direct and associative cache
mapping techniques.
In this case, the cache consists of a number of sets, each of which consists of a number of lines.
The relationships are

196
ANNA UNIVERSITY

m=v*k
i= j mod v
where
i=cache set number
j=main memory block number
v=number of sets
m=number of lines in the cache number of sets
k=number of lines in each set

Figure 9.5: Set-associative Mapped Cache

Figure 9.6: Block Frames


Application of Cache Memory
• Usually, the cache memory can store a reasonable number of blocks at any given time,
but this number is small compared to the total number of blocks in the main memory.
• The correspondence between the main memory blocks and those in the cache is specified
by a mapping function.
9.3 LRU REPLACEMENT
In this section, you will learn about LRU replacement . short form for line replaceable unit an
LRU is an equipment level repair. LRU is also short form for Least recently used, which is an
Oracle algorithm that substitute the oldest data to make space for latest data when out of
memory. Least recently used LRU replacement policy is utilized to substitute the cache line or
page that is least currently utilized.

197
ANNA UNIVERSITY

Specification:
LRUs are designed to specifications to assure that they can be interchanged, especially if they are
from different manufacturers. Usually a class of LRUs will have coordinated environmental
specifications (i.e. temperature, condensation, etc.).
However, each particular LRU will also have detailed specifications describing its function, tray
size, tray connectors, attachment points, weight ranges, etc. It is common for LRU trays to have
standardized connections for rapid mounting, cooling air, power, and grounding.
The mounting hardware is often manually removable standard-screw-detent quick-release
fittings. Front-mounted electrical connectors are often jacks for ring-locked cannon plugs that
can be removed and replaced (R&R) without tools. Specifications also define the
supporting tools necessary to remove and replace the unit.
Many require no tools, or a standard-sized Frearson screwdriver. Frearson is specified for some
vehicles and many marine systems because Frearson screws keep their mating screwdriver from
camming out, and the same screwdriver can be used on many sizes of screws. Most LRUs also
have handles, and specific requirements for their bulk and weight. LRUs typically need to be
"transportable" and fit through a door or hatchway.
There are also requirements for flammability, unwanted radio emissions, resistance to damage
from fungus, static electricity, heat, pressure, humidity, condensation drips, vibration, radiation,
and other environmental measurements.
LRUs may be designed to ARINC 700-series standards. The form factor of LRUs comply to
ARINC Standards, ARINC 404 and ARINC 600.
LRUs are also defined by manufacturers like Airbus and Boeing and by various military
organizations. In the military, electronic LRUs are typically designed to interface according to
data bus standards such as MIL-STD-1553. On the International Space Station, LRUs are
referred to as Orbit Replaceable Units.
LRU Page Replacement Policy
In the Least Recently Used (LRU) page replacement policy, the page that is used least
recently will be replaced.
Implementation:
• Add a register to every page frame - contain the last time that the page in that frame was
accessed
• Use a "logical clock" that advance by 1 tick each time a memory reference is made.
• Each time a page is referenced, update its register
The following figure shows the behavior of the program in paging using the LRU page
replacement policy:

198
ANNA UNIVERSITY

Figure 9.7: LRU Page Request


• We can see notably that the bad replacement decisions made by FIFO is not present in
LRU.
• There are a total of 9 page read operations to satisfy the total of 18 page requests - that is
almost a 20% improvement over FIFO in such a short experiment.
• In fact, it has been shown empirically that LRU is the preferred page replacement policy.

Learning Activity

Differentiate Cache mapping types.

Building Blocks
In the previous chapters, we have broadly seen the kind of components required for
designing a computer. In this chapter, we address how to design central processing unit of a
simple computer that interacts with a small memory module. Before we proceed further let
us define a problem that our computer is supposed to solve. This is not the only problem it
can handle. Depending on how we program it, we will be able to solve different arithmetic
and logic problems and that is shown towards the end of this chapter through examples. The
purpose of defining a problem is to choose specific hardware components that will serve as
building block of our simple computer.
The Problem is Add 10 numbers stored in consecutive locations of memory. Subtract from
this total a number, stored in 11th location of memory. Multiply this result with 2 and store it

199
ANNA UNIVERSITY

in 12th location of memory. All the numbers brought from memory lie between 0 and 9.
Since, the problem says the numbers or data to be fetched from memory and we also know
that programs, i.e. binary coded instructions are also stored in memory, let us divide the
memory used in our computer in two parts. One part stores the program or series of
instructions the computer executes sequentially and this is known as program memory. The
other part houses data that program uses and is also used for storing result. This is called data
memory. From the given problem we find, we need 12 memory locations for data storage.
We expect our computer won’t need more than 20 instructions to complete the given task
hence, a memory with 32 locations (integer power of 2) can be selected for our computer.
Now we try to decide how many bits of information we store in each address location.
Usually, bits in memory locations are stored in multiple of 8 called byte. Let’s see if our job
can be done with 8bits. Each memory location stores data between 0 and 9 on which
program operates and thus require only 4bits. The final result at most can be 10 × 9 × 2 =
180 which requires 8bit for representation. So the data memory can be of 8-bits with which
we can represent decimal number up to 28-1=255.
Let’s now see the requirement of program memory. There, in each location, certain number
of bits is allocated that defines the instruction to be executed. This is called operation code or
in short, opcode. The rest of the bits can be used for referring the memory location from
which data is to be brought or stored, if required by the instruction. Since, 32 memory
locations require log232 = 5 bits for memory referencing we’ll have 8-5=3 bits for opcode
specification giving 23=8 different opcode (Figure).

Figure: Three Most Significant Bits (MSB) are opcode and


five Least Significant Bits (LSB) are address

We’ll see that 8 instructions are sufficient for the given kind of task in our limited ability
computer. Hence, one important hardware component of our computer gets decided. The
memory to be used is of size 32×8.
The above mode of addressing memory for data is called direct addressing. If the address
mentioned in the instruction contains address from which actual data is to be brought it is
called indirect addressing. If after opcode, in place of address actual data is made available, it
is called immediate addressing. If an instruction requires 2 bytes to be fetched from program
memory it is calle d 2-byte instruction. Obviously, in 2-byte instruction number of opcodes
or memory or memory addressing capability can be more than a single byte instruction.
Questions:
1. For a more complex computer design. 75 different instructions are required. What size of
memory required.

200
ANNA UNIVERSITY

1. Caches are divided into blocks, which may be of various sizes. The
number of blocks in a cache is usually a power of 2. For now we’ll say
that each block contains one byte.
2. Associative mapping scheme attempts to improve cache utilization, but
at the expense of speed.

SUMMARY
• Cache is a high speed access area that can be either a reserved section of main memory or
a storage device.
• Cache memory between main memory and CPU Cache acts as an intermediate storage
between CPU and main memory.
• Level 1 caching is alternatively referred to as L1 cache, primary cache, internal cache or
system cache.
• L2 Cache is short for Level 2 caching. L2 is also commonly referred to as secondary
cache or external cache.
• L3 Cache is Cache found on the motherboard instead of the processor on earlier
computers.
• Cache mapping is the method by which the contented of main memory are brought into
the cache and referenced by the CPU.
KEYWORD
Cache: A cache is a hardware or software component that stores data so that future requests for
that data can be served faster; the data stored in a cache might be the result of an earlier
computation or a copy of data stored elsewhere.
Cache Memory: Cache Memory is a special very high-speed memory. It is used to speed up and
synchronizing with high-speed CPU.
LRU: The LRU caching scheme is to remove the least recently used frame when the cache is full
and a new page is referenced which is not there in cache.
Direct Mapped Cache: A cache where the cache location for a given address is determined from
the middle address bits.
Fully Associative Mapping: Fully Associative Mapping is a cache mapping technique that
allows to map a block of main memory to any freely available cache line.
Set Associative Cache: Set-associative cache is a specific type of cache memory that occurs in
RAM and processors.
SELF ASSESSMENT QUESTIONS
Short Answer Questions
1. Define cache.
2. List the types of caches.
3. Illustrate the levels of memory in cache.
4. Write few performance measure of cache.

201
ANNA UNIVERSITY

5. Recall about LRU replacement.


Long Answer Questions
1. With a block diagram, explain the direct and set associative mapping between cache and
main memory
2. Discuss any six ways of improving the cache performance.
3. Briefly explain any two cache mapping functions
4. Describe the different mapping functions in cache.
5. Discuss about LRU Replacement.
FURTHER READING
1. Godse A.D. 2008. “Computer Organisation and Architecture”, Technical Publications.
2. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
3. Mano Morris. 2012. “Logic and Computer Design Fundamentals”,Cram101.
4. Hamacher Carl. 2002. “Computer Organization”, 5th Edition, Tata McGraw Hill.

202
ANNA UNIVERSITY

UNIT V- I/O Data Transfer


Chapter 10- Data Transfer

CONTENTS
Learning Objectives
Learning Outcomes
Overview
10.1 Concept of Data Transfer
10.1.1 Serial Data Transfer
10.1.2 Parallel Transfer
10.1.3 Serial-to-Parallel or Parallel-to-Serial Conversion
10.2 Full-Duplex–Half Duplex Interaction
10.2.1 Full-duplex
10.2.2 Half-duplex
10.3 Bus Interface
10.3.1 Types of Bus
10.3.2 Interrupt
Summary
Keywords
Self-Assessment Questions
Further Reading

LEARNING OBJECTIVES
After studying this chapter, you can able to:
• Gains knowledge of data transfer
• Illustrate the serial data transfer and parallel data transfer
• Describe in detail about synchronous and asynchronous transmission
• Illustrate the concept of full duplex-half duplex interaction
• Describe the concept of bus interface
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• Gains idea about data transfer method in computer

203
ANNA UNIVERSITY

• Differentiating serial and parallel data transfer


• How to convert serial-to-parallel or parallel-to-serial data
• Analyze the concept of full-duplex–half duplex interaction
• The basic knowledge of bus interface
OVERVIEW
In the past chapter, we have examined the idea of cache memory. We have talked about LI, L2,
L3 cache recollections and fundamental strategies of cache planning. We have additionally found
out about the idea of LRU substitution.
In this chapter, you will contemplate the interaction of data transmission. Data transmission
portrays the development of data fit as a fiddle of pieces between at least two advanced gadgets.
Transfer of information among a few processor level parts of a PC system incorporates data
transmission by methods of a gathering of electrical conduits known as buses interconnecting
such segments. This exercise will cover the idea of data transmission strategies and duplex
communication system. You will likewise find out about bus interface which associate all the PC
gadgets and chips together.
Data transmission allows the PC mediated communication among system clients, just as
additionally with different systems. By methods of data transmission strategies you can ship
information of any sort using an assortment of techniques.
10.1 CONCEPT OF DATA TRANSFER
When we enter data into the computer via keyboard, each keyed element is encoded by the
electronics within the keyboard into an equivalent binary coded pattern, using one of the
standard coding schemes that are used for the interchange of information. To represent all
characters of the keyboard, a unique pattern of 7 or 8 bits in size is used. The use of 7 bits means
that 128 different elements can be represented, while 8 bits can represent 256 elements. A similar
procedure is followed at the receiver that decodes every received binary pattern into the
corresponding character.
The most widely used codes that have been adopted for this function are the Extended Binary
Coded Decimal Interchange Code (EBCDIC) and the American Standard Code
for Information Interchange codes (ASCII). Both coding schemes cater to all the normal
alphabetic, numeric, and punctuation characters, collectively referred to as printable
characters and a range of additional control characters, known as non-printable characters.
Data transmission refers to the movement of data in form of bits between two or more digital
devices.
Example: An I/O device may be slower than the CPU due to which it cannot send data to
the CPU at the expected time.
There are two types of data transmission:
• Serial Data transfer

204
ANNA UNIVERSITY

• Parallel transfer

Figure 10.1: Data Transmission


Serial and parallel are terms used to clarify the strategy where information or data is moved starting with
one spot then onto the next. SERIAL TRANSFER portrays that the information is moved along a
solitary line the slightest bit at a time. A control beat is expected to move each bit. PARALLEL
TRANSFER portrays that each bit of information is moved on its own line and that all pieces
transfer simultaneously as they did in the parallel register. A solitary control beat is expected to
move all pieces.
10.1 Types of Serial Transmission
There are two types of serial transmission-synchronous and asynchronous both these
transmissions use ‘Bit synchronization’. Bit Synchronization is a function that is required to
determine when the beginning and end of the data transmission occurs. Bit synchronization helps
the receiving computer to know when data begin and end during a transmission. Therefore bit
synchronization provides timing control.
Asynchronous Transmission
Asynchronous transmission sends only one character at a time where a character is either a letter
of the alphabet or number or control character i.e. it sends one byte of data at a time. Bit
synchronization between two devices is made possible using start bit and stop bit. Start bit
indicates the beginning of data i.e. alerts the receiver to the arrival of new group of bits. A start
bit usually 0 is added to the beginning of each byte. Stop bit indicates the end of data i.e. to let
the receiver know that byte is finished, one or more additional bits are appended to the end of the
byte. These bits, usually 1s are called stop bits.

Figure 10.2: Start and Stop bit

Addition of start and stop increase the number of data bits. Hence more bandwidth is consumed
in asynchronous transmission. There is idle time between the transmissions of different data
bytes. This idle time is also known as Gap. The gap or idle time can be of varying intervals. This
mechanism is called Asynchronous, because at byte level sender and receiver need not to be
synchronized. But within each byte, receiver must be synchronized with the incoming bit stream.

205
ANNA UNIVERSITY

Application of Asynchronous Transmission


1. Asynchronous transmission is well suited for keyboard type-terminals and paper tape
devices. The advantage of this method is that it does not require any local storage at the
terminal or the computer as transmission takes place character by character.

Figure 10.3: Keyboard Type Terminal

2. Asynchronous transmission is best suited to Internet traffic in which information is


transmitted in short bursts. This type of transmission is used by modems.
Advantages of Asynchronous transmission
1. This method of data transmission is cheaper in cost as compared to synchronous e.g. If
lines are short, asynchronous transmission is better, because line cost would be low and
idle time will not be expensive.
2. In this approach each individual character is complete in itself, therefore if character is
corrupted during transmission, its successor and predecessor character will not be
affected.
3. It is possible to transmit signals from sources having different bit rates.
4. The transmission can start as soon as data byte to be transmitted becomes available.
5. Moreover, this mode of data transmission in easy to implement.
Disadvantages of asynchronous transmission
1. This method is less efficient and slower than synchronous transmission due to the
overhead of extra bits and insertion of gaps into bit stream.
2. Successful transmission inevitably depends on the recognition of the start bits. These bits
can be missed or corrupted.
Synchronous Transmission
Synchronous transmission does not use start and stop bits. In this method bit stream is combined
into longer frames that may contain multiple bytes. There is no gap between the various bytes in
the data stream.

Figure 10.4: Synchronous Transmission

206
ANNA UNIVERSITY

In the absence of start & stop bits, bit synchronization is established between sender & receiver
by ‘timing’ the transmission of each bit.
Since the various bytes are placed on the link without any gap, it is the responsibility of receiver
to separate the bit stream into bytes so as to reconstruct the original information. In order to
receive the data error free, the receiver and sender operates at the same clock frequency.
Application of Synchronous transmission
• Synchronous transmission is used for high speed communication between computers.
Advantage of Synchronous transmission
• This method is faster as compared to asynchronous as there are no extra bits (start bit &
stop bit) and also there is no gap between the individual data bytes.
Disadvantages of Synchronous transmission
• It is costly as compared to asynchronous method. It requires local buffer storage at the
two ends of line to assemble blocks and it also requires accurately synchronized clocks at
both ends. This lead to increase in the cost.
• The sender and receiver have to operate at the same clock frequency. This requires proper
synchronization which makes the system complicated.
10.2 Parallel transmission
Within a computing or communication device, the distances between different subunits are too
short. Thus, it is normal practice to transfer data between subunits using a separate wire to carry
each bit of data. There are multiple wires connecting each sub-unit and data is exchanged using
a parallel transfer mode. This mode of operation results in minimal delays in transferring each
word.
In parallel transmission, all the bits of data are transmitted simultaneously on separate
communication lines. In order to transmit n bits, n wires or lines are used. Thus each bit has its
own line.
All n bits of one group are transmitted with each clock pulse from one device to
another i.e. multiple bits are sent with each clock pulse. Parallel transmission is used for short
distance communication.
As shown in the fig, eight separate wires are used to transmit 8 bit data from sender to receiver.

Figure 10.5: Parallel Data Transmission

207
ANNA UNIVERSITY

Advantage of parallel transmission


It is speedy way of transmitting data as multiple bits are transmitted simultaneously with a single
clock pulse.
Disadvantage of parallel transmission
It is costly method of data transmission as it requires n lines to transmit n bits at the same time.
10.3 Serial to Parallel Transmission
When transferring data between two physically separate devices, especially if the separation is
more than a few kilometers, for reasons of cost, it is more economical to use a single pair of
lines. Data is transmitted as a single bit at a time using a fixed time interval for each bit. This
mode of transmission is known as bit-serial transmission. In serial transmission, the various bits
of data are transmitted serially one after the other. It requires only one communication line rather
than n lines to transmit data from sender to receiver. Thus all the bits of data are transmitted on
single line in serial fashion. In serial transmission, only single bit is sent with each clock pulse.
As shown in figure 10.3., suppose an 8-bit data 11001010 is to be sent from source to
destination. Then least significant bit (LSB) i,e. 0 will be transmitted first followed by other bits.
The most significant bit (MSB) i.e. 1 will be transmitted in the end via single communication
line. The internal circuitry of computer transmits data in parallel fashion. So in order to change
this parallel data into serial data, conversion devices are used. These conversion devices convert
the parallel data into serial data at the sender side so that it can be transmitted over single line.
On receiver side, serial data received is again converted to parallel form so that the interval
circuitry of computer can accept it.

Figure 10.6: Serial Is Used For Long Distance Transmission

Advantage of Serial transmission


Use of single communication line reduces the transmission line cost by the factor of n as
compared to parallel transmission.
Disadvantages of Serial transmission
1. Use of conversion devices at source and destination end may lead to increase in overall
transmission cost.

208
ANNA UNIVERSITY

2. This method is slower as compared to parallel transmission as bits are transmitted serially one
after the other.
10.2 TRANSMISSION MODES
Transmission mode means transferring of data between two devices. It is also known as
communication mode. Buses and networks are designed to allow communication to occur
between individual devices that are interconnected. There are three types of transmission
mode:-

Figure 10.7: Types of Transmission modes

1. Simplex Mode
In Simplex mode, the communication is unidirectional, as on a one-way street. Only one of the
two devices on a link can transmit, the other can only receive. The simplex mode can use the
entire capacity of the channel to send data in one direction.
Example: Keyboard and traditional monitors. The keyboard can only introduce input, the
monitor can only give the output.

Figure 10.8: Simplex Transmission Mode

Half-Duplex Mode
In half-duplex mode, each station can both transmit and receive, but not at the same time. When
one device is sending, the other can only receive, and vice versa. The half-duplex mode is used
in cases where there is no need for communication in both direction at the same time. The entire
capacity of the channel can be utilized for each direction.
Example: Walkie- talkie in which message is sent one at a time and messages are sent in both
the directions.
Channel capacity=Bandwidth * Propagation Delay

Figure 10.9: Half Duplex Transmission Mode

209
ANNA UNIVERSITY

Full-Duplex Mode
In full-duplex mode, both stations can transmit and receive simultaneously. In full duplex
mode, signals going in one direction share the capacity of the link with signals going in other
direction, this sharing can occur in two ways:
• Either the link must contain two physically separate transmission paths, one for sending and
other for receiving.
• Or the capacity is divided between signals travelling in both directions.
Full-duplex mode is used when communication in both direction is required all the time. The
capacity of the channel, however must be divided between the two directions.
Example: Telephone Network in which there is communication between two persons by a
telephone line, through which both can talk and listen at the same time.
Channel Capacity=2* Bandwidth*propagation Delay

Figure 10.10: Full Duplex Transmission Mode

10.3 BUS INTERFACE


Bus is a subsystem that is used to transfer data and other information between devices. Means
various devices in computer like (Memory, CPU, I/O and Other) are communicate with each
other through buses. In general, a bus is said to be as the communication pathway connecting
two or more devices.
A key characteristics of a bus is that it is a shared transmission medium, as multiple devices are
attached to a bus. Typically, a bus consists of multiple communication Pathways or lines which
are either in the form of wires or metal lines etched in a card or board(Printed Circuit
Board).Each line is capable of transmitting binary 1 and binary 0. Computer System Contains a
number of different buses that provide pathways between components at various levels of
computer system hierarchy.

Figure 10.11: Bus Interface

210
ANNA UNIVERSITY

But before discussing them i.e., types of buses ,will first describe here one of the most important
aspect of buses which is described below. Any Bus consists, typically of form about 50 to 100 of
separate lines. And on any bus, the lines may generally be classified into three functional groups,
as depicted in figure below:
Data Lines:
Data Lines provide a path for moving data between system modules. It is bidirectional which
means data lines are used to transfer data in both directions. As an example, CPU can read data
on these lines drom memory as well as send data out of these lines to a memory location or to a
port. And in any bus the no. of lines in data lines are either 8, 16, 32 or more depending on size
of bus. These lines, collectively are called as data bus.
Address Lines:
Address Lines are collectively called as address bus. In any bus, the no. of lines in address are
usually 16, 20, 24, or more depending on type and architecture of bus. On these lines, CPU sends
out the address of memory location on I/O Port that is to be written on or read from. In short, it is
an internal channel from CPU to Memory across which the address of data(not data) are
transmitted. Here the communication is one way that is, the address is send from CPU to
Memory and I/O Port but not Memory and I/O port send address to CPU on that line and hence
these lines are unidirectional.
Control Lines:
Control Lines are collectively called as Control Bus. Control Lines are gateway used to transmit
and receives control signals between the microprocessor and various devices attached to it. In
other words, Control Lines are used by CPUs for Communicating with other devices within the
computer. As an example-CPU sends signals on the Control bus to enable the outputs of address
memory devices and port devices. Typical Control Lines signals are:
-Memory Read
-Memory Write
-I/O Read
-I/O Write
-Bus Request
-Bus Grant, etc.
Operation of Bus
The Operation of Bus is as follows:
If One module wishes to send data to another, it must do two things:
1.Obtain the use of module Bus
2.Transfer for data to the Bus
If one module wishes to request data from another module, it must:
Obtain the use of Bus Transfer a request to other module over the appropriate control and

211
ANNA UNIVERSITY

address lines. It must then wait for that second module to send the data.
10.3.1 Types of Bus
There are variety of Buses, but here i will describe only those that are widely used.
System Bus:
A Bus that connects major computer components (Processor, Memory, I/O) is called a System
Bus. It is a single computer bus among all Buses that connects all these components of a
computer system. And it is the only Bus, in which data lines, address, control lines all are
present. It is also known as "front side” Bus. It is faster than peripheral Bus (PCI, ISA, etc) but
slower than backside Bus.
Peripheral Bus (I/O Bus /External Bus):
Peripheral Bus also known as "I/O Bus". It is data pathway that connects peripheral devices to
the CPU. In other words , in computing, a peripheral bus is a computer bus designed to support
computer peripheral like printers, hard drives. The PCI and USB buses are commonly used
Peripheral Buses, and are today used in commonly many PCs. Now we will discuss both of them
in brief:
PCI(Peripheral Component Interconnect): PCI Bus connects the CPU and expansion
boards such as modem cards ,network cards and sound cards. These expansion boards are
normally plugged into expansion slots on the motherboard. That's why PCI bus is also known as
expansion bus or external Bus.
USB(Universal Serial Bus): Universal Serial Bus is used to attach USB devices like Pen Drive,
etc to CPU.
Local Bus:
Local Bus are the traditional I/O(Peripheral) buses such as ISA,MCA, or EISA Buses. Now we
will discuss about each in brief one by one.
ISA(Industry Standard Architecture Bus): The ISA Bus permit bus mastering i.e., it enabled
peripheral connected directly to the bus to communicate directly with other peripherals without
going through the processor. One of the consequences of bus mastering is Direct Memory
Access. Up to end of 1990s almost all PCs computers were equipped with ISA Bus, but it was
progressively replaced by the PCI Bus, which offer a better performance.
MCA(Micro Channel Architecture): It is an improved proprietary bus designed by IBM in
1987 to be used in their PS/2 lines of computers.
EISA(Extended Industry Standard Architecture): The EISA Bus use connectors that were
same size as the ISA connectors but with 4 rows of contacts instead of 2 for 32 bit addressing.
High Speed Bus:
High Speed Bus are specifically designed to support high capacity I/O devices. High Speed Bus

212
ANNA UNIVERSITY

brings high demand devices into closer integration with the processor. This Bus supports
connection to high speed LANs, such as Fast Ethernet at 100 Mbps, video and graphic
workstation, firewire etc.
10.3.2 Interrupts
Interrupt is a signal emitted by hardware or software when a process or an event needs
immediate attention. It alerts the processor to a high-priority process requiring interruption of the
current working process. In I/O devices one of the bus control lines is dedicated for this purpose
and is called the Interrupt Service Routine (ISR).
When a device raises an interrupt at let’s say process i, the processor first completes the
execution of instruction i. Then it loads the Program Counter (PC) with the address of the first
instruction of the ISR. Before loading the Program Counter with the address, the address of the
interrupted instruction is moved to a temporary location. Therefore, after handling the interrupt
the processor can continue with process i+1.
While the processor is handling the interrupts, it must inform the device that its request has been
recognized so that it stops sending the interrupt request signal. Also, saving the registers so that
the interrupted process can be restored in the future, increases the delay between the time an
interrupt is received and the start of the execution of the ISR. This is called Interrupt Latency.
Hardware Interrupts:
In a hardware interrupt, all the devices are connected to the Interrupt Request Line. A single
request line is used for all the n devices. To request an interrupt, a device closes its associated
switch. When a device requests an interrupt, the value of INTR is the logical OR of the requests
from individual devices.
The sequence of events involved in handling an IRQ:
• Devices raise an IRQ.
• The processor interrupts the program currently being executed.
• The device is informed that its request has been recognized and the device deactivates the
request signal.
• The requested action is performed.
• An interrupt is enabled and the interrupted program is resumed.
Handling Multiple Devices:
When more than one device raises an interrupt request signal, then additional information is
needed to decide which device to be considered first. The following methods are used to decide
which device to select: Polling, Vectored Interrupts, and Interrupt Nesting. These are explained
as following below.
Polling: In polling, the first device encountered with the IRQ bit set is the device that is to be
serviced first. Appropriate ISR is called to service the same. It is easy to implement but a lot of

213
ANNA UNIVERSITY

time is wasted by interrogating the IRQ bit of all devices.


Vectored Interrupts: In vectored interrupts, a device requesting an interrupt identifies itself
directly by sending a special code to the processor over the bus. This enables the processor to
identify the device that generated the interrupt. The special code can be the starting address of
the ISR or where the ISR is located in memory and is called the interrupt vector.
Interrupt Nesting: In this method, the I/O device is organized in a priority structure. Therefore,
an interrupt request from a higher priority device is recognized whereas a request from a lower
priority device is not. To implement this each process/device (even the processor). The processor
accepts interrupts only from devices/processes having priority more than it.

Learning Activity

Demonstrate types of transmission modes with example.

1. Data transmission empowers gadgets or parts inside gadgets to address


one another. Serial transmission is utilized for significant distance
correspondence.
2. Most PC's have both an equal port and something like one serial port.
Utilizing a solitary wire diminishes costs yet eases back down the speed of
transmission.

SUMMARY
• A data move activity communicates signs or data bits between two machines.
• Serial Transfer implies that the data is moved along a solitary line the slightest bit at a
time.
• Parallel Transfer implies that each bit of data is continued on its own line and that all bits
move all the while as they did in the parallel register.
• Asynchronous transmission sends just each character in turn where a person is either a
letter of the letters in order or number or control character.
• Synchronous transmission doesn't utilize start and stop bits.
• Serial-to-parallel change or parallel-to-serial transformation depicts the way where data is
stored in a capacity gadget and the way in which that data is taken out from the capacity
gadget.
KEYWORD
Data transfer: Data transfer refers to the secure exchange of large files between systems or
organizations.
Serial Transfer: Serial Transfer means that the data is moved along a single line one bit at a

214
ANNA UNIVERSITY

time.
Parallel Transfer: Parallel Transfer means that each bit of data is moved on its own line and
that all bits transfer simultaneously as they did in the parallel register.
Asynchronous transmission: It sends only one character at a time where a character is either
a letter of the alphabet or number or control character.
Synchronous transmission: It is a method in which bit stream is combined into longer
frames that may contain multiple bytes.
Full-duplex: In full duplex communication, data can travel in both directions
simultaneously.
Half-duplex: A half-duplex channel can send and receive, but not at the same time.
Bus: A bus is a high-speed internal connection. Buses are used to send control signals and data
between the processor and other components.
Bit Synchronization: Bit-synchronous operation is a type of digital communication in which
the data circuit-terminating equipment (DCE), data terminal equipment (DTE), and transmitting
circuits are all operated in bit synchronism with a clock signal.
SELF ASSESSMENT QUESTIONS
Short Answer Questions
1. Write the concept of data transmission.
2. Compare the different types of transmission modes
3. Define bus interface
4. List out the types of bus interface
5. Recall interrupt.
6. What is meant by hardware interrupt?
Long Answer Questions
1. Describe in brief about data transmission.
2. Explain bus interface and its types.
3. Discuss in detail about the transmission mode with example.
4. Write concept of interrupt in brief.
FURTHER READING
1. Mano, M. Morris, and M. Morris Mano. 2007. “Digital design”. Englewood Cliffs, N.J.:
Prentice-Hall.
2. Godse A.D. 2008. “Computer Organisation and Architecture”, Technical Publications.
3. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
4. Ghoshal Subrata . 2011. “Computer Architecture and Organization”, Pearson
Education.
5. Gill s. Naseeb. 2011. “ Digital Design and Computer Organisation”, Firewall Media.

215
ANNA UNIVERSITY

UNIT V- I/O Data Transfer


Chapter 11- Modes of Data Transfer

CONTENTS
Learning Objectives
Learning Outcomes
Overview
11.1 Programmed Input/Output
11.1.1 Polling
11.2 Interrupt Driven I/O
11.3 Hardware Interrupt Mechanism
11.3.1 Hardware Interrupt Common Terms
11.4 Multi Level of Interrupts
11.5 Direct Memory Access (DMA)
11.5.1 DMA and Interrupt Breakpoints
11.5.2 DMA Controlled Operation
11.5.3 Buffer Chaining
11.5.4 Operation Chaining
Summary
Keywords
Self-Assessment Questions
Further Reading

LEARNING OBJECTIVES
After studying this chapter, you can able to:
• Concept of programmed Input/Output
• Identify interrupt driven I/O
• How the hardware interrupt mechanism works
• Concept of multi-level of interrupts and direct memory access
LEARNING OUTCOMES
Upon completion of the chapter, students are able to understand :
• Recall programmed input/output strategy
• The idea of Interrupt driven I/O

216
ANNA UNIVERSITY

• Illustrate equipment interrupt on mechanism


• Clarify staggered of interrupts
• Describe the idea of Direct Memory Access (DMA)
OVERVIEW
In this chapter, you will examine the idea of Input/Output association. The Input/Output
activities can be performed by three fundamental strategies. These are Programmed
Input/yield, Interrupt driven Input/Output, and Direct Memory Access. In both modified I/O
and interrupt driven I/O, CPU is liable for removing data from the memory for Output and
storing data in memory for input. Accordingly, the I/O module can store or concentrate data
in/from the memory. Consequently, this exercise will cover these procedures of Input/Output
activities.
The fundamental benefit of Programmed Input/yield, Interrupt driven Input/Output, and Direct
Memory Access is that they free CPU of the weight of Input/Output. Subsequently, during this
time CPU can accomplish other work, hence, adequately expanding the CPU usage.
11.1 PROGRAMMED INPUT/OUTPUT
Programmed input–output (also programmed input/output, programmed I/O, PIO) is a method
of data transmission, via input/output (I/O), between a central processing unit (CPU) and
a peripheral device, such as a network adapter or a Parallel ATA (PATA, formerly AT
Attachment (ATA)) storage device. Each data item transfer is initiated by an instruction in the
program, involving the CPU for every transaction. In contrast, in direct memory access (DMA)
operations, the CPU is uninvolved in the data transfer.
The term can refer to either memory-mapped I/O (MMIO) or port-mapped I/O (PMIO). PMIO
refers to transfers using a special address space outside of normal memory, usually accessed with
dedicated instructions, such as IN and OUT in x86 architectures. MMIO refers to transfers to I/O
devices that are mapped into the normal address space available to the program. PMIO was very
useful for early microprocessors with small address spaces, since the valuable resource was not
consumed by the I/O devices.
Is useful I/O method for computers where hardware costs need to be minimized. The Input
or Output operation in such cases may involve:
• Transfer of data from I/O device to the CPU registers.
• Transfer of data from CPU registers to memory.
In addition, in a programmed I/O method the responsibility of CPU is to constantly check
the status of the I/O device to check whether it has become free (in case output is desired) or
it has finished inputting the current series of data (in case input is going on). Thus,
Programmed I/O is a very time consuming method where CPU wastes lot of time for
checking and verifying the status of I/O device.

217
ANNA UNIVERSITY

The processor issue the I/O commands to the I/O module can be of four types.
1. Control: This I/O command activates the I/O module addressed by the processor and
directs it to the task it has to perform. This command can be customized depending upon
the type of peripherals.
2. Test: This I/O command tests the status of the I/O module and its peripherals to ensure
that the addressed peripheral is powered on and available for the task. This command also
tests whether the most recent I/O operation has completed successfully or if any error has
occurred.
3. Read: This I/O command lets the I/O module extract the data from the corresponding
peripheral and store it in its internal buffer. Further, the I/O module can place this data
over the data bus on the processor’s demand.
4. Write: This I/O command lets the I/O module accept the data over the data bus and
transmit it to the corresponding peripheral.
11.1.1 I/O Instructions
• The I/O instruction encountered by the processor is issued to the processor by the main
memory. And to execute this I/O instruction the processor provides the I/O command to
the corresponding I/O device. Thereby the I/O instruction cab is simply mapped onto the
I/O command. Usually, there is a simple one-to-one relationship between I/O instruction
and I/O command.
• The I/O instruction can also be customized depending on the peripherals addressed. As
we have discussed above how the external device or peripheral recognizes that the
processor is addressing them and has issued them the I/O command.
• So, when the processor, main memory, and I/O module share the common bus then
addressing can be achieved in two ways memory-mapped I/O and isolated I/O.
• With the memory-mapped I/O, the processor access memory and I/O using a single
address space. Here the processor uses the same address, data, and control bus. So, the
same set of machine instructions addresses both memory and I/O.
• With isolated I/O the address space for memory is isolated from the address space of I/O.
Though the processor uses the same data and address line for memory and I/O devices it
uses a separate control line for memory and I/O devices.
• The memory-mapped I/O has a large set of I/O instructions as compared to the isolated
I/O.
• So, with the programmed I/O for every I/O transfer or I/O operation, a program is written
to implement the task. The other two methods of I/O transfer i.e. interrupted I/O and
DMA involve the use of interrupt.

218
ANNA UNIVERSITY

11.1.2 Polling
Polling is the process where the computer or controlling device waits for an external device to
check for its readiness or state, often with low-level hardware.
Example: When a printer is connected via a parallel port, the computer waits until the printer has
received the next character. These processes can be as minute as only reading one bit. This is
sometimes used synonymously with 'busy-wait' polling. In this situation, when an I/O operation
is required, the computer does nothing other than check the status of the I/O device until it is
ready, at which point the device is accessed. In other words, the computer waits until the device
is ready. Polling also refers to the situation where a device is repeatedly checked for readiness,
and if it is not, the computer returns to a different task. Although not as wasteful of CPU cycles
as busy waiting, this is generally not as efficient as the alternative to polling, interrupt-
driven I/O.
Algorithm
Polling can be described in the following steps:
Host actions:
• The host repeatedly reads the busy bit of the controller until it becomes clear (with a
value of 0).
• When clear, the host writes the command into the command register. If the host is
sending output, it sets the write bit and writes a byte into the data-out register. If the host
is receiving input, it reads the controller-written data from the data-in register, and sets
the read bit to 0 as the next command.
• The host sets the command-ready bit to 1.
Controller actions:
• When the controller notices that the command-ready bit is set, it sets the busy bit to 1.
• The controller reads the command register. If the write bit inside is set, it reads from the
data-out register and performs the necessary I/O operations on the device. If the read bit
is set, data from the device is loaded into the data-in register for the host to read.
• Once the operations are over, the controller clears the command-ready bit, clears the error
bit to show the operation was successful, and clears the busy bit.
11.2 INTERRUPT I/O DRIVEN
Interrupt driven I/O is an alternative scheme dealing with I/O. Interrupt I/O is a way of
controlling input/output activity whereby a peripheral or terminal that needs to make or receive a
data transfer sends a signal. This will cause a program interrupt to be set. At a time appropriate to
the priority level of the I/O interrupt. Relative to the total interrupt system, the processors enter
an interrupt service routine. The function of the routine will depend upon the system of interrupt
levels and priorities that is implemented in the processor. The interrupt technique requires more
complex hardware and software, but makes far more efficient use of the computer’s time and
capacities. Figure 11.1 shows the simple interrupt processing.

219
ANNA UNIVERSITY

For input, the device interrupts the CPU when new data has arrived and is ready to be retrieved
by the system processor. The actual actions to perform depend on whether the device uses I/O
ports or memory mapping.
For output, the device delivers an interrupt either when it is ready to accept new data or to
acknowledge a successful data transfer. Memory-mapped and DMA-capable devices usually
generate interrupts to tell the system they are done with the buffer.

Figure 11.1: Simple Interrupt Processor


Here the CPU works on its given tasks continuously. When an input is available, such as when
someone types a key on the keyboard, then the CPU is interrupted from its work to take care of
the input data. The CPU can work continuously on a task without checking the input devices,
allowing the devices themselves to interrupt it as necessary.
Basic Operations of Interrupt
1. CPU issues read command.
2. I/O module gets data from peripheral whilst CPU does other work.
3. I/O module interrupts CPU.
4. CPU requests data.
5. I/O module transfers data.
Design Issues
There are 2 main problems for interrupt I/O, which are:
• There are multiple I/O modules, how should the processor determine the device that
issued the interrupt signal?

220
ANNA UNIVERSITY

• How does the processor decide which module to process when multiple interrupts have
occurred?
There are 4 main ways to counter these problems, which are:
• Multiple Interrupt Lines
• Software Poll
• Daisy Chain (Hardware Poll, Vectored)
• Bus Arbitration (Vectored)
These are known as the 4 general categories of techniques that are commonly used in I/O
interrupt.
Multiple Interrupt Lines
As the name suggests, we provide multiple interrupt lines between the processor and the I/O
modules. This allows multiple modules to be handled at the same time. However, it is not
practical to assign many bus lines and processor pins to interrupt lines. One of the reasons is that
there might be more than one I/O module attached to a single line. This defeats the purpose of
this technique. The 3 techniques of the latter are usually used together with this technique to
rectify its’ problems.
Software Poll
Whenever an interrupt is detected by the processor, it branches to an interrupt service routine
which will poll each and every I/O module to determine the exact interrupting module. The
processor raises a poll which could be in the form of a command line. Consequently, the address
of the respective I/O module which is interacted by the poll will be placed on the address line.
The module will respond positively if it is responsible for setting the interrupt. On the other
hand, every I/O module has an addressable status register. This register can be read by the
processor to determine the interrupting module. After that, the processor is then branched to a
specific device-service routine. The downside to this technique is that it is time consuming.
Daisy Chain (Hardware Poll, Vectored)
This is actually a hardware poll. The interrupt acknowledge line is daisy chained to all the
modules. Whenever there is an interrupt, the processor send out an interrupt acknowledge which
will propagate throughout the series of I/O modules. This process will continue until it reaches a
requesting module. The module will respond by placing a word on the data lines. The word is
known as vector. This vector can either be the address of the module or a specific identifier. The
processor subsequently directs the module to its’ specific device-service routine based on its’
vector. This technique is also known as the vectored interrupt. It completely removes the need
for interrupt-service routine.
Bus Arbitration (Vectored)
The last method which also utilizes vectored interrupts is bus arbitration. This method involves
the I/O module gaining control over the bus before requesting for the interrupt. This is limited to

221
ANNA UNIVERSITY

only one module at a time. The processor sends an acknowledge signal whenever it detects an
interrupt. The requesting module then places its’ vector on the data lines.
However, when there are multiple interrupts at a single time, there will be a need to assign
priorities.
Examples of Interrupt Structures
Intel 82C59A

Use of the 82C59A Interrupt Controller


• This is a type of interrupt controller/external interrupt arbiter.
• Main purpose is for managing interrupts, especially priorities.
• Contains an Interrupt Request (INTR) and Interrupt Acknowledge (INTA) line.
• Connects to external device and then to Intel 80386, allowing the
handling of a variety of devices and priority structures.
• A single 82C59A can connect up to eight I/O modules.
• Can be arranged in cascaded arrangement for up to 64 modules.
It has 3 modes:
1. Fully nested: Interrupt request ordered based on priority.
2. Rotating: Some applications have equal priority. Thus, a particular device has lowest

222
ANNA UNIVERSITY

priority in a group after being serviced.


3. Special mask: Processor can inhibit interrupts from some devices.
Interrupt Processing
1. A device driver initiates an I/O request on behalf of a process.
2. The device driver signals the I/O controller for the proper device, which initiates the
requested I/O.
3. The device signals the I/O controller that is ready to retrieve input, the output is complete
or that an error has been generated.
4. The CPU receives the interrupt signal on the interrupt-request line and transfer control
over the interrupt handler routine.
5. The interrupt handler determines the cause of the interrupt, performs the necessary
processing and executes a “return from” interrupt instruction.
6. The CPU returns to the execution state prior to the interrupt being signaled.
7. The CPU continues processing until the cycle begins again.
11.3 HAEDWARE INTERRUPT MECHANISM
A hardware interrupt is an electronic alerting signal sent to the processor from an external
device, like a disk controller or an external peripheral. For example, when we press a key on the
keyboard or move the mouse, they trigger hardware interrupts which cause the processor to read
the keystroke or mouse position.
An hardware interrupt is a signal that stops the current program forcing it to execute another
program immediately. The interrupt does this without waiting for the current program to finish. It
is unconditional and immediate which is why it is called an interrupt - it interrupts the current
action of the processor.
You will often find software interrupts used in x86 BIOS routines and they make it easier to
update the software since the interrupt routine will always be in the located in the same place e.g.
for plotting a pixel a software interrupt is executed and the actual code that plots the pixel is
defined by the manufacturer.
Hardware Interrupt Example:
Reading a keypad is made far easier using interrupts (especially on PIC devices) as PORTB has
an interrupt-on-change feature for Port B pins. Attaching each pin to a push-to-make button and
enabling the internal pullups on these pins gives you an easy way to read button presses.
You can arrange that whenever the push button is pressed an interrupt is generated that forces the
interrupt routine to go and read (and store) the state of that input.
An interrupt occurs independently of main code operation. So the processor can be doing other
things before and after the interrupt. During the interrupt, main code operation is suspended and
all registers are saved so that after the interrupt main code starts up exactly where it was.
It is as if the main code operates as usual but the interrupt operates without the main code

223
ANNA UNIVERSITY

noticing!
Hardware interrupt Common terms
Terms you might hear associated with hardware interrupts are ISR, interrupt mask, non maskable
interrupt, an asynchronous event, interrupt vector and context switching.

Figure 11.3: Hardware interrupt Common terms

ISR
An ISR is simply another program function. It is no different to any other function except that it
does context switching (saving/restoring processor registers) and at the end of the function it re-
enables interrupts.
After the ISR finishes program execution returns to the original program and it continues exactly
from where it was interrupted. The original program will have no idea that this has happened.
Hardware Interrupt Vector
This is a fixed address that contains the location of your ISR – for a PIC micro it is usually
address 4. In other micros there may be an interrupt vector for each vector – you have to make an
interrupt routine for each one that you want to use. For PIC micros you have just one interrupt
and you have to detect which interrupt triggered by examining the interrupt flag register(s).
You program the interrupt address with the address of your interrupt routine. Whenever the
interrupt is triggered (and if the interrupt is unmasked) program operation jumps to the location
of your interrupt routine.
NMI
The NMI is exactly the same as a normal interrupt except that you can not control whether or not
it is active - it's always active. It is more commonly found on older/larger processors as a
dedicated input pin. In this case it is more than likely fed with a brown-out-detector (BOD)
circuit i.e a PSU voltage dip detector circuit - for detecting a condition that should never be
ignored!
You don't need it in a microcontroller as you can achieve exactly the same functionality using a

224
ANNA UNIVERSITY

programmable interrupt and many microcontrollers have built-in BODs.


Context switching
This means that the register state is preserved at the start of an ISR and restored at the end of an
ISR. It ensures that the function that was interrupted is not affected by the ISR.
A Basic Interrupt System
This is a general description of an interrupt system (biased slightly to PIC micros) which is true
for any interrupt module and it is useful for understanding how to control and use interrupts.
Hardware Interrupt Block Diagram
Input Sources
These are signals that start an interrupt. They can be external pins (where a rising or falling edge
of an input triggers the interrupt) or internal peripheral interrupts e.g. ADC completed, Serial Rx
data received, timer overflow etc. (The 16F877 has 15 interrupt sources).
Interrupt Flags
Each hardware interrupt source has an associated interrupt flag - whenever the interrupt is
triggered the corresponding interrupt flag is set. These flags are usually stored as bits within an
interrupt register.
The processor can read from and write to the interrupt register, reading from it to find out which
interrupts occurred and writing to it to clear the interrupt flags.

Figure 11.4: Block Diagram

Interrupt Mask
The interrupt mask has a set of bits identical to those in the interrupt register. Setting any bit (or
unmasking) lets the corresponding signal source generate an interrupt - causing the processor to

225
ANNA UNIVERSITY

execute the ISR.


Hardware Interrupt Vector
The interrupt vector is a location in memory that you program with the address of your interrupt
service routine (ISR). Whenever an unmasked interrupt occurs program execution starts from the
address contained in the interrupt vector.
For PIC micros the hardware interrupt vector address is usually 0004. This is simply an event
that is not synchronised to the processor clock. It is an event that the processor can not predict
e.g. A button press.
11.5 MULTI LEVEL OF INTERRUPTS
Multiple levels of interrupts are normally used to increase the number of addressable interrupts
in a system. For example, if two levels are used, a second level interrupt aggregator would
combine all interrupts routed to it into one IRQ line in the first level interrupt controller. If three
levels are used, a third level aggregator combines its input interrupts into one IRQ line at the
second level. The number of interrupt levels is usually determined by the hardware. (The term
"aggregator" here means "interrupt controller".)
You will think that its intriguing to take note of that interrupts might be brought up in hardware
by sending a signal down a dedicated wire, or in software by executing a special instruction.
Regardless, the processor normally pushes its current register substance onto a stack to save
them, and begins to execute another program known as an intrude on handler. At the point when
that is finished, the processor might restore its registers from the stack and proceed as in the past.
One hinder may intrude on another, etc, to many levels.
To deal with situations where one hinder may intrude on another, need levels are utilized to
separate the significance of each interfere. These need levels are restricted, in any case.
Regardless of the need level of an intrude on, it generally interrupts the current handling. In
specific applications, in any case, like switches, it tends to be vital to save kernel preparing,
while application handling might be handily intruded.
A need exists, in this manner, for an answer that takes into consideration interrupts to be
executed by an application without influencing kernel handling.
Various degrees of interrupts to be used in a PC framework, which permits, for instance, a hinder
with an intrude on level related with an application to be unmistakable from a hinder with an
intrude on level related with a kernel. The kernel level hinder might be dealt with rapidly through
its own handler, while the application level hinder might be taken care of all the more leisurely.
Figure 11.5 is a flow diagram illustrating a method for handling an interrupt in accordance
with an embodiment of the present invention. This figure illustrates an embodiment having 2
levels of interrupts, described as kernel-level and user-level, however, one of ordinary skill
in the art will recognise that embodiments can be created having any number of different
levels. Each act in this process may be performed by hardware or software. In this

226
ANNA UNIVERSITY

embodiment of the present invention, no other interrupts are allowed while handling the
current interrupt using this process, at least until the process is passed to a user- level interrupt
handling procedure as will be described.

Figure 11.5: Flow Diagram for an Embodiment having 2 Levels of Interrupts

11.5 DIRECT MEMORY ACCESS (DMA)


DMA Controller is a hardware device that allows I/O devices to directly access memory with
less participation of the processor. DMA controller needs the same old circuits of an interface to
communicate with the CPU and Input/Output devices.
Figure 11.6 below shows the block diagram of the DMA controller. The unit communicates with
the CPU through data bus and control lines. Through the use of the address bus and allowing the
DMA and RS register to select inputs, the register within the DMA is chosen by the CPU. RD
and WR are two-way inputs. When BG (bus grant) input is 0, the CPU can communicate with
DMA registers. When BG (bus grant) input is 1, the CPU has relinquished the buses and DMA
can communicate directly with the memory.
DMA Controller Registers :
The DMA controller has three registers as follows.
Address register – It contains the address to specify the desired location in memory.
Word count register – It contains the number of words to be transferred.
Control register – It specifies the transfer mode.
All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU can
both read and write into the DMA registers under program control via the data bus.

227
ANNA UNIVERSITY

Figure 11.6: Block Diagram

Explanation :
The CPU initializes the DMA by sending the given information through the data bus.
• The starting address of the memory block where the data is available (to read) or where
data are to be stored (to write).
• It also sends word count which is the number of words in the memory block to be read or
write.
• Control to define the mode of transfer such as read or write.
• A control to begin the DMA transfer.
11.5.1 DMA and Interrupt Breakpoints
DMA interface transfers complete block of data one word at a time directly to or from memory
without going through processor. When transfer is complete, DMA interface transmits an
interrupt signal to processor. So in DMA processor involvement can be restricted at beginning
and end of transfer that can be displayed as in figure above. However question is when should
DMA take control of bus? For this we would recall phenomenon of execution of an instruction
by processor. Figure below displays five cycles for an instruction execution. Figure also displays
five points where a DMA request can be responded to and a point where interrupt request can be
responded to. Please note an interrupt request is acknowledged only at one point of an instruction
cycle and that is at interrupt cycle.

228
ANNA UNIVERSITY

Figure 11.7: DMA and Interrupt Breakpoints

11.5.2 DMA Controlled Operation


For each channel, the DMA controller saves the programmed address and count in the base
registers and maintains copies of the information in the current address and current count
registers, as shown in Figure 11.5. Each DMA channel is enabled and disabled via a DMA mask
register. When DMA is started by writing to the base registers and enabling the DMA channel,
the current registers are loaded from the base registers. With each DMA transfer, the value in the
current address register is driven onto the address bus, and the current address register is
automatically incremented or decremented. The current count register determines the number of
transfers remaining and is automatically decremented after each transfer. When the value in the
current count register goes from 0 to -1, a terminal count (TC) signal is generated, which
signifies the completion of the DMA transfer sequence. This termination event is referred to as
reaching terminal count.
DMA controllers often generate a hardware TC pulse during the last cycle of a DMA transfer
sequence. This signal can be monitored by the I/O devices participating in the DMA transfers.
DMA controllers require reprogramming when a DMA channel reaches TC. Thus, DMA
controllers require some CPU time, but far less than is required for the CPU to service device I/O
interrupts. When a DMA channel reaches TC, the processor may need to reprogram the
controller for additional DMA transfers. Some DMA controllers interrupt the processor
whenever a channel terminates. DMA controllers also have mechanisms for automatically
reprogramming a DMA channel when the DMA transfer sequence completes. These mechanisms
include auto initialization and buffer chaining.
The auto initialization feature repeats the DMA transfer sequence by reloading the DMA
channel's current registers from the base registers at the end of a DMA sequence and re-enabling
the channel. Buffer chaining is useful for transferring blocks of data into noncontiguous buffer
areas or for handling double-buffered data acquisition. With buffer chaining, a channel interrupts
the CPU and is programmed with the next address and count parameters while DMA transfers
are being performed on the current buffer. Some DMA controllers minimize CPU intervention
further by having a chain address register that points to a chain control table in memory. The

229
ANNA UNIVERSITY

DMA controller then loads its own channel parameters from memory. Generally, the more
sophisticated the DMA controller, the less servicing the CPU has to perform. A DMA controller
has one or more status registers that are read by the CPU to determine the state of each DMA
channel. The status register typically indicates whether a DMA request is asserted on a channel
and whether a channel has reached TC. Reading the status register often clears the terminal count
information in the register, which leads to problems when multiple programs are trying to use
different DMA channels.
11.5.3 Buffer Chaining
With interrupts, the processor is occasionally interrupted from implementing the main
program to stock incoming data in a buffer for later retrieval as well as processing. With
DMA, a dedicated data transfer device reads incoming data from a device and stocks that data
in a system memory buffer for future retrieval by the processor. DMA controllers need
reprogramming when a DMA channel reaches TC. Hence, DMA controllers need some CPU
time, but far less than is needed for the CPU to service device I/O interrupts. When a DMA
channel attains TC, the processor may need to reprogram the controller for further DMA
transfers. Few DMA controllers interrupt the processor whenever a channel ends. DMA
controllers also possess mechanisms for automatically reprogramming a DMA channel at
times the DMA transfer sequence finishes.
These mechanisms involve auto-initialisation and buffer chaining. Buffer chaining is
valuable for transferring blocks of data into non-contiguous buffer regions or for tackling
double-buffered data acquisition. With buffer chaining, a channel interferes the CPU and is
programmed with the subsequent address and count parameters whereas DMA transfers are
being conducted on the recent buffer. Few DMA controllers minimise CPU intervention
additionally by possessing a chain address register that indicates a chain control table in
memory. The DMA controller thereafter loads its personal channel parameters from memory.
With buffer chaining mode, multiple buffers can be filled/emptied with minimum CPU
intervention as well as no delay between buffers. Ring buffer mode is an improvement to
auto-initialisation mode. A stop register can be programmed with an address that is
compared against the present address register. In case a match is identified, the DMA
controller stops trans till the stop register is reprogrammed. Ring buffer mode can be
utilised with auto- initialisation to manage a circular buffer and stop data from being
overwritten unintentionally.
11.5.4 Operation Chaining
DMA operations enable transference of data from a source to a destination without needing
a host processor to regulate the data transfer itself.
Example: Examples of source include a memory, input/output (I/O) device, etc.
Examples of destination include another memory, I/O device, etc. Transfers may be between

230
ANNA UNIVERSITY

like or varied device kinds.


Example: memory-memory, memory-I/O device, etc.
In a DMA operation, the host processor firstly offers information to a DMA controller
(also depicted as a “Bus Master”) stating the parameters of a data- transfer operation.
Example: The address of a block of data to be transferred, the amount of such data, the
destination address, etc.; referred to collectively herein as “data-transfer parameters”.
The DMA controller, then, tackles the actual data transfer.
When conducting DMA operations, the DMA controller attains control of an
interconnecting bus (or busses) between the source and destination devices. The controller
acquires data-transfer parameters from a host processor and informs the processor when
the operation has been finished. One renowned method for portraying data-transfer
parameters between a processor and DMA controller is via “channel control blocks” or
“descriptors” (depicted herein as “control blocks”); that is, data structures developed by
the processor, stocked in a memory and availed by the DMA controller for causing a
specific DMA operation.
To enable the implementation of multiple DMA operations in succession, it is known to
“chain” these operations together by building a connected list of corresponding control
blocks. The connected list may be developed through memory addresses; that is, each
block contains the address of the block disposed instantly next in the chain.
In conventional designs, a “chain bit” may be held in every control block to point out
whether a block recently undergoing processing is connected to a subsequent block. Under
such conditions, the host processor may be needed to poll this bit to decide whether a
chaining operation is to be conducted. Such polling may add to bus traffic in the system.
Frequently, while a DMA controller is conducting a data transfer operation stated by a
specific control block, the host processor specifies extra data transference operations by
developing extra control blocks. When additional control blocks are developed, it is
desirable to add the new control blocks to the prevailing linked list of control blocks to
permit the DMA controller to process all the control blocks in one continuous sequence of
data-transfer operations.
Without the capability to append these control blocks, the DMA controller will deactivate
upon completion of the data-transfer operation specified by the initial linked list. Thereafter,
the host processor will require to restart the DMA controller to tackle the data transfer
operation started by the new group of control blocks. This operation of deactivating and
restarting the DMA controller may lead to important time delays and should be neglected
when possible.
The appending of control block(s) to a prevailing block (or linked-list of blocks) before a
matching DMA operation is complete is depicted herein as dynamic chaining of DMA

231
ANNA UNIVERSITY

operations (“dynamic DMA chaining”). Connected lists of control blocks (also depicted
herein as “chains” or “control-block chains”) that permit such appending of extra control
blocks are regarded “dynamic”. On the contrary, chains that are described before the start of
a corresponding DMA operation and cannot be appended prior to such operation is completed
are regarded “static”.

Learning Activity

Illustrate hardware interrupt mechanism with example.

Interrupt in 8051
Interrupts are the events that temporarily suspend the main program, pass the control to the
external sources and execute their task. It then passes the control to the main program where
it had left off.
8051 has 5 interrupt signals, i.e. INT0, TFO, INT1, TF1, RI/TI. Each interrupt can be
enabled or disabled by setting bits of the IE register and the whole interrupt system can be
disabled by clearing the EA bit of the same register.
IE (Interrupt Enable) Register
This register is responsible for enabling and disabling the interrupt. EA register is set to one
for enabling interrupts and set to 0 for disabling the interrupts. Its bit sequence and their
meanings are shown in the following figure.

It disables all interrupts. When EA = 0 no interrupt will be


EA IE.7
acknowledged and EA = 1 enables the interrupt individually.

- IE.6 Reserved for future use.

- IE.5 Reserved for future use.

ES IE.4 Enables/disables serial port interrupt.

ET1 IE.3 Enables/disables timer1 overflow interrupt.

EX1 IE.2 Enables/disables external interrupt1.

ET0 IE.1 Enables/disables timer0 overflow interrupt.

EX0 IE.0 Enables/disables external interrupt0.


IP (Interrupt Priority) Register
We can change the priority levels of the interrupts by changing the corresponding bit in the

232
ANNA UNIVERSITY

Interrupt Priority (IP) register as shown in the following figure.


A low priority interrupt can only be interrupted by the high priority interrupt, but not
interrupted by another low priority interrupt.
If two interrupts of different priority levels are received simultaneously, the request of higher
priority level is served.
If the requests of the same priority levels are received simultaneously, then the internal
polling sequence determines which request is to be serviced.

- IP.6 Reserved for future use.

- IP.5 Reserved for future use.

PS IP.4 It defines the serial port interrupt priority level.

PT1 IP.3 It defines the timer interrupt of 1 priority.

PX1 IP.2 It defines the external interrupt priority level.

PT0 IP.1 It defines the timer0 interrupt priority level.

PX0 IP.0 It defines the external interrupt of 0 priority level.

Questions
• Discuss a simple mechanism of 8051.
• Discuss the signification of bits in the IP Register.

1. Data transfers between memory occur as result of instruction fetch-


execute cycle. Polled I/O is inherently inefficient.
2. Generally, the more sophisticated the DMA controller, the less servicing
the CPU has to perform.

SUMMARY
• In Programmed Input/Output method, CPU controls entire IO transaction by
executing a sequence of IO instructions included in a program.
• Polling is a form of foreground data acquisition in which the processor is dedicated to
acquiring the incoming data, often by waiting in a loop.
• The term interrupt loosely is used for any exceptional event that causes temporary
transfer of control of CPU from one program to the other which is causing the
interrupt.
• Interrupts can be generated by various sources internal external to the CPU.

233
ANNA UNIVERSITY

•DMA is the transferring of data from one storage device to memory to another
device without using the CPU.
• The three primary data transfer mechanisms for computer-based data acquisition are
polling, interrupts (also known as programmed I/O), and direct memory access (DMA).
KEYWORD
PIO: Programmed input/output (PIO) is a method of transferring data between the CPU and
a peripheral, such as a network adapter or an ATA storage device.
Polling: The CPU, while waiting must repeatedly check the status of the I/O module, and
this process is known as Polling.
Interrupt: The term interrupt is a response by the processor to an event that needs attention
from the software.
Daisy Chaining: A daisy chain is a wiring scheme in which multiple devices are wired together
in sequence or in a ring, similar to a garland of daisy flowers.
Interrupt vector: An interrupt vector is the memory location of an interrupt handler, which
prioritizes interrupts and saves them in a queue if more than one interrupt is waiting to be
handled.
Direct Memory Access: Direct memory access is a feature of computer systems that allows
certain hardware subsystems to access main system memory independently of the central
processing unit.
SELF ASSESSMENT QUESTION
Short Answer Questions
1. List the different types of I/O Commands?
2. Define polling.
3. Write about hardware interrupt.
4. What do you mean by direct memory access?
5. Discuss the benefits of using interrupts
Long Answer Questions
1. Describe the modes of data transfers with examples.
2. Explain the key issues in Interrupt driven Input/output and also discuss thesolutions
to these problems.
3. Each hardware interrupt source has an associated interrupt flag. Elaborate.
4. Illustrate a method for handling an interrupt in accordance with an embodiment of
the present invention.
5. Discuss in detail about the operation of multilevel of interrupt.
6. Describe the data transfer modes of Direct Memory Access (DMA).
FURTHER READING
1. Hamacher Carl. 2002, “Computer Organization”, 5th Edition, Tata McGraw Hill.

234
ANNA UNIVERSITY

2. Hennessy L. John. 2012, “Computer Architecture: A Quantitative Approach”,


Elsevier.
3. Patterson A. David. 2011, “Computer Organization and Design”, Elsevier.
4. Tanenbaum S.Andrew. 2012, “Structured Computer Organization”, Prentice
Hall PTR.
5. Wang Paul Shuangbao. 2012, “ Computer Architecture and Security”, John Wiley
& Sons.

235
ANNA UNIVERSITY

MODEL QUESTION PAPER


COMPUTER ORAGANISATION AND DESIGN
PART-A (10*2=20)
Answer All Questions
(Each answer should have minimum 7 lines)
1. Differentiate 1’s complement and 2’s complement of signed binary number.
2. List out the rules for binary addition.
3. What is the function of the enable inputs of a decoder?
4. How many types of shift register counters are there and write their names also?
5. Write about fetch execution cycle?
6. What is Moore's Law?
7. Compare and contrast relative tending to mode and ordered tending to mode.
8. Write few performance measure of cache.
9. Discuss the benefits of using interrupts.
10. List the different types of I/O Commands?
PART-B (5*13=65)
Answer All Questions
(Each answer should be written for minimum 6 pages with minimum 25 lines per page)
11. a. Describe the concept of representing signed numbers using the binary 13
number system.
(OR)
11. b. Describe DeMorgan’s theorem 13
12. a. (i) Explain how to draw the truth table and circuit diagram for an 8-to-3 7
priority encoder.
12. a. (ii) Explain in detail the different types of subtractors. 6
(OR)
12. b. (i) Explain the different types of shift register. 9
12. b. (ii) Explain the different ways to construct a basic flip-flop. 4
13. a. (i) Discuss in detail about the ALU design. 8
13. a. (ii) Write brief about clock rate and instruction rate of CPU 5
(OR)
13. b. Explain the essential design of a microprocessor exhaustively. 13
14. a. (i) Describe various kinds of RAM. Additionally separate between them. 6
14. a. (ii) Discuss about LRU Replacement 7
(OR)
14. b. With a block diagram, explain the direct and set associative mapping between 13
cache and main memory.
15. a. Illustrate a method for handling an interrupt in accordance with an embodiment of 13
the present invention.
(OR)
15. b. Discuss in detail about the transmission mode with example. 13

236
ANNA UNIVERSITY

PART-C (1*15=15)
(Each answer should be written for minimum 6 pages with minimum 25 lines per page)
16. a. Discuss the role of DreamWorks Animation in delivering 3D animated film in 15
2011: Kung Fu Panda 2*.
(OR)
16. b. Represent each of the AND, OR, NOT, and XOR gates using only NOR 15
gates.

237
ANNA UNIVERSITY

REFERENCES:
1. Mano, M. Morris, and M. Morris Mano. 2007. “Digital design”. Englewood Cliffs, N.J.:
Prentice-Hall.
2. Mohammad A. Karim, Xinghao Chen. 2007. “Digital Design Basic Concepts and Principles”.
3. William Gothmann H .1977. “Digital Electronics : An Introduction To Theory And Practice”.
4. Mano M Morris.2007. “ Computer System Architecture” Edition 3.
5. P.K. Sinha.2004. “Computer Fundamentals”.
6. Comer E. Douglas. 2012, “Essentials of Computer A rchitecture”,Pearson, sixth edition.
7. Godse A.D. 2008. “Computer Organisation and Architecture”, Technical Publications.
8. Harris David. 2012. “Digital Design and Computer Architecture”, Elsevier.
9. Mano Morris. 2012. “Logic and Computer Design Fundamentals”,Cram101.
10. Hamacher Carl. 2002. “Computer Organization”, 5th Edition, Tata McGraw Hill.
11. Hennessy L. John. 2012. “Computer Architecture: A Quantitative Approach”, Elsevier.
12. Patterson A. David. 2011. “Computer Organization and Design”, Elsevier.
13. Tanenbaum S.Andrew. 2012.“Structured Computer Organization”, Prentice Hall PTR.
14. Wang Paul Shuangbao. 2012. “ Computer Architecture and Security”, John Wiley & Sons.
15. Ghoshal Subrata . 2011. “Computer Architecture andOrganization”, Pearson Education.
16. Gill s. Naseeb. 2011. “ Digital Design and Computer Organisation”,Firewall Media.

238

You might also like