Computer Architecture 5
Computer Architecture 5
Computer Architecture 5
Memory is one of the important subsystems in a Computer. It is a volatile storage system that
stores Instructions and Data. Unless the program gets loaded in memory in executable form,
the CPU cannot execute it. CPU Interacts closely with memory for execution.
There are many other storage systems in a computer that share the characteristics with
memory. So why have so many storage systems? Everyone desires to have very large, super
fast and cheap storage. Storage cost varies depending on the type of storage. Memory
devices are hierarchically connected to design a cost-effective memory. When we say
memory, we refer to the main memory, commonly referred to as RAM.
Access Time - The access time depends on the physical nature of the storage medium
and the access mechanisms used. Refer to figure 17.1. At the bottom is access time in
Milliseconds, while at the top of the triangle it is less than 10 ns.
For memory, the access time can be calculated as the time difference between the request
to the memory and service by memory.
Access Mode - Access mode is a function of both memory organization and the
inherent characteristics of the storage technology of the device. Access mode has
relevance to the access time. There are three types of access methods.
o Random Access: If storage locations can be accessed in any order then access time
is independent of the storage location being accessed. Ex: Semiconductor memory.
o Serial Access: Memory whose storage locations can be accessed only in a certain
predetermined sequence. Ex: Magnetic tape
o Semi Random: The access is partly random and there apart serial. Ex: Hard disk,
CD drives. It is random to locate the tracks and access within the track is serial.
Retention - This is the characteristic of memory relating to the availability of written
data for reading at a later time. Retention is a very important characteristic in the
design of a system.
Cycle Time - Is defined as the minimum time between two consecutive access
operations. This is greater than the access time. Generally, when once access is over,
there is a time gap required to start the next access, although minimal. Cycle time =
Access time + defined time delay. Ex: You ask the shop keeper of what is the speed of
the memory strip.
Capacity - Measured in Units of Bytes, Kilobytes, Megabytes, Gigabytes, Terabytes,
Petabytes. In figure 17.1, the bottom of the triangle has a larger capacity and the ones
at the top have the far lesser capacity. Ex: the Memory strip as 2GB, 4GB, Hard disk as
1TB, GPRs are 128 words.
Cost Per bit – Factors of cost per bit are Access time, Cycle time, Storage capacity, the
purchase cost of the device and the hardware to use the device (controller). We don’t
have much choice on this; designers care for this.
Reliability – It is related to the lifetime of the device. Measured as Mean Time Between
Failure (MTBF), in the units of days/years. Ex: Think of how frequently you replace your
Hard disk while the CPU is still usable.
What is RAM?
RAM, which stands for Random Access Memory, is a hardware device generally
located on the motherboard of a computer and acts as an internal memory of the
CPU. It allows CPU store data, program, and program results when you switch on
the computer. It is the read and write memory of a computer, which means the
information can be written to it as well as read from it.
Function of RAM
RAM has no potential for storing permanent data due to its volatility. A hard drive
can be compared to a person’s long-term memory and RAM to their short-term
memory. Short-term memory can only hold a limited number of facts in memory
at any given time; however, it concentrates on immediate tasks. Facts kept in the
brain’s long-term memory can be used to replenish short-term memory when it
becomes full.
This is also how computers operate. When RAM is full, the CPU of the computer
must constantly access the hard drive to overwrite the old data in RAM with the
fresh data. The computer’s performance is slowed by this process.
In most cases, the term "offline memory" referred to magnetic tape, from
which a particular piece of data can only be accessed with the help of finding
the address sequentially, beginning at the tape's starting. Data may be saved
and retrieved directly to and from specified locations thanks to the
organization and control of RAM.
Even while these other storage media, including the hard drive and CD-ROM,
are accessed both directly and randomly as well, the word "random access" is
not used to describe them.
RAM is much like a collection of boxes, where each box can store either a 0 or
a 1. You may find the specific address for each box by numbering up the rows
and down the columns. An array is a collection of RAM boxes, and a cell is a
single RAM box in an array.
Types of DRAM:
i) Asynchronous DRAM: This type of DRAM is not synchronized with
the CPU clock. So, the drawback with this RAM is that CPU could
not know the exact timing at which the data would be available
from the RAM on the input-output bus. This limitation was
overcome by the next generation of RAM, which is known as the
synchronous DRAM.
I) DDR1 SDRAM:
DDR1 SDRAM is the first advanced version of SDRAM. In this RAM, the voltage
was reduced from 3.3 V to 2.5 V. The data is transferred during both the rising as
well as the falling edge of the clock cycle. So, in each clock cycle, instead of 1 bit, 2
bits are being pre-fetched which is commonly known as the 2 bit pre-fetch. It is
mostly operated in the range of 133 MHz to the 200 MHz.
Furthermore, the data rate at the input-output bus is double the clock frequency
because the data is transferred during both the rising as well as falling edge. So, if
a DDR1 RAM is operating at 133 MHz, the data rate would be double, 266 Mega
transfer per second.
ii) DDR2 SDRAM:
It is an advanced version of DDR1. It operates at 1.8 V instead of 2.5V. Its data
rate is double the data rate of the previous generation due to the increase in the
number of bits that are pre-fetched during each cycle; 4 bits are pre-fetched
instead of 2 bits. The internal bus width of this RAM has been doubled. For
example, if the input-output bus is 64 bits wide, the internal bus width of it will be
equal to 128 bits. So, a single cycle can handle double the amount of data.
iv) DDR3 SDRAM:
In this version, the voltage is further reduced from 1.8 V to the 1.5 V. The data
rate has been doubled than the previous generation RAM as the number of bits
that are pre-fetched has been increased from 4 bits to the 8 bits. We can say that
the internal data bus width of RAM has been increased 2 times than that of the
last generation.
v) DDR4 SDRAM:
In this version, the operating voltage is further reduced from 1.5 V to 1.2 V, but
the number of bits that can be pre-fetched is same as the previous generation; 8
bits per cycle. The Internal clock frequency of the RAM is double of the previous
version
What is ROM?
ROM
ROM, which stands for read only memory, is a memory device or storage medium
that stores information permanently. It is also the primary memory unit of a
computer along with the random access memory (RAM). It is called read only
memory as we can only read the programs and data stored on it but cannot write
on it. It is restricted to reading words that are permanently stored within the unit.
The manufacturer of ROM fills the programs into the ROM at the time of
manufacturing the ROM. After this, the content of the ROM can’t be altered,
which means you can’t reprogram, rewrite, or erase its content later. However,
there are some types of ROM where you can modify the data.
For example, when you start your computer, the screen does not appear
instantly. It takes time to appear as there are startup instructions stored in ROM
which are required to start the computer during the booting process. The work of
the booting process is to start the computer. It loads the operating system into
the main memory (RAM) installed on your computer. The BIOS program, which is
also present in the computer memory (ROM) is used by the microprocessor of the
computer to start the computer during the booting process. It allows you to open
the computer and connects the computer with the operating system.
Types of ROM:
1) Masked Read Only Memory (MROM):
It is the oldest type of read only memory (ROM). It has become obsolete so it is not
used anywhere in today's world. It is a hardware memory device in which programs
and instructions are stored at the time of manufacturing by the manufacturer. So it
is programmed during the manufacturing process and can't be modified,
reprogrammed, or erased later.
The MROM chips are made of integrated circuits. Chips send a current through a
particular input-output pathway determined by the location of fuses among the rows
and columns on the chip. The current has to pass along a fuse-enabled path, so it
can return only via the output the manufacturer chooses. This is the reason the
rewriting and any other modification is not impossible in this memory.
The data in this memory is written or erased one byte at a time; byte per byte,
whereas, in flash memory data is written and erased in blocks. So, it is faster than
EEPROM. It is used for storing a small amount of data in computer and electronic
systems and devices such as circuit boards.
5) FLASH ROM:
It is an advanced version of EEPROM. It stores information in an arrangement or
array of memory cells made from floating-gate transistors. The advantage of using
this memory is that you can delete or write blocks of data around 512 bytes at a
particular time. Whereas, in EEPROM, you can delete or write only 1 byte of data at
a time. So, this memory is faster than EEPROM.
It can be reprogrammed without removing it from the computer. Its access time is
very high, around 45 to 90 nanoseconds. It is also highly durable as it can bear high
temperature and intense pressure.
What is Main Memory?
The main memory is central to the operation of a Modern Computer. Main
Memory is a large array of words or bytes, ranging in size from hundreds of
thousands to billions. Main memory is a repository of rapidly available
information shared by the CPU and I/O devices. Main memory is the place
where programs and information are kept when the processor is effectively
utilizing them. Main memory is associated with the processor, so moving
instructions and information into and out of the processor is extremely
fast. Main memory is also known as RAM (Random Access Memory). This
memory is a volatile memory. RAM lost its data when a power interruption
occurs.
What is Memory Management?
In a multiprogramming computer, the Operating System resides in a part of
memory and the rest is used by multiple processes. The task of subdividing the
memory among different processes is called Memory Management. Memory
management is a method in the operating system to manage operations
between main memory and disk during process execution. The main aim of
memory management is to achieve efficient utilization of memory.
Static
Linking: In static linking, the linker combines all necessary program
modules into a single executable program. So there is no runtime
dependency. Some operating systems support only static linking, in
which system language libraries are treated like any other object
module.
Dynamic Linking: The basic concept of dynamic linking is similar to
dynamic loading. In dynamic linking, “Stub” is included for each
appropriate library routine reference. A stub is a small piece of code.
When the stub is executed, it checks whether the needed routine is
already in memory or not. If not available then the program loads the
routine into memory.
Swapping
When a process is executed it must have resided in memory. Swapping is a
process of swapping a process temporarily into a secondary memory from the
main memory, which is fast compared to secondary memory. A swapping
allows more processes to be run and can be fit into memory at one time. The
main part of swapping is transferred time and the total time is directly
proportional to the amount of memory swapped. Swapping is also known as
roll-out, or roll because if a higher priority process arrives and wants service,
the memory manager can swap out the lower priority process and then load
and execute the higher priority process. After finishing higher priority work, the
lower priority process swapped back in memory and continued to the execution
process.
swapping in memory management
In this approach, the operating system keeps track of the first and last
location available for the allocation of the user program
The operating system is loaded either at the bottom or at top
Interrupt vectors are often loaded in low memory therefore it makes sense
to load the operating system in low memory
Sharing of data and code does not make much sense in a single process
environment
The Operating system can be protected from user programs with the help
of a fence register.
Advantages of Memory Management
It is a simple management approach
Disadvantages of Memory Management
It does not support multiprogramming
Memory is wasted
L1 Cache
This refers to the first level of any cache memory, usually known as the
L1 cache or Level 1 cache. In L1 cache memory, a very small memory exists
inside the CPU itself. In case the CPU consists of four cores (A quad-core
CPU), each core will then have its own L1 cache.
Since this memory exists in the CPU, it can operate at the very same speed
as that of the CPU. The actual size of such a memory ranges from 2
kilobytes to 64 kilobytes. The Level 1 cache has two further types of
caches: The Instruction cache storing, the instructions that are required
by a CPU, along with the data cache storing, the data that is required by
a CPU.
L2 Cache
This cache refers to the L2 cache or Level 2 cache. The level 2 cache may
reside both inside or outside any given CPU. Here, all the cores of the
CPU can consist of their separate (own) level 2 cache, or they can even
share a single L2 cache among themselves. Now, in case it’s outside the
CPU, it connects with the CPU using a very high-speed bus. Here, the
memory size of a cache ranges from 256 kilobytes to 512 kilobytes. If we
go for the speed, these are comparatively slower than the Level 1 cache.
L3 Cache
This cache is known as the L3 cache or Level 3 cache. This type of cache
isn’t present in all of the processors. Only a few of the high-end
processors may contain this type of cache. We use the L3 cache to enhance
the overall performance of the L1 and the L2 cache. This type of memory is
located outside a CPU. It is also shared by all CPU cores. The memory size
of the L3 cache ranges from 1 to 8 megabytes. Though the L3 cache is
slower than that of the L1 cache and the L2 cache, it is much faster than
that of the RAM or Random Access Memory.
It stores and accepts the data which is immediately stores in the CPU. For example
instruction register, program counter, accumulator, address register, etc.
It is the fastest memory that stores data temporarily for fast access by the CPU.
Moreover, it has the fastest access time.
It is the main memory where the computer stores all the current data. It is a volatile
memory which means that it loses data on power OFF.
Level 4 (L4) or Secondary Memory
It is slow in terms of access time. But, the data stays permanently in this memory.
The CPU first checks any required data in the cache. Furthermore, it does
not access the main memory if that data is present in the cache.
On the other hand, if the data is not present in the cache then it accesses
the main memory.
The block of words that the CPU accesses currently is transferred from the
main memory to the cache for quick access in the future.
The CPU searches the data in the cache when it requires writing or read any data from
the main memory. In this case, two cases may occur as follows:
If the CPU finds that data in the cache, a cache hit occurs and it reads the
data from the cache.
On the other hand, if it does not find that data in the cache, a cache
miss occurs. Furthermore, during cache miss, the cache allows the entry of
data and then reads data from the main memory.
Therefore, we can define the hit ratio as the number of hits divided by the
sum of hits and misses.
hit ratio = hit / (hit + miss)
higher associativity.
The recent data stores in the cache and therefore, the outputs are faster.
Disadvantages of Cache Memory
The disadvantages are as follows:
It is quite expensive.
Cache Mapping
There are three different types of mapping used for the purpose of cache
memory which is as follows:
Direct Mapping
Associative Mapping
Set-Associative Mapping
1. Direct Mapping
The simplest technique, known as direct mapping, maps each block of main
memory into only one possible cache line. or In Direct mapping, assign each
memory block to a specific line in the cache. If a line is previously taken up by
a memory block when a new block needs to be loaded, the old block is trashed.
An address space is split into two parts index field and a tag field. The cache is
used to store the tag field whereas the rest is stored in the main memory. Direct
mapping`s performance is directly proportional to the Hit ratio.
i = j modulo m
where
For purposes of cache access, each main memory address can be viewed as
consisting of three fields. The least significant w bits identify a unique word or
byte within a block of main memory. In most contemporary machines, the
address is at the byte level. The remaining s bits specify one of the 2s blocks of
main memory. The cache logic interprets these s bits as a tag of s-r bits (the
most significant portion) and a line field of r bits. This latter field identifies one
of the m=2r lines of the cache. Line offset is index bits in the direct mapping.
Direct Mapping – Structure
2. Associative Mapping
In this type of mapping, associative memory is used to store the content and
addresses of the memory word. Any block can go into any line of the cache.
This means that the word id bits are used to identify which word in the block is
needed, but the tag becomes all of the remaining bits. This enables the
placement of any word at any place in the cache memory. It is considered to be
the fastest and most flexible mapping form. In associative mapping, the index
bits are zero.
3. Set-Associative Mapping
This form of mapping is an enhanced form of direct mapping where the
drawbacks of direct mapping are removed. Set associative addresses the
problem of possible thrashing in the direct mapping method. It does this by
saying that instead of having exactly one line that a block can map to in the
cache, we will group a few lines together creating a set. Then a block in memory
can map to any one of the lines of a specific set. Set-associative mapping allows
each word that is present in the cache can have two or more words in the main
memory for the same index address. Set associative cache mapping combines
the best of direct and associative cache mapping techniques. In set associative
mapping the index bits are given by the set offset bits. In this case, the cache
consists of a number of sets, each of which consists of a number of lines.
Set-Associative Mapping
Associative Memory
Associative memory is also known as content addressable memory
(CAM) or associative storage or associative array. It is a special type of
memory that is optimized for performing searches through data, as
opposed to providing a simple direct access to the data based on the
address.
it can store the set of patterns as memories when the associative memory
is being presented with a key pattern, it responds by producing one of the
stored pattern which closely resembles or relates to the key pattern.
it can be viewed as data correlation here. input data is correlated with that
of stored data in the CAM.
If these characteristics are present then, it is not necessary that all the
pages or segments are present in the main memory during execution.
This means that the required pages need to be loaded into memory
whenever required. Virtual memory is implemented using Demand
Paging or Demand Segmentation.
Demand Paging
The process of loading the page into memory on demand (whenever a page
fault occurs) is known as demand paging. The process includes the
following steps are as follows:
3. The OS will search for the required page in the logical address
space.
Hence whenever a page fault occurs these steps are followed by the
operating system and the required page is brought into memory.
Advantages of Paging
There are the following advantages of Paging are −
In Paging, there is no requirement for external fragmentation.
In Paging, the swapping among equal-size pages and page frames is clear.
Paging is a simple approach that it can use for memory management.
Disadvantage of Paging
There are the following disadvantages of Paging are −
In Paging, there can be a chance of Internal Fragmentation.
In Paging, the page table employs more memory.
Because of Multi-level Paging, there can be a chance of memory reference
overhead.
Segmentation
The partition of memory into logical units called segments, according to the user’s
perspective is called segmentation. Segmentation allows each segment to grow
independently, and share. In other words, segmentation is a technique that partition memory
into logically related units called a segment. It means that the program is a collection of the
segment.
Types of virtual memory
A computer's MMU manages virtual memory operations. In most computers, the MMU
hardware is integrated into the central processing unit (CPU). The CPU also generates
the virtual address space. In general, virtual memory is either paged or segmented.
Paging divides memory into sections or paging files. When a computer uses up its
available RAM, pages not in use are transferred to the hard drive using a swap file. A
swap file is a space set aside on the hard drive to be used as the virtual memory
extension for the computer's RAM. When the swap file is needed, it is sent back to
RAM using a process called page swapping. This system ensures the computer's OS
and applications do not run out of real memory. The maximum size of the page file can
be 1 ½ to four times the physical memory of the computer.
The virtual memory paging process uses page tables, which translate the virtual
addresses that the OS and applications use into the physical addresses that the MMU
uses. Entries in the page table indicate whether the page is in RAM. If the OS or a
program does not find what it needs in RAM, then the MMU responds to the missing
memory reference with a page fault exception to get the OS to move the page back to
memory when it is needed. Once the page is in RAM, its virtual address appears in the
page table.
Segmentation is also used to manage virtual memory. This approach divides virtual
memory into segments of different lengths. Segments not in use in memory can be
moved to virtual memory space on the hard drive. Segmented information or processes
are tracked in a segment table, which shows if a segment is present in memory,
whether it has been modified and what its physical address is. In addition, file systems
in segmentation are only made up of segments that are mapped into a process's
potential address space.
3. The user will have the lesser hard disk space for its use.
Single-Ended DMA
Dual-Ended DMA
Arbitrated-Ended DMA
Interleaved DMA
Dual-Ended DMA: Dual-Ended DMA controllers can read and write from
two memory addresses. Dual-ended DMA is more advanced than single-
ended DMA.
Interleaved DMA: Interleaved DMA are those DMA that read from one
memory address and write from another memory address.
Working of DMA Controller
The DMA controller registers have three registers as follows.
Note: All registers in the DMA appear to the CPU as I/O interface registers.
Therefore, the CPU can both read and write into the DMA registers under
program control via the data bus.
The figure below shows the block diagram of the DMA controller. The unit
communicates with the CPU through the data bus and control lines.
Through the use of the address bus and allowing the DMA and RS register
to select inputs, the register within the DMA is chosen by the CPU. RD and
WR are two-way inputs. When BG (bus grant) input is 0, the CPU can
communicate with DMA registers. When BG (bus grant) input is 1, the CPU
has relinquished the buses and DMA can communicate directly with the
memory.
Burst Mode: In Burst Mode, buses are handed over to the CPU by
the DMA if the whole data is completely transferred, not before that.
Cycle Stealing Mode: In Cycle Stealing Mode, buses are handed
over to the CPU by the DMA after the transfer of each byte.
Continuous request for bus control is generated by this Data
Transfer Mode. It works more easily for higher-priority tasks.
Transparent Mode: Transparent Mode in DMA does not require
any bus in the transfer of the data as it works when the CPU is
executing the transaction.
The PCI bus is designed primarily to support a brust of data than just one word.
A read or write operation involving a single word treated as a burst of length one.
A bus supports 3 independent address spaces:
->Memory
->I/O
->Configuration
The I/O address space is intended to use with processor such as pentium , that
have separate I/O address space.
The Configuration space is intended to give the PCI its plug and play capability
The PCI bridge provides a separate physical connection for the main memory.
A 4-bit command identifies which of the three spaces is being used in a given data
transfer operation.
The master maintains the address information on the bus until the slave is selected
, it doesn’t have to keep it until, the execution is complete.
The address is needed in the bus for one clock cycle only thus freezing the address
lines in subsequent clock cycles. As a result there is significant reduction in cost,
because the number of wires on the bus is an important cost factor.
At any given time one device is the bus master. It has the right to initiate data
transfer by issuing read and write commands.
Device Configuration:-
The PCI bus has been defined for operation either a 5v or 3.3v power supply .
Connectors on expansion boards are designed to ensure that they can be plugged
inly in a compatible motherboard.
UNIVERSAL SERIAL BUS (USB):-
It was developed by several communication companies like compaq , hewlett
packard, Intel Microsoft etc.
The USB supports 2 speeds of connection:-
-> Low speed (1.5 megabits/ second. EPORT THIS AD
The Parallel and serial port provide a general purpose point of connection through
when a variety of low to medium speed devices can be connected to a computer.
Only a few ports are available in a computer. To add new ports the user must open
the computer box to gain access to the internal expansion and install new interface
cards and the user should also know how to configure device and software.
An objective of USB is to add many devices to the computer system without
opening the computer box.
Device Characteristics:-
->Plug and play features means that a new device such as an additional speaker
can be connected at any time while the system is operating . The system will detect
th existence of the new device automatically, identify the appropriate device driver
software and any other facilities required to enable them to communicate.
->It can be implemented in all levels of the system from hardware to the
OS(operating system) and the application software.
USB Architecture :-
When a USB is connected to a host computer its root hub is attached to the
processor bus. The host software communicates with individual devices attached to
the USB by sending packets of information which the root hub forward to
appropriate device in the USB tree. REPORT THIS AD
Each device on the USB whether its a HUB or I/O device is assigned a 7-bit
address.This address is local to the USB tree and is not related in any way to the
address used on the processor bus.
SMALL COMPUTER SYSTEM INTERFACE BUS(SCSI BUS):-
It refers to a standard bus defined by the ANSI under the designation X3.131[2]
A controller connected to SCSI bus is one of two types – an initiator or a target.
An initiator has the ability to select a particular target and to send commands
specifying the operations to be performed.
The disk controller operates as a target. It carries out the commands it receives
from the initiator. The initiator establishes a logical connection with the
intended target.Once the connection as been established, it can be suspended and
restored as needed to transfer commands and bursts of data.
While a particular connection is suspended , other devices can use the bus to
transfer information.This ability to overlap data transfer request is one of the key
features of the SCSI bus that leads to its high performance.
Data transfer on SCSI bus is controlled by the target controller. To send a command
to a target, an initiator controller requests control of the bus and after, and , after
winning arbitration , selects the controller it wants to communicate with and
hands control of the bus over to it. Then the controller starts a data transfer
operation to receive a command from the initiator.
BUS Arbitration :-
The bus is free when he -BSY signal is in the inactive state. Any controller can
request the use of the bus while it is in this state . Since two or more controllers
can generate such a request at the same time , an arbitration scheme must be
implemented.
A controller requests the bus by inserting the -BSY signal and by asserting its
associated data line to identify itself.
The SCSI bus uses simple distributed arbitration scheme, in which the controllers
uses the bus simultaneously.
Selection:-
The selected target controller responds by asserting -BSY . This informs the initiator
that the connection it is requesting has been established , so that it may remove the
address information from he data lines . The selection process is now complete and
the target controller is asserting -BSY . From his point onwards the controller has
control of the bus , as required for the information transfer phase.
Information Transfer Phase:-
The information transferred between two controllers may consists of commands
from the initiator to the target, status responses from the target to the initiator , or
data being transferred to or from the I/O devices . Handshake signal is used to
control information transfer in the same manner.