Design Principles of SRAM Memory in Nano-CMOS Technologies: International Journal of Computer Applications May 2019
Design Principles of SRAM Memory in Nano-CMOS Technologies: International Journal of Computer Applications May 2019
Design Principles of SRAM Memory in Nano-CMOS Technologies: International Journal of Computer Applications May 2019
net/publication/333116907
CITATIONS READS
5 816
1 author:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Process Variation Aware Non-Volatile(Memristive) Embedded 9T SRAM Memory Design in Nano-CMOS Technologies View project
All content following this page was uploaded by Apollos Chinonso Ezeogu on 19 June 2019.
General Terms
SRAM, CMOS, Design Principles, Nano Technology
Keywords
Memory cell, Embedded System, Read, Write, Process
variation, Leakage current, Power consumption.
1. INTRODUCTION
SRAM memory is still currently the main memory block of
today’s embedded systems and computing devices cache and
register designs. It is used in portable devices and in
embedded systems due to high demand for data storage, Figure 1a: Intel 45nm SRAM chip [2]
computing speed, data stability, and low power consumption
[1] and plays essential role in all Intel products in achieving
power performance goals; process defect sensitivity and
detection [2]. In high performance computing systems, the use
of SRAM for cache design helps speed up data
communication between the Central Processing Unit (CPU)
and the memory block; this ensures that frequently accessed
data are retrieved from the cache rather than the CPU this
technique is called the John Von Neumann stored program
computation concept that is widely used. However, this multi-
core system computational efficiency is getting saturated in
high data computation because of decreased efficiency, power
consumption, and stability as CMOS scaling continues to get
smaller below the 90nm technology node. Figure 1b: Intel 45nm SRAM in CMOS technology [11]
The SRAM cell is classified into different configurations
From figure 1a, the Intel’s SRAM bits are arranged in
which are named according to the number of transistors used
subarrays with rectangular matrix-like structure with rows and
in designing the memory cell. We have 4T, 5T, 6T, 7T, 8T,
columns such that in a read or write operation from the
9T, 10T and higher order SRAM configurations [1]. SRAM is
memory subarray, a specific row and column are activated
used in main memory for cache-less embedded processors and
depending on the address, and a group of bits called a word
hence must be optimized in terms of power, density, area and
are read or written [2]. The subarray is designed to be very
delay [4]. The design factors that are paramount to be
compact since it is tiled many times, to form an on-die cache
considered when designing SRAM are power consumption,
in a real product. After completion of the subarray design,
leakage current and stability under process variation. As
then large portions of the X-chip area can be tiled with these
CMOS scaling down is reaching its physical limit this poses
subarrays with minimal additional effort [2].
other challenges in leakage power, reliability, test complexity,
cost for mask and design, yield and fabrication processes [7].
Figure 1 shows 45nm 6T SRAM logic technology which was
5
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 11, May 2019
2. SRAM DESIGN ARCHITECTURE number of transistors include 4T, 5T, 6T, 7T, 8T, 9T, 10T,
11T, and 12T. The cell name is given base on the number of
2.1 SRAM Block Diagram transistors it contains, where “T” stands for “transistor”. The
SRAM complete block structure is shown in figure 2, with fundamental building block of a Static Random Access
other peripheral circuitry such as sense amplifier, row and Memory (SRAM) is the SRAM memory cell. The cell is
column decoders, read and write drivers, and timing control activated by raising the word line and is read or written
logic required for the complete implementations and through the bit lines. There are three different states of an
simulation of the SRAM cell for read, write and hold states. SRAM cell, namely: standby, reading and writing states to be
The SRAM architecture is arranged in cores for a larger discussed later in section 3.
system then in blocks and arrays depending on the design
specifications. The memory arrays are arranged in rows (word wordline
lines) and column (bit lines) of memory cells and have unique Vdd
location defined by the intersection of the rows and column of
Bitline
Bitline
the array. And each address or memory cell has its own pre-
charged circuit, read buffer and write driver, sense amplifier, R1 R2
and activation word lines and bit lines for active cell selection.
M3 M4
Pre-charge Circuitry
RD_Enable
M1 M2
BL BLB
WL[0]
Row Decoder
A0 SRAM Cell
A1
wordline
WL[N-1] Vdd
Bitline
Bitline
M1 M3
Column Decoder
M5 M6
CD_Enable
M2 M4
SA_Enable Sense Amplifier Data out
Write_Enable Data in
Figure 3b: 6T SRAM Cell
M1 M3
either of two stable states, 1 or 0, and also the access
transistors grant access to stored data for read and write. Thus, M5
M7
the term static means that it holds data as long as power is M6
6
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 11, May 2019
Write_wordline
equalization transistor which ensure that asymmetric defect is
Vdd
eliminated for correct read operation than if it was not
included; in order words it helps to minimize the voltage
difference between the bit lines, reduce pre-charge time by
Bitline
M1 M3 making sure that the bit lines (BL and BLB) are at nearly at
equal potential. A good pre-charge circuit should not have bit
M5 line voltage difference greater than 80mV for correct read
operation of the SRAM cell [8]. Consequently, M2 and M3
are the load transistors that connect the bit lines to the Vdd for
M2 M4
the pull up. It is also possible to use NMOS transistors which
will pre-charge the bit lines to Vdd - Vth; this gives a faster
single-ended bit line sensing because the bit lines do not
Vdd
swing as much to the Vdd, but the disadvantage is that it
reduces the noise margins and require more pre-charge time.
M8
M6
M7
Read_wordline
wordline
Vdd
Bitline
Bitline
M1 M3
M5 M6
Read_wordline
wordline
Bitline
Vdd
Bitline
M1 M3
M5 M6 R10
M9
M2 M4
M7 M8
7
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 11, May 2019
8
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 11, May 2019
deep-sub-micron technology becomes an issue because of the as shown in figure 9b, MN3 and MN4, are switched ON
leakage current and data retention at low operating voltages. thereby connecting them to the bit lines. See figure 9(b) for
The 6T SRAM consists of two PMOS (MP1 and MP2) known simplified schematic during read 0. For instance, if data Q = 0
as the load transistors, two NMOS (MN1 and MN2) known as through the bit line (BL) then the gate of MN2 is turned off
the drivers’ transistors and also two NMOS (MN3 and MN4) and MP1 initially held high due to the pre-charging will then
known as the access transistors. There are three major go “off” - PMOS are turned ON when the gate input is low
operations of SRAM: retention/standby, read and write while NMOS when the gate input is high. Note that Q =0,
operations. These operations are explained below: then QB=1 which is then fed as feedback to MP1 and MN1
hence switching MN1 ON while MP1 OFF. Thereafter the
current (Icell) now moves from the bit line (BL) through MN3
to the storage node Q thereby charging node Q while
discharging the bit line (BL). Since MN1 is ON then the
current from the node Q is further discharged to gnd; this is
possible by making the width (cell ratio) of MN1 wider than
MN3. The bit line voltage, (VBL), having been discharged to
(Vdd - Vth) the sense amplifier detects this voltage difference
in the bit lines and is then triggered and speedily amplifies the
small differential voltage between the bit lines to full swing
close to (Vdd) by identifying the bit line with the higher
voltage raising it Vdd while the lesser voltage is discharged
slowly through gnd. Then the data is kept at a stable state by a
sense amplifier. Conversely, if the data to be stored is “1” (see
figure 9c) the potential at node Q and the bit line potential will
be equal so no discharge will take place; however, at node QB
= 0, the bit line-bar (BLB) potential is higher so the discharge
current will move through transistor MN4 to MN2, thus
discharging BLB to Vdd -Vth. The sense amplifier will pull
BLB to gnd while BL remains at Vdd.
Figure 8: 6T SRAM Architecture [1]
WL = Vdd
This is the state when the SRAM cell is idle (data is held in Icell
latch) and the bit line and bit line bar (data path) are kept at BL= Vdd
MP2 BLB=Vdd
gnd when the access transistors are disconnected because the
word line is not inserted. Thus, the PMOS transistors will MN3
Q=0 MN4
QB=1
continue to re-enforce each other as long as they are Vdd
connected to the power supply in order to keep the data stored MN1
in the latch as shown in Figure 9a. During this idle/retention
mode, when “1” is stored in the cell, MP1 and MN1 are ON
thus there exists a positive feedback between Q and QB nodes
making Q to be pulled to Vdd. Similarly, when “0” is stored
in the cell, MP1 and MN1 are OFF while QB is pulled to Vdd . Figure 9(b): Read Operation for Data=0 [1]
9
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 11, May 2019
read operation. node Q then MP2 pulls down Q and cause the SRAM to
switch values
…………….(1)
…………..(3)
Since = = 45nm, then > by a factor
of at least 1.2 in order to ensure adequate noise margin and In equation 3, the NMOS has mobility higher than the PMOS,
no-destructive read. And as CR increases, the speed of the therefore for correct write operation the NMOS can be sized
SRAM cell increases [9]. to be equal to or greater than PMOS at minimum size.
Similarly, WMN3 = WMN4 and WMN1 = WMN2 Write Operation Summary:
The differential voltage developed on the bit lines depends on 1. Drive one of the bit lines high and the other low;
the cell current, Icell, bit line capacitance, Cbit, and the length
of time, t, the word line is activated. The current should be 2. Load data into write driver input pin;
large enough in order to discharge the bit line capacitance. 3. Enable the “write line” and the “word line” simultaneously;
…………………………………… (2) 4. Data is overwritten due to the weak SRAM cell transistors
compared to the write driver.
Read Operation Summary: 3.2 Transistor Scaling and Challenges
1. Pre-charge the bit lines to Vdd while the word line and sense Moore’s law states that the shrinking in density of transistor
size will double every 1.5 years. This predictive law for
amplifiers are disabled; CMOS has always proven to be true; the technology-node
scaling is driven by the need for high integration density and
2. Lower the column decoder of the given memory cell to be performance required in cache designs and microprocessors.
read; This has led to increase in statistical variation in the process
parameters which can cause increase in the total leakage
3. Enable the word line and sense amplifier after the pre- current. Hence, the reduction in threshold voltage, channel
charge cycle; length, drain/source junction depth, gate oxide thickness and
Vdd has become a major contributor to increase in leakage
4. Sense amplifier reads data from bit lines; current. The sub threshold leakage is the drain-source current
of the transistor when the gate-source voltage is less than the
5. Read outputs using the bit lines, a drop in bitline (BL)
threshold voltage; this is large for short channel devices. The
indicates data = 0 else data =1.
gate current leakage is due to low oxide thickness and the
3.1.3 Write Operation high electric field resulting to current flow through gate of the
This is the state when data is been written/updated in the cell transistor even during the off state because the classical
(see Figure 9d). To write data into a cell, the sense amplifier infinite impedance assumption of MOS is destroyed by the
and pre-charge circuits are deactivated while write enable and energy field. Due to this increase in leakage current, the static
the word line are first activated then the input data is driven power consumption thus exceeds switching component of the
through the write driver input pin then the bit line is pulled to power consumption.
the value of the given data while the bitline bar (BLB) takes
the complementary value. For instance, if data=0 then BL =0 3.3 Effect of Process Variation in SRAM
while BLB = 1 (Vdd); conversely, if data=1 then BL =1(Vdd) Process variations are the critical design parameters – die to
while BLB = 0 (gnd). Hence, given that transistors MP1 and die and intra-die variation – from equipment processing in the
MN3 are correctly sized then cell will flip and the data is semiconductor design technology due to inability to precisely
effectively written. control the fabrication process at small feature technologies at
the nano-scale which in turn results in large variation in the
Vdd
operation and functionality of the design. This is very severe
in the case of memory components as minimum sized
Icell transistors are used in their design [11]. These variations
BL=Vdd
MP2
include the film thickness, lateral dimensions, doping
concentration and threshold voltage variation. All these
BLB=0
Q=0 MN4
contribute to the circuit optimization for performance and
MN3 QB=1 power consumption. Doping concentration affects the
Vdd
threshold voltage, the Vth increases steadily as a result of more
MN1 random dopant fluctuations in channel, source and drain due
to increase delay distribution and delay spread. Consequently,
these random and systematic fluctuations affect the stability of
the SRAM [1]. Therefore, in the 6T SRAM design, the read
Figure 9(d): Simplified Schematic During WriteOperation stability of the cell is determined by the ratio of the current
(switching data 0 1) produced by the access transistors MN3 and MN4.
Furthermore, the impact of variation increases as the supply
Thus, consider when data = 1 to be written to a cell node voltage, Vdd, scales down to Vth because the sensitivity of the
initially storing a “0”, then the transistors MP2 and MN4 will circuit delay amplifies. Temperature and voltage variation are
function as pseudo-NMOS (MN3 is ON) inverter then current environmental variations which are primarily a function of
flows through the storage node to bit line-bar (BLB) and also intra-die (within die) variations, and contribute to failure rate
through MP2 to the storage node (QB) as soon as the potential (write ability and read stability) in SRAM cells.
at this node starts decreasing. This results in a voltage drop at
10
International Journal of Computer Applications (0975 – 8887)
Volume 178 – No. 11, May 2019
IJCATM : www.ijcaonline.org 11