A (Not So) Short Introduction To MEMS
A (Not So) Short Introduction To MEMS
A (Not So) Short Introduction To MEMS
introduction
to MEMS
Franck CHOLLET, Haobing LIU
.
memscyclopedia.org
The original source for the document you are viewing has been written with
EX. The diagrams in the book were mostly created using Inkscape and CorelDraw
sometimes incorporating graphs produced with MATLAB . The photographs
were processed (mostly contrast enhancement) when required with GIMP.
The title of the book is an homage to the famous introduction to LATEX by
Tobias Oetiker.
All diagrams and most photographs are original to this book. Additional photographs are licensed from a variety of sources as indicated in the caption below
the figure. These additional photographs can not be reused outside of this book
without asking the original copyright owners.
For the full copyright of the original material (photographs, diagrams, graphs,
code and text) see the following page.
LAT
ISBN: 978-2-9542015-0-4
ISBN 978-2-9542015-0-4
00000 >
9 782954 201504
Attribution-NonCommercial 3.0
For any reuse or distribution, you must make clear to others the license terms
of this work.
Any of these conditions can be waived if you get permission from the copyright holders, who can be contacted at http://memscyclopedia.org/contact.
html.
Your fair use and other rights are in no way affected by the above.
This is a human-readable summary of the Legal Code
(http://creativecommons.org/licenses/by-nc/3.0/legalcode).
Contents
Contents
1 Why MEMS?
1.1 What is MEMS and comparison with microelectronics . . .
1.2 Why MEMS technology . . . . . . . . . . . . . . . . . . .
1.2.1 Advantages offered . . . . . . . . . . . . . . . . . .
1.2.2 Diverse products and markets . . . . . . . . . . . .
1.2.3 Economy of MEMS manufacturing and applications
1.3 Major drivers for MEMS technology . . . . . . . . . . . . .
1.4 Mutual benefits between MEMS and microelectronics . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
10
10
12
14
16
17
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
22
24
26
29
30
33
36
39
40
41
45
47
53
57
57
58
62
CONTENTS
3.2.1 Crystalline, polycrystalline and amorphous
3.2.2 Materials properties . . . . . . . . . . . .
3.3 Bulk micromachining, wet and dry etching . . . .
3.3.1 Isotropic and anisotropic wet etching . . .
3.3.2 Dry etching . . . . . . . . . . . . . . . . .
3.3.3 Wafer bonding . . . . . . . . . . . . . . .
3.4 Surface micromachining and thin-films . . . . . .
3.4.1 Thin-film fabrication . . . . . . . . . . . .
3.4.2 Design limitation . . . . . . . . . . . . . .
3.4.3 Microstructure release . . . . . . . . . . .
3.5 DRIE micromachining . . . . . . . . . . . . . . .
3.6 Other microfabrication techniques . . . . . . . . .
3.6.1 Micro-molding and LIGA . . . . . . . . .
3.6.2 Polymer MEMS . . . . . . . . . . . . . . .
3.7 Characterization . . . . . . . . . . . . . . . . . .
3.7.1 Light Microscope . . . . . . . . . . . . . .
3.7.2 SEM (Scanning Electron Microscope) . . .
3.7.3 Contact probe profilometry . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . .
4 MEMS technology
4.1 MEMS system partitioning . . . .
4.2 Passive structures . . . . . . . . .
4.2.1 Mechanical structures . .
4.2.2 Distributed structure . . .
4.2.3 Fluidic structures . . . . .
4.3 Sensor technology . . . . . . . . .
4.3.1 Piezoresistive sensing . . .
4.3.2 Capacitive sensing . . . .
4.3.3 Other sensing mechanism
4.4 Actuator technology . . . . . . .
4.4.1 Magnetic actuator . . . .
4.4.2 Electrostatic actuator . . .
4.4.3 Piezoelectric actuator . . .
4.4.4 Thermal actuator . . . . .
Problems . . . . . . . . . . . . . . . .
materials
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
66
70
75
76
80
83
85
87
101
103
104
108
108
109
110
111
121
125
129
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
133
133
135
135
139
142
145
146
149
152
153
154
155
161
164
171
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
173
175
176
179
182
192
195
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
5.3.1 Testing . . . .
5.3.2 Calibration .
5.3.3 Compensation
Problems . . . . . . . . .
7
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
196
197
199
203
213
215
D Laplaces transform
217
E Complex numbers
219
F Fraunhofer diffraction
223
F.1 Far-field diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
F.2 Bessel function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
G MATLAB code
227
G.1 Bode diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Bibliography
229
Index
233
CONTENTS
Chapter 1
Why MEMS?
1.1
Micro Electro Mechanical Systems or MEMS is a term coined around 1989 by Prof.
R. Howe [1] and others to describe an emerging research field, where mechanical
elements, like cantilevers or membranes, had been manufactured at a scale more
akin to microelectronics circuit than to lathe machining. But MEMS is not the only
term used to describe this field and from its multicultural origin it is also known
as Micromachines, a term often used in Japan, or more broadly as Microsystems
Technology (MST), in Europe.
However, if the etymology of the word is more or less well known, the dictionaries
are still mum about an exact definition. Actually, what could link inkjet printer
head, video projector DLP system, disposable bio-analysis chip and airbag crash
sensor - yes, they are all MEMS, but what is MEMS?
It appears that these devices share the presence of features below 100 m that are
not machined using standard machining but using other techniques globally called
micro-fabrication technology.
Micro-fabrication makes the MEMS.
Of course, this simple definition would also include microelectronics and some
would do it in the broader microsystem term but there is a characteristic that
electronic circuits do not share with MEMS. While electronic circuits are inherently
solid and compact structures, MEMS have holes, cavity, channels, cantilevers,
membranes, etc, and, in some way, imitate mechanical parts.
This has a direct impact on their manufacturing process. Actually, even when
MEMS are based on silicon, microelectronics process needs to be adapted to cater
for thicker layer deposition, deeper etching and to introduce special steps to free
the mechanical structures. Then, many more MEMS are not based on silicon and
can be manufactured in polymer, in glass, in quartz or even in metals...
Thus, if similarities between MEMS and microelectronics exist, they now clearly
are two distinct fields. Actually, MEMS needs a completely different set of mind,
9
10
1.2
1.2.1
The development of a MEMS component has a cost that should not be underestimated, but the technology has the possibility to bring unique benefits. The
reasons that prompt the use of MEMS technology can be classified broadly in three
classes:
miniaturization of existing devices For example the production of silicon based
gyroscope which reduced existing devices weighting several kg and with a
volume of 1000 cm3 to a chip of a few grams contained in a 0.5 cm3 package.
using physical principles that do not work at larger scale A typical example is given by the biochips where electric field are use to pump the reactant
around the chip. This so called electro-osmotic effect based on the existence
of a drag force in the fluid works only in channels with dimension of a fraction
of one mm, that is, at micro-scale.
developing tools for operation in the micro-world In 1986 H. Rohrer and
G. Binnig at IBM were awarded the Nobel price in physics for their work
on scanning tunneling microscope. This work heralded the development of
a new class of microscopes (atomic force microscope, scanning near-field optical microscope...) that shares the presence of micromachined sharp microtips with radius below 50 nm. This micro-tool was used to position atoms
in complex arrangement, writing Chinese character or helping verify some
prediction of quantum mechanics. Another example of this class of MEMS
devices at a slightly larger scale would be the development of micro-grippers
to handle cells for analysis.
By far miniaturization is often the most important driver behind MEMS development. The common perception is that miniaturization reduces cost, by decreasing
material consumption and allowing batch fabrication, but an important collateral
benefit is also in the increase of applicability. Actually, reduced mass and size
allow placing the MEMS in places where a traditional system wont have been
able to fit. Finally, these two effects concur to increase the total market of the
miniaturized device compared to its costlier and bulkier ancestor. A typical example is the case of the accelerometer shown in Figure 1.1. From humble debut as
crash sensor, it was used in high added value product for image stabilization, then
in game controller integrated inside the latest handphones, until the high volume
and low price allowed it to enter the toys market.
Helicopter toys
11
Washing machine
Smartphone
NXT robot
200
2
Wiimote
Segway
Pedometer
200
7
Active
sub-woofer 1997
92
Camera
stabilization
19
Binoculars
Crash
stabilization
sensor
(Airbag)
1987
12
1.2.2
The previous difficulty we had to define MEMS stems from the vast number of
products that fall under the MEMS umbrella since the first commercial products
appeared in 1973. The timescale in Figure 1.2 reveals the emergence of the main
successful MEMS products since that early time - and if it seems to slow lately, it
is only because we choose to show only those products that are here to stay and
have generally changed the market landscape, which takes a few years to prove.
2D optical switch (OMM)
Microphone (Knowles)
Mechanical sensors
Microfluidics
Optical MEMS
RF MEMS
Airbag accelerometer
(Sensonor)
2007
2009
1996
1987
1984
1973
1998
1999
2000
2001
2002
2003
2004
2005
Bolometer (Ulis)
3D optical switch (Lucent)
RF switch (Teravicta)
Silicon oscillator (Discera)
...
13
RF MEMS is also emerging as a viable MEMS market. Next to passive components like high-Q inductors produced on the IC surface to replace the hybridized
component as proposed by MEMSCAP we find RF switches, silicon oscillators
(SiTime) and soon micromechanical filters.
But the list does not end here and we can find micromachined relays (MMR) produced for example by Omron, HDD read/write head and actuator or even toys,
like the autonomous micro-robot EMRoS produced by Epson.
Product type
Examples
Pressure sensor
Inertia sensor
Microfluidics
bioMEMS
In 2010 these products represented a market of a bit less than 9B, increasing by
almost 200% since 2002, with roughly 16% each for the traditional inkjet printer
nozzle and pressure sensor market, 33% in inertial sensors (accelerometers and
gyroscope), 12% each in projection display and microfluidics and the rest split
between RF MEMS, microphones, oscillators[2]... Of course the MEMS market
overall value is still small compared to the 300B IC industry - but there are two
aspects that make it very interesting for investors:
it is expected to grow annually at a 2 digit rate for the foreseeable future,
much higher than any projection for the mature IC industry market;
MEMS chips have a large leveraging effect, and in the average a MEMS
based systems will have 8 times more value than the MEMS chip price (e.g.,
a DLP projector is about 10 times the price of a MEMS DLP chip).
It should be noted that this last point has created large discrepancies between
market studies, whether they report the market for components alone or for devices. The number we cited above are in the average of other studies and represent
the market for the MEMS part alone (actually a somewhat fairer comparison with
14
electronics industry where device price is considered would put the value of
the MEMS market to more than 70B).
1.2.3
However large the number of opportunities is, it should not make companies believe that they can invest in any of these fields randomly. For example, although
the RF MEMS market is growing fueled by the appetite for smaller wireless communication devices, it seems to grow mostly through internal growth. Actually
the IC foundries are developing their own technology for producing, for example,
high-Q inductors, and it seems that an external provider will have a very limited
chance to penetrate the market.
Thus, market opportunities should be analyzed in detail to eliminate the false perception of a large market, taking into consideration the targeted customer inertia
to change and the possibility that the targeted customer develops by himself a
MEMS based solution! In that aspect, sensors seems more accessible being simple
enough to allow full development within small business unit and having a large
base of customers - in the other hand, an optical switch matrix is riskier because
its value is null without the system that is built by a limited number of companies,
which, most probably, also have the capabilities to develop in-house the MEMS
component...
Some MEMS products already achieve high volume and benefit enormously from
the batch fabrication techniques. For example more than 100 millions MEMS accelerometers are sold every year in the world - and with newer use coming, this
number is still growing fast. But large numbers in an open market invariably
means also fierce competition and ultimately reduced prices. Long are gone the
days where a MEMS accelerometer could be sold 10 a piece - it is now less than
2 and still dropping. Currently, the next target is a 3-axis accelerometer in a
single package for about 4, so that it can really enter the toys industry. And
dont expect that you will be able to ramp -up production and decrease prices
easily : many of the initial producers of MEMS accelerometers(e.g. Novasensor)
have not been able to survive when the price went south of 5 as their design could
not be adapted to lower production cost. New companies overtook them (e.g. ST
Microelectronics), with design aimed from the start at reaching the 1 mark...
Of course, there are a few exceptions to this cost rule. Actually, if the number
of unit sold is also very large, the situation with the inkjet printer nozzle is very
different. Canon and Hewlett Packard developed a completely new product, the
inkjet printer, which was better than earlier dot matrix printer, and created a new
captive market for its MEMS based system. This has allowed HP to repeatedly
top the list of MEMS manufacturer with sales in excess of 600M. This enviable
success will unfortunately be hard to emulate but it will be done again!
But these cases should not hide the fact that MEMS markets are often niche
markets. Few product will reach the million unit/year mark and in 2006 among the
more than 300 companies producing MEMS only 18 had sales above 100m/year.
15
Thus great care should be taken in balancing the research and development effort,
because the difficulty of developing new MEMS from scratch can be daunting
and the return low. Actually current customers expect very high reliability, with
<10 ppm failed parts for consumer products, and even <1 ppm for automotive
applications. As such, it is not surprising that a normal component development
time for automotive applications, as acknowledged by Sensonor, would be 2-3 years
and 2-3 years more if substantial process development is required... And it may
be worse. Although Texas Instruments is now reaping the fruit of its Digital Light
Processor selling between 1996 and 2004 more than 4 millions chips for a value now
exceeding 200m/year, the development of the technology by L. Hornbeck took
more than 10 years [3]. Few startup companies will ever have this opportunity
and dont be over-optimistic: when a prototype is done only 10-20% of the job is
done.
Actually it is not clear for a company what is the best approach for entering
the MEMS business, and we observe a large variety of business model with no
clear winner. For many years in microelectronics industry the abundance of independent foundries and packaging companies has made fabless approach a viable
business model. However it is an approach favored by only a handful of MEMS
companies and as it seems now, for good reasons.
A good insight of the polymorphic MEMS business can be gained by studying
the company MemsTech, now a holding listed on the Kuala Lumpur Mesdaq
(Malaysia) and having office in Detroit, Kuala Lumpur and Singapore.
Singapore is actually where everything started in the mid-90s for MemsTech with
the desire from an international company (EG&G) to enter the MEMS sensor
market. They found a suitable partner in Singapore at the Institute of Microelectronics (IME), a research institute with vast experience in IC technology.
This type of cooperation has been a frequent business model for MNC willing to
enter MEMS market, by starting with ex-house R&D contract development of a
component. EG&G and IME designed an accelerometer, patenting along the way
new fabrication process and developing a cheap plastic packaging process. Finally
the R&D went well enough and the complete clean room used for the development
was spun-off and used for the production of the accelerometer.
Here, we have another typical startup model, where IP developed in research institute and university ends up building a company. This approach is very typical
of MEMS development, with a majority of the existing MEMS companies having
been spun-off from a public research institute or a university.
A few years down the road the fab continuously produced accelerometer and
changed hands to another MNC before being bought back in 2001 by its management. During that period MemsTech was nothing else but a component manufacturer providing off-the-shelf accelerometer, just like what Motorola, Texas Instrument and others are doing.
But after the buyout, MemsTech needed to diversify its business and started
proposing fabrication services. It then split in two entities: the fab, now called
16
Sensfab, and the packaging and testing unit, Senzpak. Three years later, the company had increased its off-the-shelf product offering, proposing accelerometer,
pressure sensor, microphones and one IR camera developed in cooperation with
local and overseas university.
This is again a typical behavior of small MEMS companies where growth is fueled by cooperation with external research institutions. Still at the same time
MemsTech proposes wafer fabrication, packaging and testing services to external
companies. This model where products and services are mixed is another typical
MEMS business model, also followed by Silicon Microstructures in the USA, Colybris in Switzerland, MEMSCAP in France and some other. Finally, in June 2004
MemsTech went public on the Mesdaq market in Kuala Lumpur.
The main reason why the company could survives its entire series of avatars, is
most probably because it had never overgrown its market and had the wisdom to
remain a small company, with staff around 100 persons. Now, with a good product
portfolio and a solid base of investor it is probably time for expansion.
1.3
From the heyday of MEMS research at the end of the 1960s, started by the discovery of silicon large piezoresisitive effect by C. Smith[4] and the demonstration of
anisotropic etching of silicon by J. Price[5] that paved the way to the first pressure
sensor, one main driver for MEMS development has been the automotive industry.
It is really amazing to see how many MEMS sensor a modern car can use! From
the first oil pressure sensors, car manufacturer quickly added manifold and tire
pressure sensors, then crash sensors, one, then two and now up to five accelerometers. Recently the gyroscopes made their apparition for anti-skidding system and
also for navigation unit - the list seems without end.
Miniaturized pressure sensors were also quick to find their ways in medical equipment for blood pressure test. Since then biomedical application have drained a
lot of attention from MEMS developer, and DNA chip or micro-analysis system
are the latest successes in the list. Because you usually sell medical equipment to
doctors and not to patients, the biomedical market has many features making it
perfect for MEMS: a niche market with large added value.
Actually cheap and small MEMS sensors have many applications. Digital cameras
have been starting using accelerometer to stabilize image, or to automatically find
image orientation. Accelerometers are now being used in contactless game controller or mouse.
These two later products are just a small part of the MEMS-based system that
the computer industry is using to interface the arid beauty of digits with our
human senses. The inkjet printer, DLP based projector, head-up display with
MEMS scanner are all MEMS based computer output interfaces. Additionally,
computer mass storage uses a copious amount of MEMS, for example, the harddisk drive nowadays based on micromachined GMR head and dual stage MEMS
1.4
The synergies between MEMS development and microelectronics are many. Actually MEMS clearly has its roots in microelectronics, as H. Nathanson at Westinghouse reported in 1967 the resonant gate transistor [6], which is now considered
to be the first MEMS. This device used the resonant properties of a cantilevered
beam acting as the gate of a field-effect transistor to provide electronic filtering
with high-Q (see Example 4.1). But even long after this pioneering work, the
emphasis on MEMS based on silicon was clearly a result of the vast knowledge
18
on silicon material and on silicon based microfabrication gained by decades of research in microelectronics. Even quite recently the SOI technology developed for
ICs has found a new life with MEMS.
But the benefit is not unilateral and the MEMS technology has indirectly paid
back this help by nurturing new electronic product. MEMS brought muscle and
sight to the electronic brain, enabling a brand new class of embedded system that
could sense, think and act while remaining small enough to be placed everywhere.
As a more direct benefit, MEMS can also help keep older microelectronics fab running. Actually MEMS most of the times have minimum features size of 1 5 m,
allowing the use of older generation IC fabrication equipment that otherwise would
just have been dumped. It is even possible to convert a complete plant and Analog
Devices has redeveloped an older BiCMOS fabrication unit to successfully produce
their renowned smart MEMS accelerometer. Moreover, as we have seen, MEMS
component often have small market and although batch fabrication is a must, a
large part of the MEMS production is still done using 100 mm (4) and 150 mm
(6) wafers - and could use 5-6 years old IC production equipment.
But this does not mean that equipment manufacturer cannot benefit from MEMS.
Actually MEMS fabrication has specific needs (deeper etch, double side alignment,
wafer bonding, thicker layer...) with a market large enough to support new product line. For example, firms like STS and Alcatel-Adixen producing MEMS deep
RIE or EVGroup and Suss for their wafer bonder and double side mask aligner
have clearly understood how to adapt their know-how to the MEMS fabrication
market.
Chapter 2
Introduction to MEMS design
2.1
The large decrease in size during miniaturization, that in some case can reach 1 or
2 orders of magnitude, has a tremendous impact on the behavior of micro-object
when compared to their larger size cousin. We are already aware of some of the
most visible implications of miniaturization. Actually nobody will be surprised to
see a crumb stick to the rubbed surface of a plastic rod, whereas the whole bread
loaf is not. Everybody will tell that it works with the crumb and not with the
whole loaf because the crumb is lighter. Actually it is a bit more complicated than
that.
The force that is attracting the crumb is the electrostatic force, which is proportional to the amount of charge on the surface of the crumb, which in turn is
proportional to its surface. Thus when we shrink the size and go from the loaf
to the crumb, we not only decrease the volume and thus the mass but we also
decrease the surface and thus the electrostatic force. However, because the surface
varies as the square of the dimension and the volume as the cube, this decrease in
the force is relatively much smaller than the drop experienced by the mass. Thus
finally not only the crumb mass is smaller, but, what is more important, the force
acting on it becomes proportionally larger - making the crumb really fly!
10
1
v = 1000
v/s=1.6
s = 600
v = 1 v/s=0.16
s=6
20
whose side goes from a length of 10 to a length of 1. The surface of the bigger
cube is 6 10 10 = 600 whereas its volume is 10 10 10 = 1000. But now what
happen to the scaled down cube? Its surface is 6 1 1 = 6 and has been divided
by 100 but its volume is 1 1 1 = 1 and has been divided by 1000. Thus the
volume/surface ratio has also shrunk by a factor of 10, making the surface effect
proportionally 10 times larger with the smaller cube than with the bigger one.
This decrease of volume/surface ratio has profound implications for the design of
MEMS. Actually it means that at a certain level of miniaturization, the surface
effect will start to be dominant over the volume effects. For example, friction
force (proportional to surface) will become larger than inertia (proportional to
mass hence to volume), heat dissipation will become quicker and heat storage
reduced: energy storage will become less attractive than energy coupling... This
last example is well illustrated by one of the few ever built micromachines, the
EMRoS micro-robot from Epson. The EMRoS (Epson Micro Robot System) is
not powered with a battery (which stores energy proportional to its volume and
becomes less interesting at small scale) but with solar cells whose output is clearly
proportional to surface.
Then of course we can dwell into a more elaborate analysis of nature laws and
try to see apart from geometrical factor what happens when we shrink the scale?
Following an analysis pioneered by W. Trimmer [7], we may describe the way
physical quantities vary with scale as a power of an arbitrary scale variable, s. We
have just seen that volume scale as s3 , surface as s2 and the volume/surface ratio
as s1 . In the same vein we may have a look at different forces and see how they
scale down (Table 2.1).
Force
Scaling law
Surface tension
s1
s2
Magnetic
s3
Gravitational
s4
21
E hw3
,
4 L3
(2.1)
where E is the elasticity modulus, h is the beam thickness, w its width and L its
length. For a nominal beam width of 2 m with an absolute fabrication tolerance of
0.2 m the relative accuracy is 10%. The stiffness for bending along the width
direction varies as a power of 3 of the width and will thus have a relative accuracy
of 30%. For a stiffness nominal value of 1 N/m, it means that the expected value
can be anywhere between 0.7 N/m and 1.3 N/m - the range indicates a potential
variation by almost a factor of two! If our design does not support this full range,
the yield of the device will be very low. In this particular case, we could improve
the relative accuracy figure by taking advantage of the mostly constant absolute
fabrication tolerance (here 0.2 m) and increase the beam width. For example, if
the beam width grows to 4 m we reduce the stiffness variation to 3 4/0.2 = 15%
- of course it also means doubling the beam length if we want to keep the same
22
spring constant.
2.2
Since the first days of pressure sensor development, MEMS designers have had
to face the complexity of designing MEMS. Actually if IC design relies on an
almost complete separation between fabrication process, circuit layout design and
packaging, the most successful MEMS have been obtained by developing these
three aspects simultaneously (Figure 2.2).
Process/
Material
Process
Device
Device
System /
Packaging
IC design
System /
Packaging
MEMS design
23
age. This may imply the use of alignment mark, on the MEMS and in the package.
In other case the chip may need to be aligned with external access port. Actually
MEMS sensors often need an access hole in the package to bring air or a liquid
in contact with the sensing chip, complicating substantially the packaging. One
of the innovative approaches to this problem has been to use a first level packaging during the fabrication process to shield the sensitive parts, finally linking
the back-end with the front-end. Even for MEMS that do not need access to the
environment, packaging can be a complex issue because of stress.
MEMS often use stress sensitive structure to measure the deformation of a member
and the additional stress introduced during packaging could affect this function.
Motorola solved this problem with its line of pressure sensor by providing calibration of the device after packaging - then any packaging induced drift will be
automatically zeroed.
This kind of solution highlights the need to practice design for testing. In the
case of Motorola this resulted in adding a few more pins in the package linked
to test point to independently tweak variable gain amplifier. This cannot be an
afterthought, but need to be taken into consideration early. How will you test your
device? At wafer level, chip level or after packaging? MEMS require here again
much different answers than ICs.
Understandably it will be difficult to find all the competence needed to solve these
problems in one single designer, and good MEMS design will be teamwork with
brainstorming sessions, trying to find the best overall solution. MEMS design cannot simply resume to a sequence of optimized answer for each of the individual
process, device and packaging tasks - success will only come from a global answer
to the complete system problem.
An early misconception about MEMS accelerometer was that these small parts
with suspension that were only a few m wide would be incredibly fragile and
break with the first shock. Of course it wasnt the case, first because silicon is a
wonderful mechanical material tougher than steel and then because the shrinking
dimension implied a really insignificant mass, and thus very little inertia forces.
But sometime people can be stubborn and seldom really understand the predictive
nature of the law of physics, preferring to trust their (too) common sense. Analog
Devices was facing the hard task to convince the army that their MEMS based
accelerometer could be used in military system, but it quickly appeared that it had
to be a more direct proof than some equations on a white board. They decided
to equip a mortar shell with an accelerometer and a telemetry system, and then
fired the shell. During flight, the accelerometer measured a periodic signal, that
was later traced back to the natural wobbling of the shell. Then the shell hit his
target and exploded. Of course the telemetry system went mum and the sensor
was destroyed. However, the fragile sensing part was still found in the debris...
and it wasnt broken.
In another example, the DLP chip from Texas Instruments has mirrors supported
by torsion hinge 1m wide and 60nm thick that clearly seems very prone to failure.
24
TI engineers knew it wasnt a problem because at this size the slippage between
material grains occurring during cyclic deformation is quickly relieved on the hinge
surface, and never build-up, avoiding catastrophic failure. But, again, they had to
prove their design right in a more direct way. TI submitted the mirrors of many
chips through 3 trillions (1012 ) cycles, far more that what is expected from normal
operation... and again not a single of the 100 millions tested hinges failed.
Of course, some designs will be intrinsically more reliable than other and following
a taxonomy introduced by P. McWhorter, at Sandia National Laboratory [11],
MEMS can be divided in four classes, with potentially increasing reliability problems.
Class
Type
II
No moving
parts
Moving
parts,
no
rubbing and
impacting
parts
III
IV
Moving
Moving
parts, im- parts, impacting
pacting and
surfaces
rubbing
surfaces
TI DLP, Re- Optical
lay, Valve, switch,
Pump...
scanner,
locking
system
2.3
As we have seen miniaturization science is not always intuitive. What may be true
at large scale may become wrong at smaller scale. This translates into an immediate difficulty to design new MEMS structure following guts feeling. Our intuition
may be completely wrong and will need to be backed up by models. However
simulation of MEMS can become incredibly complex and S. Senturia describes a
multi-tiered approach that is more manageable [14] as shown in Figure 2.3.
25
System
DE
s
ti
ua
eq
PDEs
ric
Ge
o
m
et
am
a gr
l
-d i
de
ck
mo
FEM
Material
DB
cro
Process
blo
ma
state
on
Device
Physical
AD
TC
lumpe
de
red
le m
uc
ed
BEM -orde
r
t
en
s behavioral model
Figure 2.3: MEMS multi-tiered simulation (adapted from [14] and expanded).
Some simulation tools like Intellisuite by Intellisense or Coventorware by Coventor have been specifically devised for MEMS. They allow accurate modeling using
meshing method (FEM, BEM) to solve the partial different equation that describe
a device in different physical domains. Moreover, they try to give a complete view
of the MEMS design, which, as we said before, is material and process dependent,
and thus they give access to material and process libraries. In this way it is possible to build quickly 3D model of MEMS from the mask layout using simulated
process. However MEMS process simulation is still in its infancy and the process simulator is used as a simple tool to build quickly the simulation model from
purely geometrical consideration, but cannot yet be used to optimize the fabrication process. One exception will be the simulation of anisotropic etching of silicon
and some processes modeled for IC development (oxidation, resist development...)
where the existing TCAD tools (SUPREM, etc) can be used.
Complete MEMS devices are generally far too complex to be modeled entirely ab
initio, and they are first divided into sub-systems that are more manageable. First
they can be partitioned between a pure MEMS part and electronics, but they also
need to be considered as a mosaic of diverse elements: mechanical structures, actuator, sensor, etc. For example, the MEMS optical switch shown in Figure 2.4
without its control electronics can be divided into a set of optical waveguides, a
pair of electrostatic actuator, multiple springs and hinges, a lock and many linkage
bars. Then each of the sub-systems can be modeled using reduced order models.
For example, behavioral simulation is used with MEMSPro from MemsCap using
numerical simulation (ANSYS) to obtain elements characteristics and generate the
reduced model, which then is solved in a circuit-analysis software like Spice. Sugar
from C. Pisters group at UC Berkeley is also based on behavioral model with discrete elements, but the decomposition of the structure in simpler elements is left
to the designer. Still, although the actual tendency is to use numerical modeling
extensively, it is our opinion that no good device modeling can be devised without
a first analytic model based on algebraic equation. Developing a reduced order
26
Figure 2.4: MEMS optical switch composed of (red) optical waveguides (yellow)
hinges and lock (purple) springs and suspension (green) electrostatic actuator
(blue) alignment structure.
model based on some analytic expression help our intuition regains some of its
power. For example, seeing that the stiffness varies as the beam width to the cube
makes it clearer how we should shrink this beam: if the width is divided by a
bit more than two, the stiffness is already ten times smaller. This kind of insight
is invaluable. The analytical model devised can also be usefully for validating
numerical simulation, which in turn can be used to get detailed insight into the
device.
The system level simulation is often not in the hand of the MEMS designer
alone, but here block diagram can be used with only a limited set of key state
variable. The model may then include the package, the electronics, and the MEMS
part will be represented by one or more blocks, reusing the equation derived in the
behavioral model.
2.4
The exact internal (hear physical ) description of a complete microsystem is generally rather complex and time-consuming to fully handle for the designer. Except in
the the most simple cases where an analytical solution is existing for the complete
structure, modeling the continuous physical system is obtained by discretization.
Actually the description of the physical system requires the resolution of the partial differential equations (Maxwell, Newton, ...) of physics by dividing the system
in many sub-domains where the equation can be solved analytically or numerically
with ease. In this way, the complete continuous problem is replaced by a set of
27
discrete problems, and we intuitively understand that the larger the size of this
set is, the closer the resulting discrete model will be from the physical model. As
such, in the FEM method, this approach is systematically applied to divide the
continuous system in small sub-blocks, yielding a large number of linear equation
that can only be solved numerically with the help of a computer.
However the discretization approach can be used with another strategy: instead
of blindly discretizing the complete structure, we can consider only its characteristics of interest and its behaviour. For example, if we have a cantilevered beam
submitted to a vertical force at its end as shown in figure 2.5, we can choose to
model its behaviour in a number of ways:
F
F
F
The last approach loose some of the detail of the device (What happen if the force
is sideway? The beam behaves as a spring but it also has a mass, how can we take
care of that?) it is a reduced order model but it often has the advantage to be
amenable to an analytic solution or at least to a simple enough model to be useful
during design, even if it requires numerical simulations for refining it.
The block and the circuit representation are two techniques that can be used
with lumped elements for complete system simulation, particularly for studying
their evolution with time, that is, their dynamic. If these two representations are
somewhat equivalent and can be used interchangeably to describe most systems,
the circuit elements modeling is usually preferred for sub-systems as it has the
advantage to be energetically correct and to intrinsically represent the reciprocal
28
actuator - F
spring - k
.
x
mass - m
F +
-
-c
m
+
++
1/m
1/s
1/s
-k
k
29
been performed. This hybrid method, tries to use the best of the analytical and
numerical approach and can be very efficient to model complex system, particularly
when the underlying physics is not fully understood by the designer.
2.4.1
Block modeling
If artistic drawings, like the beam in Figure 2.5, can be nice for a textbook introduction, it is obvious that the engineer needs another technique to represent
the elements inside such system. One method is to use block diagram, where each
components is represented by a box with an input and an output. Inside the box
we write the transfer function, that is the function that relates the input to the
output (Figure 2.7). In general the choice of the input and output variable is dic-
Input
x
Transfer
function
(y = h(x))
Output
y
Amplifier
V = AVT
Voltage
V (V)
30
2.4.2
In the system theory, the measurement system presented in the Example 2.1, is
called an open-loop system. This kind of architecture is typical for measurement
system, but control system will use closed-loop architecture. The function of such
system is to maintain a variable (e.g., temperature, speed, direction...) to a desired
value (e.g., 37 C, 1 m/s, 175 ...). Sensors are used there to measure the controlled
value or a related quantity and actuators to regulate it. Obviously, control systems
are everywhere, from the biologic system that adjusts the diameter of the pupil in
the eye for controlling the light intensity on the retina, to the lever system allowing
automatic leveling of water in the toilet flush!
In control system the output of the system is fed back to the input where an
error detector will sense any difference between the desired output and the actual
output and act accordingly to correct the error (Figure 2.8). Note that often
Error
detector
G = ye
e=xf
i
Output
y = Ge
Feedback signal
f = Hy
Feedback path
H = fy
31
=
x
1 + GH
Thus we can redraw the previous block diagram as shown in Figure 2.9. However
Input
i
Transfer
function
G
i
o = 1+GH
Output
o
32
PD
I0
Iris
Controller
Error
Amplifier
e
A
Set-point
V0
Control
VI
Iris
cI0
Luminous
flux
Controller
Sensor voltage
VP
Photovoltaic
cell
k
kAcI0
V0
G
1+GH
1/k which show that the radiant flux on the retina (PD) is independent of the value of the light intensity I0 if the gain is high, kAc
1.
2.4.3
33
Generally the relationship between the output and the input (i.e. the transfer
function) is a complex non-linear function of both input and output changing with
time. If we restrict ourselves to the case of a single input-single output (SISO)
system, this relationship is expressed as,
y = f (x, t),
(2.2)
f (x, t) = f (x, t)
(2.3)
f (x + x , t) = f (x , t) + f (x , t)
1
or equivalently
f (x1 + x2 , t) = f (x1 , t) + f (x2 , t)
(2.4)
Moreover if the system is linear and time-independent (i.e., its characteristics dont
depend noticeably on time at least during the time of observation2 ) the relationship between its input and output can be represented by a differential equation,
dn y
dn1 y
dy
dm x
dm1 x
dx
+
a
+
+
a
+
a
y
=
b
+
b
+
+
b
+ b0 x
n1
1
0
m
m1
1
dtn
dtn1
dt
dtm
dtm1
dt
(2.5)
3
with theoretically n > m. The sum m + n is called the order of the system.
The Laplaces transform gives a way to represent the differential equation 2.5
governing a physical system (e.g., the equation of the voltage in an electric circuit,
an
1
note that the eye is definitely not a linear detector (it follows more or less a logarithmic
function) and thus visually two lamps of equal intensity lit together wont create twice as much
luminosity (i.e. visual intensity)
2
it means that we neglect any drift effect, or treat it as a set of quasi-stationary systems
3
Actually we have used in the previous examples n = m. It is acceptable in punctual system
(i.e., system without spatial extension) as it is the case in lumped model (cf. Appendix B)
34
iR
vi
iC
vo
We have iR = vR /R = (vi vo )/R in the resistance, and iC = C dvo /dt for the
capacitor. Thus as iR = iC , the relationship between vo and vi becomes simply:
RC
dvo
+ vo = vi
dt
Using the table of Laplaces transform properties (Appendix D) we see that the
derivation correspond to a product by the Laplace variable s, and that the sum
remains a sum. Thus, in the Laplaces domain the previous equation becomes:
RCsVo + Vo = Vi
1
1
Vo
=
=
Vi
1 + RCs
1 + s
where = RC is called the time constant of the filter (with dimension of second).
Consequently the block diagram in the Laplaces domain (note the capital letters)
for this element can be represented by:
Vi
1
1+RCs
Vo
35
Spring (k)
Mass (m)
Neglecting the gravity effect, we write down the fundamental equation of dynamic
as:
d2 y
F = k(x y) = m 2
dt
Using again the table of Laplaces transform properties (Appendix D) we see that
the second order derivative correspond to a product by s2 . Thus, in the Laplaces
domain we have:
k(X Y ) = ms2 Y
Y
1
1
=
m 2 =
X
1+ ks
1 + s2 /n2
where n = k/m is the natural frequency of the mass-spring system. Consequently the block diagram in the Laplaces domain for this element is represented
by:
X
1
1+ m
s2
k
36
f (t)est dt,
(2.6)
but for the most common functions and operation (like derivation, integration, time
delay...) we may use table of transforms (Appendix D) that avoid repeating the
integration. We should note that usually the name of a function in the Laplaces
domain is written in upper case or block capitals (F (s)) and it uses the variable s.
2.4.4
As we have seen, for block analysis (and Laplaces transform), it doesnt matter
what physical effect is behind the system we are considering, because the behavior
of electrical, mechanical, thermal or even fluidic systems may be described by
similar differential equations. We have already seen in examples 2.3 and 2.4 that
electrical and mechanical systems give transfer functions (and differential equation)
of similar form. Then by inspecting the equations, we can see that the lumped
electrical elements behaves in the same way as the lumped mechanical elements.
They are said to be analogue systems.
Accordingly we give simple analogies for system made of lumped mechanical
(mass m, spring k, damper c), electrical (resistor R, capacitor C, inductor L),
thermal (thermal capacitance C and resistance R ) and fluidic (fluidic inductance
L and resistance R ) elements in Table 2.3.
Lumped element
Electrical
Mechanical
Thermal
Fluidic
Effort
Voltage
Force
Temperature
Pressure
Flow
Current
Speed
Heat flow
Flow rate
Capacitance
1/k
Inductance
Resistance
L
R
f
Passive
+
e
Lumped
Element
f
f
Active +
e
Lumped
Element
f
37
f1
f2
Lumped
e+2
e+1
Element
f1
f2
Figure 2.10: One and two port(s) lumped elements in circuit analysis.
(for example we could take the voltage instead of the charge, which will invert
the role of the capacitor and inductor) or if the elements are connected differently.
The right set of state variable could be dictated by the input and output of the
block used but there is a rationale behind the different analogies that makes some
more meaningful than other.
Actually, physical variables may be classified as effort and flow
variables. The effort variables follow the Kirchoffs voltage law
e3
+ + e4
(KVL), that states that their sum is zero along a closed path, like
38
associates the voltage of circuit to all the other effort variables and the current
to all the flow variables. Notice that in this case the product of the flow and
the effort variable has indeed the dimension of a power (and accordingly, the time
integral of the product, the dimension of energy), and thus could allow some sanity
check during modeling. The main problem encountered with circuit description
Example 2.5 Circuit modeling using analogies.
aking again the simple mass-spring mechanical system analyzed previously,
but with this time an additional external force F (t) applied on the mass.
Spring (k)
Mass (m)
F (t)
F (t)
1/k
is: how to connect the lumped elements together? This is solved by understanding
that elements that share the same effort are connected in parallel, while elements
sharing the same flow are connected in series.
We note that when we have the circuit-element model of our sub-system, instead of deriving the differential equation we may use impedances and Kirchoffs
laws to directly establish transfer function. Actually, in the Laplace domain, for
each passive circuit element the relationship between the voltage drop across one
port U (the effort) and the current I (the flow) can be written as :
U = Zi I
where Zi is the impedance of the element i.
For resistance we have,
ZR = R
39
for capacitance,
ZC =
1
Cs
Vo = VC =
IC
Cs
Vi Vo
R
Vi Vo 1
R Cs
giving the expected final relationship in Laplace domain without any differential
equation:
Vi
Vo =
1 + RCs
2.5
Dynamic analysis
In principle to obtain the dynamic response of a system (that is, its evolution with
time) we need to solve the differential equation that describes the system (eq. 2.5)
with the appropriate initial conditions.
However with the lumped element analysis using either block or circuit elements
we have simpler ways to deal with this problem5 . Actually, we will be using again
5
It could be noted that the lumped elements we use in our modeling are actually punctual
40
the Laplaces transform for studying the system dynamic in the time domain, or a
very close formalism, the Fourier transform, to study it in the frequency domain.
Actually, the dynamic study of a system can be accurately performed both in
the time domain (where the variable is t) or in the frequency domain (where the
variable is f or ).
2.5.1
With the Laplaces transform the tedious task of solving the differential equation6
is replaced by a lookup in a table of transforms (cf. Appendix D.2) and algebraic
manipulations facilitated by simple properties (cf. Appendix D.1).
The procedure can be summarized as follow:
First step using the properties of the Laplaces transform (Table D.1), transpose
the integro-differential equation of the system (t variable) to an equation
in the Laplaces domain (s variable), to obtain the transfer function of the
system;
Second step Transpose the input signal in the Laplaces domain using the table
of functional transform (Table D.2);
Third step Compute the output signal in the Laplaces domain by multiplying
the transfer function with the input signal;
Fourth step Transform back this subsidiary solution to the time domain using a
table of inverse Laplaces transforms (Table D.2). Before looking-up in the
table of inverse transform, it is better to perform some algebraic manipulation on the equation in s to bring it to a form consisting of a sum of simple
polynomial, whose inverse transforms are easily found in the table.
Although this procedure works for any input signal, we usually limit ourselves
to a few classes of typical signals. The most useful one is the step signal, that represents a sudden change in the input, and which is represented by a step function:
x(t)
0 if t < 0
u(t) =
1 if t > 0
elements: we neglect their spatial extension and accordingly the time for the signal to travel
through one element is completely ignored. The dependence of the response of a system with
time is of another nature, linked with the existence of energy storage elements that requires time
to be charged or discharged.
6
Laplaces transform method works both for ordinary differential equations (ODE) and partial
differential equation (PDE), but we will see here only the simpler problems, described by the
former.
41
As we can see from the table D.2, the Laplaces transform of this function is
very simple,
1
U (s) = L(u(t)) = .
s
The response of a system to the step signal is simply called the step response as
in Example 2.7, and it gives a good insight on how quickly the system will react
when there is a sudden change in the input (e.g., for an accelerometer it would
mean when a car crashes and the acceleration increases quickly).
2.5.2
However, the time domain analysis is generally not sufficient to get a good understanding of the properties of a system, and often the dynamic analysis is completed
by an analysis in the frequency domain. In frequency domain, the plot of a function versus frequency is called the spectrum of the function and we talk about the
spectrum of a signal or the spectrum of a transfer function.
To obtain the spectrum of the response, we need to compute the Fourier transforms of the transfer function and of the input signal. These computations are
again obtained easily with the Laplaces transform. The trick is even simpler
than for the time domain analysis7 : we need to replace in the function in the
Laplaces domain the s variable with j, where j is the complex imaginary unit
(i.e., j 2 = 1) and is the angular frequency of the signal (i.e., = 2f where f
is the frequency).
For any value of the frequency, the transfer function becomes a complex number
(H() = A()ej() ) whose amplitude A() is the gain (or amplitude ratio) and
whose phase () is the phase-shift induced by the transfer function.
Then, as we did in the Laplaces domain, we can obtain the spectrum of the
response to any input signal by multiplying the transfer function in frequency
domain by the spectrum of the input signal. The spectrum of the input signal is
again obtained by replacing the s variable with j in the Laplaces transform of
the signal8 . The resulting spectrum of the input signal is a complex quantity with
an amplitude and a phase and it can be represented by its amplitude spectrum
(the amplitude vs frequency) and its phase spectrum (the phase vs frequency).
Again, in the frequency domain there is a particular
A
X()
response that is more often studied, which is called the
1
sinusoidal steady state response. The sinusoidal steady
Actually what we compute here is the Fouriers transform, that could be obtained directly
with integral or tables too - but as we already have the Laplaces transform this is a much
simpler way. Note that the transform is a complex quantity and a refresher in complex numbers
is provided in Appendix E
8
Actually we are again performing a Fouriers transform of the input signal.
42
10
s
10
1 10
=
1 + s s
s(1 + s)
vo (t)
=
=
vo (t)
t>0
1/
1
1
10
= 10
= 10
s(1 + s)
s(1/ + s)
s 1/ + s
1
1
L1 {Vo (s)} = L1 10
s 1/ + s
1
1
10 L1
L1
s
1/ + s
10(1 et/ ) = 10u(t)(1 et/ )
The output of the circuit will never reach the final value but rises slowly towards
it asymptotically (we have 63% of the max value when t = ).
10
vi (t)
10
6.3
0
vo (t)
43
phase spectrum of 0 (see the amplitude spectrum in the inset9 ). To obtain the
corresponding response we need, as we just said, to multiply the transfer function in
frequency domain by the input signal spectrum it is of course trivial because the
spectrum of the input signal is constant and equal to 1. Thus the sinusoidal steady
state is actually simply H(j), that is, the Fouriers transform of the transfer
function. We note here that as the analysis is actually performed in frequency
domain we dont need to transform back the result as we did for the time domain
analysis with the Laplaces transform.
The sinusoidal steady state response (H() = A()ej() ) is a complex quantity, with an amplitude and a phase and it is customary to plot it as a function of
frequency in a pair of plots called the Bode diagram. On one plot, the amplitude
is plotted in decibel (20 log A()) and on the other the phase (()) is plotted in
degree. In both plots the horizontal axis is the frequency on a logarithmic scale,
allowing display of a wide range of frequency (e.g., from 0.1 Hz to 1 MHz)10 .
At the frequencies where the amplitude (or gain) spectrum is larger than 1,
the input signal is said to be amplified by the system, while when it is smaller
than 1 it is attenuated. The phase of the sinusoidal steady state response is also
a very important parameter of a system. For example, it is used to determine if
a closed-loop system is stable or not, that is, if it will start oscillating by itself or
saturate11 . Even with the Laplaces transform method, a general linear system of
order m + n is quite tedious to handle. It is thus advisable to try to simplify the
problem by lowering the order of the system using valid approximation.
If we remember Example 2.2, we have already used a crude approximation for
the photodetector transfer function by using a constant (VP = k), corresponding
actually to a model of zero order. It is valid only for slowly varying signals,
where the output can be considered to match the input instantaneously (i.e., at a
much larger speed than the time it take for the signal to change). Actually strictly
speaking, a zero order transfer function is not physical and a less crude model
for the photodiode will be using a first order model.
Actually the transfer function of many microsystems can be quite accurately
described by an equation of first (i.e, m+n = 1) or second (m+n = 2) order. This
happen even in complex system because there is often only one or two elements
9
How does such a signal with a constant frequency spectrum and phase look like? In the time
domain it is a pulse, that is a signal of very short duration and finite energy . Mathematically,
it is represented by the dirac function (t) whose Fouriers transform is of course... 1 constant
over the whole frequency spectrum as confirmed by Laplaces transform table.
10
Another often used representation of the sinusoidal steady state response is the Nyquist
diagram, where instead of plotting the amplitude and phase on two separate diagrams, we use
a parametric plot (f is the parameter) of the complex frequency response with the real part on
the X-axis and the imaginary part on the Y-axis.
11
Control theory tells that a closed-loop system is not stable if the signal fed back to the input
has been amplified (i.e., A 1) and is in opposition of phase (i.e., = 180 ) - such condition can
be observed directly on the Bode diagram. The very important problem of stability in closedloop system is beyond the scope of this course, but interested reader may refer to control theory
books.
44
H(s) =
1
Vo
=
Vi
1 + s
To get the transfer function in the frequency domain we replace s by j and get :
H(j) =
1
1
1
=
j
=
ej arctan
2
2
2
2
1 + j
1+
1+
1 + 22
-10
-20
-20
Phase (o)
Amplitude (dB)
We have here extracted the amplitude (modulus) of the complex transfer function
and its phase using the properties of complex numbers (cf. Appendix E).
-30
-40
-50
0.01
1
0.1
10
Normalized pulsation wt
100
-40
-45
-60
-80
-90
-100
0.01
1
0.1
10
Normalized pulsation wt
100
Looking at the amplitude transfer function, it becomes obvious why such a circuit
is called a low-pass filter: it let the signal with low-frequency pass ( < 1 or
f < 1/(2RC)) and attenuates substantially the signal with higher frequency.
45
2.5.3
First-order model
Typical example of microsystems that may be modeled using a first-order approximation are thermal sensors, certain chemical sensors, a photodiode, a seismic
sensor... and we have already seen a first-order transfer function in the example 2.7. In all this cases the inertia of the system (i.e., its ability to resist change)
is much larger than any other characteristics. Then, the relationship between the
input (x) and the output (y) is given in the time and the Laplaces domain by :
dy
dt
G
Y (s) =
X(s)
1 + s
y = Gx
(2.7)
(2.8)
where G is the static gain of the system (e.g., the sensitivity of a sensor or 1 in the
low-pass filter example) and is its time constant. The static gain is given by the
ratio between the input and the output after an infinite time has elapsed after
the input has changed - that is when the transient part of the signal at the output
has died out and dy/dt = 0. It is also simply the value of the transfer function
when the frequency of the input is 0 (G = H(0)).
Step-response
The step response of the first order system is given by:
ystep (t) = Gu(t)(1 et/ )
(2.9)
By observing the step response of the first order system in Figure 2.11 it appears
that waiting about > 7 is sufficient to obtain about 99.9% of the final response
(i.e., 1 e7 ).
This behavior will fix the speed of the system and thus will ultimately limit
the capability to follow the input at a high rate.
From the measured step-response of a system, we can retrieve the value of the
two parameters of the first-order model, the static gain and the time constant. The
static gain (G) is obtained as the value of the gain (the ration between the output
and the input) when the transient response has vanished, that is, after a long
enough time. The time constant can be obtained as the time for the output
to reach 63.5% of its maximum value (i.e., 1 e1 ). A more practical definition
for this quantity (avoiding difficulty to define signal level near the start and end
points exactly) uses the rise time of the system, which is defined as the time to go
from the 10% level to the 90% level. For first order system we have = trise /2.35.
46
y/G
1
0.9
0.5
0.1
Rise time
1
t/t
10
15
20
-20
0,6
[]
A/G [1]
0,8
-40
0,4
-60
0,2
-80
0
-90
0
10
15
20
Figure 2.12: Plot of a first order transfer function (linear frequency scale)
Actually we have already seen this response in Example 2.8, but instead of
reproducing the Bodes plot here (with its logarithmic scales for frequency and
amplitude), we show in Figure 2.12 a linear plot, without using dB nor logarithmic
frequency scale. Still it should be understood that the two figures show exactly
the same data there is only a scale difference! The transfer function amplitude
remains almost constant when < 1/ . When = 1/ , the gain has been
47
divided by 2 (or decrease by about 20 log( 2) 3dB on the log plot). This
frequency fc = 1/2 , is called the cut-off frequency and at fc the phase shift is
exactly 45 (/4), which provide another mean to obtain the time constant of the
circuit.
For a frequency much larger than the cut-off frequency (i.e., f
fc or
1),
the transfer function amplitude becomes simply A() G/( ). Thus, when the
frequency is multiplied by a certain factor, the signal is divided by the same factor.
For example the signal is divided by 10 when the frequency is multiplied by 10
(i.e., on the log plot, the amplitude decreases by 20 dB per decade) or it is divided
by 2 when the frequency is multiplied by 2 (that is 6 dB per octave on the log
plot). In this frequency range the transmittivity (corresponding to the amplitude
of the transfer function) changes too rapidly usually preventing to use the system
predictably.
2.5.4
Second-order model
The next simplest model after the first-order model is the second-order model
where we have m + n = 2. Such model describes systems where there are two
inertial components of similar magnitude coupled together, or a transfer between
kinetic energy and potential energy. Typical example are (R)LC circuits, some
thermal sensor, most of the mechanical sensors (accelerometer, pressure sensor,
etc) or linear actuators... The general equation governing such system is given in
time and Laplaces domain by :
2
y = Gx 2 n dy 12 d 2y
dt
n dt
2
Gn
Y (s)
= s2 +2n s+2 X(s)
(2.11)
(2.12)
where G is the static gain, n the natural frequency and (pronounced zeta) the
damping ratio.
Using analogies, Table 2.4 shows the parameters of first and second order systems expressed using typical lumped elements existing in the different physical
domains.
Characteristic
Mechanical
c
2 km
1
LC
R C
2 L
1/k
c
4
k
RC
k
m
Electrical
Thermal
Rt Ct
48
Step-response
The step response of a second-order system is more complicated than for a firstorder one, and depends on the value of .
ystep (t) = Gu(t) 1 en t cosh
2 1n t +
sinh
2 1n t
(2.13)
The complicated expression does not give much insight, but we can study limiting
cases depending on the value of . When the damping ratio is very small (low
damping) we have,
<< 1
This last case gives the fastest possible settling time without any overshoot or
oscillation. These three cases are schematically depicted in Figure 2.13. We show
in the Figure the typical terms used to describe the response: overshoot, settling
time and rise time. We have figured also the case where the oscillations and the
overshoot just disappear which is called a critically damped system. This happen
when the term 2 1 cease to be complex and becomes real, that is when 1.
Note that if it is possible to tolerate overshoot, the rise time may be shortened
by using a slightly under-damped system ( < 1). Usually over-damped systems
( > 1) are avoided, because they dont bring advantage over the two other cases
(still, a slightly over-damped system 1 may be useful to improve the robustness
of the system).
It can be noted that in an under-damped ( < 1) second order system, we
can easily retrieve different parameters of the model with the step response. The
static gain (G) is again obtained as the value of the gain (output/input) when
the transient response has vanished, that is after a long enough time. Then, the
natural frequency (n ) is roughly obtained from the period n of the oscillations
(cos n t term) as
2
n =
.
n
12
The simplest at this stage to obtain the function is to see that limx0 sinh(ax)/x = a using
limited development of the function close to 0
49
2% settling time
tn
Overshoot
y/G
1
0.9
Underdamped (z<1)
D
C
A
2% band
Critically damped (z=1)
Overdamped (z>1)
0.1
t
Rise time
Figure 2.13: Step response for a 2nd order transfer function with different damping
ratio < 1 (under-damped), = 1 (critically damped), and > 1 (over-damped).
Finally, the damping ratio is obtained by taking the ln of the ratio of the amplitude
of two consecutive oscillations. Actually the amplitude of the oscillation decreases
as en t , thus we have ln(A/B) = ln(B/C) = ln(A/C)/2 = /
yielding :
<0.3
1 2 ,
(2.14)
Using these formulas we may plot the Bode diagram using the MATLAB code
given in Annex G. In the plot of Figure 2.14 we have used the normalized pulsation
50
0.05 z
-40
-20
Phase (o)
Amplitude (dB)
5
-40
-80
-90
-120
-60
-80
0.01
0.05 z
1
0.1
10
Normalized pulsation w/wn
100
-160
-180
-200
0.01
1
0.1
10
Normalized pulsation w/wn
100
Figure 2.14: Bode diagram for a 2nd order transfer function with different damping
ratio = [0.05, 0.1, 0.3, 0.5, 1, 3, 5].
/n , a gain of 1 and we have varied the damping ratio between 0.05 and 5. The
main feature of the amplitude plot is the resonance phenomena that appears for
certain values of the damping ratio. We observe a marked increase of the transfer
function amplitude that appears around the natural frequency, and where the
output becomes larger than the input. The frequency where the transfer function
has its maximum is called the resonance frequency and can be found by looking
for the maximum of the amplitude term in eq. 2.14. This maximum is attained
when the denominator of the amplitude reaches a minimum, thus by equating the
derivative of (1 2 /n2 )2 + (2/n )2 with 0 and we find that the resonance
frequency is given by :
0 = n
1 2 2 for < 1/ 2.
(2.15)
At the same time we find that this maximum amplitude reached by the output
when = 0 is ymax = Gxmax /(2 1 2 ) (or simply ymax = Gxmax /(2) when
0.3) : the amplitude at resonance is multiplied by a factor 1/2. This may be
used to amplify the response of a system - but it is rarely used as it only works
for a limited band of frequency
and is hard to control.
The condition that < 1/ 2 in eq. 2.15 implies that for values larger than 1/ 2
there is no resonance : the amplitude does not increase above the value at = 0.
This can be easily proved, by using the value of 0 in the amplitude factor of
eq. 2.14, and by searching for which value of the resonance (i.e., the amplification)
51
disappears. We have:
G
(1
1
02 /n2 )2
1 2 2
+ (20 /n )2
2 2
>G
2
+ 2
1 2 2
<1
4 4 + 4 2 < 1
< 1/ 2
We can note that for 1/ 2 < < 1 there is no resonance in the frequency
response whereas the step response clearly shows an overshoot - although these
two phenomena are interrelated, there is no direct relationship. This shows that
we need to take care of too quick conclusions when we try to deduce the frequency
response from the step response (or the reverse), and that their careful individual
study is interesting!
Quality factor
There is another way to look at the resonance by using the concept of quality factor .
We have seen that high damping factor are associated with small resonance peak
or even no peak at all, and conversely low damping means a high resonance peak.
The damping of a system represents its loss to the environment, usually as heat,
and a system that has a large loss is understandably a low quality system. Thus we
also quantify the sharpness of the resonance using the quality factor, defined as the
maximum of the normalized frequency response amplitude, that is the frequency
response when = 0 :
1
1
Q=
2
2 1 2
However this definition of the quality factor does not give much more information
than the damping ratio, and is not very easy to measure practically as far from
the resonance the amplitude may be too small and lost in the system noise. A
more practical definition, valid for Q
1, is given by:
Q
n
0
2 1
2 1
G
where 1 and 2 are the frequency where the transfer function is 12 2
, that is,
the maximum of the transfer function divided by 2. As signal energy is proportional to the square of its amplitude, these two points correspond to frequencies
where the energy in the system has been reduced to half of the maximum energy
stored at resonance.
This definition is quite easy to apply and is used to define a quality factor and a
damping ratio, even when the system is not purely a second-order system.
52
f0
50
40
df
Amplitude (1)
50/ 2
30
20
10
G
0
0
10
Frequency (MHz)
15
20
2.5.5
53
A good insight in the frequency limitation of a system is obtained with the Bode
plot. Actually the change of the amplitude and phase of the transfer function with
the frequency of the input signal means that the characteristics of the system will
vary with the frequency of the input signal! Meaning, for example, that a microsensor measuring two measurands with the same magnitude could return different
informations because the two measurands have different frequencies. This effect
is often difficult to compensate, thus usually systems are rated for a maximum
frequency above which the manufacturer does not warranty their characteristics
anymore.
Lets have a look at what it means for our typical systems, looking separately at
the limitations imposed by the amplitude and the phase change.
Gain distortion
The change in the amplitude of the transfer function place a limit to the range
of frequency where a system will keep its accuracy. To understand the problem
we will imagine that we have a photodiode, modeled here as a first order system
with cut-off frequency fc (fc = 2/ ), and that we need to compare two luminous
signals modulated at different frequencies. When one of the modulation frequency
is much smaller than fc the sensitivity of the sensor is about G, but if the frequency
is much larger than fc the sensitivity of the sensor approaches quickly 0, and the
sensor wont see the modulation at all! In this case a comparison between the
two measurements is barely possible, because the sensor will have different output
signal for signal that originally have the same amplitude, simply because they are
at different frequencies! Figure 2.15 illustrates this effect by using an input whose
spectrum has two frequency components, one at 0.5fc the other at 7fc . The two
frequency component have the same amplitude and the system transfer function
spectrum is representative of a first order system. In the output signal, the two
components dont have the same amplitude anymore, because the system did not
show the same gain at all frequencies. This results in wrong measurement, making
signal comparison almost impossible.
For a first order system, if we want to keep the change in gain smaller than
1% for all frequencies of operation, we need to have A() > 0.99A(0), where A()
is the amplitude of the transfer function H(j). Using the expression for A()
given in eq. 2.10 we find that this condition is fulfilled if f < 0.14fc : we thus need
to keep the operating frequency below 15% of the cut-off frequency to maintain
accuracy within 1%.
If the system is represented by a second order model, we may also try to find
an estimation of the maximum operating frequency to keep the gain error (or
distortion) within 1%. In Figure 2.16 we zoom in the previous Bode plot for
frequency smaller that the resonance frequency to obtain the information. We see
that if we want to keep the error within 1%, it is necessary to keep the frequency of
1
0,8
0,8
0,6
0,6
A/G
Amplitude [a.u.]
Output
Transfer function
1
Input
0,4
1
Amplitude [a.u.]
54
0,4
0,2
0,2
0,8
0,6
0,4
0,2
0
0
0
10
10
f [fc]
f [fc]
10
f [fc]
Amplitude
1.05
1.04
1.03
1.02
1.01
1.00
0.99
0.98
0.97
0.96
5
0.95
0
z 0 0.3 0.5
0.6
0.656
0.707
1
0.2
0.4
0.6
0.8
Normalized frequency w/wn
Figure 2.16: Amplitude plot for a 2ndorder transfer function with different damping ratio = [0, 0.3, 0.5, 0.6, 0.656, 1/ 2, 1, 5].
the input below 0.1fn if the damping is very small (i.e., 0). When the damping
is large ( 1) the limitations becomes quickly even more drastic, allowing, for
example, a maximum operating frequency of only 2% of fn for = 5. In the
other hand, we see the interest of having a damping ratio around 0.7: it increases
significantly the bandwidth of the system. In this case the bandwidth (assuming
again that an error smaller than 1% can be tolerated) even exceed 50% of fn if
= 0.656.
System response will always benefit from a damping ratio around 0.7, even
if this may be difficult to implement in practice. It should be noted, that active structures may be used to circumvent this problem : for example using a
feedback loop and an actuator inside a micro-accelerometer may help to increase
the operating frequency up to the resonance frequency, even for systems naturally
55
badly damped. Of course, in that case the complexity of the system increases
substantially.
Phase distortion
In the previous section we have only considered the effect of the frequency on the
amplitude of the transfer function (i.e. the gain or sensitivity), however the phase
is also varying with the frequency and we may ask ourselves what is the effect
of this variation on the signal. This issue will affect first-order and second-order
systems but it will be more pronounced with second-order system because their
bandwidth (determined on amplitude consideration as above) may be larger that
for first order system, and because the maximum phase change is two times larger.
Actually if the input signal is a purely sinusoidal function of time, the change
in phase is not important because usually the phase of the output signal is not a
relevant factor. Thus even if the output phase varies with the frequency, the amplitude wont be affected causing no problem with systems where the information
is contained in the amplitude of the output signal.
However a problem arises when the signal has a complex time dependence. To
describe the reasoning we will consider a periodic input signal and a second-order
sensor, but the result holds for all kind of signal and systems as well. As the signal
is periodic, we may use the Fouriers decomposition to represent the input signal
as a sum of sinusoids, whose amplitude is obtained from the signal spectrum. For
example, as shown in Figure 2.17, the signal is decomposed as a fundamental and
its first harmonic at a frequency twice the fundamental frequency. For the sensor
we take 0.05 (small damping) and we consider that the signal fundamental
frequency is about f = 0.5n .
No phase distortion
Input
Fundamental
Linear
System
=0
Output
1st harmonic
56
z 0
0.1
-10
Phase (o)
-20
-30
0.3
-40
0.5
-50
0.707
1
-60
-70
-80
-90
0.2
0.4
0.6
0.8
Normalized frequency w/wn
n .
The linear-shift condition is approximately fulfilled when the damping ratio is
57
2.6
2.6.1
MEMS are very often multi-physics system working in more than one energy domain. Actually, many MEMS are used to transform energy in one energy domain
to another one and are called transducers. A typical example will be a motor, that
is a system that convert electrical energy to mechanical energy. But a microphone
would also be such a device, as it converts acoustic/mechanical energy to electrical
energy.
If this is true for a system it can also be found for simple elements in subsystems, that are used to convert energy between domains. The one port circuit
elements we have encountered earlier are intrinsically single domain, however the
two-ports elements (Fig. 2.10) can be used for describing elements that work in
two domains. One port will be used for connecting to other elements in a certain
energy domain (e.g., mechanical domain) while the other will be used to connect
to another domain (e.g. electrical domain).
Actually in its simplest form, a linear reciprocating13 multi-domain element is
a linear quadrupole. As such they can be represented with different impedances
in conjunction with a transformer, where the transforming ratio 1 : n actually
has the dimension suitable to link the effort and flow variables in both domains
(Fig. 2.19). When we work with transformer, we repeatedly make use of the
13
that is the transfer between energy domain is equal both way, which happens in any energyconserving (lossless) transducer
58
i1 1:n i2
Z1 v1
e1
Z2
+
v2
-
f2
+
e2
-
1
v2
1 v2
1
v1
= n = 2 = 2 Z2
i1
ni2
n i2
n
For example a capacitance C of impedance ZC = 1/(jC) connected to the secondary, will be equivalent to an impedance ZCeq = 1/(jn2 C) seen from the primary, that is it could be represented by a capacitor of value n2 C.
2.6.2
The methods we have described in the previous sections are mostly designed for linear systems, and although we have suggested that the equations governing physics
are linear, they often yield non-linear solutions. For example, if we consider the
Coulombs force between two electric charges, q1 and q2 , we have:
F1/2 =
1 q1 q2
r12
2
4 0 r12
If we consider the charge q1 (thus we are in the electrical domain), the force
is linear, and the effect of additional charges would be easily obtained with the
superposition theorem... however if we are in a system where the distance r12
can vary (we place ourselves in the mechanical domain), as would happen in an
electrostatic actuator, then the law becomes non-linear!
In that case, block or circuit model (with non-linear elements) can still be used
for modeling - but Laplaces transform is no more useful for studying the system
dynamics. In that case one would have to use state equations, that is write
59
Actuator
RT
+
VT -
1/k
1:n z
C0
Electrical
domain
F
-
Mechanical
domain
60
and solve numerically a set of first order differential equations describing the
system14 .
Actually, one should write one equation per independent state variable ui , that
is, per independent energy storage elements (i.e., per capacitor and inductor in the
analogue circuit), and the equations will give the derivative of each state variables
u i as a function of the state variable (but not their derivatives) and the inputs of
the system xi .
u 1 = f1 (u1 , u2 , . . . , x1 , x2 , . . .)
u = f (u , u , . . . , x , x , . . .)
2
2 1
2
1
2
(2.16)
u 3 = f3 (u1 , u2 , . . . , x1 , x2 , . . .)
...
As the system is non-linear at least one of the fi will be non-linear and the solution
will require in most cases numerical solution to find the system dynamic. These
equation can be directly integrated numerically or alternatively can be placed into
the Simulink environment. One may note that the steady state of the system is
obtained when the right hand side of each equation is equal to 0 (i.e. all u i = 0).
The output y of the system can be used as one of the state variables but it is
not always possible, and in this case it will be derived from the state variables
y = g(u1 , u2 , ).
14
Clearly state equations can be used to represent linear system too, but the Laplace transform
or the complex impedance are simpler to use
61
A 2
V k(x g) cx = m
x
0
2x2
The electrical circuit in the other hand is rather
simple and we write:
V + RI +
1
C
Idt = 0
we get the integral disappear by using the charge Q = Idt instead of the current,
and get:
Q
V + RQ + = 0
C
A
where the capacitance C = 0 x in the parallel plate approximation, thus finally
giving:
xQ
V + RQ +
=0
0A
In this circuit there are 3 energy storage elements: two in the mechanical domain
(the inductor of the mass and the capacitor of the spring), and one in the electrical
domain, the capacitor formed by the electrodes. We will need 3 independent state
variables that should yield only first order differential equations, that is, we need
u1 = x and u2 = x to be able to bring the mechanical equation to a proper format.
The last variable will be u3 = Q as it appears as Q in the electrical equation and
we can finally write the state equations :
A
2
u 1 = m1
u 2 = u1
u = 1 V u2 u3
3
0A
62
Problems
1. The Mad Hatter wants to serve Alice tea, but the tea is waaaaay too hot.
From basic principles, will it be faster (White Rabbit is waiting !) to let
the tea cool in the teapot before pouring it in the cups or to empty the
teapot in the cups first and let the tea cool down there? How much faster ?
(The Hatters teapot makes 2pi cups - but we would count it as 6 cups and
consider every container as spherical)
2. Starting from the expression of the frequency response of a second order
system and using the definition of the Q-factor in the frequency domain (the
difference between the frequencies where the energy in the system has been
reduced to half of the maximum energy stored at resonance divided by the
resonance frequency.), establish the exact expression for Q and simplifies it
in the case where
1.
3. Find the circuit representation of the lumped mechanical system in the Figure.
k1
k2
m1
m2
F (t)
Chapter 3
How MEMS are made
3.1
64
Modifying process
Subtractive process
Evaporation
Oxydation
Wet etching
Sputtering
Doping
Dry etching
CVD
Annealing
Sacrificial etching
Spin-coating
UV exposure
Development
...
...
...
Photoresist
coating
Mask aligning
UV-exposure
Positive
photoresist
Development
Negative
photoresist
65
chemical solution, the developer. Actually the exposure changes the solubility of
the photoresist in the developer and the exact change of solubility depends on the
type of photoresist used originally: for so-called positive photoresist the exposed
region becomes more soluble in the developer, while for negative photoresist the
reverse happens and the exposed region becomes insoluble. After development,
the surrogate layer patterned over the whole surface of the wafer can be used for
pattern transfer.
They are actually two main techniques that can be used to transfer the pattern:
Etching
Lift-off
Material etching
Material deposition
66
Layer
deposition
Layer
modification
Patterning
Pattern
transfer
Front-end Process
Back-end Process
Final test
Packaging
Dicing /
Release
Wafer
bonding
Wafer level
testing
3.2
3.2.1
MEMS materials
Crystalline, polycrystalline and amorphous materials
Broadly speaking, the MEMS materials can be split in three classes depending
on how orderly the atoms are arranged in space: at one extreme, the crystalline
materials, where order prevails; at the other end, the amorphous materials, where
orientation varies wildly between neighbouring atoms; and in between, the poly-
67
crystalline materials, where order is conserved only on a short scale, called a grain,
while on a larger scale it is made of arrangement of differently oriented grains.
A single crystal presents the highest order, as the
atoms are periodically arranged in space in a precise manSingle Crystal
ner following one of the lattice allowed by thermodynamic
and geometry, that is, one of the 14 Bravais lattice. The
crystal is then built by the repetition of this elementary
Polycrystalline
lattice in all three directions. In this case the material
properties of the crystal are highly reproducible but they
Amorphous
will generally depend on the direction within the crystal,
and the material is said to be anisotropic.
In the case of polycrystalline films, the material does not crystallize in a continuous
film, but in small clusters of crystal (called grains), each grain having a different
orientation than its neighbour. In general the grain size range from about 10 nm
to a few m. The grains may not be completely randomly oriented and some
direction may be favored depending on the material elaboration process, resulting
in highly varying material properties for different process condition. If the distribution of grain orientation is known, a good approximation of the properties of
the material can be obtained by using the weighted average of the single crystal
properties along different directions.
Finally, in amorphous films, the material grows in a disordered manner, with clusters of crystal being of a few atoms only. In this case, the material properties
are not the same as those present in single crystal or in polycrystalline films, and
usually present inferior characteristics: lower strength, lower conductivity... Interestingly, the properties of amorphous material are normally more stable with their
elaboration process parameters and they are also independent of the direction :
amorphous materials are intrinsically isotropic.
Order
68
tetrahedron configuration that shows that each Si atom (grey) has 4 neighbours
(black), sharing one electron with each as silicon is a tetravalent material. Many
semiconductor materials - usually tetravalent - will share this fcc arrangement,
also called the diamond lattice.
z
z
001
(001)
z
(110)
(111)
111
y
y
100
x
110
y
x
[100]
(212)
(012)
y
y
[110]
x
[100]
x
[021]
[100]
[101]
[120]
x
Figure 3.4: Lattice points coordinate, planes and directions in the cubic lattice of
silicon.
A plane in the crystal is in turn identified by 3 indices (hkl) placed between
parentheses which are obtained by considering the coordinate of the intersections
between the plane and the 3 crystal axes. Actually the number h, k and l are always
integers that are obtained by using the reciprocal of the intersection coordinates
for each axes and reducing them to the smallest possible integers by clearing the
common factors. If the integer is negative, it is represented by placing a bar on its
top.
Three important crystal planes, the (100) plane, (110) plane and (111) plane have
been illustrated in Figure 3.4. For example the (100) plane intercept the x axis in
1 (the reciprocal is 1/1 = 1!), and along y and z the 0 arises because in these cases
the plane is parallel to the axis and thus will intercept it... at infinity - because
we take the reciprocal of the intercept coordinate we get 1/ = 0. Note that if
the plane intercepts the axes at the origin, we need to use a parallel plane passing
through neighbour cell to be able to compute the indices.
One of the cause of the anisotropy observed in crystal can be understood by
considering the density of atoms at the surface of a particular plane. Actually, let
us use a stack of closely packed hard spheres as a simple 3D model for the atoms
arrangement in a fcc lattice1 . We observe different planes by cutting through this
model2 , and figured the atoms closest to the surface in black (and further away
1
2
This is not a Si crystal here the unit cell has 1 atom only whereas in Si it has two.
To perform this task the simplest is to use a computer system like the Surface Explorer
69
in lighter grey). We see that the (111) plane presents here the highest density of
atoms possible in a plane with a closely packed hexagonal structure. The (100)
plane present a square-packed arrangement, with more voids and thus a lower
density of atoms, and other planes will be of different density.
Example 3.1 Cleavage planes in <100>-cut Si wafer.
n <100>-cut wafer of silicon, cleavage happens parallel to the main flat of
the wafer which is located along the [110] direction which correspond for this
cut to a (110) plane. Why doesnt it happen in the (100) or the (111) planes?
In bulk crystal cleavage preferably happens parallel to high density plane. The
density of atoms in a plane is found by counting the number of atoms belonging
to the plane in one cell of the lattice and dividing by the surface of the plane in
the cell.
For the (100) plane, by looking at the Si crystal structure we
see that the atom at the center of the plane belong to the cell,
z
while the four atoms at the corner belong to 4 other cells, the
total number of atoms is thus: 1 + 4/4 = 2, that is a density
2
(100)
= a22 .
of aa
y
x
For the (111) plane, we see that the 3 atom at the corner of the
triangle belong to 6 cells, while the 3 atoms at the edge belong
z
to 2 cells, the total number of atoms is thus: 3/6 + 3/2 = 2,
4
that is a density of 2a/223a/2 = 3a
2.
For the (110) plane we have the 4 atoms at the corners that (111)
y
x
belong each to 4 other cells, the 2 atoms in the top and botz
tom side diagonal that belong to 2 cells and the 2 atoms from
the lighter fcc lattice that belong to the cell. The total number of atom is thus: 4/4 + 2/2 + 2 = 4, that is a density of
(110)
4 = 8 2 .
3a
3/2aa/ 2
We have 83
> 43 > 2, thus among these 3 planes the (110)plane is indeed the preferred cleavage plane.
Note that the high density rules is not the only rule to decide for cleavage. Actually
the cleavage plane is generally the plane with highest density... that is perpendicular to the surface for minimizing the cut cross-section and thus the energy required
for the cut. In this way in a <110>-cut wafer the cleavage plane generally will be
the (111) plane, as the (110) plane is not normal to the surface.
The same indices are used to represent crystallographic direction as well. In
this case the indices are obtained by considering the vector coordinate between
two points of the lattice placed along the chosen direction. The coordinate are
then reduced to the smallest set of integer and placed between brackets [hkl] to
differentiate them from the plane indices (hkl).
Interestingly, the direction normal to the (hkl) plane is the [hkl] direction. Moreover the angle existing between two directions is given by taking the dot product
proposed by K. Herman that was used to help produce this views http://surfexp.fhi-berlin.
mpg.de/
70
h1 h2 + k1 k2 + l1 l2
h21 + k12 + l12
atoms density
In general, crystal symmetries result in different directions having the same physical properties and there is no
(111)
need to distinguish them. In this case we use the <hkl>
(001)
notation to represent any of these equivalent directions, whereas
for plane with equivalent orientation we use the {hkl} no(110)
tation. For example, it is customary to give the crystallographic orientation of a silicon wafer by indicating the
(221)
equivalent direction of the normal to the top surface. A
<100>wafer means that the normal direction is equivalent to the [100] direction,
which could be [100] or even [010]. In these different cases, the top surface of
the wafer would be (100), (100) and (010), series of plane that present the same
properties because of the silicon face-centered cubic lattice. For other lattices with
less symmetries, care should be taken to use exact direction [hkl] to indicate the
precise crystallographic direction of the top wafer surface.
3.2.2
Materials properties
The choice of a good material for MEMS application depends on its properties, but
not so much on carrier mobility as in microelectronics. Actually we select materials on more mechanical aspect: small or controllable internal stress, low processing
temperature, compatibility with other materials, possibility to obtain thick layer,
patterning possibilities... In addition, depending on the field of application, the
material often needs to have extra properties. RF MEMS will want to be based
on material with small loss tangent (for example high resistivity silicon), optical
MEMS may need a transparent substrate, BioMEMS will need bio-compatibility,
if not for the substrate, at least for a coating adhering well to the substrate, sensor application will need a material showing piezoresistance or piezoelectricity,
etc. Actually, because the issue of material contamination is much less important in MEMS than in IC fabrication, the MEMS designer often tries to use the
material presenting the best properties for his unique application. Still, from its
microelectronics root MEMS has retained the predominant use of silicon and its
compounds, silicon (di)oxide (SiO2 ) and silicon nitride (Six Ny ). But actually, it
was not purely coincidental because silicon, as K. Petersen explained in a famous
paper [21], is an excellent mechanical material. Silicon is almost as strong but
lighter than steel, has large critical stress and no elasticity limit at room temperature as it is a perfect crystal ensuring that it will recover from large strain.
Unfortunately it is brittle and this may pose problem in handling wafer, but it is
rarely a source of failure for MEMS components. For sensing application silicon
has a large piezoresistive coefficient, and for optical MEMS it is transparent at the
71
Youngs modulus
Poisson ratio
Density
kg/m3
GPa
Stainless Steel
200
0.3
7900
Silicon (Si)
<100>130
0.25
2300
<111>187
0.36
PolySilicon (PolySi)
120-175
0.15-0.36
2300?
73
0.17
2500
340
0.29
3100
Glass
(BK7) 82
0.206
2500
(SF11) 66
0.235
4700
Gold (Au)
78
0.42
19300
Aluminum (Al)
70
0.33
2700
SU8
4.1
0.22
1200
PDMS
0.0004-0.0009
0.5
970
E [GPa]
The value of 169 GPa in the graph may seem to contradict the Table where E=187.5 GPa
but this last value correspond to the modulus in the <111>-direction, whereas in the graph we
show the direction perpendicular to that direction ((111)-cut wafer)
72
1/E /E /E
0
0
0
X
Y /E 1/E /E
0
0
0
Z /E /E 1/E
0
0
0
XY 0
0
0
1/G 0
0
Y Z 0
0
0
0 1/G 0
ZX
0
0
0
0
0 1/G
X
Y
Z
XY
Y Z
ZX
(3.1)
where I and I represent the longitudinal strain and stress along direction I,
and IJ and IJ the shear strain and stress in the IJ plane4 . In the case of
unidimensional stress (e.g., along X), the strain is simply described as X = E X ,
where X is the stress along the X direction, X = x/x the strain along X and
E the Youngs modulus. We also observe the effect of the Poissons ratio , where
the positive stress along the X direction does not only cause elongation along X as
seen above but also contraction in the other direction as Y = E X (and similar
along Z).
For an anisotropic material, the relationship will not be as simple and the stiffness
matrix (the inverse of the compliance matrix) will have a larger number of non-zero
terms, coupling stress and strain components in a much more complex pattern:
(3.2)
73
c1 c2 c2 0 0 0
c2 c1 c2 0 0 0
c2 c2 c1 0 0 0
C=
0 0 0 c3 0 0
0 0 0 0 c3 0
0 0 0 0 0 c3
with c1 = c11 = = 166 GPa, c2 = c12 = = 64 GPa and c3 = c44 = =
80 GPa for Silicon. If we compare this matrix to the isotropic case in Eq. (3.1),
they seem similar, however, this form of the matrix in the case of Silicon is only
valid for one particular X, Y and Z Cartesian coordinate system parallel to the
crystallographic (A,B,C) axes for isotropic materials it is true for all Cartesian
coordinate system with any orientation. Alternatively instead of C, we could use
the compliance matrix S and obtain:
s s s 0 0 0
1 2 2
s2 s1 s2 0 0 0
s2 s2 s1 0 0 0
S=
0 0 0 s3 0 0
0 0 0 0 s3 0
0 0 0 0 0 s3
with s1 = s11 = = (c1 + c2 )/(c21 + c2 c1 2c22 ) = 7.66 1012 Pa1 ,s2 = s12 =
= c2 /(c21 + c2 c1 2c22 ) = 2.13 1012 Pa1 and s3 = s44 = = 1/c3 =
12.5 1012 Pa1 for Silicon. Instead of using the cumbersome matrix notation,
it can be shown that for arbitrary direction in a cubic crystal we can define an
equivalent Youngs modulus in the direction (l1 , l2 , l3 ) using:
1
= s1 2(s1 s2 0.5s3 )(l12 l22 + l22 l32 + l12 l32 )
E
and a Poissons ratio for any pair of orthogonal directions (l1 , l2 , l3 ) and (m1 , m2 , m3 )
using:
= E s2 + (s1 s2 0.5s3 )(l12 m21 + l22 m22 + l12 m23 )
74
1
= s1 2(s1 s2 0.5s3 )l22 l32
E
that is, by inserting the values of the material properties,
E=
1
1
=
2 2
12
s1 2(s1 s2 0.5s3 )l2 l3
7.669 10
7.107 1012 l22 l32
thus for the [010] and [001] direction we have either l2 or l3 that is zero and we
obtain
E = 130.3 GPa. At 45 from these direction, we have l2 = l3 = cos(/4) =
3/2 and E = 169.6 GPa. These results corroborates the curves shown in the
inset p. 71.
75
varying quantity to the nitride yields oxynitride compounds (Six Oy Nz ), giving the
possibility to tune the refractive index between stoichiometric nitride (n=2.1 @
542 nm) and oxide (n=1.5) - an interesting property for optical MEMS applications. Closing the list of silicon compound we can add a newcomer, silicon carbide
SiC. SiC has unique thermal properties (albeit not yet on par with diamond) and
has been used in high temperature sensor.
But silicon and its derivative are not the only choice for MEMS, many other
materials are also used because they posses some unique properties. For example,
other semiconductors like InP have also been micromachined mainly to take advantage of their photonics capabilities and serve as tunable laser source. Quartz
crystal has strong piezoelectric effect that has been put into use to build resonant
sensors like gyroscope or mass sensors. Biocompatibility will actually force the
use of a limited list of already tested and approved material, or suggest the use of
durable coating.
Glass is only second to silicon in its use in MEMS fabrication because it can
easily form tight bond with silicon and also because it can be used to obtain biocompatible channels for BioMEMS. Moreover, the transparency of glass is what
makes it often popular in optical MEMS application.
Polymers are also often used for BioMEMS fabrication where they can be tailored to provide biodegradability or bioabsorbability. The versatility of polymers
makes them interesting for other MEMS application, and for example the reflow
appearing at moderate temperature has been used to obtain arrays of spherical
microlenses for optical MEMS. This thermoplastic property also allows molding,
making polymer MEMS a cheap alternative to silicon based system, particularly
for micro-fluidic application. Recently the availability of photosensitive polymers
like SU8 [24] than can be spun to thickness exceeding 100 m and patterned with
vertical sides has further increased the possibility to build polymer structure.
This quick introduction to MEMS materials needs to mention metals. If their
conductivity is of course a must when they are used as electrical connection like
in IC, metals can also be used to build structures. Actually, their ability to be
grown in thin-films of good quality at a moderate temperature is what decided
Texas Instruments to base the complete DLP micro-mirror device on a multilayer aluminum process. In other applications, electroplated nickel will produce
excellent micro-molds, whereas gold reflective properties are used in optical MEMS
and nitinol (NiTi), presenting a strong shape memory effect, easily becomes a
compact actuator.
3.3
76
3.3.1
Wet etching is obtained by immersing the material in a chemical bath that dissolves the surfaces not covered by a protective layer. The main advantages of this
subtractive technique are that it can be quick, uniform, very selective and cheap.
The etching rate and the resulting profile depend on the material, the chemical,
the temperature of the bath, the presence of agitation, and the etch stop technique
used if any. Wet etching is usually divided between isotropic and anisotropic etch-
77
The top-left part of Figure 3.5 shows isotropic etching of silicon when the
bath is agitated ensuring that fresh chemical constantly reaches the bottom of the
trench and resulting in a truly isotropic etch. Isotropic wet etching is used for thin
layer or when the rounded profile is interesting, to obtain channels for fluids for
example. For silicon, the etchant can be HNA, which is a mixture of hydrofluoric
acid (HF), nitric acid (HNO3 ), and acetic acid (CH3 COOH). In HNA the nitric
acid acts as an oxidant and HF dissolves the oxide by forming the water soluble
H2 SiF6 . The two steps of the simplified reaction are:
Si + HNO3 + H2 O SiO2 + HNO2 + H2
SiO2 + 6HF H2 SiF6 + 2H2 O
The etching rate for silicon can reach 80 m/min, and oxide can be used as mask
material as its etch rate is only 30 to 80 nm/min. Etching under the mask edge or
underetch is unavoidable with isotropic wet etching. Moreover, the etch rate and
profile are sensitive to solution agitation and temperature, making it difficult to
control the geometry of the deep etch usually needed for MEMS.
Anisotropic etching developed in the late 60s can overcome these problems.
The lower part of Figure 3.5 shows features obtained by etching a (100) wafer
with a KOH solution. The etched profile is clearly anisotropic, reveling planes
without rounded shape and little underetch. Potassium hydroxide (KOH), tetramethyl ammonium hydroxide (TMAH) and ethylene diamine pyrocatechol (EDP)
are common chemicals used for anisotropic etching of silicon.
For KOH and TMAH the simplified chemical reaction is written as :
4H2 O + 4e 4OH + 2H2
Si + 4OH Si(OH)4 + 4e
We thus have a generation of hydrogen (bubbles escape during etching), and we
notice that electrons are important elements in the reaction. Actually, the etching
anisotropy has its roots in the different etch rates existing for different crystal
planes that is generally thought to arise because of their different density of atoms
and hence, of electrons. In fact, scavenging electrons will generally be a mean of
stopping the reaction.
<100>
<110>
[111]
90
[111]
54.7
54.7
125.2
[110]
90
[110]
Figure 3.6: Orientation of the pattern edge for benefiting of the <111> lateral
etch stop plane in <100> and <110> wafers.
78
The anisotropy can be very large and for example, for silicon and KOH, the
etch rate ratio can reach 400 between (100) and (111) planes and even 600 between
(110) and (111) planes - meaning that when the etch rate for the (100) plane is
about 1 m/min then the (111) plane will etch at only 2.5 nm/min effectively
allowing to consider it as an etch-stop plane. With different combinations of wafer
orientations and mask patterns, very sophisticated structures such as cavities,
grooves, cantilevers, through holes and bridges can be fabricated. For example,
if the (100) wafers in Figure 3.5 shows an angle of 54.7 between the (111) plane
and the surface, typically producing V-grooves, (110) oriented wafer will present
an angle of 90 between these planes resulting in U-grooves with vertical walls. To
obtain these grooves, as shown in Figure 3.6, the mask pattern edges need to be
aligned with the edge of the (111) planes. For a (100) wafer it is simple because
the groove edge are along the <110> direction, that is parallel to the main wafer
flat. Moreover the four (111) planes intersect on the (100) surface at 90 and a
rectangular pattern will immediately expose four sloping (111) planes and provide
a simple way to obtain precisely defined pits or square membranes. (110) wafers
are more difficult to handle, and to obtain a U-groove the side should be tilted by
an angle of 125.2 with respect to the <110> wafer flat. In addition, to obtain a
four-sided pit, the two other sides should make a 55 angle with the flat direction
- defining a non-rectangular pit that is seldom used for membranes.
If the control of the lateral etching by using the (111) planes is usually excellent,
controlling the etching depth is more complicated. Monitoring the etching time
is the simplest technique. However this is limited by the etching uniformity in
the bath, and by the variation of the etching rate. Actually, if the etching rate is
known with 5% accuracy, after etching through a wafer of 300 m, the uncertainty
on the etched depth is 15 m. We see that producing flat thin membranes of
precise thickness (often t < 30 m) needed for pressure sensors will require a
better approach that what can be achieved by this method. We have seen that we
could use the self limiting effect appearing when two sloping (111) planes finally
contact each other, providing the typical V-grooves of Figure 3.5. However, if
this technique is interesting because
it provides an etch stop and a structure of
precise depth (we have d = w/ 2), it is unable to provide membrane with flat
bottom. MEMS technologists have tackled this problem by developing other etch
stop techniques that reduce by one or two order of magnitude the etch speed when
the solution reach a particular depth.
w
d
V-groove
p
n
Electrochemical
p+
Boron
Figure 3.7: Comparison between timed etch and etch-stop techniques for controlling membrane thickness.
79
The electrochemical etch stop works by first creating a diode junction by using
epitaxial growth or doping of a n-layer over a p-substrate. The junction is reverse
polarized by contacting the substrate and the chemical bath, preventing current to
flow between anode and cathode. As soon as the p-substrate is completely etched,
a current can flow from the anode causing the apparition of a passivation layer
by anodization, effectively stopping the chemical reaction and the etching. This
process yields an excellent control over the final membrane thickness that is only
determined by the thickness of the epitaxial layer, and thus can be better than 1%
over a whole wafer.
Another popular method that does not require epitaxial growth, consists in heavily
doping (> 1019 cm3 ) the surface of silicon with boron by diffusion or implantation. As soon as the p+ doped zone is exposed the electron density is lowered,
slowing down the etching reaction by at least one order of magnitude. However,
if diffusion is used to obtain the boron layer, the resulting high boron concentration at the surface will decrease substantially the piezoresistive coefficient value
making piezoresistors less sensitive. Ion implantation can overcome this problem
by burying the doped layer a few m under the surface, leaving a thin top layer
untouched for the fabrication of the piezoresistors.
Actually, the seemingly simple membrane process often requires two tools specially designed for MEMS fabrication. Firstly, to properly align the aperture of
the backside mask with the piezoresistor or other features on the front side (Figure
2.5) a double-side mask aligner is required. Different approaches have been used
(infrared camera, image storage, folded optical path...) by the various manufacturers (Suss Microtec, OAI, EVGroup...) to tackle this problem, resulting in a
very satisfying registration accuracy that can reach 1 m for the best systems.
Secondly, etching the cavity below the membrane needs a special protection tool,
that in the case of electrochemical etch stop is also used for ensuring the substrate
polarization. Actually the presence of the cavity inevitably weakens the wafer and
to avoid wafer breakage, the membrane is usually etched in the last step of the
process. At that time, the front side will have already received metalization which
generally cannot survive the prolonged etch and needs to be protected. This protection can be obtained by using a thick protective wax, but more often a cleaner
process is preferred based on a mechanical chuck. The chuck is designed to allow
quick loading and unloading operation, using O-ring to seal the front-side of the
wafer and often includes spring loaded contacts to provide bias for electrochemical
etch-stop.
The chemical used during anisotropic etching are usually strong alkaline bases
and requires a hard masking material that can withstand the solution without
decomposing or peeling. In general polymer (like photoresist) can not be used to
protect the substrate, and if some metals (like tungsten) can be used effectively,
in general a non-organic thin-film is used. For example, silicon oxide mask is
commonly used with TMAH, while silicon nitride is generally used with KOH.
Table 3.3 summarizes the characteristics of some anisotropic etching solution.
80
Solution
KOH / H2 O
44g / 100ml
(30 wt.%) @
85C1
TMAH / H2 O
28g / 100ml
(22 wt.%) @
90C2
EDP (Ethylene diamine /
pyrocatechol /
H2 O) 750ml /
120g / 240ml
@ 115C3
1
2
3
(100) Si
Etch rate
etch rate
ratio
(m/min)
400
for
(100)/(111)
1.4
600
for
(110)/(111)
30
for
(100)/(111)
1
50
for
(110)/(111)
1.25
Mask
Boron
etch
rate etch stop
(nm/min)
(cm3 )
3.5
(SiO2 )
<0.01 (Si3 N4 )
> 1020
rate/20
0.2
(SiO2 )
<0.01 (Si3 N4 )
4 1020
rate/40
0.5 (SiO2 )
35
for 0.1 (Si3 N4 )
(100)/(111) 0 (Au, Cr,
Ag, Cu, Ta)
7 1019
rate/50
+largest etch rate ratio; K ions degrade CMOS; etch SiO2 fast
+SiO2 mask; +CMOS compatible ; large overtech
+SiO2 mask; +no metal etch; +CMOS compatible; large overtech;
toxic
Table 3.3: Characteristics of some anisotropic etchants for silicon.
Of course anisotropic wet etching has its limitation. The most serious one lies
with the need to align the sides of the pattern with the crystal axes to benefit from
the (111) plane etch-stop, severely constraining the freedom of layout. A typical
example is when we want to design a structure with convex corners - that is instead
of designing a pit, we now want an island. The island convex corners will inevitably
expose planes which are not the (111) planes and will be etched away slowly, finally
resulting in the complete disappearance of the island. Although techniques have
been developed to slow down the etch rate of the corner by adding protruding
prongs, these structures take space on the wafer and they finally cannot give the
same patterning freedom as dry etching techniques.
3.3.2
Dry etching
81
the aspect ratio. Actually we can define an aspect ratio for features (h/wr ) and for
holes (h/wh ) with most technologies giving better results with features than with
holes - but generally with only a small difference. Typical values for this parameter
would range between 1 (isotropic etch) and 50, for very anisotropic etching like
the DRIE process.
As in many other processes, dry etching makes often use of a
plasma. The plasma is an equal mixture of positive ions and high
G
e- + e- +
energy (=high speed) electrons with some neutral atoms that reA G
A
mains mostly electrically neutral. The plasma will help form a high
quantity of reacting ions and radical, increasing the etching rate. In a plasma, new
pairs of ion and electron are continuously formed by ionization and destroyed by
recombination.
Cathode
Cathode glow
Crookes dark space
G
G
G
Anode
82
The dominant mode of operation will depend on the energy of the ions and the
reactivity of the radicals. Usually the etching is more anisotropic (vertical) and
less selective when it is more physical (corresponding to high energy plasma), while
it is more isotropic and selective when it is more chemical (for low energy plasma).
In the RIE mode, obtained for mildly energetic ions, the bombardment of the
surface by ions allows to increase tremendously the rate of the chemical reaction,
while keeping selectivity and providing anisotropy. For example, using an argon
plasma (a non-reactive gas) in a chamber with XeF2 has been shown to increase
the etch rate of Si by a factor of more than 10.
The RIE is the most versatile technique and is often used in MEMS. In its
original configuration shown in Figure 3.9, if it is based on the glow discharge
principle to generate the plasma, the excitation is obtained through a capacitively
coupled RF source. Actually the high frequency RF source generates an alternating
field at a frequency high enough (13.56 MHz) for affecting only the low inertia
electrons. The electrons that are set into motion ionize the gas atoms, and end
up on the chamber wall or on the electrodes. This loss of electrons in the plasma
places it at a slightly positive potential. The electrons that come to the upper
electrodes or the chamber wall are evacuated (ground), but those falling on the
lower electrode accumulate, polarizing the plate with a negative voltage (a few
83
gas
wafer
3.3.3
Wafer bonding
A review of MEMS fabrication technique cannot be complete without mentioning wafer bonding. Wafer bonding is an assembly technique where two or more
precisely aligned wafers are bonded together. This method is often used simultaneously for device fabrication and also for its packaging - belong both to front-end
and back-end process, another peculiarity of MEMS, but at this stage it is not
surprising anymore!
84
Wafer bonding has the potential to simplify fabrication method because structures
can be patterned on both wafers and after bonding they will be part of the same
device, without the need for complex multi-layer fabrication process. The main
issues that need to be considered to evaluate a wafer-bonding technique are : the
bonding temperature (high temperature may damage the materials or structure
on the processed wafer), the difference in coefficient of thermal expansion between
the bonded materials (in the worst case causing debonding or affecting stress sensitive systems during use) and the permeability to gas and humidity of bond and
bonded wafer (affecting long term reliability).
The bonding techniques are usually split between intermediate layer bonding
technique, where an intermediate layer is used to form the bond between the two
wafers, and direct bonding methods where there is no such layer.
Type
Intermediate
layer
Direct
Bonding
Temp.
Stress
Hermeticity
epoxy
low
average
poor
eutectic
average
average
very good
glass frit
low
very good
anodic
average
very good
excellent
fusion
high
excellent
excellent
85
Cathode
Glass
Na+
Na+
Na+
Na+
Na+
Na+
Na+
Na+
Na+ Na+
Na+
-
Na+
-
O2 +O2 +O2 +O2 + O2 + O2 + O2 + O2+ O2+ O2+ O2 +O2 +O2 +O2 +O2 + O2 + O2 + O2 + O2+ O2+ O2+ O2 + O2 + O2+ O2+
Si Si
Si Si Si Si
Si
Si Si Si Si Si Si Si Si Si
Si Si Si Si Si Si Si Si Si
Silicon
Na+
-
O2
Hot plate
Si
Na+
-
O2
+
Si
Na+
-
O2
+
Si
O2
+
Si
Anode
-
Si
3.4
6
Pyrex glass has a CTE=3.2 106 /K constant until 400 , while silicon has a CTE lower
than Pyrex until about 140 and higher above - the bonding temperature is chosen such that
the integral of the difference of thermal expansion over the temperature range is close to 0
86
Figure 3.11: Silicon pressure sensor SP15 bonded with glass cover (Courtesy Sensonor AS - An Infineon Technologies Company).
terials, layer by layer, on the surface of the substrate. The thin-film layers are
typically 1 5 m thick, some acting as structural layer and others as sacrificial
layer. Dry etching is usually used to define the shape of the structure layers, and a
final wet etching step releases them from the substrate by removing the supporting
sacrificial layer.
A typical surface micromachining process sequence to build a micro bridge is
shown in Figure 3.12. Phosphosilicate glass (PSG) is first deposited by LPCVD
to form the sacrificial layer. After the PSG layer has been patterned, a structural
layer of low-stress polysilicon is added. Then the polysilicon thin-film is patterned
with another mask in CF4 + O2 plasma. Finally, the PSG sacrificial layer is etched
away by an HF solution and the polysilicon bridge is released.
As a large variety of materials such as polysilicon, oxide, nitride, PSG, metals,
diamond, SiC and GaAs can be deposited as thin film and many layers can be
stacked, surface micromachining can build very complicated micro structures. For
example Sandia National Laboratories is proposing a process with four polysilicon
structural layers and four oxide sacrificial layers, which has been used for fabricating complex locking mechanism for defense application. Figure 3.13 demonstrates
surface micromachined micro-mirrors fabricated using two polysilicon structural
layers and an additional final gold layer to increase reflectivity. They have been
assembled in 3D like a pop-up structure, using a micromanipulator on a probestation.
87
3.4.1
Thin-film fabrication
The choice of the thin-film and its fabrication method is dictated by many different
considerations: the temperature budget (limited by the maximum temperature
that the substrate can withstand and the allowable thermal stress), the magnitude
of the residual stress in the thin-film (too much stress cause layer cracking), the
conformality of the thin-film (how the thin-film follows the profile of the substrate
as shown in Fig. 3.14), the roughness of the thin-film, the existence of pinholes, the
88
uniformity of the thin-film, the rate of fabrication (to obtain cost-effective thick
thin-film)...
h min
h max
h min
h min
h max
h max
Temperature
Conformality
Rate
Spin-coating
room temp.
--
++
Oxidation
very high
++
Evaporation
low
Sputtering
low
LPCVD
high
89
Structural material
Sacrificial material
Etchant
Polysilicon
Buffered HF
Si3 N4
Poly-Si
KOH
SiO2
Poly-Si
EDP/TMAH
Aluminum
Photoresist
Acetone/O2 plasma
Polyimide
Cu
Ferric chloride
Ti
Au
Ammonium iodide
Poly-Si
XeF2
Oxidation
Oxidation belongs to the modifying processes, sharing with them a generally excellent conformality. Oxidation is a reactive growth technique, used mostly on
silicon where silicon dioxide is obtained with a chemical reaction with a gaseous
flow of dry or wet dioxygen. Using dry dioxygen results in a slower growth rate
than when water vapour are added, but it also results in higher quality films. The
rate of growth is given by the well-known Deal and Grooves model as
do =
A
2
1+
t+
A
,
2
A /4B
2
where B is called the parabolic rate constant and B/A the linear rate constant that
are obtained for the long and the short growth time limit respectively. Actually,
for the short growth time limit, we notice that (neglecting the correcting factor )
t
limt0 do = A2 (1 + 12 A2 /4B
) A2 = B
t : we indeed have a linear growth rate with
A
the slope B/A. Accordingly, we would find a parabolic approximation for long
growth duration (cf. Problem 4).
Typical value for these constant at 1000 are A = 0.165m, B = 0.0117m2 /h
and = 0.37h in dry O2 and A = 0.226 m, B = 0.287 m2 /h and = 0 in wet
O2 . It should be noted that the model breaks down for thin dioxide (<300
A) in
dry atmosphere because of an excessive initial growth rate that is modeled through
the use of . The passage from a linear growth rate to a parabolic rate is dictated
by the need for the oxygen atoms to diffuse through the growing layer of silicon
dioxide - the thicker the layer the slower its growth rate. The parabolic form of
90
the growth rate is a typical signature of diffusion limited process. The diffusion
rate is increased for wet oxidation as the presence of hydrogen facilitates oxygen
diffusion through dioxide, resulting in much faster growth.
Actually, it is possible to control locally the oxidation by
blocking in selected places the diffusion of oxygen to the surface,
performing what is called a LOCOS process (local oxidation).
In this process, Si3 N4 is deposited and patterned on the silicon
surface (atop a thin oxide layer used for stress relieving) to
Oxide
Silicon
act as a barrier against oxygen diffusion. During the oxidation
Silicon nitride
process in the furnace, oxide grows only in the bare regions
while Si3 N4 prevents oxidation in the covered regions, effectively resulting in a
patterned oxide layer. We note that the oxide growth happens in part below the
original surface and that at the edge of the region, the oxide lateral growth lifts
the nitride film resulting in a birds neck profile of the oxide, both characteristics
resolutely different from what would happen with etched oxide films.
For MEMS applications, an interesting feature of oxidation is that it results in
a net volume change as the density is lower for oxide than for silicon. Actually, the
volume increases by about 53% of the grown oxide. That is, for an infinite plane,
the growth of a thickness do of dioxide results in the consumption of a thickness
dSi = 0.46do of silicon, but a net expansion of do dSi = 0.53do during growth.
If this phenomenon may produce unwanted stress, it is also used to close holes in
silicon or poly-silicon layers.
The relatively large range of variation of the oxidation growth rate, allows to
obtain thick layer (up to 2m for 10h wet oxidation) or very thin layer (a few nm)
with a good control. The thicker films can be used as mask for wet etching or as
sacrificial layer, and the thinner ones serve to produce nano-structures with high
accuracy. This versatility and the high quality of the film produced give oxidation
an important role in MEMS manufacturing.
Doping by diffusion and ion implantation
Doping is a process where impurities are introduced into a material to modify it,
and as such belongs to the modifying processes. The impurities can be directly
introduced during the fabrication of the substrate (for example, As can be introduced into the melt during the growth of the silicon ingot to obtain n-doped silicon
wafer), but we are here interested by techniques that could be used selectively for
doping locally and in-situ a thin layer of material. The two main techniques that
are used are diffusion and ion implantation.
Diffusion is performed by placing the substrate in a high temperature furnace
in presence of the doping species. In the main process currently used the doping
species are present in gaseous form in the furnace or have been deposited as a
thin film directly onto the substrate. The high temperature will agitate atoms
strongly and allow the impurities atoms to move slowly inside the substrate, until
temperature is lowered or their concentration is uniform.
91
Actually, the diffusion is governed by the Ficks laws which are derived from the
statistical study of the random motion of particle due to thermal energy. The first
law relates the flux of the diffusing species (j) with the gradient of concentration
C. The proportionality constant D is the diffusion constant, a material constant
depending on the substrate, the diffusing atoms and the temperature.
j = D gradC
which in system with one dimension gives j = DC/x. This equation translates
the fact that the average flow of impurities will last until the concentration is equal
everywhere and the gradient is null. The second law allows to relate the evolution
of the concentration with time.
C
= DC
t
which in one dimension is given by C/t = D 2 C/x2 . These partial differential equations can be solved for different set of initial and boundary condition,
depending on the diffusion configuration.
Of particular interest is the case of infinite source, where he concentration
at the surface remains constant during the complete diffusion, as happen during
diffusion from a gaseous source in a furnace. In this case the concentration is given
as:
x
C(x, t) = CS erfc
2 Dt
x 2
where erfc(x) = 1 erf(x) = 1 2/ 0 e d is the complementary error
function. From this equation we can obtain the diffusion depth d as:
d = 2 Dt erfc1
Cd
CS
where Cd is the concentration in d. We verify here that the depth varies as the
square root of time, that is, to go twice as deep, the diffusion needs to be 4 times
longer. The surface concentration CS remains constant during the diffusion and is
generally given by the solubility limit of the impurities in the substrate (e.g. 2
1020 at/cm3 for Boron in Silicon at 1100C). Actually, the Ficks equations describe
only the evolution of the concentration inside the material and not what happen
at the interface between the substrate and the upper medium. The solubility
limit gives the maximum concentration that is reached before the impurity forms
cluster and small crystals which will require a lot more energy and generally does
not happen. In this way it also gives the maximum impurities concentration that
can be obtained by diffusion in a material.
Another common case is the case of finite source, where, after some time, the
source is completely consumed by the diffusion, as happen when a thin-film on the
substrate surface acts as the diffusion source. In this case, the relevant quantity is
92
the total amount of dopant present Q that will finally fully diffuse in the substrate.
In this case the solution to the Ficks second law is:
C(x, t) = CS exp x2 /4Dt
d = 2 Dt ln(Cd /CS )
varying again as a function of the square root of the time.
Ion implantation is comparatively a more versatile technique, but it requires a
much more complex set-up (Figure 3.15). In this case the impurities are introduced
into the substrate by bombarding the substrate with ions of the impurity traveling
at high velocity. The ions will penetrate and be stopped after a short distance
under the surface (at most a few m) by interaction with the electrons and atoms of
the substrate material. The ion implanter is thus composed of a collimated source
of ion, a section using high voltage to accelerate the ions, a mass spectrometer
to sort the ions and only select the desired species and finally an electrostatic
scanning system allowing to direct the ion beam toward any place on the wafer
surface.
Acceleration
Ion
source
Scanner
Mass
spectrometer
Wafer
93
The implantation being a rather violent process, the collision of the impurity
ions with the stationary atoms of the substrate cause them to recoil and generally results in amorphization of the doped portion of the substrate. In general
recrystallization is needed and thus the implantation process needs to be followed
by an annealing process at high temperature (800C to 1200C) under an inert
atmosphere.
If ion implantation allows to tailor, by varying dose and energy, doping profile
much more precisely than diffusion, it is not exempt of drawbacks. The implantation is a rather directional process and it is affected by shadowing behind tall
structures and reflection on the side wall, making uniformity more problematic
with high aspect ratio structures. Additionally, inside crystals there are often
direction with less chance of nucleus collision were ions will be channeled much
deeper. For example in Si, such phenomenon appears along the < 111 > orientation, and when ion implantation replaced diffusion for microelectronics at the
turn of the 1980s, the preferred Si wafer orientation changed from < 111 > to
< 100 >. Finally, ion implantation is unable to provide deep doping profile (> a
few m) and, contrary to microelectronics, diffusion remains widely used in MEMS
fabrication.
C
50 keV
2(Dt) =0.1m
100 keV
200 keV
400 keV
0.5m
1m
0.5
0
Depth [m]
Figure 3.16: Concentration profile for (left) implanted ions of different energy
(right) diffused atoms with different diffusion time and with (dots) infinite and
(solid) finite source.
In general, it is possible to control locally the doping by placing a protecting layer over the zone that should not be doped before the doping process is
performed. Actually the protecting layer will be doped at the same time as the
exposed substrate, but will be removed in a later step leaving only the exposed
substrate with doping. If the doping is obtained by diffusion, at the edge of the
pattern lateral (isotropic) diffusion will occur enlarging the original pattern, but
implanted layer will have more precisely defined edges.
94
Spin-coating
Spin-coating is a simple and fast technique that allows to deposit thin-films in
liquid form. In general it is used to deposit polymer in a solvent, and particularly
photoresist, but it can also be used for glass (spin-on glass) or other material
(PZT...) with the sol-gel technique.
Thickness
95
to solve it in the case of photoresist where a good control of thickness is very important. One of such method is the spray coating method, where the photoresist
is sprayed over the surface avoiding presence of thinner layer at corners. Photoresist has also been deposited by using electroplating. In that case a conductive
substrate (i.e. metal coated) is placed in a special electrolytic bath and after application of a current, a layer of photoresist is grown uniformly on all the surface
of the substrate, providing excellent conformality.
The main limitation with spin-coating is that the material need to be in liquid
form, restricting the range of available material to a relatively small list. In general, as it is the case for photoresist, the material is dissolved in a solvent which is
then evaporated after spin-coating leaving a hard film on the substrate. Alternatively, monomer can be spin-coated and then polymerized by heating or using UV
exposure to form a film. Finally, in sol-gel method like the spin-on-glass (SOG),
suspension can be spin-coated and will form a network upon thermal treatment.
96
projected surface is smaller7 . The vertical side can even be left in the shadow
and receive no material at all. This last problem can be generally avoided (except
in narrow trenches) by rotating the substrate during the deposition. Of course the
shadowing effect is not always a problem and actually lift-off process can benefit from it. Actually, if nothing deposits on the sacrificial material side this will
facilitate its dissolution by providing a better access to the etching liquid.
The basic operation of the DC sputter relies on a plasma and is close to the
glow discharge principle, but operated at a higher voltage. In that case, when
the positive gas ions are accelerated in the Crookes dark space and hit the target
surface they have enough momentum to eject one or more atom of the target in
addition to the secondary electrons. The neutral atom will then fly through the
chamber and land onto the wafer.
Coolant pipe
Magnet
Target
Ar
Ar gas
Wafer
Heater
+
Pressure control valve
Vacuum pump
Figure 3.18: Typical DC Sputter schematic.
The most commonly used gas for sputtering is argon, an inert gas, avoiding any
chance for the gas to chemically react with the substrate of the target. The target
is located at the cathode which is biased at a few kV where the sputtering efficiency
is highest for most materials. The substrate bias is also often set to a negative
value ( 100 V, that is, of course, much lower than the cathode voltage), to
attract ions that will help compact the deposited film and remove the loose atoms.
Alternatively a positive bias could be used to attract the electrons and repel the
ions, resulting in different thin-films properties. Clearly, the substrate bias should
not be made too negative otherwise sputter of the substrate will occur. In fact,
this possibility is often used for a short time by biasing the substrate to 1 kV before
the deposition is started in order to clean the substrate surface. Additionally, the
wafer is normally heated at a temperature set halfway to the melting temperature
(T 0.5Tm ) to increase the atom mobility at the surface. This result in film
7
This is the same phenomena that makes the temperature depends on the latitude on earth:
at higher latitude the curve of the globe let the rays of the sun shine obliquely on the earth
surface and thus make them illuminate a larger surface, bringing less heat there.
97
having less porosity and lower stress. However, the main factor affecting stress is
the gas pressure, and lower stress will be obtained with low pressure.
The magnet is used to increase the ionization yield of gas atoms. Actually
the magnetic field traps the moving electrons (Lenzs law) in an helical path,
increasing the length of their trajectory and thus increasing the chance they hit a
gas atom. The yield increases dramatically (> 10) resulting in a similar increase
of the deposition rate. Most modern sputter are using this principle and are called
magnetron sputter.
One issue with the DC sputter is that it can not be used efficiently for depositing
insulating material. Actually, the charge building-up on the insulating target
surface would finally repel the incoming ions and require a dramatic increase of
the voltage to sputter the target material. However, instead of a DC potential,
the sputter can also be operated with a pulsed field in the radio frequency (RF
sputter) range (in general at 13.56 MHz). In this case the potential of the target
is maintained negative for most of the period and then briefly turned positive.
The short duration of the positive field does not modify much the momentum of
the heavy ions that behave mostly as seen before in the DC field. In the other
hand the much lighter electrons are affected by the positive field and they move
backward toward the target. This neutralizes the build-up of positive charge which
would happen in DC sputter with non conductive target and allows deposition of
insulating or semi-conducting materials.
The atoms coming from the target follows mostly a line of sight path, but the
conformality is still better than with evaporation. Actually, as can be seen in
Figure 3.18, owing to the large dimension of the target and its proximity with the
wafer, target atoms arrive on the wafer within a relatively broad solid angle spread,
decreasing the shadow effect usually observed with evaporation. Additionally, the
atoms from the sputter have a higher velocity that atoms obtained by evaporation,
resulting in layer with better adhesion to the substrate. Finally the deposition
speed is higher with sputter, making it the tool of choice for metal deposition in
the industry.
Chemical Vapor Deposition (CVD) techniques
Chemical vapor deposition techniques are split according to the operating pressure,
from atmospheric pressure CVD (APCVD), low-pressure CVD (LPCVD) to finally
ultra-high-vacuum CVD (UHCVD). However they all work on the same principle:
the decomposition of a gas on the heated surface of the wafer in the absence of any
reagent, a phenomena called pyrolysis. CVD is performed in a simple furnace with
a gas inlet and connected to a vacuum pump to maintain a controlled pressure as
shown in Figure 3.19.
Depending on the gas, or gas mixture, used it is possible to deposit a wide
variety of thin-films. The most commonly deposited films are polysilicon using
the decomposition of silane (SiH4 ) at 620C, silicon nitride using a mixture of dichlorosilane (SiH2 Cl2 ) and ammonia (NH3 ) at 800C and low temperature oxide
98
Furnace
door
Gas inlet
99
ditionally, the quality of the thin-films is usually lower - but adjusting the plasma
parameter allows a better control on the mechanical properties of the film.
Material
Gas
LPCVD LTO
SiH4 + O2
425
low density
LPCVD TEOS
Si(OC2 H5 )4
700
good film
PECVD
SiH4 + N2 O
300
contain H
Nitride
(SiNx ,
Si3 N4 ...)
LPCVD Si3 N4
800
high stress
LPCVD SiNx
800
low stress
PECVD SiNx
SiH4 + NH3
300
contain H
Silicon
(PolySi,
a-Si...)
LPCVD PolySi
SiH4
620
small grain
LPCVD a-Si
SiH4
570
amorphous
PECVD a-Si
SiH4
280
contain H
LPCVD W
WF6 + SiH4
440
good conf.
Oxide
(SiO2 )
Tungsten (W)
T[
Technique
Remark
100
also remark that the expected gain in yield achieved by shrinking the chip dimension (smaller area has less chance to see one particle) can be easily offset by the
dramatic increase of smaller particle number noted above: to keep the same yield,
the environment should actually be cleaner, and the number of smaller particle
should decrease proportionally with the scaling factor.
Epitaxy
Epitaxy is a CVD techniques, as it generally relies on furnace very similar to what
is used for LPCVD, but actually it presents features that makes it unique. The
main difference between epitaxy and other CVD techniques is that in the case of
epitaxy the structure of the thin-film depends on the substrate, and particularly,
epitaxial growth allows to obtain single crystal layers. Actually if in the case
of CVD the deposition is relatively random and independent of the substrate,
generally resulting in amorphous or polycrystalline films, with epitaxy the thinfilm will grow in an ordered manner determined by the lattice of the substrate.
If the material of the epitaxial layer is the same as the substrate, the process
is called homoepitaxy and heteroepitaxy otherwise. Moreover, depending on the
match between the lattice period of the substrate and the film, we can distinguish
three types of epitaxial growth:
commensurate growth when the substrate and the layer have the same crystal
structure and lattice constant,
incommensurate growth when they dont have the same lattice constant resulting in point defects at the interface,
pseudomorphic growth when they dont have the same lattice constant but the
epitaxial layer strains to match the lattice of the substrate.
The growth of high quality crystals, like silicon, is generally obtained by the
Czochralsky method, which consists in pulling slowly from a melt very large single
crystal starting from a small seed crystal. Actually, this method could be described
as an extreme case of homoepitaxy with commensurate growth.
The epitaxy process needs a furnace similar to a LPCVD furnace, but in practice the process is relatively more complicated to control. On silicon the main
process is based on the reduction in a H2 atmosphere of SiCl4 at 1200C with HCl
as a by-product. However the high temperature makes it hardly useful, except as a
first process step, and lower temperature process using dichlorosilane above 950C
have been developed, but are harder to control resulting often in polycrystalline
layers.
The main interest of the technique is in the high quality of the grown layer,
which results in good electronics properties, important for optoelectronics (solar
cell, laser diode...) and some specific electronics circuits, and good mechanical
properties (low stress) more interesting for the MEMS application. The relative
101
difficulty of the technique makes it rarely used in MEMS fabrication, with the
notable exception of the process used by Bosch for their multi-user foundry process.
In the MPW process the structural layer is a 10.5m polycrystalline layer grown
by epitaxy (called epipoly). In this case the interest is the growth speed (that
could exceed 0.3m/min) that can be obtained without sacrificing the low stress
present in the layer.
3.4.2
Design limitation
The flexibility of surface micromachining is not free of unique problems that need
addressing to obtain working devices.
During layer deposition, a strict control of the stress in the structural layer
has to be exerted. Compressive stress in a constrained member will cause it to
buckle, while a gradient of stress across a cantilevered structure causes it to warp,
resulting in both case in probable device failure.
The possibility to stack several layers brings freedom but also adds complexity.
Actually there is large chance that the topography created by the pattern on
underlying layer will create havoc with the upper layer, as illustrated in Figure 3.20.
Photoresist
Dimple
Interference
Stringer
Stringer formation
Layer interference
102
below the top structural layer that will forbid it to move freely sideway - probably
dooming the whole device. This problem can be tackled during layout, particularly
when the layout editor has a cross-section view, like L-Edit from Tanner Research.
However even a clever layout wont be able to suppress this problem completely
and it will need to be addressed during fabrication. Actually it is possible to polish
using Chemical-Mechanical Polishing (CMP) the intermediate sacrificial layer to
make it completely flat, will avoid all interference problems. For example, Sandia
National Laboratory uses oxide CMP of the second sacrificial layer for their four
layers SUMMiT V process.
However, sometimes the interference may be a desired effect and for example the
so called scissors hinge [25] design shown in Figure 3.21 benefits greatly from
it. The scissors hinge is designed to provide a hinge functionality with micromachining process and as we see here the protrusions below the upper layer help to
hold the hinge axis tightly. If we had to rely on lithography only, the gap between
the axis and the fixed part in the first structural layer would be at best 2 m, as
limited by the design rules, and the axis will have too much play. However the
protrusions below the staple reduce the gap to 0.75m, the thickness of the second
sacrificial layer, and the quality of the hinge is greatly increased.
Protrusion
2nd structural layer
Axis
Staple
Connecting via
1st structural
layer
The final step in surface micromachining process is the release - and this critical
step has also a fair amount of issues that need to be considered.
3.4.3
103
Microstructure release
The release step is the source of much technologist woes. Release is usually a wet
process that is used to dissolve the sacrificial material under the structure to be
freed. However the removal rate is usually relatively slow because the sacrificial
layer is only a few m thick and the reaction becomes quickly diffusion limited.
Then the depth of sacrificial layer dissolved under the structure will increase slowly
with the etching time as
drelease tetch .
Simply said, releasing a structure twice as wide will take 4 times more time. However if the etching lasts too long the chemical may start attacking the device
structural material too. A first measure to avoid problems is to use compatible
material and chemical, where the sacrificial layer is etched quickly but other material not at all. A typical example is given by the DLP (Digital Light Processing)
from Texas Instruments, where the structural layer is aluminum and the sacrificial
layer is a polymer. The polymer is removed with oxygen plasma, and prolonged
release time will only slightly affect the metal.
This ideal case is often difficult to reach and for example metals have often a finite
etch rate in HF, which is used to remove PSG sacrificial layer. Thus to decrease
the release time we have to facilitate etching of the sacrificial layer by providing
access hole for the chemical through the structural layer. In the case of Figure 3.13
for example, the mirror metal starts to peel off after about 10 minutes in HF. However in about 5 minutes HF could only reach 40 m under a plain plate, and the
designer introduced release holes. These holes in the structural layer are spaced
by roughly 30 m in the middle of the mirror plate (the white dots in the figure)
allowing for the HF to etch all the oxide beneath in less than 5 minutes.
104
against the substrate. This intimate contact give rise to other surface forces like
Van der Walls force, which will irremediably pin your structure to the substrate
when the drying is complete, effectively destroying your device. This phenomenon
is referred as stiction (Figure 3.22). Strategies that have been used to overcome this
problem have tackled it at design and fabrication level. In surface micromachining
the idea has been to reduce the contact surface by introducing dimples under the
structure. From the fabrication side, super-critical drying, where the liquid changes
to gas without creating a receding meniscus, has also been applied successfully.
Coating the structure with non-sticking layer (fluorocarbon, hydrophobic SAM...)
has also proved successful and this method, albeit more complex, has the added
advantage to provide long lasting protection again sticking that could arise during
use.
Finally, a completely different approach is to avoid wet release altogether and
instead to perform a dry release with a gas or a vapour, suppressing entirely the
stiction concerns. For example the Multi-Project-Wafer (MPW) process run by
Bosch uses HF vapour to remove the oxide layer below the polycristalline structures. The reaction is roughly as follow:
SiO2 + 2H2 O Si(OH4 )
Si(OH4 ) + 4HF SiF4 + 4H2 O
where all the final by-products are, of course, gaseous. The main issue with the
technique is the high toxicity of the HF vapour and in Table 3.6 we describe two
other popular methods which present less risk: dissolving polymer sacrificial layer
with O2 plasma, and using xenon difluoride (XeF2 ) to etch sacrificial silicon. The
xenon difluoride is a gas showing an excellent selectivity, having etching rate ratio
close to 1000 with metal and up to 10000 with oxide. The gas has thus been used
successfully to release very compliant or nano-sized oxide structures where silicon
was used as the sacrificial material. The process does not use plasma, making the
chamber rather simple, and several manufacturers like XactiX (in cooperation with
STS), in the USA or PentaVacuum in Singapore are proposing tools exploiting the
technology.
3.5
DRIE micromachining
Deep reactive ion etching (DRIE) micromachining shares features both from surface and bulk micromachining. As in bulk micromachining the structure is etched
in the bulk of the substrate, but as in surface micromachining a release step is used
to free the microstructure. Figure 3.23 shows a simplified process of bulk micromachining on silicon-on-oxide (SOI) wafer using deep reactive ion etching (DRIE),
a special MEMS dry etch technique allowing large etch depth with very vertical
side walls. The SOI wafers used in MEMS usually have a device layer thickness
between 10 and 200m where the structure is defined. After photolithography, the
105
106
two alternating steps: passivation and etching. In the passivation step, C4 F8 gas
flows into the ICP chamber forming a polymer protective layer (n (-CF2-)) on all
the surfaces. In the following etch step, the SF6 gas in the plasma chamber is
dissociated to F-radicals and ions. The vertical ion bombardment sputter away
the polymer at the trench bottom, while keeping the sidewall untouched and still
protected by the polymer. Then the radicals chemically etch the silicon on the
bottom making the trench deeper. By carefully controlling the duration of the
etching and passivation steps, trenches with aspect ratio of 25:1 are routinely
fabricated - and aspect ratio as high as 100:1 have been reached. Figure 3.25 right
shows a SEM picture of a movable mirror fabricated by DRIE on a SOI wafer.
The DRIE is a very versatile tool and allows a good control on the etched profile
slope, mostly by varying the etching/passivation duration and also by varying the
substrate biasing voltage. However, it is affected by common RIE problems like
microloading, where etching rate is slower for high density patterns than for low
density ones. This effect is linked with the transport speed of reactant and products
to and from the surface, which can be improved somewhat by increasing the flow
rate of etching gas. But this is not the only issue, and other common issues are
shown in Figure 3.26.
One issue with DRIE is the presence of regular ripple with an amplitude over
100 nm on the vertical edge of trenches, a phenomena referred to as scalloping.
The ripples comes from the repetition of isotropic etching and passivating steps,
and can be annoying for etching nano-pillars with a few 100 nm diameter or for
obtaining vertical wall with smooth mirror finish. Actually, they can be mostly
removed by shortening the etching step to 1 s, instead of a standard 7 s, and
by reducing the passivation step duration accordingly. This of course results in
coil
gas
+
wafer
107
SF6
Etching
SF6
Etching
C4F8
Passivation
Passivation
Etching
Photoresist
SF6
6'
Etching
Etching
SF6
SF6
Silicon
4'
C4F8
Fluoropolymer
Figure 3.25: Principle of the Bosch process for DRIE etching and 50 m thick
movable mirror fabricated on SOI wafer.
DRIE lag
Scalloping
Notching
Silicon
Oxide
Figure 3.26: Some issues affecting DRIE process: scalloping, notching and lag
(ARDE).
a much slower etching rate, but is a usual practice of trading etching speed for
improving another etching parameter.
The existence of DRIE lag is also a nuisance that needs to be considered. Actually,
in narrow trenches, ion charging at the sidewall lowers the electric field and the
energy of the ions, decreasing etching rate as compared to what happens in wide
trenches. This is described as the aspect-ratio dependent etching (ARDE) effect.
This effect again can be controlled by properly tweaking the recipe and trading a
bit of etching speed.
Another major issue existing in DRIE is the fast silicon undertech that happens
in SOI trenches when the etch reaches the buried oxide layer. Actually after the
silicon has been completely etched, the oxide get exposed, and positive charges
build-up in the insulating layer. This local space charge deviates the incoming ions
laterally, causing an increased etch at the lower portion of the sidewall of the trench,
an effect called notching. The most recent DRIE tools have managed to tackle
this problem satisfactorily, by using a low frequency biasing scheme. Actually the
108
normal RIE plasma frequency (13.56 MHz) is too high to have any effect on the
ions, but by lowering the frequency to 380 kHz the ions bombardment will follow
the field. In a way similar to what happens in a RF sputter, but this time for the
ions, the ions during the positive bias pulse wont be anymore directed toward the
substrate. The plasma electrons will then be attracted there and recombine within
the charged insulator, suppressing the spatial charge. By varying the cyclic ratio
of the low frequency bias pulse it is thus possible to control the etching/uncharging
timing, and obtain optimal etching rate while avoiding notching.
It should be noted that the notching effect can be put to good use and help produce
an etch and release process. Actually it has been found [23] that the notching
effect is self limiting and the depth of the notch is roughly equal to the width of the
trench as soon as the trench has an aspect ratio larger than 2 (for smaller aspect
ratio there is no notching effect). In this way, by carefully designing the geometry
of the layout, it is possible to etch the structure and finally obtain anchored or
free structures within the same etching step. This simplifies further the DRIE
fabrication process, and the device can now be operated right after emerging from
the DRIE - without the need for a separate release etch!
The SOI wafer used often in DRIE machining is still expensive and it is possible to obtain the thick silicon structural layer by growing it using epitaxy on
an oxidized wafer. Even more simply, DRIE has been used to etch through the
Si wafer for a dry etched version of bulk micromachining but allowing complete
freedom over layout as there is no more crystallographic orientation concerns. In
this case wafer bonding can be used to provide movable part.
3.6
3.6.1
Other methods exist for patterning where no material is removed but where it is
simply molded. LIGA, a German acronym for lithography (LIthographie), electroforming (Galvanoformung), and molding (Abformung) is the mother of these
methods. LIGA makes very high aspect ratio 3-D microstructures with non-silicon
X-ray lithography
Electroforming
Molding
Substrate
X-ray photoresist
Metal
Molded polymer
109
X-ray lithography using a synchrotron source (e.g. energy of 2.4 GeV and wavelength of 2
A) to expose a thick layer of X-ray photoresist (e.g. PMMA). Because
of the incredibly small wavelength, diffraction effects are minimized and thick layer
of photoresist can be patterned with sub-micron accuracy. The resist mold is subsequently used for electroforming and metal (e.g. nickel using NiCl2 solution) is
electroplated in the resist mold. After the resist is dissolved, the metal structure
remains. This structure may be the final product but to lower down the costs,
it usually serves as a mold insert for injection molding or hot embossing. The
possibility to replicate hundreds of part with the same insert opens the door to
cheap mass production.
When the sub-micrometer resolution is not much of a concern, pseudo-LIGA processes can be advantageously used. These techniques avoid using the high cost
X-ray source for the mold fabrication by replacing it by the thick photoresist SU8
and a standard UV exposure or even by fabricating a silicon mold using DRIE.
3.6.2
Polymer MEMS
Bulk and surface micromachining can be classified as direct etch method, where
the device pattern is obtained by removing material from the substrate or from
deposited layers. However, etching necessitates the use of lithography, which already includes patterning the photoresist, then why would we want to etch the
lower layer when the pattern is already here? Actually lithography for MEMS
has seen the emergence of ultra-thick photoresist that can be spun up to several
100 m and exposed with a standard mask aligner, providing a quick way to the
production of micro-parts. SU8, a high-density negative photoresist can be spun
in excess of 200 m and allows the fabrication of mechanical parts [24] of good
quality. It is used in many applications ranging from bioMEMS with micro-parts
for tissue scaffold or channels, for example to packaging, where it is used as buffer
layer.
Another application of thick photo-patternable polymer is the fabrication of miD1
D2
r1
r2
500 m
110
that the continuous profile of the lens, which would have been hard to obtain using
etching method, is obtained here through a fundamental principle of nature, the
minimization of energy in a system, which translates itself in the minimization of
surface energy at this scale. Varying the diameter of the pillar before this so called
reflow process allows obtaining different radius of curvature, that is, different focal
length. Another option would be to change the thickness of the photoresist layer,
as the final shape is mostly determined by the volume of photoresist in the original
pillar. One of the interest of this technology is that polymers usually have better
optical properties than silicon in the visible, and there is a lot of opportunity for
polymer micro-optical elements and system, a domain that is sometimes called
Polymer Optical MEMS or POEMS.
Next to these major techniques, other microfabrication processes exist and keep
emerging. They all have their purpose and advantages and are often used for a
specific application. For example, quartz micromachining is based on anisotropic
wet etching of quartz wafers to take benefit of its stable piezoelectric properties
and build sensors like gyroscopes.
3.7
Characterization
The small dimension of the MEMS makes it hard to properly measure their geometry or observe their operation by simply using rulers or our naked eyes. Accordingly, a large range of specialized tools including some specifically developed for
the MEMS industry, are used for interfacing with the micro-world and measure
geometry, layer thickness, beam motion, materials properties, etc.
We list in Table 3.8 some of the most common measurements tools for the
measurands encountered in MEMS. We note that most measurand can usually be
obtained with different tools, but the tool will usually differ in other characteristics,
as whether it is an area or a point measurement, or a contact or a non-contact
method, etc. For example, measuring surface roughness may be obtained with a
stylus profilometer, which is a contact method working point by point or with an
optical interferometer, which is non contact method and records a complete surface
simultaneously. For a proper choice of the right instrument, additional properties
will often need to be considered and, for example, the optical interferometer will
have difficulty to work with transparent samples and will usually have a smaller
range than a stylus profilometer. Clearly, the ability to work with multiple tools
is important for answering all the challenges of MEMS measurement.
In the following sections we will describe a few of these tools in more details,
but be convinced that good characterization skills will only be acquired with a
knowledge of the capabilities of more tools than what is cited in the table.
3.7. CHARACTERIZATION
Type
Mesurand
in-plane
Geometry
depth
thickness
roughness
Physical
refractive index
resistivity
surface energy
interfacial tension
modulus
composition
111
Tool
Remark
contact
non-contact
SEM (SE)
destructive
ellipsometer
dielectric
IR reflectometry
metals
contact
contact
interferometer
non-contact
ellipsometer
multi-layer
four-probe measurement
sessile drop, pendant drop
No
uy ring, Pt plate
nano-indenter, instrom
SEM (EDS), SEM (BSE)
XPS, SIMS, AES, NMR
Chemical
structure
XRD
TEM
Dynamics
vibration
LDV
point meas.
strobed interferometer
area meas.
3.7.1
Light Microscope
The light microscope is ubiquitous in MEMS characterization, letting the microworld come to our sight. It is used repeatedly in the cleanroom at the end of each
process steps for quality control or after fabrication for observing the operation of
the completed MEMS. From the light microscope we have been using for biological sample viewing in 3rd grade to the fluorescent confocal microscope there is a
complete palette of microscopes available for different usage, and for MEMS characterization the most used microscope is called a reflected light infinity corrected
112
compound microscope (Figure 3.29). The compound part just means it uses two
Camera
Relay lens
Ocular
Tube lens
Objective turret
Infinity corrected objective
Sample X/Y stage
Focusing knob
X/Y stage knobs
Transmissive illumination
Figure 3.29: Compound microscope for reflected and transmitted light observation.
sets of lenses for magnifying the sample, the infinity corrected is a cool thing we
will explain latter while the reflected part (as opposed to transmitted light used
for biologic sample microscopy) means that the optical path for observation and
illumination are from the top allowing opaque samples observation which are the
norm in MEMS. Such microscope is a very precise optical component, requiring
high tolerance manufacturing and careful design to obtain all the desired features
for precise characterization.
Although its design details are complex the principle of the compound microscope is simple : the objective forms an enlarged and inverted image of the object
at 160 mm (normalized length) from the objective end and it is this real image
that is observed with the ocular (also called the eyepiece) allowing further enlargement. The imaging light path is shown in Figure 3.30, where we have used the
principal planes to represent the equivalent optical system for each set of thick
3.7. CHARACTERIZATION
113
114
(3.3)
(3.4)
We note that the objective is inverting the image and the eyepiece isnt
changing orientation (magnification is positive) thus the final image should
be inverted. However, the prisms placed before the eyepiece that can be seen
in Figure 3.29 ensure by using folded reflection that the actual image is erect.
The magnification is changed by rotating the turret and selecting an objective with a higher or lower magnification. Interestingly, although we change
the focal length (hence their magnification), the objective are designed so
that even for objective corrected for finite tube length, the position of the
intermediate image does not change, a fact known as parfocality. In practice, there is little reference to focal length on objectives and oculars and
the lateral magnification is directly labeled on the objective (common value
are 2, 5, 10, 20, 40, 100) while the angular magnification of the
eyepiece is commonly written as 10 or 20.
When we use a camera we need to form a real image on its photosensitive
sensor, thus we can not use the ocular (it produces a virtual image) but
should place the sensor directly after the objective. The issue with this simple configuration is that the camera can not be placed in the intermediate
image plane inside the microscope (Figure 3.29) but can only be placed further away outside of the microscope. Accordingly the object would have
9
3.7. CHARACTERIZATION
115
to be moved using the focusing knob to change the position of the intermediate image and focus the image through the objective directly on the sensor.
However this is not very practical because if we switch back to visual observation through the eyepiece the intermediate image wont be in focus and
the focusing knob would have to be used again11 . The modern microscope
solves this problem by using an additional relay lens (Figure 3.29) forming
a real image of the intermediate image on the camera sensor. This lens is
normally placed so that the magnified intermediate image fills the camera
sensor and, depending on its format the magnification is normally between
0.5 and 1.25. The magnification of the microscope is then simply equal
to the lateral magnification of the objective multiplied by the relay lens magnification. Note that because we use two lenses with negative magnification,
the resulting image will be erect. Actually here we only considered the optical magnification, but when we use a camera, we may also want to consider
the electronic magnification: for optimal magnification, a pixel on the image will be a magnified view of one sensitive element on the camera. For
visual observation this factor can be obtained by simply taking the ratio of
the image dimension on the screen with the camera sensor dimension. For
example, for an image that comes from a camera with a 1 sensor (with an
actual diagonal of 16 mm) filling completely a 17 (or 43 cm) screen, the
electronic magnification is 430/16 27. Note that the format of the sensors labeled as 1/3, 1/2, 2/3 or 1, does not directly translate to actual
dimension and we would need to refer to Table 3.9 The total visual magniFormat
Diagonal length
Length
Height
1/3
4.8
3.6
1/2
6.4
4.8
2/3
11
8.8
6.6
16
12.8
9.6
116
Resolution is also called the resolving power of the microscope and gives the
shortest distance between two points on a specimen that can still be resolved. The resolution depends ultimately on physics, and more specifically
on diffraction law.
We note that the resolving power does not describes the ability to observe
small structure, but the capability to resolve between two neighboring points
or observe detailed features. The difference is of importance, and, for example, nanoparticle below 50 nm that disperse enough light can still be observed (that is, we see there is a halo, but can not really tell its details
and only guess that the particle stays at the center of the halo) with a light
microscope.
The possibility to correct the optical aberration of objectives has tremendously improved since the apparition of simulation software. Nowadays,
most optical system are mostly limited by the diffraction effect and no more
by spherical or other aberration. Diffraction by the microscope objective
acting as the entrance pupil of the microscope can be simply considered as
a problem of diffraction by a circular aperture in the far field - with infinity
corrected objective we observe at infinity - and we can use the simplified
model of Fraunhofer diffraction (cf. Appendix F)12 . This means that the
image of a point object observed by the objective and formed by the tube
lens in an infinitely corrected microscope is not a point, but an Airy disk.
Then we may define criterion for resolution, based on whether the Airy disk
produced by two neighbouring points can be distinguished or not. The meaning of distinguished is subject to debate and one of the earliest criterion
has been postulated by Lord Rayleigh: two points of the same size can be
distinguished if the center of the two resulting Airy disks is separated by a
distance equal to the diameter of the Airy disk (Figure 3.31-c). Another way
to look at this criterion, and what makes its simplicity, is to say that the
maximum of the Airy disk should be located at the first zero of the other
Airy pattern. According to Rayleighs criterion we may obtain the resolving
power of the objective using the formulas derived in Appendix F as :
xminR = 1.22
fO
D
where D = 2a is the diameter of the objective entrance pupil and fO its focal
length.
12
Actually it is an approximation as the far-field hypothesis is not entirely verified here: the
observation plane is indeed far but the wave from the very close object striking the objective
aperture is not plane.
3.7. CHARACTERIZATION
(a)
(b)
117
(c)
(d)
Figure 3.31: Resolving two neighbouring punctual objects having the same Airy
disk (a) not resolved (b) barely resolved according to Sparrows rule (c) barely
resolved according to Rayleighs rule (d) resolved.
The Rayleighs criterion is not the only one possible and other, less stringent
ones, have been proposed like the Sparrows rule which is particularly representative of what is actually observed with a telescope. Here the idea is to
say that two points can not be resolved anymore when the dip between the
two maximums disappears and we get a flat topped signal (Figure 3.31-b).
In this case the limit is slightly improved and we get :
xminS = 0.96
fO
D
fO
D
118
Depth of field is the number describing the longitudinal resolving power (axial resolution along the focus direction) as opposed to the lateral resolving
power, just described. Actually, for an infinity corrected microscope, a flat
object placed in the objective focal plane fO will be perfectly imaged in the
intermediate image plane. However if the object is offset and placed slightly
above or below fO , its image wont be right at the intermediate image plane
but slightly before or after. Thus in the intermediate image plane which is
observed through the ocular (or by the relay lens), this last object would
appear slightly blurred. The depth of field gives an estimate of how much
offset can be tolerated before the image is too blurred.
This effect has two main contributors (neglecting again aberrations) a geometrical effect and the diffraction due to wave optics. The geometrical effect
can be obtained by first observing that for an object on the optical axis there
is a region symmetrical around the image plane where the image of a punctual object will give a disk with a diameter smaller than . If is small (on
the order of the diffraction effect), this geometrical blur can be ignored and
for practical purpose we would consider that the object is on focus. Then,
the depth of field (DOF) is actually the range where the object can be placed
in front of the objective and still yield an image in this zone of acceptable
blur (Figure3.32). We note that this region is not symmetrical around the
nominal focal plane in front of the objective but is longer before it than
after. The diagram in the figure allows after some calculation and simDOF
Nominal
object position
Intermediate
image plane
3.7. CHARACTERIZATION
119
find :
DOF = n
D
2f0
n D
MO 2f0
120
Expression
NA2
Light gathering
Resolving power
Depth of field (diffraction)
Improves with
xmin
n
NA2
2NA
NA
1/NA2
3.7. CHARACTERIZATION
121
resolution and this conundrum can only be solved by using large lens for the
objective input lens, upping their price by a factor of 10 when compared to
classical objectives. Some brand like Mitutoyo are proposing 10 objective
with a working distance of 33 mm and 100 exceeding 6 mm while keeping
NA = 0.7.
As we see on the inset, many information are written on the
objective barrel: the type of objective, usually reflecting the correction of chromatic aberration (best is Apochromat corrected for
4 wavelengths, less good is Achromat corrected for 2 wavelength)
and of field-curvature aberration (Plan means that the focal plane
is a plane the best for a camera sensor instead of a sphere
section), the magnification (for example 60) followed by the numerical aperture (for example 0.5), the tube length in mm (
means infinity corrected objective, other value is 160 for standard
160 mm tube length) followed by the thickness in mm of the glass slide placed
between the lens and the sample for which the objective is corrected (usually 0.17
for the 0.17 mm thick coverslip used for biological sample, and a - when there is
no such correction as is usually the case for MEMS observation) and sometimes at
the bottom, the working distance in mm. Unfortunately different manufacturers
use different arrangement and sometimes, written figures become ambiguous (e.g.,
is 0.17 the NA or the glass slide thickness ?).
The microscope subject is much richer than this short section exposes and for
example we did not discuss the problem of illumination, including the possibility
to use dark-field microscope for reveling small details on a flat surface. However,
we shall finish the topic by noticing that we may beat the diffraction limit in an
original way by using for imaging a very small aperture scanned very close to the
sample instead of a lens. The short distance prevents diffraction effect to appear
and the resolution becomes roughly equal to the aperture diameter - which can
be down to 50 nm. With this so called Scanning Near-field Optical Microscope
(SNOM) we need to scan line by line the object and record each points of the
surface independently and sequentially, instead of the parallel way used for imaging
with traditional light microscope. Even if it is slower and if its extremely short
depth of focus is not easy to manage, the SNOM remains an optical microscope
and as such renders important services for high resolution measurement of material
optical properties.
3.7.2
The scanning electron microscope (SEM) is the workhorse of 3D micro-characterization. Not only does it give hard to match resolution for geometry measurement
(down to 1 nm), but also allows finding material composition or morphology. The
SEM magic starts at the top of the vacuumed instrument column where electrons
are produced (the so-called primary electrons PE), accelerated and shaped in a
122
beam before they are focused and scanned on the surface of the sample. Actually
the SEM is part of the scanned microscopy (like the SNOM just described) where
imaging is obtained point by point by progressively exploring the surface of the
object using raster (succession of line) scanning. Here at the impinging point, the
interaction between the primary electrons and the sample results in the generation
of different signals that are measured with multiple detectors schematically shown
in Figure 3.34.
Electron source
Condenser coils
Electron beam (PE)
Aperture
Scanning coils
Focusing
electrodes
X-Ray
detector
Electron (BSE)
detector
Electron (SE)
detector
3.7. CHARACTERIZATION
123
PE
BSE
SE
X-Ray
TE
Figure 3.35: Interaction of electron with matter, showing primary electron beam
(PE) with resulting backscattered electrons (BSE), secondary electrons (SE),
transmitted electrons (TE) and X-rays.
Actually, we also show in Figure 3.35 that if the sample is thin enough (<
100 nm) and if we use higher energy beam (100 keV) transmitted electrons (TE)
pass through the sample. By adding a detector below the sample we get measurement linked to absorption inside the material and we talk then of scanning transmission electron microscopy (STEM). However, the special configuration required
for these measurement are best done inside a transmission electron microscope
(TEM), which does not use scanning but projection and imaging for characterizing the sample. In this case, by using ultra-thin samples (a few nm) it is even
124
possible to reach an ultimate resolution well below the nm and see individual
atoms, revealing the atomic structure of the sample13 .
The resolution of the SEM is directly linked to the size of the spot where the
electrons can be focused and also on the interaction volume (cf. Figure 3.35).
Contrary to the optical microscope, the diameter of the focused spot is not limited
by diffraction effects. Actually the De Broglies wavelength of an electron at 30 keV
is given by =
h/p where the momentum p is actually obtained from the kinetic
energy as p = 2Em, resulting in a value of... e = 7 pm, that is 0.007 nm.
Even if the beam energy is 100 times smaller, the influence of diffraction (aperture
would be 10s of m at least) can clearly be neglected. Thus in the SEM case,
the spot size is still linked to the aberration of the electron optics (the coils, the
electrostatic lens, etc), and is far from being diffraction limited. Currently the
highest resolution claimed by manufacturer is below 1 nm and in some extreme
case close to 1
A.
The main techniques for obtaining a better resolution would be to decrease the
diameter of the aperture in the column (Figure 3.34) and to bring the sample closer
to the column end. Both techniques reduces the number of stray electrons blurring
the image and are also an effective means to lessen electron lens aberration. Of
course, in the first case, that means having a lower current reaching the sample
and thus darker images, but with modern detector it is usually not so much of an
issue. These techniques aim at decreasing the SEM spot size, but it is also possible
to decrease the interaction volume by lowering the velocity of the electrons and
working with lower beam energy. Ultimately using in the focusing section of the
column additional elements for performing beam deceleration can decrease the
energy of the electrons hitting the sample to about 50 eV, giving interaction depth
as small as a few nm for secondary emission.
Another feature that differentiates the SEM is that it presents a very large
depth of field, 10 to 100 times larger than what is typical with an optical microscope. Actually, for practical reason (focusing electrons is more complicated than
photons), the NA of the focusing column is very small for example, compare the
length of a SEM column to the frontal distance of a microscope and thus the
illuminating beam has a very small divergence, keeping its diameter constant over
a rather large distance, and hence the resolution of the SEM14 .
An annoying feature of the conventional SEM, which operates in high vacuum,
is that the specimen has to be electrically conductive or has to be coated with
a thin conductive layer (e.g., carbon, Au), which is usually done inside a small
sputter placed close to the tool. This avoids negative charge build-up on dielectric
sample (electron get trapped in defect) that would finally prevent primary elec13
There are more signals coming from the electron excited sample, like the Auger electron
studied with Auger electron spectroscopy (AES) or the photons from cathodaluminescence, that
can be used for sample analysis, but fall beyond the scope of this short introduction.
14
We may note that the term depth of field is inappropriate here as we dont use the electron
for imaging (except in the TEM), but only for very local illumination of the sample and then
look in many direction what comes out of it.
3.7. CHARACTERIZATION
125
3.7.3
The contact probe profilometry is a family of tools that are used for measuring
surface profile by simply measuring the vertical displacement of a sharp tip that
is scanned on the sample surface (Figure 3.36). The method is basically a line
measurement method, but it can be used for surface measurement by repeating
line scan in a raster like manner.
The stylus profilometer is the oldest of these tools, where a sharp metal tip is
pressed against the surface and scanned while its vertical movement is recorded
using magnetic sensing (LVDT). The user has only to set the sampling length
126
3.7. CHARACTERIZATION
127
1
l
|z(x) z|dx
0
128
where l is the sampling length and z = 1l 0 z(x)dx the average profile height. We
need to understand that there is no unique value of this parameter, as a surface is
mostly fractal, and using sharper and sharper tip will reveal more peak and valley.
Thus the roughness is measured for a certain application dictating the useful range
of spatial frequencies, and choosing the appropriate tool and data processing for
achieving this goal. The high frequency cut-off (or low wavelength cut-off) is given
by the tip radius and by the sampling period. Actually we can consider that the
radius of curvature of the tip gives a mechanical cut-off wavelength of about
the same value as such a tip of 2 m radius will allow to measure down to a
wavelength of 2 m (or a spatial frequency of 0.5 m1 ). The sampling period
obtained by dividing the sampling length by the number of sampling points gives
rise to a cut-off wavelength which according to Nyquist theorem is at about twice
its length. In practice, for recording roughness profile we would oversample the
surface and use a sampling length about 1/6th of the desired cut-off wavelength.
For example, for obtaining a cut-off only limited by the mechanical cut-off with
a tip radius of 2 m thus resulting in a low wavelength cut-off of about 2 m, we
should sample the surface every 0.35 m. At the other side of the spectrum, the
high wavelength cut-off (or low frequency cut-off) will be given by the sampling
length, and usually we would use 0.5 mm or so (i.e., spatial frequency of 2 mm1 ).
PROBLEMS
129
Problems
1. Plot using polar coordinate the Youngs modulus variation w.r.t. to direction
in the top surface plane for a < 110 > Silicon wafer in a way similar to what
was done for < 100 > and < 111 >-cut wafers in the inset p. 71.
2. An optical telecommunication devices manufacturer wants to use microfabrication to produce V-groove for holding optical fiber. The process that he
most likely use is:
silicon substrate, photoresist mask and HF etchant
glass substrate, chromium mask and RIE etching with SF6
silicon substrate, silicon nitride (Si3 N4 ) mask and KOH etchant
silicon substrate, silicon dioxide (SiO2 ) mask and HF etchant
Draw a cut-out view of all the different processes proposed before making
your choice
3. A mask has the pattern of Figure 3.38(a). Which photoresist and pattern
transfer technique could be used to pattern a thin film (in black) as shown
in Figure 3.38(b)?
4. A circular hole in silicon is closed by growing oxide. The hole has a diameter
of 2 m.
(a) Approximate the Deal and Groves equation when t is much larger than
and A2 /4B (long duration approximation).
(b) How long will it take to close the hole if the long duration oxidation is
performed at 1100 in wet O2 ? (Note: we have B = 0.51 m2 /h at
1100)
130
Shutter
Material
V1
Crucible
V2
Vent
V3
Roughing
pump
Diff.
pump
Backing
pump
Poly-Si
Channel
Glass
PROBLEMS
131
Membrane
Silicon
Doped silicon
SiO2
Si3N4
132
Chapter 4
MEMS technology
4.1
134
chips can use the best process without compromise and may achieve a better overall
yield. However compactness and reliability suffers from the additional elements
and the packaging becomes slightly more complicated. Moreover, the electronic is
somewhat further from the sensing element and the tiny wires used for connection
may introduce additional noise if the signal is small. It is this last argument that
has pushed AD to develop its fully integrated accelerometer range, the iMEMS.
IC chip
MEMS chip
Passive structures are used to support, guide, channel, etc providing indispensable building blocks for the realization of complete systems. Their main
role is to transport energy within the system.
Active structures (or transducers) are at the core of actuator and sensor
operation. Fundamentally there role is to allow a certain form of energy
transfer between the environment and the system.
4.2
4.2.1
135
Passive structures
Mechanical structures
d2 y
M
=
2
dz
EI
(4.1)
with E the Youngs modulus for the material of the beam and I
the second moment of inertia for the beam cross-section, which is
w F
given by I =
x2 dA = wh3 /12 for a beam with a rectangular h
cross-section as shown in the inset.
x
y
For the cantilever of the inset submitted to a point load normal
to the surface at its end, the moment is simply M (z) = F (L z) and the beam
equation becomes :
F
d2 y
= (L z)
2
dz
EI
we integrate twice w.r.t. z the equation and obtain:
y=
F 1 2 1 3
( Lz z + Az + B)
EI 2
6
dy
dz z=0
= 0, thus A = 0
giving finally
y=
F
EI
L 2 1 3
z z
2
6
F z2
(3L z)
6EI
The deflection at the end of the beam where the force is applied is thus
y(L) =
F L3
3EI
F
3EI
= 3
y(L)
L
136
L
y
L
F
Deflection
Cantilever
Clamped-guided beam
y=
y=
Clamped-clamped beam y =
F z2
(3L
6EI
Max Defl.
Spring
constant
z)
y(L) =
F L3
3EI
3EI
L3
2z 3 )
y(L) =
F L3
12EI
12EI
L3
F
(3Lz 2
12EI
F
(12Lz 2
192EI
16z 3 )
F L3
192EI
y(L/2) =
192EI
L3
Compliance
Buckling
Linearity
clamped-clamped
++
crab-leg
controllable
folded-beam
The internal moment is exactly balanced by an externally applied moment of opposite sign.
137
shuttle
anchor
shin
clamped-clamped
thigh
crab-leg
folded-beam
truss
138
L
y
L
q
Deflection
Cantilever
y=
Clampedclamped beam
y=
qz 2
(z 2
24EI
qz 2
(z 2
24EI
Max deflection
qL4
8EI
4Lz + 6L2 )
y(L) =
2Lz + L2 )
y(L/2) =
qL4
384EI
a
a
Type
Deflection
Round (Force)
Round (Pressure)
Max Defl.
F a2
16D
yC =
y=
3(1 2 )q
(a2
16Et3
r 2 )2
yC =
Spring
constant
k=
16D
a2
k=
Et3
F a2
qa4
64D
Square (Force)
yC =
F F a2
Et3
Square (Pressure)
yC =
P qa4
Et3
139
f
cantilever hinge
fork hinge
Figure 4.7: Flexure hinge and equivalent free hinge for (left) standard cantilever
hinge (right) improved fork hinge
neglected. For example for a round diaphragm with large deflection (assuming the
material remains in its elastic limit) we obtain the following non-linear characteristic equation to obtain the center deflection yC :
q = 64
D
D
yC + 4(7 ) 2 4 yC3
4
a
ta
Besides suspension, other mechanical function, like hinge or joint, are often
needed in MEMS. However the fundamental inability to miniaturize hinge because
of the low relative manufacturing accuracy evoked previously, forces designers to
often use flexible micro-joint instead. This joints will have excellent wear characteristic but they will usually restrict rotation. These flexure hinge may use a
simple cantilever or more complex beam arrangement. As we can see schematically
in (Figure 4.7), the fork hinge[26] has the advantage to present a larger rotation
angle for the same horizontal displacement than a standard cantilever beam with
the same stability (resistance to buckling). Of course, if the angle of rotation need
to be really large (> 20 ), the alternative is to use a free hinge as shown in
Figure 3.21, but the manufacturing complexity will increase substantially - and
the reliability will drop.
4.2.2
Distributed structure
Using lumped elements to represent continuous structure is clearly an approximation, and it requires good judgment to decide how a structure should be represented. For example, as we see in Figure 4.8, a beam can be represented by a
single lumped element, a spring, or we can also take into account the weight of
the beam itself and represent it by a lumped spring and a lumped mass. We could
even go further, and consider the material loss inside the beam (linked to atomic
rearrangement during deformation) and add to the model a dash pot to represent it as a viscous loss. The choice between the three representation illustrated
140
F
m
k
F
m
F
c
Figure 4.8: Modeling a continuous beam using different approximation with
lumped elements.
in Figure 4.8 will directly depend on an estimation of the dominant effect in the
complete structure. For example if a mass is connected at the tip of the bending
beam, and the mass is 20 times the weight of the beam itself, the inertia of the
beam can probably be ignored, and its mass forgotten in the model. Likewise, if
we place the beam in vacuum and damping due to air becomes negligibly small,
then material damping could become important and a dash pot could be added to
the beam model.
In the case of an elastic structure with a distributed mass, a method pioneered
by Lord Rayleigh can be used to estimate an equivalent mass for the moving
structure while keeping the spring constant obtained from beam or shell theory.
The Rayleigh method is based on the hypothesis that at resonance (where there
are no loss in the system) the maximum kinetic energy and the maximum potential
elastic energy are equal.
The method is best explained using an example and we will consider a shuttle mass with a clamped-clamped suspension, composed, by symmetry, of two
clamped-guided beams, as shown in Figure 4.9.
L
w
clamped-guided beam
shuttle
clamped-guided beam
Figure 4.9: A shuttle mass oscillating between two positions with a pair of clampedguided beams of length L, width w and thickness h as suspension.
We need first to estimate the kinetic energy in the structure when it is excited
sinusoidally at resonance with = 0 .
We consider one beam of the suspension and approximate the amplitude of its
141
resonant mode by using the expression of the static deflection with a force at the
end as given in Table 4.1.
F
(3Lz 2 2z 3 ) sin(t)
12EI
We take the derivative to obtain the velocity distribution along the beam as:
y(z, t) =
v(z, t) = y(z,
t) =
F
(3Lz 2 2z 3 ) cos(t)
12EI
The kinetic energy ( 21 mv 2 ) in the beam is then obtained by integrating over the
volume:
2
L
13
F
hwv 2 (z, t)dz = hwL7
Kb =
cos(t)
70
12EI
0
We will now turn our attention to the determination of the potential elastic
energy inside the beam.
The elastic energy density in pure bending (that is, we have = M y/I) for
isotropic materials is given by:
1 2
1 (M y/I)2
1
=
=
2
2E
2
E
Noting that he bending moment along the beam is given by
d2 y(z, t)
F
M (z, t) = EI
=
(6L 12z) sin(t),
dz 2
12
we can obtain the potential energy in the complete beam by integrating the energy
density over the beam volume:
1
1 (M y/I)2
dV =
dV
E
V 2
V 2
L
1
1
2 2
y 2 dA
=
M y dV =
M 2 dz
2
2
2EI V
2EI 0
A
L
L
1
1
=
M 2 dzI =
M 2 dz
2
2EI 0
2EI 0
2
L
1
F
=
(6L 12z) sin(t) dz
2EI 0
12
F 2 L3
=
sin2 (t)
24EI
At resonance = 0 we have a periodic transfer between kinetic and potential
energy (for non dissipative systems) thus the maximum of kinetic energy is equal
to the maximum of potential energy:
Ub =
max Kb = max Ub
13
hwL7
70
F
0
12EI
F 2 L3
=
24EI
142
and we get
70 12EI
26 hwL4
If the continuous beam can be represented by a massless spring and a punctual
mass at its end (lumped model), we have:
02 =
02 =
keff
meff
where keff the effective spring constant is taken as the spring constant of a single
clamped-guided beam as defined by beam theory and meff is obtained by identification with the previous formula.
Thus we have for the equivalent model of the beam a spring of stiffness
keff = kb =
12EI
L3
keff
26
26
= hwL = mb
2
0
70
70
where mb is the mass of the beam.
Thus the complete suspension we are considering composed of two equal springs and a central
kb
mshuttle
kb
mass can be represented by this equivalent model.
It means, for example, that its resonant frequency
meff
meff
could be estimated using 0 = k/m with k = 2kb
and m = mshuttle + 2 26
m
.
70 b
meff =
4.2.3
Fluidic structures
In microfluidic devices the ubiquitous passive element is the channel which is used
to transport fluids. In general the problem of fluid flow, even using simpler incompressible fluid, is rather complex as it is governed by the Navier-Stokes equation.
However, in some practical cases at micro-scale this equation can be simplified and
has even a few analytic solutions. The distinction between these cases is based on
a series of dimensionless coefficients, principally the Knudsen number and Mach
number (M a = u/cs ) for gas and the Reynolds number (Re = uL/) for gas and
liquid. The fluid properties are noted as u for the velocity of flow, cs the speed of
sound in the fluid, the density of the fluid, the dynamic viscosity ( = with
the kinematic viscosity) and L a characteristic dimension of the channel (usually
its width). Interestingly, for value of Mach number smaller than 0.3, a gas can be
considered as an incompressible fluid and use the same simplified equations as for
liquids for example in air the velocity of sound is approximately cs = 340m/s,
that is air can be considered an incompressible fluid for speed u < 100m/s.
In cases of microchannel at the low flow velocity usually observed in microdevices, the Reynolds number is small (< 1500) and the flow is laminar (that is, there
143
u(r) =
dp r02 r2
dx
4
u(y, z) =
dp 16a2
dx 3
Rectangular
and Q =
and Q =
dp 4ba3
dx 3
dp r04
dx 8
(1)
i1
2
i=1,3...
192a
5 b
i=1,3...
cosh(iz/2a)
cosh(ib/2a)
cos(iy/2a)
i3
tanh(ib/2a)
i5
Table 4.5: Fluid velocity and flow in channels with circular and rectangular crosssection (the channel is placed along x, p is the pressure, r0 the channel radius, a
the channel width and b its height).
is no turbulence). This complicates the tasks for producing mixers, as turbulence
is the most efficient way to mix two fluids, but in that case, the flow induced by a
difference of pressure (Poiseuilles flow) can be obtained analytically as shown in
Table 4.5 [18].
With the equations in the table it is possible to obtain the flow and the velocity
across the channel section according to the pressure drop and to the geometry.
However it is not always necessary to know them precisely
at any point across the channel and often it is enough to obtain
the average velocity u. The flow Q is related to the average
u
fluid velocity by:
A
Q = uA
(4.2)
where A is the channel cross-section area. In the case of pressure driven flow, the
average fluid velocity can simply be expressed as:
upf =
2Dh2
p
Ref L
(4.3)
where Ref is the product of the Reynolds number (Re) and f the friction factor.
This product can be computed from first principle for circular channel and we have
Ref = 64. Experience shows that this value needs correction for micro-channels,
where it is actually more in the 50 60 range.
If the channel is not circular it is still possible to use the same equation, but then
we need to use an effective diameter, which is called the hydraulic diameter. For
a channel of any cross-section we have:
Dh =
4A
Pwet
(4.4)
where Pwet is the wetted perimeter, that is, the length of the channel perimeter in
direct contact with the fluid. We note that for a circular channel of diameter D
we have Dh = 4A/Pwet = 4(D/2)2 /2(D/2) = D, as we would expect.
These equations will give the pressure drop along a channel, or the flow rate if the
pressure drop is known.
144
(4.5)
where sl , lg and sg are the surface tensions at the solid-liquid, the liquid-gas and
the solid-gas interfaces respectively, and is the contact angle. The contact angle
is used in particular to distinguish between hydrophobic and hydrophilic surfaces,
with the exact criterion being that the contact angle should be larger than 90 for
the former and lower for the later. However in general this concept is often used in
a less rigorous manner and describe relative value of different concepts as shown
in Table 4.6
Parameters
Hydrophobic surface
Hydrophilic surface
>90
Drop behavior
<90
Contact angle
high
low
Adhesiveness
poor
good
Wettability
poor
good
low
high
1
1
+
R1 R2
(4.6)
where R1 and R2 are the two radii of curvature of the interface (this surface is
three dimensional and for a circular channel we have R1 = R2 ).
Using these two equations we can find the force exerted on a liquid in a circular
micro-channel - also called a capillary. Actually in that case the pressure drop at
the liquid-gas interface simplifies to:
p =
2lg
r/cos
(4.7)
where r is the radius of the capillary. This resulting pressure difference can draw
liquid in narrow channel (where r is small, thus p large), even balancing the
145
gravitational force induced hydrostatic pressure (p = gh) and thus allowing the
liquid to rise in the capillary to the height h. This property of fluid to rise in
narrow channel made of material that they wet (actually the contact angle need to
be smaller than 90 for the force to pull the fluid) is called capillarity. Capillarity
makes it difficult to fill hydrophobic channels, as it tends to push the liquid back,
or to empty hydrophilic ones, where it pulls the liquid inside. This effect is more
pronounced for liquid with larger surface tension (Table 4.7), like water, which has
one of the largest surface tension of common material due to the hydrogen bond
between water molecules.
Liquid
Mercury
0.425
Sodium Chloride 6M
0.082
Water
0.072
0.070
Ethylene glycol
0.048
Isopropanol
0.023
Ethanol
0.022
Perfluorohexane
0.012
Table 4.7: Surface tension for selected fluids at 20 (the surface tension will
decrease with the temperature).
4.3
Sensor technology
Sensing is certainly a quality that we associate with living being. A stone does
not sense, but can a silicon circuit do it? Of course, the answer is yes, and MEMS
have increased tremendously the number of physical parameters that are sensed
by silicon.
Sensing can be formally defined by the ability to transform any form of energy present in the environment into energy inside a system. An example will
be to convert the air temperature (heat energy) to an electrical signal by using a
thermo-couple. At the heart of the sensor is the ability to perform the energy transformation, a process usually called transduction. MEMS sensors ability to measure
different parameters as pressure, acceleration, magnetic field, force, chemical concentration, etc is actually based on a limited number of transduction mechanisms
compatible with miniaturization : piezoresistive, capacitive, piezoelectric, and in
146
Measurand
Primary signal
Conditioning circuit
Piezoresistive
Stress
Resistance
Potentiometric, Bridge
Capacitive
Deformation
Capacitance
Permitivity
Bridge
Frequency converter
Piezoelectric
Stress
Charge
Inductive
Deformation
Reluctance
C-V converter
Bridge
Frequency converter
4.3.1
Piezoresistive sensing
The oldest MEMS sensor that gained huge popularity was the pressure sensor and
it was based on the piezoresistive effect. Piezoresistivity can be described by the
change of resistance of a material when it is submitted to stress. This effect is
known since the 19th century in metals, but it was only in the mid 1950s that it
was recognized that semiconductor and particularly silicon had huge piezoresistive
coefficient compared to metal[4]. The MEMS designer will then create piezoresitors
by doping locally silicon and place them where the stress variation is maximal,
for example, at the edge of a membrane in pressure sensor. Then by measuring
their resistance change he will be able to infer the stress which is related to the
deformation.
For converting the resistance change R in a voltage,
the potentiometric (voltage divider) configuration is the
R
simplest, and the voltage at the output is given by :
+
+
Vin
-
R+ D R
Vout
-
Vout =
R + R
Vin R
Vin
+
Vin
2R + R
2
2R
However, as we see, this configuration suffers from a strong offset ( V2in ), complicating the signal processing operation, and a relatively low sensitivity (s =
in
dVout /dR = V2R
). A better configuration is based on the Wheatstone bridge
circuit (Figure 4.10) that completely suppresses the offset and reach a better sensitivity in some configurations. The suppression of offset is obtained by balancing
147
the bridge, that is obtaining a null output voltage when the resistor has its nominal
value. This condition is obtained when the resistors in each branch of the bridge
have values verifying R1 R3 = R2 R4 which is automatically reached if the four
resistors are the same and R1 = R3 = R2 = R4 = R. From there, it is simple
to show that if there is one single variable resistor in the balanced bridge and if
R << R then
Vin
R.
Vout
4R
We see here that we have suppressed the offset completely allowing easy amplification further down the chain. Moreover, we can increase the sensitivity to make
it surpass the earlier voltage divider configuration. By positioning the variable
resistors with the configuration shown on the right (where there are four variable
resistors and where the change of resistance induced by the measurand on two of
the resistors is opposite to the change induced on the two other but with the same
magnitude), then we exactly have
Vout =
Vin
R
R
R
+
Vin
-
R
-Vout +
R+ D R
Vin
.
R
R- D R
+
Vin
-
R+ D R
-Vout +
R+ D R
R- D R
Figure 4.10: Resistors in a Wheastone bridge with (left) one variable resistor, or
(right) four variable resistors.
For resistors much longer than wide it is possible to write the relative change
of resistance as :
R
= l l + t t
R
where i is the piezoresistive coefficient and i the stress component respectively,
along the direction parallel to the current flow (l longitudinal) or perpendicular
to it (t transverse). However the anisotropy in silicon, and actually in most crystals, makes it difficult to obtain the piezoresistive coefficients. Actually, all the
148
s
l
st
4.3.2
Capacitive sensing
A
149
er
Capacitive sensing is a versatile sensing technique independent of the material used and it relies on the variation of capacitance appearing when the geometry of a
capacitor is changing. Capacitance is proportional to
C
0 r
A
g
where A is the area of the electrodes, g the distance between them and r the
permittivity of the material separating them (actually, for a plane capacitor as
shown above, the proportionality factor is about 1). A change in any of these
parameters will be measured as a change of capacitance and variation of each of
the three variables has been used in MEMS sensing.
For example, accelerometers have been based on a change in g or in A, whereas
chemical or humidity sensor may be based on a change of r .
If the dielectric in the capacitor is air, capacitive sensing is essentially independent
of temperature but contrary to piezoresistivity, capacitive sensing requires complex
readout electronics. Still the sensitivity of the method can be very large and, for
example, Analog Devices used for his range of accelerometer a comb capacitor
having a suspended electrode with varying gap. Measurement showed that the
integrated electronics circuit could resolve a change of the gap distance of only
20 pm, a mere 1/5th of the silicon inter-atomic distance.
The conversion from capacitance to voltage can be obtained by using a bridge
configuration but using AC voltage excitation. Then, instead of resistance as in
the Wheatstone bridge we would then consider complex impedance of capacitor
(ZC = jC) and do the same math. A remaining issue would be to detect the
amplitude of the signal (e.g., with a diode) but that principle would be sufficient
for may applications. A more evolved principle has been used in some capacitive accelerometer from Analog Devices that highlights the interest of differential
sensing and will be described in more details.
The designer have chosen to use the variation of the
gap g between the electrodes as it may provide large seng0-x
sitivity if the initial gap g0 is small enough. However,
+V0
Ct
because the electrodes thickness was limited as they used x
Cb
surface micromachining, the capacitance was small and 0
-V0
they connected many such capacitors in parallel using a
Vx
g0+x
fin-like structure with interpenetrated fingers for the mobile (rotor) and fixed (stator) parts.
If we zoom on a single rotor finger, we see that it is surrounded by two stator
fingers that are polarized with AC voltage V0 and each constituting the fixed
electrode of a variable capacitor where the rotor finger acts as the second movable
electrode. We notice that when the rotor moves (e.g. for an accelerometer, because
it is subjected to acceleration) one capacitance increases (decreasing gap) while the
150
Vp
Source
Cantilevered
gate
Drain
Fel
Vin
p silicon
Metal
n silicon
Oxide
151
other decrease (increasing gap). Individually each of this capacitance varies very
non-linearly with the position x and we have, for example for the top capacitor :
Ct =
0 r
A
=
g0 x
0 r
A
1
1
= C0
g0 1 x/g0
1 x/g0
1, and we get :
x
g0
Let now consider the difference between the top and the bottom capacitance :
C = Ct Cb =
= 2C0
0 r
g0 x
0 r
A
=
g0 + x
0 rA 2
g0
2x
x2
x/g0
1 (x/g0 )2
x
g0
1 as :
1+
x
g0
2C0
x
g0
The interest ? Well, first the sensitivity has increased by a factor of 2 between
the two cases, and besides the approximation is much less stringent, meaning
that the non-linearity will be decreased. To convince yourself of this last point,
imagine that the displacement x is 20% of the initial distance g0 it is not sure
that this case still qualify for x/g0 = 0.2
1... but clearly it would respect the
2
(x/g0 ) = 0.04
1 condition. Actually what we have done here is that we have
removed the first order error in the approximation and now the error between the
approximate (linear) formula and the exact formula is only of the second order.
However the last equation still depends on an approximation to be linear and
on many geometrical parameters (buried in C0 expression) that could make it
harder to go for industrialization. To overcome this issue the engineers came up
with a more complex demodulation scheme (Figure 4.12) than the simple bridge
configuration, that is based on using two AC excitation voltages of opposite sign
(frequency 1 MHz) and performing the required amplitude detection with a multiplier.
The voltage Vx of the rotor electrode is floating and its value is found by
considering the node there. Because of the buffer amplifier, no current goes out
of this node and thus the total charge on the connected capacitors has to be 0,
that is, Qt + Qb = 0, where t and b subscripts are used for the top and bottom
capacitors as previously. Thus, using the capacitor relationship Q = CV , we can
write
Qt + Qb = Ct (+V0 Vx ) + Cb (V0 Vx ) = 0
152
+
-
Cb
-V0
Ct Cb
Ct + Cb
Using the previous expression for the capacitance the voltage Vx is given by :
Vx = V0
Vx = V0
x/g0
2C0 1(x/g
2
0)
1
2C0 1(x/g
2
0)
= V0
x
g0
Although there is still dependence on one geometrical parameter g0 the C0 dependence is gone, and, more importantly, the expression is now fully linear and we
may use displacement x as large as we want without any approximation ! The
trick here has been to normalize the value of the difference by the average value
of the capacitance, removing the factor in front of the expression. It should be
noted that it is a general property and such principle could be used whenever a
non-linear element response needs to be linearized.
The remaining issue is that the signal amplitude still need to be retrieved, which
is performed by the multiplier after the buffer amplifier. Actually, by multiplying
the signal Vx with GV0 (G is the gain of the multiplier) we actually get
V = GV02
x
,
g0
thus the signal becomes continuous (V0 is a symmetric bipolar signal of amplitude
Va thus V02 is simply a continuous signal of amplitude Va2 ) with an amplitude
directly proportional to the displacement.
4.3.3
153
4.4
Actuator technology
Since the industrial revolution we have understood that machines can perform
tasks with more force and endurance than humans. Bulldozers moving around with
their huge engine and pushing big rocks with their powerful pneumatic actuators
are probably a good example of what big machines can do. But what will be the
function of a micro-sized actuator?
Force
Stroke
Efficiency
Manuf.
Gap-closing
Electrostatic Comb-drive
Bimorph
Heatuator
Shape memory
alloy (SMA)
Thermo-fluidic
Type
Electromagnetic
SDA
Piezoelectric
Thermal
154
are currently used to act on micro-object, typically one part of a MEMS device,
and generate forces in the micro to milli Newton range with a stroke from a few
m to several hundreds m. It would be interesting to have enough force and
stroke to allow actuator to help interface human and machine by providing force
feedback for example, but micro-actuators are still unable to do that properly.
Still a wide range of principles exists that would transform internal energy of a
system (usually electrical energy) to energy in the environment (in the case of
MEMS, generally mechanical energy). Sometimes the conversion from electric
energy to mechanical energy is direct but often another intermediate energy form
is used. For example, the heatuator, a form of thermal actuator, uses current to
generate heat which in turn becomes strain and displacement.
The MEMS actuators can be conveniently classified according to the origin of their
main energy form. In Table 4.9 we compare the most common MEMS actuators,
where Efficiency refers to the loss existing in the actuator conversion of electrical
energy to mechanical energy and Manuf. is the manufacturability or the easiness
of mass micro-fabrication.
4.4.1
Magnetic actuator
I
n
g /2
x
w
155
(nI)2
2w
0 A
g + 0 L/
From this equation it is clear that the force is non linear with the current, and
assuming a constant resistance for the coil, the force will also depend on the square
of the coil voltage.
Although this force does not scale very favorably, the possibility to increase the
current at small scale, because the heat can be dissipated more quickly, still allows producing relatively strong force. However the main difficulty preventing the
widespread use of this type of actuator in a MEMS component is the fabrication of the coil. In that case the most convincing approach proposed so far are
most probably those using a hybrid architecture, where the magnetic circuit is
fabricated using micro-fabrication but the coil is obtained with more conventional
techniques and later assembled with the MEMS part. Actually some design have
shown that the coil does not need to be microfabricated at all and can be placed
in the package, taking benefit of the long range action of the magnetic field.
Finally it should be noted that magnetic actuation can used in conjunction with
ferro-magnetic material to provide bistable actuator where two positions can be
maintained without power consumption. A permanent magnet placed in the package is used to maintain the magnetized ferro-magnetic material in place. Then,
when we send a current pulse of the right polarity in a coil wound around the
ferro-magnetic material we invert its magnetization and the actuator switch to its
second state. NTT has been producing since at least 1995 a fiber optic switch
based on a moving fiber with a ferro-nickel sleeve that has two stable positions in
front of two output fibers [15]. The device will consume power only during the
brief time where the current pulse is sent and can maintain its position for years.
4.4.2
Electrostatic actuator
A physical principle that leads itself well to integration with MEMS technology is
electrostatics. Actually by applying a potential difference between two electrodes,
they develop charges of opposite sign and start attracting each other through the
Coulombs force. This principle has known several applications among which,
the comb-drive actuator, the gap-closing actuator and the scratch drive actuator
are the most commonly used (Figure 4.14). To derive an expression of the force
developed by such actuator we will use the principle of virtual work. This principle
states that for energy conserving systems (no dissipation) the potential energy of
the system changes as the work of internal forces. This is another way of saying
that a system tends towards a state of minimal energy in mathematical terms it
says something like:
F = gradU
156
t
with grad = ( x
, y
, z
) a vector composed of the partial derivative along each
axis. We note the minus sign in front of the gradient, saying that the direction of
the force is opposite to the direction of increase (gradient is positive if the function
rises) of energy: that is, the force is leading the system toward a minimum of stored
energy.
In this case, as simple electrostatic systems are capacitors, the energy3 stored
in the system upon application of an external voltage is
1
UC = CV 2 .
2
Thus any variation of the internal (potential) energy of the capacitor due to some
change in its geometry can be attributed to the (virtual) work of a force. Thus the
force developed between the two electrodes becomes proportional to the gradient
(spatial derivative) of the energy :
F elec V 2 grad C.
This fundamentally shows that electrostatic actuators develop force non-linear
with voltage and proportional to the gradient of the capacitance. The V 2 depeng
stator
x
rotor
+
V
-
h
rotor
x
stator
comb-drive
gap-closing
vidt =
vdq =
vCdv = 0.5CV 2
157
This can be obtained in two ways: either the mean free path is increased by reducing the pressure (that is, at constant temperature, the density of gas molecules),
or alternatively the gap between the electrodes is reduced. At micro-scale it is
this second phenomena that occurs and allow for much higher electric field than
at large scale. This is the main contributor to the interest of electrostatic force
at microscale, making it able to pack almost as much energy in a finite volume
(energy density) than electromagnetic energy.
However avalanche effect is not the only contributor to conductivity and with
high enough field and even in vacuum electron emission will occur through other
mechanism (e.g., field emission at surface asperities) and finally result in arcing.
Experimental investigation of this effect has lead to the elaboration of modified
Paschens curves (Figure 4.15) showing an estimate of the breakdown voltage as
a function of the gap that take into consideration the Paschens curve and the
vacuum emission. These critical curve will vary with the gas used, its pressure,
600
Vacuum breakdown
Paschen's curve
500
400
Modified Paschen's curve
300
200
100
0
0
10
Gap [m]
15
20
Figure 4.15: Maximum admissible voltage in electrostatic actuators following modified Paschens law at 1 atm.
the electrodes material and the presence of asperities on the electrodes (including
at its edge). Still, we see that for obtaining maximum force there is little incentive
to get a gap below 2-3 m, and that a typical voltage of several 100 V may be used
for micro-electrodes. In practice test will be required to ascertain the real critical
voltage, but robust development will try to use gap of 4 m to avoid random effect
due to field emission while keeping the maximum voltage below 200 V.
The comb-drive actuator (Figure 4.14-left) was invented by W. Tang [16] at UC
Berkeley and it generally allows motion in the direction parallel to the finger length.
The capacitance can be obtained by considering each side of a finger behaves as
a parallel plate capacitor, giving for each finger a capacitance of C 2 0 hx/g.
Taking the gradient, the force produced by n fingers in the rotor is approximately
158
given by
Fcd n
h 2
V
g
where we see the expected dependence with the square of the voltage and notice
that it is independent of the displacement x. The proportionality factor is 0 , a
small quantity indeed, hinting to a small force generated per finger, in the order
of a few 10 nN. Of course the number of fingers can reach 100 or more and the
actuator aspect-ratio can be made larger (i.e., increase h/g) to increase its force
proportionally. This actuator has been used repeatedly in MEMS component, for
example in the original Analog Devices accelerometer or in the fiber optic switch
from Sercalo.
The origin of the force parallel to the electrodes
a
rotor
b
results from the unbalance of the Coulombs force
along some part of the finger. Actually the rotor
stator
charges located in (a) will experience a symmetrical
attraction resulting in an absence of net force, whereas, charges located in (b)
will, as a result of the unbalanced attraction, create a net force pulling the rotor
parallel to the stator. We note that the Coulombs force also results in a force
perpendicular to the surface, however the motion toward the stator is prevented
by the rotor suspension and by the balancing force of the stator finger placed on
the other side of the rotor finger.
The gap-closing actuator (Figure 4.14 center) actually makes use of this force
perpendicular to the electrodes surface and usually delivers a larger force (proportional to A). Actually the capacitance is now expressed as C 0 A/x, resulting
in a force again non linear with the applied voltage, but additionally depending
on the displacement x:
Fgc
A 2
V .
2x2
As this force is unidirectional (changing the voltage sign does not change the force
direction), a reversible actuator will need a restoring force to bring it back to its
original position. This is usually obtained with a spring (usually a bending beam)
that will be used to polarize and retain the rotor electrode as seen in Figure 4.16.
To find the rest position, we write the force equilibrium,
Fgc + k(g x) = 0
where k(g x) is the magnitude of the upward directed spring force. Thus we get
a third order equation relating the position x with the applied voltage V .
x3 gx +
A 2
V =0
2k
159
1
2V 2
arccos 1 2
3
Vpullin
1
2V 2
arccos 1 2
3
Vpullin
4
3
g
3
(4.8)
g
3
(4.9)
plus another root yielding x < 0 that is unphysical. The two physical roots of this
equation have been plotted in Figure 4.16. We note that instead of solving the third
order polynomial equation, we may plot V as a function of x for 0 < x < g. The
solution shows that, the rotor position can only be controlled over a limited range,
and actually one root corresponds to a completely unstable position. Actually,
when the voltage is increased the rotor slowly moved toward the stator electrode
but as soon as the rotor has moved by one third of the original gap width (g), snapin suddenly occurs and the rotor comes into contact with the stator (in the figure
we show two blocks in black used to prevent the contact between the electrodes
and a short-circuit, the position of these blocks will determine the pull-out voltage
when the voltage is decreased). The pull-in voltage is given by:
g
2g
3
R
Vpull-in
Vpullin =
8 kg 3
27 0 A
where g is the original gap width and k the rotor suspension spring constant.
This behavior can be advantageous if the actuator is used for bi-stable operation,
but as here, preventive measures should be taken to avoid electrodes short-circuit.
Actually, the actuator behind the Texas Instruments DLP is a gap-closing electrostatic actuator working in torsion with the two stable states position fixed by
resting posts. By biasing the actuator at a voltage in the middle of the hysteresis
curve, it needs only a small swing of voltage to allow a robust bi-stable actuation.
160
The scratch drive actuator (Figure 4.14 right) has been invented by T. Akiyama
[17] and although it is actuated by a varying electrostatic field, the friction force is
the real driving force. As we can see in the diagram, as the voltage is applied, the
electrostatic energy is stored in the SDA strain while its front part, the bushing,
bends. When the voltage is released, the strain energy tends to decrease and the
elasticity of the bushing returns it to its rest orientation producing displacement.
The main advantage of this actuator is that it is able to produce a rather large force
(100 N), which can be even increased by connecting multiple actuators together.
Actually the SDA has been used as an actuator in the 2D optical switch matrix
that was developed by Optical Micro Machines (OMM) and which received the
stringent Telcordia certification.
Electrostatics can also be used to move liquids. It is based on two phenomena:
electro-hydrodynamic which works with non-conductive fluids and electro-osmosis
which works with conductive fluids. Electro-osmosis pumps are of larger significance because biological fluids are actually solute with different ions (salts) and
are thus conductive.
In an electro-osmosis pump a stationary electric field is applied along a channel
and result in an overall motion of the conductive fluid. Although the liquid is
globally neutral, this motion can happen because of a so-called double layer of
charged particles near the walls of the channel. In general with conducting fluids
DV
- - - - - - +- - +- - +- - charged layer { + + + + + + + +
+ + + +- u +- ++ - + + neutral layer
+ - - + - - + - + + - - + +
+
charged layer { -+ - - +- -+ - +- +- +- -+ +- -+-+
u
u eof
(4.10)
161
where is the dielectric constant of the fluid and its viscosity. Actually the
proportionality constant between the velocity and the field is usually rather small
(eo = 0 r /, eo electro-osmotic mobility) and in practice the voltage used for
electro-osmosis flow across a 10 mm-long channel will be in the order of 1000 V.
4.4.3
Piezoelectric actuator
At the end of the 19th century Pierre and Jacques Curie discovered that certain
materials produce electrical charges at their surface when they are submitted to
an external force the so called direct piezoelectric effect. Reciprocally, when
these materials are subjected to an electric field they contract or expand the
so called converse piezoelectric effect. The materials themselves are called piezoelectric materials and are natural transducers between electrical and mechanical
domains. They have thus been used inside different mechanical sensors and actuators and are particularly interesting for micro-scale actuation because they have
a high power density, producing rather large force for small volume. However, the
deformation being induced in the rigid bulk of the material, the magnitude of the
deformation remains small requiring clever designs to obtain actuators with large
stroke. Fundamentally the origin of piezoelectricity is linked with the absence of
l
-
162
The piezoelectric effect is linear and for small deformation can be considered
to be independent of the compliance and the permittivity of the material. Thus
we can write the combination of piezoelectricity and elasticity as:
= S E + dt E
(4.11)
D = d + E
(4.12)
d
d
d
11 21 31
d =
d
0
1
d1
0
0
0
t
d =
d2
0
0
d2
0 2d1
with d1 = d11 = 2.3 1012 C/N and d2 = d14 = 0.67 1012 C/N. For zinc
oxide (ZnO), a material from the hexagonal class different from quartz that can
0
t
d =
d3
163
(A,B,C) system of coordinate:
0 d1
0 d1
0 d2
d3 0
0 0
0 0
with d1 = d11 = 5.4 1012 C/N, d2 = d33 = 11.6 1012 C/N and d3 = d42 =
11.3 1012 C/N.
In practice the most commonly used materials are artificial ceramic (e.g., Lead
Zirconate Titanate or PZT) which, although they present at nano-scale the right
lack of center of symmetry, are not naturally piezoelectric because they are polycrystalline. Actually, in each grain the random orientation of the polarization
cancels each other resulting in a lack of observable piezoelectric effect. For circumventing this problem, the materials is heated (close to the Curie temperature
where reorganization can easily occur) and submitted to a large electrical field, an
operation called poling, that orients the polarization of all the nano-crystallite in
the same direction. When the material is cooled down, most of this orientation is
preserved and the resulting material shows macroscopic piezoelectric effect. The
PZT crystal symmetry is the same than ZnO but the piezoelectric coefficient are
much larger making this crystal very interesting for lower voltage operation. We
have for poled PZT d1 = d11 = 123 1012 C/N, d2 = d33 = 289 1012 C/N and
d3 = d42 = 496 1012 C/N.
Besides the PZT it is possible to deposit by sputtering thin-film crystalline
material like AlN and ZnO that present interesting piezoelectric properties without
poling. Alternatively it is also possible in certain cases to work on piezoelectric
substrate like quartz, gallium arsenide (AsGa) or lithium niobate (LiNbO3 ). Most
of the time, the purpose of the actuator is to generate acoustic waves (vibration)
into the materials but they are rarely used to produce mechanical work.
Z
X
Y
V
V
V
t
l
w
Simple
Bimorph
164
XY
Y Z
ZX
0
0
5.4
0
0
11.6 12
10
=
0
11.3
0
V/t
11.3
0
0
0
0
0
5.4
Thus
XY
Y Z
ZX
1012 Vt
5.4
5.4 1012 Vt
11.6 1012 Vt
=
0
0
0
4.4.4
Thermal actuator
The thermal energy used by this class of MEMS actuator comes almost invariably
from the Joule effect when a current flows through a resistive element. These actuators are generally relatively strong and their main drawback is most probably
their speed, although at micro-scale the heat is quickly radiated away and operating frequency up to 1 kHz can be achieved.
Bimorph actuators are the most common type of thermal actuator. The bi-material
5
It will also happen along the X-axis but with much less bending as the beam is narrow.
165
0 0 d1
s s s 0 0
0
0
X
1 2 3
Y 0 0 d1
0 s2 s1 s3 0 0
0
0 0 0 d2
Z s3 s3 s4 0 0
0
+
0
=
XY 0 d3 0
XY 0 0 0 s5 0
0
V /t
Y Z d3 0 0
Y Z 0 0 0 0 s5
0
0 0 0
ZX
0 0 0 0 0 2(s1 s2 )
ZX
with s1 = 7.79 1012 Pa-1 , s2 = 3.63 1012 Pa-1 , s3 = 2.12 1012 Pa-1 , s4 =
6.28 1012 Pa-1 and s5 = 24.7 1012 Pa-1 . We use the two equations along X and
Y and write
0=s +s +d V
1 X
2 Y
1 t
0=s +s +d V
2 X
1 Y
1 t
166
d1 HW
H
=
V
2
s1 + s2 2
d1 s 1 3 2
LV
s1 + s2 H 2
n
V
k
167
DX
DY
DZ
= 0 0 0 d3
d1 d1 d2 0
d3
0
0
0
0
0
XY
Y Z
ZX
X 0
+ 0
Y
0 0
0 0
Z
V /t
The Gauss law expressed with the electric field displacement D states that
D ndA = Q the charge in the system. The electrodes are in (X,Y) parallel
planes, thus the normal to the electrodes is along Z and the Gauss integral simplifies as DZ W L = Q. The two other components of the electric displacement(DX
and DY ) do not contribute to the electrodes polarization. From the direct piezoelectric effect equation, we find that DZ = d1 X + d1 Y + Z Vt , thus using the
previous expression for the biaxial stress in the piezoelectric layer we obtain:
Q = DZ W L =
WL
d21 W L
+ Z
c1 + c2 t
t
There are two contributions to the capacitance of the electrodes: a first capacitance
term due to the charge appearing through the direct piezoelectric effect and the
biaxial stress in the piezoelectric layer, and a second capacitance term arising
from the charge displacement in the dielectric. This second term Z WtL is the
capacitance from the electric domain and is the C0 of the model. The first term
d21 W L
2 c1 +c
is actually the capacitance of the mechanical domain 1/k seen from the
2 t
electrical side. As an impedance connected on a transformer secondary is divided
by n2 when it is seen from the primary, the equality between these two capacitances
is written as:
d21 W L
n2
=2
k
s1 + s2 t
Thus taking the ration between this expression, and the expression found previously in the mechanical domain for n/k, we get an expression for n:
n=
2d1 W H 2
3s1 tL
2 s1 + s2 W H 2
3 s21
tL
Now, we turn our attention to the value of the effective mass m appearing in the
model that will be obtained by using Rayleighs method.
168
1
z 2 dV =
dA
z 2 dx = HW
2 V
2 A
2
0
1
= HW a2 L5 2 sin2 (t)
10
a2 x4 2 sin2 (t)dx
Kb =
d z
I
The bending moment in the beam is given by M (x, t) = EI dx
2 = 2 s a sin(t)
1
where the Youngs modulus along Z is taken as E = 1/s1 . We evaluate the elastic
energy as shown in Section 4.2.2 for a beam in pure bending:
1
s1 L 2
2
dV =
s1 (M z/I)2 dV =
M dx
2 V
2I 0
V E
s1 L I 2 2 2
W H3 2 2
(2 ) a sin (t)dx =
La sin (t)
2I 0
s1
6s1
Ub =
1
2
where we have used the fact that I = W H 3 /12. At resonance the amplitude of
kinetic and elastic energy are equal:
W H3 2
1
HW a2 L5 02 =
a L sin2 (t)
10
6s1
5H 2
02 =
3s1 L4
The effective mass is then given by:
k
m
2(s1 + s2 ) L3 W
m =
5s1
t
02 =
The last parameter of the model would be the loss c which is harder to evaluate
analytically without further hypothesis. In practice, we would estimate from past
experience
or measure the quality factor Q of the structure and set the value
of c = Qkm accordingly.
169
actuator, well known from the bimetallic version used in cheap temperature controller, and the heatuator (Figure 4.20) are both bending actuator where bending
is induced by a difference of strain in two members connected together.
a
ah
al
T0
T0
T1
Th
I>0
Tl
T1 >T0
bimorph
heactuator
E12 h41
4E1 E2 h31 h2
(4.13)
where Ei is the Youngs modulus for the two materials and hi their thickness.
Actually the difference in thermal expansion existing for materials deposited at
different temperatures (e.g., polysilicon and metal) makes any bilayer curl when
it is released at room temperature. This effect is often annoying, and if it can be
controlled to some extent, it is the main issue behind the use of bilayer actuator.
Actually, for such actuator, upon release the bilayer will curl (up if it has been well
designed!) and as its temperature changes (e.g. using Joules heating by flowing a
current) its radius of curvature evolves but it will be difficult to make it flat. Still,
the initial stress induced curvature has been put to good use to fabricate curling
beam that naturally protrude high above the surface of wafer in a permanent
way. They have been used to lift other MEMS structure (e.g. micro-assembly in
Figure 5.1) or micro-parts (e.g. conveying system).
The heatuator [19] does not have this problem as it uses a single material. This
simplifies the fabrication, and the difference in strain is obtained by maintaining
different temperature in the two arms. Actually as the current flow through the
actuator the wider cold arm will have a lower resistance and thus generate less
heat than the other narrow hot arm.
It should be noted that the force produced by these two actuators decreases with
the stroke: at maximum stroke all the force is used to bend the actuator and no
external force is produced. One heatuator can produce force in the 10 N range
and they can be connected together or made thicker to produce larger force.
170
PROBLEMS
171
Problems
1. Establish the expression of the spring constant of the folded beam suspension. You may want to consider the symmetry existing in the structure and
decompose it as a set of cantilever beams connected in series and in parallel.
2. We consider a micro-cantilever of length L, width w and thickness h bending
under its own weight.
What is the expression of the weight per unit of length of the cantilever
assuming the material has a density of ?
What is the general expression of the deflection at the tip of the cantilever?
What is the length of a 2 m thick silicon cantilever whose tip deflects
by 2 m? (Note: the density and Youngs modulus of silicon are =
2.33 103 kg/m3 and E = 106 GPa, the acceleration of gravity is g=9.81
m/s2 )
What are the practical implication of this deflection for the cantilever?
Force
(a)
(b)
Piezoresistor
(c)
(d)
172
2R
Chapter 5
MEMS packaging, assembly and
test
MEMS packaging, assembly and test collectively called back-end process problems are the aspects of the MEMS technology that even now remain the less
mature. Actually, although the bookshelves appear to be replete with books discussing all the aspects of MEMS technology, we had to wait until 2004 to finally
have a reference book really discussing these three issues with real MEMS examples [28]. However it is hard to stress enough how MEMS packaging and test are
important for obtaining a successful product at a low cost. Figure 5.1 shows two
real life examples of MEMS based micro-sensor where packaging, assembly and
test are a dominant part of the total cost.
Package
5%
Substrate
17%
Die
14%
Package
14%
Assembly
and test
64%
Assembly
and test
32%
Die
30%
ASIC
24%
(a)
(b)
Figure 5.1: Cost break-up in (a) a pressure sensor in plastic package (b) an accelerometer in surface-mount package
As a matter of fact, the choice taken for packaging and test may dictate how
to design the MEMS chip itself! This is in sharp contrast with micro-electronics
packaging where packaging is a somewhat independent activities than chip processing or design. On the contrary, in the MEMS case (no pun intended!), the
influence of the package on the microsystem behavior may be very significant.
173
174
Tf
Rs
Tf
Ts
Cs
Ts
Thus its transfer function is given by: TTfs = 1+1 s s with s = Rs Cs
However, when a sheath is placed around the sensing element it brings an additional thermal resistance (Rp ) and thermal capacitance (Cp ), and the sensor
becomes a second order system.
Tf
Ts
Tp
Tf
Rp
Cp
Tp
Rs
Cs
Ts
5.1. ASSEMBLY
175
The Example 5.1 shows clearly that early consideration of the packaging solution could lead to a successful product... while ignoring it could lead to a dramatic
failure.
The different operations that could appear in a somewhat complete MEMS
back-end process are presented in Figure 5.2, and we find the assembly, packaging
and testing steps. However as we stressed earlier, nothing is less typical than
a standard MEMS back-end process - and in practice, some steps may not be
present or their order be different.
Electrical
Test
In-Pack.
Assembly
Release
Wire
bonding
WL
Assembly
Sealing
Mounting
Dicing
Functional
Test
Calibration
Compens.
5.1
Assembly
176
5.2
Packaging
MEMS packaging, unlike the well-established and standardized IC packaging technology, is still largely an ad-hoc development. The main packaging efforts have
been conducted within MEMS manufacturer companies, and they have jealously
kept their secret considered, with reason, as the most difficult step to bring MEMS
to market.
Still, the purpose of packaging in MEMS is in many ways similar to IC, and
we can list a series of functionalities that should be brought by the package:
Support: provide a standard mechanical support for handling during assembly of the MEMS part into a system.
Protection: protect the chip from the environment (dust, stress, shock, moisture...). For MEMS the most important parameter to be controlled is often
stress.
Interfacing: bring signal in and out of the chip. For MEMS, signals will not
only be electrical but may be fluids, radiation, fields...
Heat removal: ensure the heat generated inside the chip is properly evacuated
to the environment (it is actually only a special kind of interfacing, but not
for a signal). Actually, this point is much less severe with MEMS than with
IC and is usually relatively easy to fulfill.
The need for protection from the environment is actually for MEMS not only
for reliability (e.g. preventing corrosion of metal contact) but it often serves to
5.2. PACKAGING
177
Blocking support
Mirror
2
1
2
The mirror is actually pushed above the surface of the wafer using beams curling
up permanently under stress gradient (see Section 4.4.4) and held in operating
position by additional blocking structure. The Figure shows the principle of the
1 ) the
device which get automatically assembled during the release etch: at first (
narrow bilayer beams get freed positioning the locking structure above the gimbal
2 ) the wider bilayer beams are freed, curling up and pushing
mount ears, then (
the mirror against the blocking structure that self-align using the V-shaped frames.
178
Elec.
port
Fluid
port
Transp.
wind.
Hermetic
encaps.
Stress
isolation
Heat
sink
Pressure
yes
yes
no
maybe
yes
Flow
yes
yes
no
no
no
Thermal
isolation
Calib.
Comp.
no
no
yes
no
yes
yes
Accel.
yes
no
no
yes
maybe
no
no
yes
Yaw-rate
yes
no
no
yes
maybe
no
no
yes
Sound
yes
yes
no
no
no
no
no
yes
Light
yes
no
yes
no
no
no
maybe
no
Temp.
yes
maybe
no
maybe
no
maybe
maybe
yes
IC
yes
no
no
moisture
no
maybe
no
no
Table 5.1: Packaging and testing requirement for some micro-sensors (adapted
from [27]).
It is obvious that the challenges brought by micro-sensors packaging are completely different from those encountered by electronics packaging. The necessity
for the measurand to reach the sensing element brings in transparent windows,
fluid port, gas hermetic sealing, stress isolation... unheard of in the IC industry,
and unfortunately, in the IC packaging industry.
In fact, besides protection, the major hurdle often rests in interfacing to the
external environment. Actually this problem is diametrically opposed to the preceding point: interfacing requires us to open a way in (and out) through the protection to come close to the MEMS die. For inertial sensors, such as accelerometers
and gyroscopes, the packaging problem is not too severe because they can be fully
sealed and still sense the measurand they are to probe provided they are rigidly
attached to the package. In that case, the use of stress relieving submount and a
bonded cap is all whats needed to be able to use modified IC packaging procedure.
But this is for the simplest cases, for chemical and biological sensors, which must
be exposed to fluids, the task is much more complex and the package can represent
as much as 90% of the final cost. Actually, the diversity of issue encountered for
interfacing has for the moment received no standard solution and the packages
are then designed on a case-by-case basis. In many cases the package will condition the response of the MEMS, particularly in the case of micro-sensor, and the
package must be considered during design at the earliest possible stage. See for
example the gas sensors developed by Microsens in Figure 5.3. The package use
a charcoal filter placed inside the cap for a very important function : decreasing
cross-sensitivity by allowing only small molecule gas to go through and to reach
the sensing element. The time for the gas to diffuse through the filter determines
mostly the response of the sensor, that behaves as a first order system with a time
5.2. PACKAGING
179
constant in the order of 10 s, whereas the response of the sensing element (the
MEMS die) is shorter than 1 s. The package has definitely a dramatic effect on
the system response!
Nylon mesh
Nylon cap
Filter
16 mm
Metal
mesh
Metal cap
Sensor die
5.2.1
Encapsulation
Encapsulation is used to protect the MEMS from mechanical contact, dust, water
vapour or other gases that could affect the reliability of the MEMS. Encapsulation
is performed by using plastic, ceramic or metal, with cost increasing in the same
order. For example, a pressure sensor from Novasensor packaged in plastic may
be sold for less than US5, while a steel housed sensor with metallic membrane
for harsh environment may exceed the US100 mark! NovaSensor is proposing
the three types of packaging: the plastic package for a low-cost low performance
sensor, a ceramic package for compensated structure for medical application, a
metal case (modified from a standard TO-8 cap from the early IC industry) and a
full metal package (the interface to the environment is a metallic membrane) for
corrosive or harmful media. But we should note that the MEMS sensing element
used in these three sensors is exactly the same - the value of the sensor is mostly
in the package!
Ceramic cap
TO-cap
MEMS die
TO-base
Lead pin
TO-package
Low temp.
seal glass
Ceramic base
IC die
MEMS die
Lead frame
CERDIP-package
Figure 5.4: Box and lid package in metal (TO-header) and ceramic (CERDIP).
180
Material
Modulus (T/S)
CTE
ppm/K
GPa
Si
4.1
150
N/A
Al (pure/alloy 1100)
23
69
35/60
Al (Si alloy)
6.5-13.5
100-130
Cu
17
117
70
Ni
13
207
148/359
In
29
11
1.9/6.1
Alumina (Al2 O3 )
6.7
380
Kovar
5.9
131
340
Borosilicate glass
4.5
65/26
29
Epoxy (pure)
60
2.4/0.9
54
-1.1
186
As MEMS generally have mobile parts, the main difficulty of MEMS encapsulation is to avoid blocking this motion. This originally forbid the simple use of
injection molding after mounting on a lead frame as in standard IC packaging.
The earlier idea were to use existing box and lid type of casing, like the older TO
series of package used for transistors or the ceramic CERDIP package, as we see
in Figure 5.4. This package have been adapted to the specific MEMS application,
and for example we have figured an original Transistor Outline (TO) case modified
by welding a fluid entry port to the cap for using it in a pressure sensor. Later
on, to reduce cost, injection molded plastic box and lid cases were used, most notably in Motorolas consumer grade range of pressure sensors. The CERDIP case
solves the second main issue generally encountered in MEMS packaging : thermal
stress. Many MEMS devices are actually stress sensitive, and besides the obvious
piezoresistive type of sensors, even DLP chips or fluidic micro-valves would have
issue if too much stress is induced during the encapsulation step or during the device use. As such, the choice of materials is often tied to the coefficient of thermal
expansion (CTE) with respect to silicon. The Alumina used in CERDIP package
has a CTE close to the CTE of silicon and thus wont introduce too much stress
changes while it is operated in the environment.
Another example of matched CTE in MEMS packaging is given by the DLP.
Actually the glass window above the chip is made in borosilicate glass that is
bonded to a covar lid. A rapid look in Table 5.2 shows that this choice of material
5.2. PACKAGING
181
is well thought of: they have both a matching CTE which is close to the CTE of
silicon!
However, matching the CTE is not the only way a material may be suitable for
MEMS packaging - particularly for a material that is in contact with the MEMS
die. Actually some polymer, although they exhibit a very large CTE and will
experience large thermal expansion, can still be used for packaging. To understand
that, we note that the stress induced by thermal expansion is proportional to the
temperature T , the CTE but also to the Youngs modulus E:
T ET
As polymer have usually small Youngs modulus, they wont exert much force on
the silicon die as they will deform readily and absorb much of the induced strain.
In fact soft polymer are often used as a stress relieving buffer to attach the die to
the casing for example.
More surprisingly, some material, that possess a high CTE and a relatively high
Youngs modulus can still be used in packaging. The best example is given by
indium. This metal can be used for solder or as a paste for die attachment because,
although it has a relatively large CTE and Youngs modulus, its yield strength
is very low. Thus as the thermal strain changes, the alloy will quickly deform
plastically and again wont induce any excessive stress on the MEMS die.
If MEMS encapsulation is still too often an adhoc development, some strategies
are maturing to keep as much as possible of the cost advantage brought up by batch
fabrication. As such, more and more MEMS devices use first level encapsulation,
where a glass or silicon wafer is bonded to the chip, helping to maintain the MEMS
integrity during dicing and further mounting in the package. In the packaging
with wafer bonding technique (cf. Sec. 3.3.3) the cap wafer is first patterned
with a simple cavity, or even a hole if an access to the environment is needed.
Then, alignment of the MEMS wafer and the cap wafer brings the cavity in front
of the MEMS part before the bonding is finally performed (Figure 5.5). This
182
component. Actually, after dicing the wafer in chips, the capped MEMS is rather
sturdy and can be processed using standard IC packaging procedure.
The first step, shown in Figure 5.6, consists in placing each die on a lead frame,
a long strip of identical metal structures punched from a thin foil. The lead frame
is used as a support for the die and to obtain electrical connection that can be
soldered on a printed circuit board (PCB). The dies are first glued on each of
the die bonding site using polymer or indium. Then electrical wiring is done to
connect the pad on the MEMS chip to the lead frame contacts that reroutes them
to the chip contacts. This part of the process is serial in nature, but can benefit
heavily from automation (pick-and-place and wire-bonding machines) as it is a
simple task, making it surprisingly cost effective.
Lead frame strip
Chip mounting
Chip wiring
5.2.2
Hermetic encapsulation
5.2. PACKAGING
183
Lead frame
184
Unit
0
mol/s
20
mol/s
25
mol/s
300K
mol/s
1 atm cc/sec
4.46 105
4.16 105
4.09 105
4.06 105
1 Pa m3 /s
4.40 104
4.10 104
4.03 104
4.01 104
1 mbar l/s
4.40 105
4.10 105
4.03 105
4.01 105
Table 5.3: Conversion of flow units between customary units and mol/s for different
temperatures.
engineers to do the same.
Fully hermetic package, that is a package without any exchange with the environment for any period of time, hardly exists at all. In practice, given enough
time, some gases will creep through some defects by diffusion like process and
reach the inside of the package. Of course the existence of leaks, as can be found
at seal interface, makes the thing worse, but plain materials without cracks will
anyhow let fluids sip in through a process called permeation. Then the choice of
the material is crucial to obtain good hermetic package, as the permeation of gas
and moisture through the material itself will be the ultimate limit to the leak rate
in any package.
Actually, the flow of gas Q through a barrier made of a certain material can
be linearized and using the unit of mole per unit of time (Q M = NtM ) given the
form:
AP
Q M = P0
d
where P0 is the intrinsic permeability for the material, A the exposed surface, P
the pressure difference between both side, and d the barrier thickness. There is
no standard unit for P0 and it mostly changes with the pressure unit used1 and
we use mol s1 m1 atm1 . The expression of the moisture evolution with time
in a package of volume Vin , is obtained by first recognizing in the case of a closed
volume the relationship between the flow and the pressure inside the volume using
the ideal gas law (P V = NM RT ),
V dP
dNM
= in in
Q M =
dt
RT dt
then we use the definition of the permeability (and the fact that Pout is constant)
to obtain,
dPout Pin
RT A
=
P0 (P
Pin ).
dt
Vin d out
1
We express flow of matter in mol/s but the literature often reports a mass flow in kg/s
instead. Divide the kg/s value by the gas molar mass (e.g. 0.018 kg/mol for water) to obtain
mol/s.
5.2. PACKAGING
185
RT
d
A
V
in
To get some general information out of this equation with a large number
of parameters, we need to make a few assumptions. We will consider the time
(t = t50 ) it takes in a cubic (spherical) box of side (diameter) a (i.e., in both cubic
and spherical cases Vin /A = a/6) for the water vapour pressure to reach 50% of
the outside pressure. Thus using Pin = 0.5Pout we get an expression relating the
permeability with other environmental and package parameters:
log d log t50 + log
ln 2 a
RT 6
= log P0
NM
RT P = nM RT
V
(5.1)
where NM is the number of mole of gas molecule, nM the molar density (in mol/m3 )
and R = 8.31J/K.
186
-1
-1
10
10
-6
-11
ies
s
e
-5
10
on
10
-9
rb
-4
ca
10
ox
-3
en
ryl
e
tri
d
et
a
ls
Ni
Gl
as
se
Pa
uo
ro
10
Ep
-2
10
Fl
Thickness [m]
10
-7
es
10
on
-1
-3
lic
10
10
Si
10
1
min
1
h
1
day
1
mo
1
yr
10 100
yr yr
Figure 5.8: Required barrier permeability and thickness for reaching specific duration of protection (defined as the time it takes for the moisture pressure inside
the package to reach 50% of the outside pressure at T =25 in a cavity with side
a=1 mm).
But what happens now if there is an aperture in the barrier? In that case,
in the average, the number of molecule entering the aperture on the high density
(i.e. pressure) side will be larger than on the lower density (i.e. pressure) side,
resulting in a net flow of molecule from high pressure to low pressure. This flow
is then simply governed by the gas molar density difference nM 2 nM 1 , and can
be written in the form
NM 1
= C(nM 2 nM 1 )
Q M 1 =
t
where C is expressed in m3 /s and is the conductance of the channel that takes
into account the probability that a molecule is going through the aperture and not
coming back.
The conductance depends on the flow regime, which in turn depends on the
dominant collision mode for the gas molecule flowing through the aperture : are
the wall collisions or the inter-molecule collisions dominating? To estimate this
effect we compare the width of the channel d with the gas molecules mean free
path . At one end of the regime (/d < 0.01 : large channel), as the wall
contact is infrequent and the inter-molecules collisions dominant we have viscous
5.2. PACKAGING
187
Poiseuilles flow (cf. Sec. 4.2.3). Then, as the interaction with the wall becomes
dominant (/d > 1 : narrow channel) we observe molecular and diffusion flow.
The equations governing these different types of flow show that for a long channel
of diameter d, Poiseuilles flow varies as d4 (cf. Eq. 4.3), molecular flow as d3
and diffusion flow as d2 , which could allow to experimentally find what regime is
dominant in a particular case.
But, for small hermetic package, only the study of fine leaks is of interest (larger
leak will change pressure inside the package very rapidly) and we can reasonably
suppose that molecular flow regime is the dominant flow regime2 . The channel
conductance then takes a simple form as shown by Knudsen and the equation
governing the flow becomes3 :
Fm
Q M 1 = (
M
T2 nM 2
T1 nM 1 )
(5.2)
where we have left the opportunity for the temperature to be different on both
side of the leak. In the case where the temperature is equal, the eqution can be
simplified further as:
T
Q M 1 = Fm
(nM 2 nM 1 )
(5.3)
M
where Q M 1 is the molar flow in region 1 counted positive if it enters the region, Fm
is the molecular conductance of the leak (which for a single ideal circular channel
R/18d3 /L), T the absolute
of diameter d and length L is given by Fm =
temperature equal on both side, M the molecular mass of the gas and nM i the
molar density in region i. The conductance of the conduit (also called the standard
or true leak rate), for molecular flow is then defined as:
C = Fm
T
M
We note that the conductance depends on the temperature (and that this definition assumes equal temperature on both side of the leak), but if it is known
for one gas, it can be obtained for any other gases, provided its molecular mass
is known. Helium gas, having the smallest molecular mass after hydrogen, will
leak faster than most other gases and, at the same temperature, the ratio of the
conductance is given by C/CHe = MHe /M . Table 5.4 gives this ratio for some
commonly encountered gases. The air value is for rough computation as the molar
density (or partial pressure) of each component of air should be used with their
corresponding leak rate to estimate the effect of the leak (78% N2 , 31% O2 ...).
Actually, because of the difference in gas conductance, the leaked air will have
2
Diffusion flow will be dominant over molecular flow in the case where L
d, that is a short
conduit between the two regions which is not a common case in encapsulation.
3
We use here molar density instead of pressure as it makes the theory more sound and removes
some ambiguities in existing derivation.
188
Conductance ratio
Cgas /CHe
H2
1.414
He
N2
0.3779
O2
0.3536
H2 O
0.4714
Air
0.3804
Table 5.4: Conductance ratio for different gases with respect to He.
a different composition than normal air (N2 will leak faster than O2 ) until the
molar density equilibrates4 . The molar leak rate Q M is easily computed if the
molar density difference is constant, but actually for a closed package a gas leaking inside will gradually increase the molar density (or lower if it leaks outside)
continuously changing the leak rate, until
the molar density on both side becomes
equal (or more exactly the product nM T , when the temperature is different on
both side).
If we consider the standard situation for a package where
the molar density (and pressure) of the gas in the enviQ M1
ronment is unaffected by what can leak from the package
nM1
nM2
(i.e., nM 2 = nM 20 is constant), and by using the volume V
of the package to relate the mole number to molar density
V
nM 1 = NM 1 /V , we can solve the flow equation Eq. 5.3 and
obtain the evolution with time of the difference in molar density inside and outside
the package:
C
nM 2 nM 1 = nM 20 nM 1 = (nM 20 nM 10 )e V t
Then the molar leak rate is obtained as:
C
Q M 1 = C(nM 2 nM 1 ) = C(nM 20 nM 10 )e V t
nM 1 = nM 20 + (nM 10 nM 20 )e V t
The molar density evolution equation can be converted to use the partial pressure in the package by applying Eq. 5.1 (ni = pi /RTi ), giving:
C
5.2. PACKAGING
189
This last equation is the one generally obtained by following the standard derivation of the molecular flow theory, however it is only valid if the temperature inside
and outside the package is the same (which is not necessarily the case as heat is
usually generated inside the package). Actually, we see above
in Eq. 5.2 that if the
temperature is different, the variable of interest becomes nM T and the pressure
can not be simply used anymore.
Typically, the equation is used with two different initial conditions:
the package is in vacuum at t = 0 and air slowly leak inside. Then we have
C
: ppack = pair (1 e V t )
the package is pressurized with air at pressure p0 at t = 0 and leaks outside.
C
We have : ppack = pair + (p0 pair )e V t
Of course, other situation will involve more complex behavior, such as, for example,
when He is used to pressurize the chip: as He will escape through the leak, air will
try to enter the package (the partial pressure of nitrogen or oxygen is 0 inside) but will be impeded in its inward flow by the out-flowing He, modifying the results
seen above.
To maintain tight hermeticity the best method is probably to use wafer bonding
technologies with limited permeability to gas, like glass to silicon anodic bonding
or metal to silicon eutectic bonding (cf. 3.3.3).
However all MEMS can not be treated in this way, and for example the Texas
Instrument DLPs packaging is more complex because the tiny mirror would not
survive harsh elevated temperature treatment including glass bonding. Thus, a full
chip-by-chip hermetic package in metal with a transparent glass window had to be
designed. The package is sealed using brazed metal can in a clean room under a
dry nitrogen atmosphere with some helium to help check leaks, and incorporates
strip of getter material, a special material for removing the last trace of humidity
(Fig. 5.9). The getter is a material with a high porosity that will react with the
Metal cap
Glass window
DLP
Ceramic base
Getter
Seal
Al heat sink
Figure 5.9: Schematic of TIs DLP package with hermetic encapsulation and getter.
target gas (usually water vapour, but oxygen or other gases can also be targeted)
forming a solid compound at the getter surface. In general a special trick is
used to activate the getter after the package is closed (heating it above a certain
190
2
N2
N2
N2
nN
M 1 = nM 20 + (nM 10 nM 20 )e
02
n02
M 1 = nM 20 (1 e
CN 2
t
V
CN 2
t
V
Finally, the total pressure inside the package is the sum of the partial pressure
2
O2
p1 = nN
M 1 RT + nM 1 RT , and is shown in the Figure. Clearly, after 1 year (i.e.
31.536 106 s) the pressure, and the gas composition, inside the package will be
the same as ambient air : the leak is way too big.
5.2. PACKAGING
191
temperature, for example) so that the getter does not get quickly saturated in
the open air during packaging operation. Getter can be deposited by PVD or
CVD in thin films or pasted in the package, but they have a finite size and thus
will work best for a finite amount of gas molecules: in general the trace of gas
adsorbed on the inner surface of the package. On the longer run, they will retard
the degradation due to the permeation of water vapour but even very small leaks
will saturate the getter rapidly.
In view of the complexity - and cost - of fully hermetic packaging, it is lucky
that all MEMS do not need such packages and the end of the 1990s Ken Gileo and
others introduced a new concept: near-hermetic packaging. If this concept resists a
formal definition (what leak rate defines quasi-hermeticity?) a heuristic definition
would say that the package should be good enough for the MEMS operation.
Accordingly, relaxing the constraint on the hermeticity (possibly by supplementing
it with a getter) opens up the range of techniques that can be used, and for example
polymer encapsulation or bonded wafer using solder bonding or even polymer
bonding, would be generally good enough while allowing a much simpler bonding
procedure (as, for example, the flatness requirement with a relatively thick solder
paste is heavily relaxed compared to what is needed for anodic, or even worse, for
fusion bonding).
Finally, for hermetic or quasi-hermetic package the remaining question will be
how to test the hermeticity? Gross leak can easily be detected by a simple bubble
test where the package is heated and immersed in a liquid: if there is a gross leak
the heated gas will escape through the leak and form bubble. But this test does
rarely make sense in MEMS packaging, where gross leak could be spotted during
visual inspection. For fine leak, the standard test is the He-leak test using what is
known as the bombig test. Here, the package is first placed into a high pressure
chamber with Helium for some time to force the gas into the package. Then the
package is taken out of the high pressure chamber and the He leak rate is measured
with a calibrated mass-spectrometer. This procedure is able to measure fine leak
rate in package with a volume of a fraction of a cm3 down to a leak rate of about
5 1017 mol/s (or 1012 mbar l/s) for the best leak rate detection systems which
could be compared to the . Still, in general MEMS package are too small (and the
amount of He too little) to use directly the test on real packages and special test bed
(using the same material and bonding technique but having a larger cavity) need to
be built to perform this test. More advanced techniques will use the measurement
of the Q factor of a mechanical resonator microfabricated on the chip itself. As
the pressure inside the package increases (when it is left in an ambient at normal
pressure or at higher pressure for accelerated tests), the Q factor of the resonator
decreases, allowing to estimate the leak rate. The advantage of this technique is
that it is sensitive enough to measure the leak directly on the MEMS packages
with their actual dimension.
192
5.2.3
Electrical feedthrough
The main technique used for connecting the die to the contact on the lead frame
(cf. Figure 5.6), or to other dies in case of in-package assembly, is wire-bonding.
Originally developed as thermocompression gold to gold bonding (still used for
wafer-to-wafer bonding cf. Sec. 3.3.3), it evolved to take benefit from ultrasonic
force. We can distinguish now 3 different types of techniques (Table 5.5, with the
dominant one in IC manufacturing being thermosonic bonding, while ultrasonic
bonding is more often used for MEMS because of its low process temperature,
although the high ultrasonic energy may pose problem to mobile mechanical part.
Technique
Pressure
Temp.
US
Mat.
Type
Thermocompression
High
300-500C
No
Au/Au
Au/Al
B-W
W-W
B-W
W-W
Ultrasonic
Low
25C
Yes
Au/Al
Au/Au
Al/Al
Thermosonic
Low
100-150C
Yes
Au/Au
Au/Al
MEMS die
Lead wire
Mount
Wire
5.2. PACKAGING
193
Lateral
194
to connect the chip to a printed circuit board (PCB) so that it can be assembled
in a complex system. This feature can be obtained by using the ball bonding
technology seen in flip-chip integration that we have seen in Figure 4.2. The
solder ball can be used to interconnect the IC and the MEMS chip, but also the
resulting stack of chips to the PCB, a step forward in 3D packaging technique.
The advantage of solder ball bonding is that it is performed in batch, as the balls
will solder the two wafers after they have been simply aligned in contact with a
pad and heated in an oven. Besides, as we can see in Figure 5.12, the deposition
of solder balls, also called bumping, can be performed at wafer level in batch. In
this case an under-bump-metalization (UBM) (for example a bi-layer of TiW/Au
150nm/300nm) is first sputtered before the solder is electroplated in a resist mold,
before it can be reflown to form spheres. The combination of ball-bonding and
Wafer passivation
Pad
Incoming wafer
UBM patterning
Mask removal
Reflow
wafer bonding (including the inclusion of via to bring contact from one side of the
wafer to the other) may result in packaging that is the same size as the die, and
we then speak of chip-size packaging (CSP), the ultimate goal for extreme system
miniaturization.
5.3
195
Testing is required to increase the reliability of the packaged MEMS. Different type
of tests are performed during the complete process, and we distinguish qualification
tests to detect failed dies while burn-in tests screen out low reliability dies. Burnin tests are performed at chipe level often after packaging, while qualification tests
are being performed both at wafer and chip level to screen the chips that should
not be packaged.
In addition to qualification test performed at wafer level, the testing phase
allows to perform calibration, which will be normally conducted after packaging.
The purpose of calibration is to compensate some of the defects into the MEMS
characteristic to make it work more ideally. Actually the linearity of MEMS sensor
or actuator may not be perfect or, more often, the cross-sensitivity with some
environmental parameters (usually temperature) would require to be accounted
for, another operation called compensation.
Example 5.4 Using calibrated and compensated sensor or not?
good insight at the importance of the calibration and compensation may be
gained by comparing two products from the range of pressure sensors from
Motorola. The MPX10 is an uncalibrated and uncompensated pressure sensor,
while the MPX2010 is passively calibrated and compensated. We report in the
following table some of their characteristics, extracted from the manufacturers
datasheets (MPX10/D and MPX2010/D).
MPX10
Characteristic
MPX2010
Min
Typ
Max
Min
Typ
Max
Unit
20
35
50
24
25
26
mV
Offset
20
35
-1.0
1.0
mV
-18
-13
-1.0
1.0
% FS
1.3
-1.0
1.0
mV
-0.22
-0.16
%/ C
15
V/ C
Sensitivity
3.5
2.5
mV/kPa
We see the effect of calibration of the sensor using laser trimmed resistor on the full
scale span, that is properly normalized, and on the offset that is almost suppressed.
The compensation technique use additional resistor that are also laser trimmed to
reduce tremendously the influence of temperature on the full scale span. However
we could note that the effect of temperature on the offset is not really compensated
this way.
What is not shown here is the large difference in price between this two sensors,
which could be an important element of choice!
196
The calibration and compensation will help to decrease the influence of unavoidable defects of the micro-sensor but, most of the time, the calibration and
the compensation will be decided at the factory. In the future it is expected that
all the system will include a capability to perform self-calibration. If this is not
always possible to implement (for example, applying a reference pressure for a
pressure sensor, is not an easy task), some system have so much drift that they
wont operate in any other manner. A good example is provided by some chemical micro-sensors that can only compare concentration in two fluids (thus need a
calibration for each measurement) but can hardly give absolute readings.
5.3.1
Testing
However if the overall reliability is mostly considered to be governed by the fabrication process, the reality is different. Actually all the process steps added during
packaging and test may affect adversely the final reliability of the device in sometimes unsuspected ways. For example, it has been shown that the qualification
tests performed at wafer level may affect reliability of wire bonding. Actually,
these tests are performed in a probe station using sharp needles to contact the
pads and apply different test signals to the device. The contact of the test probe
on the gold pads would leave a small scratch on the pad surface, which has been
shown to affect the bonding strength, ultimately possibly decreasing the reliability
of the packaged system.
The main problem faced by MEMS testing is that we now have to handle signal
that are not purely electrical, but optical, fluidic, mechanical, chemical... Then,
verifying the absence of defect needs the development of specialized system and
new strategies.
Texas Instrument DLP chip may have as many a 2 millions mirrors and simple
math shows that testing them one by one during 1 s would take approximately
three weeks at 24 h/day clearly not a manageable solution. TI has thus developed a series of test using specific mirror activation patterns that allow testing
mirrors by group and still detect defect for individual mirror, like sticking or missing mirror. After testing at wafer level the chip is diced, put into packages and
then goes through a burn-in procedure. They are then tested again before being
finally approved. TI noticed that the encapsulation step decreased the yield if the
environment wasnt clean enough and they have to use a class 10 clean-room for
the packaging of their DLP chips.
Testing is also a major hurdle for micro-sensor and to facilitate it additional
test features may have to be included into the MEMS design. A good example
is given by the integrated accelerometer range from Analog Devices. The system
use a micromachined suspended mass whose displacement is monitored by using
induced change in capacitance. However one part of this electrodes has been
configured as an actuator. By applying a voltage on this electrodes it is possible
to induce a movement of the mass without external acceleration. This is used
before the packaging to verify the mechanical integrity of the accelerometer... and
197
allow to save a lot of money, compared to a set-up that would need to apply a real
acceleration. Moreover that function may be used during operation in a smart
system to verify the integrity of the accelerometer.
Testing is conducted at different stage during the fabrication of the MEMS, but
the final test are normally conducted after the packaging is done. The reason is
simple: the packaging process always introduce stress (or damping) that change the
characteristics of the sensing elements, but need to be accounted for. This final test
can be used for burn-in and usually allow the final calibration and compensation
of the sensor.
5.3.2
Calibration
The calibration of the MEMS is the adjustment needed to deliver the most linear
possible transfer function, where the output goes from 0% to 100% when the input
varies in the same range. It generally means two different things:
1. linearization of the transfer function (i.e., having a constant sensitivity)
2. suppression of the offset
It is a particularly important step for microsensors, although it is understood
that all the other environmental variables (e.g., temperature, humidity, pressure,
stress...) are kept constant during the calibration and will be specified. As such,
the response of the system will generally be the best at calibration point. We note
that the cancellation of the change caused by these other environment variables
on the sensitivity or on the offset is left to the compensation procedure. Thus it
is possible to have a calibrated but uncompensated system, while the reverse is
often much less interesting.
To perform these tasks it is possible to trim integrated additional analog circuit
or to use a digital signal and a CPU. The choice between both is a mix of speed,
complexity and cost analysis, but obviously the integrated analog approach needs
more wit! Obviously it is possible to mix the two methods, that are not mutually
exclusive, and for example offset removing is often performed analogically while
sensitivity trimming is more easy to do with digital techniques. In both case the
calibration is performed after having the result of the calibration test that will tell
to which extent the sensor needs to be adjusted.
With analog calibration technique, the basic tuning elements are trimmable
resistors. The technique use a laser or electric fuse to trim the value of resistor
controlling MEMS sensitivity and offset. This method has the advantage to be
relatively cheap and to provide sensor with very high speed. Actually the laser
trimming method is less useful as it does not allow recalibration nor allow the
calibration after packaging, and electrical trimming is preferred. The trimmable
resistor does not allow to compensate for complex non-linearity, and if previously
there was a lot of effort devoted in developing linearizing scheme with complex
198
analog circuits (see for example Section 4.3.2), now the trend is toward digital
calibration for the more complex cases.
Digital calibration technique is quite straightforward to implement if sufficient
computation power is available, but will always be slower because it needs analog
to digital conversion and data processing. The principle is to perform a calibration
test, and to use the data recorded to compute the output of the sensor. The CPU
may use a model of the sensor, usually implemented using a high order polynomial,
or use the actual values of the calibration test, stored in a look-up table (LUT).
Between two points of the look-up table, the correction is estimated by using a
linear approximation. In principle this approach allows to compensate for any
non-linearity, and should deliver perfect characteristics. However apart from
the speed problem, the analog to digital conversion of the signal introduce new
parameters that does not allow to correct all the defect in the transfer function
characteristic. For example, we show in Figure 5.13 the case of a MEMS sensing
element presenting a flatter part in his transfer function. The marked non-linearity
1111
1110
1100
1010
1000
0110
0100
0010
0001
0000
0 1 2 3 4 5
10
Measurand (arb. unit)
15
Figure 5.13: Effect of sensing element characteristic on accuracy after digital conversion
of this transfer function introduce measurement errors than can not be calibrated
out. Actually, any input laying in the range between about 5 and 12.5, will give
the same A/D converter output: [0110]! We have here an important loss in the
accuracy of the sensor that can not be suppressed. Thus if digital calibration
allows to correct many defects, it is not possible to eliminate all of them. From
this example, we may remember that the maximum error presented by non-linear
element will be governed by the region of the transfer function presenting the
smallest slope.
Additionally, the digital converter works best by using its full range (otherwise,
we loose some bits of resolution)... which is not necessarily the same range as the
199
5.3.3
Compensation
The compensation refers to the techniques used to separate the MEMS output from
an interfering environmental parameter, like the temperature, and we distinguish:
structural compensation where preventive measure are taken at the design
stage to decrease the magnitude of cross-sensitivities;
monitored compensation where implicit or explicit measurement of the interfering parameter allow to modify the output of the sensor to compensate for
its effect. Implicit measurement approach use additional integrated circuitry
with the sensor, while explicit measurement use a sensor for the interfering parameter (e.g., a temperature sensor) and a CPU to adjust the raw
measurements according to the value of the interfering parameter.
In the first class of compensation, the layout of the microsystem is important
and for example it is possible to insulate it thermally to decrease the influence of
temperature. Another very simple structural compensation technique that should
not be underestimated is the use of symmetry in the design. The principle is
to use a difference signal, while all the other interfering variables will produce a
common mode signal, that will thus not be apparent on the MEMS output. The
compensation for residual stress in many MEMS sensors is often based on this
principle, and observing the design of the sensitive elements will invariably show
a marked 2-folds or 4-folds symmetry.
The monitored compensation may be performed at the system level without
explicit measurement of the perturbing parameter using completely analog signal,
and we talk of implicit compensation, or with explicit compensation. In this
latter case the compensation is using a CPU and an amplifier with programmable
gain and offset. The choice between these two approaches is often dictated by
the complexity of implementation. An implicit compensation is often used to
compensate for temperature in pressure sensor, but as soon as the dependence on
the external factor is complex, or when a very precise compensation is needed the
compensation is performed at the system level using a CPU. Generally speaking,
an implicit compensation needs more cleverness than an explicit approach that
200
will work all the time. For example, if it is relatively easy to compensate for
an offset induced by the temperature with analog circuit, a change in the gain
of the transfer function of the sensor will be much more difficult to compensate
and explicit compensation with analog or digital techniques will be required. Still
implicit approach will produce smaller control circuit that could be a decisive
advantage.
For example, implicit monitored compensation can be applied to shelter from
fluctuation in power voltage. In this case the idea is simply to use working principles that are ratiometric to this voltage.
For example, a Wheatstone bridge (see Sec. 4.3.1) delivers a voltage that is proportional to the supply voltage Vin :
Vin
R.
4R
This may seem to be a serious problem as any fluctuation in the supply voltage
will result in loss of accuracy, but when we consider the complete measurement
chain, this may turn to an advantage. If we consider a pressure sensor based on
piezoresistive elements in a Wheatstone bridge, the signal from the sensor needs
digitization further down in the measurement chain to be transmitted or recorded.
The digitization based on A/D converter requires generally a reference voltage from
which the quantization step (quantum) is derived. If for this reference voltage we
use the bridge supply voltage Vin , then any fluctuation in this voltage is automatically compensated by an equivalent change in the A/D converter quantum. In
this way the recorded digital information of the pressure will be accurate even if
the supply voltage change due to battery charge dwindling or other environmental
factors.
A typical example of a circuit using explicit compensation (also called digital compensation) is shown in Figure 5.14. Here the CPU constantly change the
Vout
Temperature
ADC
Micro-sensor
ADC
DSP
DAC
Memory
Figure 5.14: Explicit temperature compensation in a Smart Sensor
output according to the value of the input, using a model or a calibration curve.
201
A/D converter
range
However if this implementation looks simple it has some marked drawbacks, when,
for example, the sensitivity of the system decrease with the interfering parameter
as shown for a sensor in figure 5.15. If the sensitivity increase (T1 >T) in this
T1>T
T
max
T2<T
noise
Measurand range
202
PROBLEMS
203
Problems
1. Redo the problem in Example 5.3 considering now that the electronic circuit
inside the package raises the temperature to 50 during operation, while
the external temperature is considered to be at 25.
You will need to forget
about the pressure variable and derive solutions from
204
Chapter 6
Challenges, trends, and
conclusions
6.1
Although some products like pressure sensors have been produced since the 1980s,
MEMS industry is, in many ways, still a young industry. The heavily segmented
market is probably the main reason why a consortium like SEMI is still to appear
for MEMS. However everybody agrees that better cooperation and planning has
to happen if the cost of the assembly, test and packaging is to come down. MEMS
can currently only look with envy as IC industry seriously considers producing
RFID chips for cents - including packaging.
Again the path shown by the IC industry can serve as a model, and standardization to insure packaging compatibility between different MEMS chip manufacturers
seems the way to go. Considering the smaller market size of most MEMS component, standard is the only way to bring the numbers where unit packaging price
is reduced substantially. This implies of course automating assembly by defining
standard chip handling procedure, and probably standard testing procedure.
Of course, the diversity of MEMS market makes it impracticable to develop a
one-fit-all packaging solution and the division in a few classes (inertia, gas, fluidic)
is to be expected. For example, several proposals for a generic solution to fluidic
interfacing have been proposed and could become a recommendation in the future.
In the other hand it is not clear if standardization of MEMS fabrication process a`
la CMOS will ever happen - and is even possible. But currently most of the cost for
MEMS component happens during back-end process, thus it is by standardizing
interfaces that most savings can be expected.
The relatively long development cycle for a MEMS component is also a hurdle that
needs to be lowered if we want more company to embrace the technology.
One answer lies with the MEMS designing tool providers. The possibility to do
software verification up to the component level would certainly be a breakthrough
that is now only possible for a limited set of cases.
205
206
But it is also true that the answer to proper design is not solely in the hand of better
computer software but also in better training of the design engineer. In particular
we hope that this short introduction has shown that specific training is needed for
MEMS engineers, where knowledge of mechanical and material engineering supplements electronic engineering. Actually, experience has often revealed that an
electronic engineer with no understanding of physical aspect of MEMS is a mean
MEMS designer.
6.2
Looking in the crystal ball for MEMS market has shown to be a deceptive work,
but current emerging tendencies may help foresee what will happen in the medium
term.
From the manufacturer point of view, a quest for lowering manufacturing cost
will hopefully result in standardization of the MEMS interfacing as we discussed
earlier, but finally will lead to pursue less expensive micro-fabrication method than
photolithography. Different flavors of soft-lithography are solid contenders here
and micro-fluidic and BioMEMS are already starting to experience this change.
Another possibility for reducing cost will be integration with electronics - but,
as we already discussed, the system-on-a-chip approach may not be optimal in
many cases. Still, one likely good candidate for integration will be the fabrication
of a single-chip wireless communication system, using MEMS switch and surface
high-Q component.
From the market side, MEMS will undoubtedly invade more and more consumer products. The recent use of accelerometer in cameras, handphone or in the
Segway is a clear demonstration of the larger applicability of the MEMS solutions
- and as the prices drop, this trend should increase in the future. Of course medical application can be expected to be a major driver too, but here the stringent
requirements make the progress slow. In the mid-term, before micromachines can
wade in the human body to repair or measure, biomedical sensors to be used by
doctors or, more interesting, by patients are expected to become an important
market.
A farthest opportunity for MEMS lies probably in nanotechnology. Actually, nanotechnology is bringing a lot of hope - and some hype - but current fabrication
techniques are definitely not ready for production. MEMS will play a role by interfacing nano-scale with meso-scale systems, and by providing tools to produce
nano-patterns at an affordable price.
6.3
Conclusion
The MEMS industry thought it had found the killer application when at the turn
of the millennium 10s of startups rushed to join the fiber telecommunication
6.3. CONCLUSION
207
bandwagon. Alas, the burst of the telecommunication bubble has reminded people
that in business it is not enough to have a product to be successful - you need
customers.
Now the industry has set more modest goals, and if the pace of development is no
more exponential it remains solid at 2 digits, with MEMS constantly invading more
and more markets. Although the MEMS business with an intrinsically segmented
structure will most probably never see the emergence of an Intel we can be sure
that the future for MEMS is bright. At least, as R. Feynman[29] stated boldly in
his famous 1959 talk which inspired some of the MEMS pioneers, because, indeed,
theres plenty of room at the bottom!
208
Appendix A
Readings and References
A.1
Conferences
MEMS conference THE annual MEMS conference, single session and top research, happening usually in January with deadline in August. Hard to get
a paper accepted, but worth it.
Transducers conference The biennial conference, huge, multisession, hapenning every two years around June with deadline in Winter. Shows a really
nice panorama of all the research in MEMS and in related fields.
MicroTAS conference The annual BioMEMS/Microfluidics conference with top
results and researchers.
Optical MEMS & Nanophotonics conference The annual IEEE MOEMS conference including nanophotonics session.
PowerMEMS conference This annual conference is all about power generation,
dissipation, harvesting, and thermal management.
Micro-Mechanics Europe (MME) conference The European conference on
micro has a unique format for fostering interaction between participants:
there are no oral formal presentation but only short introduction and discussion around posters.
HARMST conference The biennial High Aspect Ratio Microstructure Technology conference is rightly focused on process.
Eurosensors conference The annual European microsensors conference.
Micro-Nano-Engineering (MNE) conference A European conference with a
good mix of topics related to MEMS.
209
210
A.2
211
Microfluidics and Nanofluidics One of the top journal for microfluidics and related technologies (http://www.springerlink.com/content/1613-4982/).
Experimental Thermal and Fluid Science A good journal geared toward more
fundamental issues in microfluidics (http://www.journals.elsevier.com/
experimental-thermal-and-fluid-science/).
Biomedical microdevices A Bio-MEMS journal (http://www.wkap.nl/journalhome.
htm/1387-2176).
Biosensors and Bioelectronics Another Bio-MEMS journal with more emphasis on sensors (http://www.elsevier.com/wps/product/cws_home/405913).
IEEE Transaction on Biomedical engineering Bio-MEMS and biomedical application can be found here (http://www.ieee.org/organizations/pubs/
transactions/tbe.htm).
IEEE Photonics Technology Letters Highly cited photonics journal publishing short papers, including optical MEMS or MOEMS (http://www.ieee.
org/organizations/pubs/transactions/ptl.htm).
IEEE/OSA Journal of Lightwave Technology A good quality photonics journal regularly featuring some optical MEMS work (http://www.ieee.org/
organizations/pubs/transactions/jlt.htm).
A.3
212
Appendix B
Causality in linear systems
In linear system the relationship between the input x and the output y can be
represented by a differential equation:
dm x
dn y
dn1 y
dy
dm1 x
dx
+ b0 x
an n + an1 n1 + + a1 + a0 y = bm m + bm1 m1 + + b1
dt
dt
dt
dt
dt
dt
(B.1)
where the sum m+n is called the order of the system with the important condition
that normally n > m. This condition describes the assumption of causality in the
system: the output is created by the input, not the reverse!
This may be better understood if instead of looking at the derivative, we invert
the problem and use integration. Lets take for example n = 3 > m = 2 :
d2 y
dy
d2 x
dx
d3 y
+ a0 y = b 2 2 + b 1
+ b0 x
a3 3 + a2 2 + a1
dt
dt
dt
dt
dt
integrate three times and reorganize the equation.
a3 y + a2
y=
b2
a3
y dt + a1
x dt +
b1
a3
y dt + a0
x dt +
b0
a3
y dt = b2
x dt
a2
a3
x dt + b1
y dt
a1
a3
x dt + b0
x dt
a0
a3
y dt
y dt
Thus the output y becomes a linear combination (i.e., a weighted sum) of the
integral of the input and the output itself. Integration is a causal operator (i.e., a
sum over time starting at t = 0) and the function can be physically implemented.
Now if we exchange the values of m and n, that is n = 2 < m = 3 we have:
a2
d2 y
dy
d3 x
d2 x
dx
+
a
+
a
y
=
b
+
b
+ b1
+ b0 x
1
0
3
2
2
3
2
dt
dt
dt
dt
dt
a2
x=
y dt + a1
b2
b3
x dt +
y dt + a0
b1
b3
x dt +
y dt = b3 x + b2
b0
b3
x dt
213
a2
b3
x dt + b1
y dt
a1
b3
x dt + b0
x dt
a0
b3
y dt
y dt
214
thus after 3 integrations we find that the input x becomes a linear combination including integral of the output, and thus the input would depend on what happened
previously at the output! Certainly not a causal behavior...
If n = m we will find that the output is directly proportional to the input,
implying instantaneous transmission of information through the system, which
should violate Einstein postulate. However, the systems we study using block or
circuit analysis are punctual systems: they have no physical size, thus signal can
travel instantaneously from the input to the output. We can more easily see that
we use this simplification all the time by considering a simple voltage divider. In
1
x,
this circuit the relationship between input and output is given by y = R1R+R
2
meaning that as soon as the input voltage change the output will change, which
is clearly not physical.
In practice, we will often use systems where m = n as in the case of Example 2.2, but if we are rigorous, such systems can not exist, and a delay should
be introduced to model the relationship between the output and the input of a
system.
Finally it should be noted that this physical delay is of a different nature (i.e.
speed of propagation of the information) that what we observe in RC circuit (i.e.
time to fill and empty energy storing elements). In telecommunication the physical
delay need sometimes to be modeled and in that case, we may use a ladder of...
R and C elements, definitely adding to the confusion!
Appendix C
Resonator and quality factor
The quality factor has originally been introduced to describe oscillators. Actually,
the larger the loss you have in an oscillator, the less pure will be its frequency, and
thus the poorer its quality.
Simply stated, if a vibrating system (called a resonator, that is an oscillator
without circuit to sustain oscillation) has loss, after the oscillations started, the
larger the loss, the quicker their amplitude will decrease until they stop. For
example, you can think about pinching the string of a guitar: the more loss you
have (e.g., if you press the string with a finger on the frets) the shorter the time
the vibration will last.
Formally this can be seen by looking at the spectrum of the generated signal. If we
had an ideally pure signal it should have only one frequency, thus in time domain
it could be represented by f (t) = A cos(t). For the resonator it means that it
started vibrating at time t = and would never end... a quite unphysical signal
- the universe after all has only 12 billions years! Thus no oscillator can possibly
generate a purely sinusoidal signal, all signals will always have some linewidth .
To understand this we consider that the resonator vibrate sinusoidally between
t = 0 and t = T0 and is stopped before and after these two moments (a slightly
more physical signal). The linewidth can be found from the Fouriers transform
of the temporal signal which gives its frequency spectrum:
216
time
frequency
T1
dw1 a 1/T1
t
1/T
T2
T
f
dw2 a 1/T2
t
1/T
Thus as we see here the shorter the time the resonator will sustain the oscillation,
the wider its linewidth... and thus, in some way, the lower its quality. We directly
observe the relationship between oscillator quality (their narrow linewidth) and
the... quality factor, expressed as we know as the ratio of the linewidth over the
resonating frequency.
Appendix D
Laplaces transform
The table D.1 gives the common properties of the Laplaces transformation.
Properties
F (s) = L{f (t)} =
Comments
st
e f (t)dt
0
Definition
f (t) = L1 {F (s)}
Inverse transform
Linearity
L{eat f (t)} = F (s a)
s-shifting
t-shifting
t-differentiation
t
0
s-differentiation
f ( )d } = 1s L{f }
t-integration
s-integration
L{ 1t f (t)} =
F ()d
Convolution
L{f g} = L{f (t)}L{g(t)}
(f g)(t)=
=
t
0
t
0
f ( )g(t )d
f (t )g( )d
218
0 when t < 0). In this case the bilateral Laplaces transform (where the integral
extends from to +) is the same as the unilateral Laplaces transform. In
Engineering, where signal are causal (that is, have an origin in time) the unilateral
transform is of course the preferred form, and we often drop the u(t) function
product on the time function, but we still actually imply that the function in time
domain is 0 when t < 0.
F (s) = L{f (t)}
1/s
1/s2
1/sn
1/ s
1/ t
(n = 0, 1, 2, . . .)
1/s3/2
1/sk
2 t/
tk1 /(k)
(k > 0)
1/(s + a)
Signal decay
a/s(s + a)
Signal rise
eat
1 eat
teat
1/(s + a)2
1/(s + a)n
(n = 0, 1, 2, . . .)
tn1 at
e
(n1)!
1/(s + a)k
(k > 0)
tk 1 at
e
(k)
/(s2 + 2 )
sin t
s/(s2 + 2 )
cos t
a/(s2 a2 )
sinh at
s/(s2 a2 )
cosh at
1
eas
0
a
(t)
(t a)
1/s
Unit step at t = 0
u(t)
eas /s
Unit step at t = a
u(t a)
Appendix E
Complex numbers
The use of complex numbers in Physics is not forced, but it simplifies many
problems significantly, for example to solve problems in two dimensions, represent
waves or periodic signal, ...
A complex number, z is defined as an ordered pair of real number (x, y), where
x is called the real part and y the imaginary part of the complex number. This
complex number can be written in its so-called cartesian form as:
z = x + iy
where i is called the imaginary unit and has the property that i2 = 1. In Physics
j is often used instead of i, supposedly to avoid mixing with the symbol used for
the current (this is definitely not a good reason, thus, better use i). With this
simple rule it is possible to use the customary rules of real algebra to compute
with the complex numbers.
The possibility of a complex numbers to represent any point in the plane of cartesian coordinate (x, y) is of extreme importance to solve elegantly problems in two
dimensions using algebraic operation, without the need for matrices1 . Moreover,
instead of using the direct orthonormal axes, we may think to use the polar system
of coordinate (r, )... that gives the polar form of the complex number :
z = x + iy = r(cos + i sin )
where r is called the modulus (or amplitude, in Physics) of the complex number and
its argument (or phase, in Physics). It is possible to relate nicely the polar form
to the complex exponential function by using Eulers formula eix = cos x + i sin x,
giving another representation of the polar form, very useful in computation.
z = x + iy = r(cos + i sin ) = rei
1
There is an interesting mathematical construct that allows to do the same in 3D. These
numbers, originally introduced by mathematician W. Hamilton, are called quaternions and have
4 components (and not 3)
219
220
The properties of the complex exponential regarding computation (ez1 +z2 = ez1 +
ez2 , ...) and calculus (ez = ez , ...) are the same than the real function.
Going from one coordinate system to the other may seem fairly simple, and it
is for the modulus where we always have r = |z|= x2 + y 2 . However, great care
should be taken when the principal value of the argument is larger than /2 or
smaller than /2. Lets have a look at Figure E.1 to see what happen,
y2
z2
y1
z1
2
x2
x1
221
exponential are simpler than for trigonometric functions) but it can only be used
with linear problems. Actually most of the fundamental equations of Physics are
linear (Newtons second law, Maxwells equation, wave equation, diffusion equation, Schrodingers equation...) and it is often not a serious limitation.
However, for a non-linear problem the cos(t + b0 ) (or sin) function has to
be used to represent a periodic signal instead of the ei(t+b0 ) . A simple example
will make the reason clear. Imagine a non-linear system that is simply squaring
the input signal, i.e. y = x2 . Note that such a system can not be described by the
theory developed in Chapter 2.4.3.
Now if we use for a periodic input signal the complex signal x(t) = eit , the
complex output becomes y(t) = ei2t , and to obtain the real signal we take the
real part of this expression y(t) = y(t) = cos(2t).
If we now use the cos directly, we have x(t) = cos(t), thus y(t) = cos2 (t) =
1+cos(2t)
, which is the right answer... but which is rather different to the first
2
answer!
222
Appendix F
Fraunhofer diffraction
Diffraction is a physical theory of light explaining the deviation from geometrical
optics that can be observed behind an aperture in an opaque screen. Actually
light behaves as if it bends and can reach region where geometrical optics would
only find darkness. The effect is more pronounced when the aperture diameter is
smaller, closer to the wavelength of light. The general description of this effect
is somewhat complex, but different approximate theory can be derived if we only
observe (and illuminate) at a distance much farther than the aperture diameter, a
condition known as far-field.
F.1
Far-field diffraction
In the general case of light diffraction shown in Figure F.1, we want to compute
the optical field at a point P located at a distance R from an opaque screen with
an aperture of diameter 2a.
y
Y
z
Z
2a
224
UP =
uA ei(tkR)
R
eikq cos()/R dd
0
kqa/R
R2
J0 (kq/R)kq/Rd(kq/R)
k2q2 0
R2
= 2 2 2 kqa/RJ1 (kqa/R)
k q
R
J1 (kqa/R)
= 2a2
kqa
2J0 (kq/R)d = 2
0
(F.1)
(F.2)
(F.3)
where J1 is the Bessel function of order 1. Thus the expression of the field amplitude in P is given by:
UP =
uA ei(tkR)
R
2a2
J1 (kqa/R)
R
kqa
2u2A 2 4 J1 (kqa/R)
a
R2
kqa/R
Plotting this function with kqa/R as the argument (Figure F.2) we obtain a central
peak with smooth profile surrounded by a series of faint rings. The irradiance
becomes zero for kqa/R = 3.83 (first zero of J1 (z)), and this point may serve for
defining the radius Q of the illuminated zone on the screen. We get :
R
R
R
= 3.83
0.61
ak
2a
a
This central peak is called the Airy disk and as suggested earlier, we see that
the diffracted spot size (or the diameter of the Airy disk) increases as a becomes
smaller.
Q = 3.83
225
1
0.8
0.6
0.4
0.2
0
10
5
0
10
5
-5
0
-5
-10 -10
Figure F.2: Intensity profile on a screen located far after a circular aperture uniformly illuminated (scale is normalized to kqa/R).
F.2
Bessel function
Bessel functions are complex functions appearing often in physics and not much
more complicated than the usual cos or sin functions. The Bessel function of order
n, Jn (z), is the solution of the differential equation
z2
d2 Jn (z)
dJn (z)
+z
+ (z 2 n2 )Jn (z) = 0
2
dz
dz
1
J0 (z) =
2
and
1
J1 (z) =
i
ejz cos d
0
226
0.6
J1
0.4
0.2
0
-10
-5
10
-0.2
-0.4
-0.6
Appendix G
MATLAB code
G.1
Bode diagram
The Bode diagram in Chapter 2.5 have been obtained using the following code,
that may be used to plot the Bode diagram of any first or second order transfer
function by changing the relevant value.
The first listing is for a first order transfer function.
% Bode-plot for first order transfer function
clear all
omega = logspace(-2, 2);
f = omega/2*pi;
%%%%%%%%%%%%%%%%%%%%%
% First order model %
%%%%%%%%%%%%%%%%%%%%%
G = 1;
tau = 1;
% static gain
% time constant
% amplitude in dB
227
228
%%%%%%%%%%%%%%%%%%%%%%%
% Second order system %
%%%%%%%%%%%%%%%%%%%%%%%
G = 1;
omega0 = 1;
zeta = 0.1;
% Static gain
% Natural frequency
% Damping ratio
Bibliography
[1] As recounted by A. Pisano in the foreword of An introduction to microelectromechanical systems engineering, 1st edition, N. Maluf, Artech House,
Boston (1999)
[2] J.-C. Eloy, MEMS market outlook, Yole Dveloppement (2011)
[3] L.J. Hornbeck and W.E. Nelson, Bistable Deformable Mirror Device, OSA
Technical Digest Series, Vol. 8, Spatial Light Modulators and Applications,
p. 107 (1988)
[4] C. Smith, Piezoresistive effect in germanium and silicon, Physics review,
vol. 94, pp. 42-49 (1954)
[5] J. Price, Anisotropic etching of silicon with KOH-H2 O isopropyl alcohol,
ECS semiconductor silicon, pp. 339-353 (1973)
[6] H. Nathanson, W. Newell, R. Wickstrom, J. Davis, The resonant gate transistor, IEEE Transactions on Electron Devices, vol. ED-14, No. 3, pp. 117133 (1967)
[7] W. Trimmer, Microrobot and micromechanical systems, Sensors and Actuators, vol. 19, no. 3, pp. 267-287 (1989)
[8] Micromechanics and MEMS - classic and seminal paper to 1990, Ed. W.
Trimmer, Section 2 - side drive actuators, IEEE Press, New-York (1997)
[9] Roarks Formulas for Stress & Strain, W. C. Young, 6th ed., McGraw-Hill,
New-York (1989)
[10] Fundamentals of Microfabrication, M. Madou, 2nd ed., CRC Press, Boca
Raton (2002) : the original (and a bit messy) reference book to the field what you are looking for is certainly there... but where ?
[11] MEMS Performance & Reliability, P. McWhorter, S. Miller, W. Miller, T.
Rost. Video, IEEE (2001)
[12] Microsensors - Principles and Applications, J. W. Gardner. Wiley, Chichester, England (1994): Short book that describe all the aspect of microsensor
without too many details. A good introduction.
229
230
BIBLIOGRAPHY
BIBLIOGRAPHY
231
[26] H. Liu, F. Chollet, Micro Fork Hinge for MEMS Devices, Journal of Experimental Mechanics (special issue: Advance in Experimental Mechanics in
Asia), vol. 21, no. 1, pp. 61-70 (2006)
[27] Introduction to Microelectromechanical Systems Engineering, Nadim Maluf
and Kirt Williams, Artech House, Boston (1999) : A good introductory book
on MEMS
[28] MEMS packaging, T. Hsu, Inspec IEE, London (2004) : The first real book
on MEMS packaging with examples - and not the usual IC packaging volume
repackaged for MEMS with generalities...
[29] a reprint of the transcript of the original talk given in 1959 at CalTech
appeared in R. Feynman, Theres plenty of room at the bottom, J. of
MEMS, vol. 1, no. 1, pp. 60-66 (1992) (the paper is available online at
http://www.zyvex.com/nanotech/feynman.html)
232
BIBLIOGRAPHY
Index
actuation
boron doping, 79
Bosch, 22, 101, 104
electromagnetic, 154
process, 83, 105
electrostatic, 155
bulk micromachining, 75
thermal, 164
bumping, 194
actuator, see MEMS actuator
AFM, see atomic force microscopy
calibration, 195
Agilent, 12
capacitive sensing, 149
Airy disk, 224
capillarity, 145
Alcatel-Adixen, 18, 105
case
AlN, 163
CERDIP, 180
amorphous, 67, 100
TO, 180
Analog Devices, 12, 23, 134, 149, 158,
causality, 213214
196
chemical vapor deposition, 97
Analogies, 36
APCVD, 97
anisotropic etching, 16, 77
LPCVD, 97
anisotropy, 67
PECVD, 98
annealing, 93
UHCVD, 97
anodization, 79
Chemical-Mechanical
Polishing, 102
ANSYS, 25
chip size packaging (CSP), 194
aperture
CMP, see Chemical-Mechanical Polishnumerical (NA), 119
ing
relative, 119
coefficient of thermal expansion, 180
AsGa, 163
comb-drive actuator, 157
aspect ratio, 81
compensation, 148, 195, 199
Assembly, 175
actuator, 201
atomic force microscope, 126
explicit, 199
AZ9260, 109
implicit, 199
monitored, 199
beam, 135
compliance matrix, 72
bi-material actuator, 169
bimetallic actuator, see bi-material ac- conductance, 187
tuator
conformality, 87
block diagram, 29
controller, 31
Bode diagram, 43
Coventor, 25
plot, 43
crystallographic planes, 68
233
234
INDEX
INDEX
Knudsen number, 142
Laplaces transform, 33
lattice
diamond, 68
fcc, 67
lead frame, 182
leak rate
standard, 187
true, 187
lens
relay, 115
LIGA, 108
LiNbO3 , 163
lithography
soft, see imprinting
LOCOS, 90
look-up table, 198, 201
low temperature oxide, 98
LTO, see low temperature oxide
Lucent, 12, 17
LUT, see look-up table
Mach number, 142
magnetoresistive effect, 153
magnification, 113
angular, 114
electronic, 115
lateral, 113
manufacturing accuracy, 21
market, 13, 14
mask, see photolithography, mask
material, 70
membrane, 137
MEMS, 9
actuator, 153
bioMEMS, 12, 16
micro-fluidic, 12, 142
optical, 12, 17
polymer, 109
RF, 13, 14, 17
sensor, 12, 14, 16, 145
MEMSCAP, 13
MemsCap, 22, 25, 101
235
MemsTech, 15
micro-world, 10
microchannel, 142
microelectronics, 9, 18
integration, 133
microloading, 106
Microsens, 178
miniaturization, 10
Mitutoyo, 121
model, see simulation
block representation, 27
circuit representation, 27
MOEMS, 12
molecular conductance, 187
Motorola, 12, 180
MUMPS, 101
NA, see aperture
natural frequency, 47
NTT, 155
numerical aperture, see aperture
ocular, 112
OMM, see Optical MicroMachines
Optical MicroMachines, 12, 160
overetch, 101
Oxford System, 105
oxidation, 89
parfocality, 114
pattern generator, 65
pattern transfer, 64
patterning, 64
PentaVacuum, 104
permeability, 184
permeation, 184
photolithography, 65
mask, 64
photoresist, 64
negative, 65
positive, 65
physical vapor deposition, 95
piezoelectricity, 152, 161
actuator, 161
converse effect, 59, 161
236
direct effect, 59, 152, 161
piezoresistive effect, 16, 146
piezoresistor, 148
plasma, 81, 96
ICP, 105
POEMS, see MEMS polymer
poling, 163
polycrystalline, 67, 100
polymer, 75
process, 63
additive, 64, 88
back-end, 66, 173
front-end, 66
modifying, 64, 8890
subtractive, 64, 75, 76, 80
PVD, see physical vapor deposition
pyrolysis, 97
Q, see quality factor
quality factor, 51
quartz, 110, 163
rapid thermal processing, 99
reactive ion etching, 82
reflow process, 110
relative aperture, see aperture
release etch, 103
reliability, 23, 24
resolution, 116
resolving power, see resolution
resonance frequency, 50
resonator, 215
response
sinusoidal steady state, 41
step response, 41
Reynolds number, 142
RIE, see reactive ion etching
RTP, see rapid thermal processing
sacrificial etching, 88
sacrificial layer, 65, 86, 103
Sandia National Laboratory, 74, 102
scaling laws, 19
scanning electron microscope, 121
INDEX
scanning near-field optical microscope,
121
scanning transmission electron microscope,
123
scratch-drive actuator, 160
SDA, see scratch drive actuator
SEM, see scanning electron microscope
Sensonor, 12, 15, 86, 133, 193
sensor, see MEMS sensor
Sercalo, 12, 74, 158
shape-memory alloy, 170
shape-memory effect, 170
silicon, 70
Silicon Light Machines, 12
simulation, 25
dynamic, 27
single crystal, 67
SIP, see system in the package
SiTime, 13
SMA, see shape-memory alloy
SNOM, see scanning near-field optical
microscope
SOC, see system on chip
SOI, 22, 74
sol-gel, 95
spectrum, 41, 55
Spice, 25
spin-coating, 94
spin-on-glass, 94
spring, 135
sputter, 96
DC, 96
magnetron, 97
RF, 97
ST Microelectronics, 14
STEM, see scanning transmission electron microscope
stiction, 104
stiffness matrix, 71
structural layer, 86
structure
active, 134
passive, 134
INDEX
STS, 18
stylus profilometer, 125
SU8, 109
superposition theorem, 56
surface micromachining, 85
Surface Technology Systems, 105
suspension, see spring
folded-beam, 136
Suss Microtec, 79
Suss microtec, 18
system
closed-loop, 30
control system, 30
first order, 45
linear system, 33
measurement system, 29
non-linear system, 221
open-loop, 30
second order, 47
system order, 33, 213
system in the package, 133, 175
system on chip, 175
Tanner Research, 102
TEM, see transmission electron microscope
tensor, 71
Texas Instruments, 12, 15, 23, 24, 75,
103, 159, 189, 196
TI, see Texas Instruments
time-independent, 33
transducer, 57
transducers, 134
transfer function, 29, 33
Transistor Outline, see case
transmission electron microscope, 123
Tronics, 193
UBM, see under-bump-metallization
under-bump-metallization, 194
underetch, 76
Van der Walls force, 104
variable
effort, 37
237
flow, 37
wafer bonding, 83
thermocompression, 84
Wafer level packaging (WLP), 193
wetting, 144
Wheatstone bridge, 146, 200
working distance, 120
XactiX, 104
XeF2 , see xenon difluoride
xenon difluoride, 104
Youngs modulus, 71
Young-Laplace equation, 144
zeta potential, 160
ZnO, 163
memscyclopedia.org
ISBN: 978-2-9542015-0-4
ISBN 978-2-9542015-0-4
00000 >
9 782954 201504