Nanotechnology - Gunter Schmid (Wiley-VCH, 2008) PDF
Nanotechnology - Gunter Schmid (Wiley-VCH, 2008) PDF
Nanotechnology - Gunter Schmid (Wiley-VCH, 2008) PDF
Edited by G
unter Schmid
Wiley-VCH
2
This book was liberated by MYRIAD WAREZ, a release team dedicated
to freeing mathematical and scientific knowledge. This book has been released in retribution for John Wiley & Sons Incs participation in the closing of ifile.it and library.nu. See http://torrentfreak.com/bookpublishers-shut-down-library-nu-and-ifile-it-120215/ and http://
www.publishers.org/press/59/.
Full details of this book are on the publishers website: http://onlinelibrary.
wiley.com/book/10.1002/9783527628155/
The pdf was compiled from vector graphics pdfs taken from the original
work. Unfortunately volumes 7 and 9 are missing. We hope that there are no
other errors please let us know if we screwed anything up. myriadwarez@
gmail.com
Contents
Volume 1: Principles and Fundamentals
11
Introduction
Gunter Schmid
11
13
81
147
Philosophy of Nanotechnoscience
Alfred Nordmann
224
293
3
CONTENTS
295
295
344
418
440
470
559
599
Phase-Coherent Transport
Thomas Schapers
599
CONTENTS
705
Charged-Particle Lithography
Lothar Berger, Johannes Kretz, Dirk Beyer and Anatol Schwersenz
730
Extreme Ultraviolet Lithography
Klaus Bergmann, Larissa Juschkin and Reinhart Poprawe
776
Non-Optical Lithography
Clivia M.Sotomayor Torres and Jouni Ahopelto
804
834
941
974
CONTENTS
1010
Phase-Change Memories
Andrea L.Lacaita and Dirk J.Wouters
1038
1107
1174
1197
Organic Transistors
Hagen Klauk
1229
1258
1278
CONTENTS
Intermolecular- and Intramolecular-Level Logic Devices
Franoise Remacle and Raphael D.Levine
1316
1389
1431
Volume 5: Nanomedicine
Introduction
Viola Vogel
1486
1486
From In Vivo Ultrasound and MRI Imaging to Therapy: Contrast Agents Based on Target-Specific Nanoparticles
Kirk D.Wallace, Michael S.Hughes, Jon N.Marsh, Shelton
D.Caruthers, Gregory M.Lanza, and Samuel A.Wickline 1502
Nanoparticles for Cancer Detection and Therapy
Biana Godin, Rita E.Serda, Jason Sakamoto, Paolo Decuzzi
and Mauro Ferrari
1536
Electron Cryomicroscopy of Molecular Nanomachines and Cells
Matthew L.Baker, Michael P.Marsh and Wah Chiu
1574
Pushing Optical Microscopy to the Limit: From Single-Molecule
Fluorescence Microscopy to Label-Free Detection and Tracking of Biological Nano-Objects
Philipp Kukura, Alois Renn and Vahid Sandoghdar
1597
Nanostructured Probes for In Vivo Gene Detection
Gang Bao, Phillip Santangelo, Nitin Nitin and Won Jong
Rhee
1627
CONTENTS
Volume 6: Nanoprobes
Spin-Polarized Scanning Tunneling Microscopy
Mathias Getzlaff
1893
1893
CONTENTS
2059
2100
2192
2210
2243
2243
10
CONTENTS
2447
Colloidal Lithography
Gang Zhang, Dayang Wang
2491
2555
j1
1
Introduction
G
unter Schmid
j 1 Introduction
manifold materials and so stand for one of the fundamental principles of nanotechnology, in agreement with the denition. Spherical or one-dimensional matter of
appropriate size can no longer be described by classical physical laws, but by quantum
mechanical rules, indicating the decisive change from the macroscopic or microscopic world to the nanoworld.
An extremely important eld of nanoscience and also of nanotechnology is dealing
with the intelligent combination of articial nanoscopic building blocks with
biomolecular systems, which can anyway be considered as the most powerful
nanotechnological inventions that we know. Most building blocks of living cells
represent perfect nanosystems, the interplay of which results in the microscopic and
macroscopic world of cells. We have learned to learn from Nature and consequently
try to develop technologically applicable devices reaching from novel sensor systems
up to diagnostic and therapeutic innovations. Chapter 6 gives an insight into this
fascinating part of the nanoworld.
Philosophical and ethical questions are discussed in Chapters 7 and 8. What kind
of knowledge is produced and communicated by nanotechnology? What is its place in
relation to other sciences? These and related problems are discussed in Chapter 7.
Studying current and future developments in nanotechnology from the viewpoint of
ethics is an essential requirement in order to elaborate rules and concerted actions on
how to deal with them in society. Such reections should accompany any novel
technological development, especially nanotechnology, the power of which has
already been compared with the beginning of a new genesis.
This is the rst of a series of books dealing with the various elds of nanotechnology. In addition to the principles and fundamentals, treated in this volume,
information technology, medicine, energy, tools and analytics as well as toxicity will
be the subjects of subsequent other books. In all cases, developed elds of nanotechnology and future areas of nanotechnological applications will be described and
discussed.
j3
2
The Nature of Nanotechnology
G
unter Schmid
2.1
Definition
This is the only existing denition without naming a particular lateral scale. It
excludes any kind of simple scaling effect (see later). The decisive aspect is the
appearance of novel properties. These can in principle be observed below 1 or above
1000 nm (1 mm). Therefore, the strict limitation to a distinct length scale does not
appropriate seem. Scientically, it would even be absurd to set a limit to sizedependent properties or novel properties of a construct of functionalized subunits. In
spite of this deliberate leaving out of a particular lateral scale, the expressions
nanoscience and nanotechnology are meaningful for practical use since most of
the known nano-effects happen on the nanoscale. The limitation of the denition to
the nanoscale, however, would degrade nanotechnology to the simple continuation of
microtechnology. Microtechnology was and still is overwhelmingly successful by the
continuous reduction of materials or tools, aiming not at the creation of novel
abilities, but at other advantages. Some examples in the context of the above
denition will help us to understand better what it is meant.
Typical size-dependent nano-effects that spontaneously occur when a critical
dimension is reached are observed when metal particles are downsized. Depending
on the kind of change of property, the critical size may vary for the same element.
A very typical and well known nano-effect is observed when gold is downsized. In
the bulk state, the beautiful color of gold results from a very fundamental phenomenon, the relativistic effect, based on Einsteins Special Theory of Relativity. One of the
basic messages of this theory is that the speed of light in vacuum is an absolute
constant everywhere in the universe. It never can be surpassed. If an object were
theoretically to be accelerated close to or even exactly to the speed of light, its mass
would continuously increase with increasing speed, nally even ad innitum.
Electrons in most atoms are moving around the atomic nucleus with speeds
usually far below that of light. However, in very heavy atoms, the high mass of the
nuclei accelerates electrons to such an extent that relativistic effects become obvious [35]. For instance, such effects are known for the elements lead, tungsten,
mercury, platinum and gold. The interesting question is whether relativistic effects
inuence the physical and chemical properties of such a heavy element. Indeed they
do! The acceleration primarily affects the s-electrons which are in the nuclear region,
linked with an increase in their mass, reducing the average distance from the
nucleus. Consequently, the s-orbitals and their energy are shrinking. As a secondary
effect, d-electrons, being farer from the nucleus, become electronically better
shielded. Their orbitals are therefore extended and increase energetically. For the
p-electrons the effects approximately equilibrate. In bulk metals, the individual
orbitals of atoms are extended to electronic bands. The valence band, resulting from
the d-orbitals, is energetically increased, whereas the conduction band, formed from
the s-orbitals, is lowered. The reduced energy difference between the valence band
and conduction band nally allows the low-energy photons of blue light to lift
electrons from the valence to the conduction bands. Consequently, gold absorbs blue
light and shows the complementary color yellow. Relativistic effects are revealed in
various ways: tungsten exhibits an unusually high melting point and mercury is the
only metal that is liquid at normal temperatures. Figure 2.1 illustrates the relativistic
effect of gold by shrinking the energy difference between the s- and d-orbitals.
2.1 Definition
Figure 2.1 Influence of relativistic effects on the energy level of d-, s- and p-orbitals.
Since the relativistic effect is a property of each individual gold atom, it must also be
present in nanosized particles, although their color is no longer golden. This metal
changes its appearance at about 50 nm to become blue. Further reduction results
in purple and nally, at about 1520 nm, in bright red. This well-understood and longknown effect can be traced back to the existence of a so-called plasmon resonance.
The phenomenon is quantitatively described by the Mie theory [6]. Qualitatively, the
formation of a plasmon resonance can be explained by a collective electron oscillation
with respect to the positive metal core of the particle, caused by the interaction of
external electromagnetic radiation (visible light) with the conned electron gas of
nanoparticles. The process is illustrated in Figure 2.2.
The energy taken up from light is responsible for the resulting color. This energy
depends on, among others, the particles size and shape and the surrounding
medium. With decreasing size the color is shifted to shorter wavelengths. If the
particles are not spherical but elongated, two plasmon bands may occur, one for
the transverse and the other for the longitudinal resonance. Figure 2.3 shows the
UVvisible spectra of spherical 18-nm gold nanoparticles (colloids) with an absorption maximum at 525 nm [7].
j5
Whereas the plasmon resonances of the three metals copper, silver and gold is in
the visible region, for most other metals it is in the UV region and so cannot be
observed with the naked eye.
The disappearance of the typical color in the case of gold when a critical size is
reached and the appearance of blue or red colors simply means that the plasmon
resonance superimposes the relativistic effect and so covers the typical color of bulk
gold.
The second part of the above denition names the use of individual or combined
functionalized subunits. As an individual functionalized subunit or building block,
molecular switches may serve as an example. A molecular switch is a molecule that
exists in two different states which can be adjusted by external stimuli. The two states
must display different physical properties, each being stable with long lifetimes. If
addressable by electrical or any other stimuli, such molecules could in principle serve
as building blocks in future storage systems. For instance, catenanes consist of two
interlocked rings equipped with electrochemically active parts (see Figure 2.4).
Applied electric potentials cause Coulomb repulsion and make one of the rings
move relative to the other, ending in another stable conguration. The sketch in
Figure 2.4 elucidates the process [8].
Finally, an example of combined functionalized subunits, the last of the three
conditions in the denition, can be given. Nature is a perfect nanoarchitect. Many
parts of living cells can be considered as a combination of functionalized building
blocks, although cells themselves have dimensions in the micro regime. The
probably most exciting molecule in Nature is deoxyribonucleic acid (DNA) with its
2.1 Definition
unique double helical structure. It consists of fairly simple subunits: four different
heterocyclic bases, phosphate anions and pentose fragments. It is the shape complementarity of the bases that enables an almost innite number of combinations to
give base pairs, which nally encode the genome of any living system using hydrogen
bonds to link complementary bases: thymine (T) combines only with adenine (A),
cytosine (C) exclusively with guanine (G). The sugar fragments and the phosphates
generate a backbone, holding the base pairs together. Figure 2.5 illustrates the
decisive interaction between the four bases. The sequence of bases in one single
strand determines the sequence in the other.
In practice, the number of different combinations in a human genome is innite,
considering the realistic length of a DNA double helix consisting of about 3.2 billion
base pairs.
The above denition and the following elucidating examples clearly demonstrate
that nanoscience and nanotechnology in a strictly scientic manner do not consider
simple scaling effects. One speaks of scaling effects if a material is downsized from
the macro/microscale even to the nanoscale and the properties, if at all, change
continuously, but not spontaneously. In other words, characteristic properties are
already present in the micro regime and only change gradually on reaching the
nanoscale. A typical example will help to understand easily what a scaling effect is.
The well-known moth-eye effect is a well-developed natural system to avoid light
reection from the eyes of night-active insects [2]. The moth eye is built up from
j7
hemispheres 200300 nm in diameter. Since they are smaller than the wavelength of
visible light, a continuous increase in the refraction index follows, avoiding the strong
reection that occurs when light hits a at and optically denser medium. The
principle of the moth-eye effect can be used to structure surfaces articially, for
instance those of transparent materials, in order to avoid unintended reection of
light: windows, solar cells, spectacles, and so on. The techniques to nanostructure
surfaces are manifold and will not be considered here. Antireection does not start
off from a distinct point. The only condition is that the structure units must be
smaller than the wavelength of light. It works with 300-nm units and also with 50-nm
building blocks, of course with varying results, but it works. Numerous such scaling
effects have been developed into very important techniques. However, they are
wrongly called nanotechnology, since they are based only on scaling effects and not
on real nano-effects as the denition demands.
2.2
From Nanoscience to Nanotechnology
Most of the currently known nano-effects are still deeply rooted in nanoscience, that
is we cannot speak at all of a technology. In spite of the obvious contradiction, one
usually speaks of nanotechnology even if a technology has not yet been realized. In
the following, a careful differentiation will be put forward, not just between science
and technology, but also with the usual intermediate step, called (nano)engineering.
The development of a technique from a scientic nding never happens in a single
Nature is a perfect nanotechnologist and we are well advised to learn from it. Distinct
proteins and protein assemblies are known to perform special motions in response to
biological stimuli [913]. Such systems are called molecular motors or molecular
machines. Numerous attempts have been made during the last two decades to
transfer the increasing knowledge of biological systems on a molecular level to
devices consisting of completely articial components or of hybrid systems where
biomolecules and technical building blocks interact.
Myosins, kinesins and dyneins are frequently studied natural molecular motors [1013]. Energetically fuelled by adenosine triphosphate (ATP), these proteins
can move back and forth on actin laments or microtubules transporting substrates.
It is not the intension of this chapter to describe these natural molecular machines;
rather, it is the discussion of man-made architectures in the sense of nanotechnology.
Just one example illustrating Natures principles will be briey presented: the
transport of actin laments by myosins.
j9
10
are part of the transport system for organelles, proteins and mRNAs. Conventional
kinesin is composed of two 80-nm long 120-kDa chains, connected to two 64-kDa
chains. The heavy chains are rod-like structures with two globular heads, a stalk and
fan-like end [9, 22, 23]. One-headed kinesins are also known [24]. The mechanism of
motion has been intensively studied using one-headed kinesins [25, 26], but will not
be described here. Of course, it is also enabled by the energy of ATP hydrolysis.
Microtubules are built up of 8-nm periodic building blocks of heterodimers of the
subunits a- and b-tubulin, forming hollow tubes 24 nm in diameter.
Instead of a cargo of natural material such as vesicles or organelles, a recent
example impressively demonstrates that articial nanomaterial can also be transported by kinesinmicrotubule systems; 7.6-nm core/shell CdSe/ZnS nanoparticles
were functionalized with biotinavidin. The as-modied quantum dot complexes
were then bound to immobilized and uorescently labeled microtubules. The
movement of the loaded kinesin along the microtubules was observed by means
of epiuorescence and total internal reection uorescence (TIRF) [27]. Figure 2.11
shows a sketch of the microtubulekinesinquantum dot hybrid system. The particle
j11
12
transport could be visualized over 1200 s, considerably longer than in any comparable
experiment before.
Numerous totally articial molecular machines with non-biological components
have also become known. For instance, the photoisomerization of substituted
alkenes is another general method to create molecular motors, provided that the
process is reversible and the two isomers are kinetically stable. In the example shown
in Scheme 2.1, the four bulky substituents are responsible for an energy barrier at
55 C between the trans and cis congurations [28].
Another light-driven system is the so-called Irie switch, a molecule with a lightsensitive CC bond that can be opened or closed by two different frequencies
(Scheme 2.2) [29].
Ring opening occurs under UV light and ring closure by light with a wavelength
>600 nm. Switching processes induced by light are in practice much more easily
managed than those using extensive chemistry. Nevertheless, this and comparable
objects are still part of basic research.
Rotaxanes consist of two parts: a stiff bar-like part and a ring-shaped part, arranged
around the bar. Due to electrostatic interactions, the ring prefers a distinct position.
By chemical oxidation of the interacting position in the bar, repulsion results and the
ring is shifted to another position. Reduction brings the ring back to the former
position. Figure 2.12 shows a rotaxane molecule with the ring in two different
positions [30]. Both congurations are stable without any voltage applied.
j13
14
Figure 2.13 A redox-driven molecular muscle consisting of two combined rotaxane molecules.
part A makes the platform B move up and down depending on the NH/NH2
situation in A [32, 33].
It is obvious that the above rotaxane examples are so far not really suited to work in
devices, since complex chemistry is necessary to oxidize and to reduce the specic
positions. However, the study of such or similar systems is of enormous importance
in order to gather experience and to improve continuously the conditions to make
such systems applicable.
2.2.2
Molecular Switches
The reasons for the worldwide and intensive search for novel generations of switches
and transistors lie in the unavoidable fact that the limits of the present silicon
technologies will eventually be reached. The famous Moores law predicts that
between 2010 and 2020 the two-year rhythm of doubling the capacity of computers
will nd a natural end due to a typical nano-effect [34]: below a not precisely known
size barrier, silicon will lose its semiconductor properties and instead it will behave as
an insulator. Other technologies which are based on very different nano-effects have
to follow. For instance, magnetic data storage systems involving the so-called
spintronics [35], magnetic recording systems using nanosized magnetic nanoparticles [36], magnetic domain walls in nanowires [37], and so on, are developing
tremendously. All are still far from being of technological relevance in the near future
and even the nanoengineering step has not really been reached so far. Hence they are
still objects of intense basic research in nanoscience.
The term molecular switch is used for molecular systems which are stable in two
different states. One state represents 1 and the other state 0. Different states may
consist either of two different geometric conformations or of two different electronic
states. Both states must be interconvertible by external stimuli. An example of
molecular systems existing in two different geometric states has already been
introduced in Section 2.1 with a catenane molecule that can in principle be switched
by means of electrical pulses. Furthermore, all examples of articial molecular
motors described in Section 2.2.1 are at the same time molecular switches, since
they exist in two different but convertible congurations. However, the use of
systems, the switching of which is only based on more or less complex chemistry,
looks not so much suited for application in future nanoelectronic devices; rather, it is
the need to switch systems by light or electric pulses.
Molecular switches, based only on the change of the electronic spin situation in a
molecule, are also promising candidates in this respect. Transition metal complexes
with d4d7 congurations can exist in either the high-spin (HS) or low-spin (LS)
version. High-spin complexes are characterized by a maximum number of unpaired
electrons, following Hunds rules. Low-spin complexes have zero or, in the case of an
odd number of electrons, one unpaired electron. For instance, if an octahedral
complex exists in the HS or the LS conguration depends on the energy gap between
the t2g and the eg orbitals. The separation of the originally equivalent ve d orbitals into
three t2g and two eg orbitals (ligand eld splitting) is due to the different inuence of
j15
16
the ligands coordinating the atom or ion along the x, y and z axes (eg) or between the
axes (t2g). Small gaps D result in HS congurations and large D values in LS
complexes. Figure 2.15 shows qualitatively the situation in both cases for a d6
complex.
The D values are dominantly determined by the nature of the ligand molecules
coordinating the corresponding transition metal atom or ion. So-called weak ligands
(H2O, halides X) cause small ligand eld splitting and strong ligands (CN, CO,
olens) cause large D values. However, situations exist where the energy difference
between the HS and LS congurations is small enough to be inuenced by stimuli
from outside and, consequently, switching between both electronic congurations
becomes possible, provided that the transition between the two states is abrupt.
Numerous such spin transition complexes have been identied and are of increasing
interest with respect to molecular switching systems. It is of special relevance that
HSLS transitions can be induced by different stimuli such as temperature, pressure
or light.
An example of a temperature-switchable complex is given in Figure 2.16. The two
congurations of the d7 Co(II) complex can be followed by the magnetic susceptibilities. The HS version has a total spin of S 3/2 and the LS form of S 1/2 [38].
The energetically higher lying states of the HS conguration are reached by
increasing the temperature from about 130 to 150 K and vice versa.
The tetranuclear d6 iron complex shown in Figure 2.17 can be switched by
temperature, pressure or light [39]. The four iron centers allow switching over three
magnetically different congurations: 3HS/1LS, 2HS/2LS, 1HS/3LS. This special
situation allows manifold storage and switching varieties. The chance to switch
this complex by light makes it a remarkable candidate for future applications. The
j17
18
2.2.3
Single-Electron Memories
Figure 2.18 The electronic transition from the bulk state to a quantum dot.
SET from one electrode into the nanoparticle leads to an increase in charge by
e (1.6 1019 C) linked with an increase in the electrostatic energy EC by EC e2/2C,
where C is the capacity of the nanoparticle. As can be seen from Figure 2.19, the metal
nanoparticle does not directly touch the electrodes, but an insulating envelope (or a
respective distance) separates it from the contacts to gain an appropriate capacity of
the system. In order to avoid uncontrolled thermal tunneling of electrons, EC must
be much larger than the thermal energy ET kBT (kB Boltzmanns constant
1.38 1023 J K1): e2/2C kBT. The observation of an SET process will only be
possible either at very low temperatures or with very small C values. Since C ee0A/d
(e dielectric constant, d electrode distance from metal core, A surface area of
the particle), small C values can be realized by very small particles having a
sufciently thick insulating shell. The charge generated on the particles causes a
voltage U e/C, linked with a current I U/RT (RT tunneling resistance).
The temperature dependence of SET has been convincingly demonstrated by the
study of a 17-nm Pd nanoparticle, covered by a shell of H2NC6H4SO3Na molecules.
As can be seen from Figure 2.20, the IU characteristic at 295 K is a straight line,
following Ohms law. At 4.2 K, however, a well-expressed Coulomb blockade is
observed, indicating that between about 55 and 55 mV the current is interrupted
due to the existence of an additional electron in the particle, blocking the transport of
a second one [45].
j19
20
Coulomb blockade is enlarged; however, the most informative knowledge can be seen
from a diagram using dI/dV values instead of I (Figure 2.22). Generally, the blockade
then turns into a minimum on the U axis. Due to the low temperature, discrete
energy levels in the minimum become visible in terms of conductivity oscillations
with an average level spacing of 135 meV. The dashed line and the solid line result
from two measurements on the same particle, but at different positions, namely
above a phenyl ring of PPh3 and at a position above bare gold atoms. They agree fairly
well and so indicate that the result does not depend on the matter between the tip and
Au55 nucleus.
This result demonstrates the existence of a perfect quantum dot, working at room
temperature and representing exactly position (c) in the sketch in Figure 2.18.
Figure 2.22 also makes it clear why such working units are sometimes called
articial atoms.
The Coulomb blockade, based on the transfer of single electrons, represents a
perfect single-electron switch and can in principle also be used as a single-electron
transistor, as indicated in Figure 2.23.
These fundamental ndings make Au55 and metal particles of similar size
excellent candidates for use in future storage systems. Intensive research and
development are still necessary to reach this ultimate goal. The very rst steps from
the science level to the engineering step are in progress, but nevertheless, quantum
dot memories are still deeply involved in basic research.
2.2.4
Drug Delivery
j21
22
Figure 2.24 Concentration (c)time (t) profile of a conventional and a controlled drug release.
host to such an extent that slow release is enabled. A hostguest system based on
dendrimers as host molecules will be described briey as an actual example of
development. Dendrimers are highly branched molecules in the nanometer size
regime. They consist of a core unit from where branched branches extend in
different directions, forming a three-dimensional architecture bearing end groups of
various functionality. Dendrimer structures unavoidably contain cavities inside the
skeleton. These are able to take up guest molecules of appropriate size and to release
them slowly depending on the surrounding conditions. With an increasing number
of branches, the number and geometry of the cavities become variable and also
increase [57]. Figure 2.25 shows a formal sketch of a dendrimer molecule taking up
and releasing host particles.
Another interesting nano-based drug delivery system uses superparamagnetic
iron oxide nanoparticles, usually embedded in a polymer matrix and attached to a
drug system. External, high-gradient magnetic elds are applied to transport drugloaded beads to the corresponding site in the body [58, 59]. Once the system has
concentrated in the tumor, the drug is released using different techniques such as
increase in temperature, change of pH value or enzymatic activity.
Another method uses superparamagnetic iron oxide nanoparticles, the surface of
which is modied by DNA sequences. Those particles easily enter cells using
receptor-mediated endocytosis mechanisms in combination with a magnetic eld
gradient [60]. Having entered the cell, the DNA is liberated and can enter the nucleus.
This so-called non-viral transfection is of special interest for gene therapy.
A rapidly growing drug delivery development is based on the use of multifunctional nanoengineered capsules containing various kinds of active compounds.
Attempts have been made to solve the general problem of treating only pathological
cells and not healthy ones by using, for instance, functionalized polymer capsules
having distinct release, permeability and adhesion properties. The inner volume can
j23
24
be lled with magnetic nanoparticles which allow aimed transport by outer magnetic
elds, by modication of the capsule surface with specic receptors to target
specically diseased cells or by generating capsules acting as a nanoreactor producing
products which are only toxic for diseased cells and cause selective apoptosis [61, 62].
Self-rupturing microcapsules consist of polyelectrolyte membranes, permeable to
water, but not to the drug-containing degradable microgels, lling the holes of the
capsules. The hydrolytic degradation of the microgels causes a swelling pressure,
rupturing the outer membrane [63].
Polyelectrolyte capsules which degrade at physiological pH values open up novel
prospects for drug delivery [64]. Intracellular targets such as nucleic acids or proteins
can cause opening of the capsules. Using CaCO3 particles as the carrying material for
uorescein isothiocyanatedextran (FITCdextran), assemblies of CaCO3/FITC
dextran are formed by coprecipitation, followed by layer-by-layer polyelectrolyte
membrane formation, for instance with poly-L-arginine as the polycation and dextran
sulfate as the polyanion. Finally, the CaCO3 particles are removed using buffered
EDTA solution.
Finally, laser-induced opening of a polyelectrolyte membrane inside living cells can
be mentioned [65].
A completely different drug delivery system has been developed using nanoporous
alumina membranes for the controlled long-term release of drugs. Nanoporous
alumina membranes with variable pore diameters between 10 and 200 nm are easy to
prepare and are used to control the speed of release depending on the pore size [66].
Figure 2.26 shows a sketch of the implantable device and Figure 2.27 illustrates the
inuence of the speed of release of the same molecule depending on the pore
diameter. Instead of a real drug, the system has been developed using crystal violet for
the easy determination of concentrations by means of UVvisible spectroscopy.
Of course, this system requires individual developments for each drug to optimize
the pore size and solubility conditions, for instance with the help of surfactants.
Figure 2.27 Dependence of the release of crystal violet on the pore size.
None of these nano-based techniques have yet reached the technology standard.
Rather, they are under development and have partially reached the engineering
state.
2.2.5
Gene Chips
j25
26
DNA microarrays are also used to analyze the sequence of particular genome
sequences. Gene chips of this type have, in spite of their enormous contribution in
diagnostics, inherent drawbacks due to the uorophore labeling (in some cases even
radioactive labeling is used). Recent advances in nanoscience open the door to
increase the sensitivity of DNA detection to an unknown level. DNAnanoparticle
conjugates are powerful tools in this direction.
New developments of microarrays are based on the use of DNA, functionalized
with quantum dots. Their excitation and emission properties make them
powerful candidates for replacing uorophore labeling techniques. Gold and
silver nanoparticles, but also CdSe and ZnS quantum dots, have been successfully tested.
The surface plasmon resonance of gold nanoparticles and the resulting intense
color allow the very sensitive and selective colorimetric detection of corresponding
DNA sequences. Au nanoparticles in the size range from about 10 up to 100 nm,
typically 1020 nm, are linked with DNA probe strands by using 30 - or 50 -end
mercapto-functionalized species. Due to the preferred formation of strong AuS
bonds, AuDNA hybrid systems become easily available. Mirkins group rst reported
the use of mercaptoalkyloligonucleotide-modied gold nanoparticles [6771]. They
used two samples, each complementary to one half of a target oligonucleotide, so that
the formation of a polymeric network was induced by mixing the three species, as
indicated in Figure 2.28. The purpose of this experiment was to perform a color change
from red to blue, due to the aggregation of the nanoparticles.
Based on the colorimetric nanoparticle approach to DNA detection, microarrays
using DNAgold nanoparticle hybrid systems are being increasingly developed. A
three-component system was rst described by Mirkin and coworkers [69, 72, 73]. It
consists of a glass chip, the surface of which is modied by capture DNA strands to
recognize the DNA under investigation. The oligonucleotide-functionalized gold
probe and the target DNA complete the system. Intense washing after the combination process results in a selectivity of 10 : 1 for single base-pair mutations. In a nal
special step, the gold nanoparticles are covered by a silver shell which is simply
generated by the catalytic reduction of silver ions on the gold nanoparticles. The
capture-strandtargetnanoparticle combination can then be visualized using a
atbed scanner. Due to the presence of silver shells on the gold particles, high
surface enhanced Raman scattering (SERS) is observed and can also be used for
target DNA detection. Figure 2.29 elucidates the various steps.
The improvement of this technique, compared with conventional uorophorelabeling techniques, is about 100-fold, namely as low as 50 fM. These remarkable
nanotechnologically based developments will initiate great progress in diagnostics. As
an example, rst investigations of Alzheimers disease (AD) can be mentioned [74].
Alivisatos and coworkers detected single base-pair mutations in DNA by the use of
CdSe/ZnS quantum dots in chip-based assays [75]. The detection method was
uorescence microscopy and the detection limit was about 2 nM. This is not yet
in the region of the detection limit described above, but it is likely that this technique
can be improved considerably since it is known that even individual quantum dots
can be detected under ideal conditions [76].
Gene chips based on the use of quantum dots belong to one of the most promising
developments in nanotechnology. In an unusually short time, beginning with the
very rst experience with biomoleculequantum dot interactions, a development
started that has already led to commercially available devices. There are still
simultaneous efforts to be made on all three levels. Further improvements of
detection limits are still part of nanoscience. At the same time, improvements of
routine detection processes are continuing in order to facilitate everyday clinical
handling.
2.2.6
Hyperthermia
Hyperthermia, known for several decades, has more or less developed to a level of
clinical applications based on nanotechnological attempts and can now be located in
the eld of nanoengineering and also nanotechnology. It uses the fact that
superparamagnetic nanoparticles can be warmed up by external alternate magnetic
j27
28
elds. As has long been known, tumor cells respond sensitively with apoptosis on
temperature increases of only a few degrees (4044 C) [7781].
The superparamagnetic state of a material at room temperature is reached when
the thermal energy kT (k Boltzmanns constant) overcomes the magnetostatic
energy of a domain or particle. If the particle or domain is small enough, a hysteresis
no longer exists or, in other words, the magnetic unit no longer exhibits the ability to
store the larger particles magnetization orientation; rather, the magnetic moments
rotate and so induce superparamagnetic behavior. Typical particle sizes for the
transition from ferro- to superparamagnetism are in the range 1020 nm for oxides.
Metal particles have to be downsized to 13 nm. A great advantage of superparamagnetic particles is the fact that they can be dispersed in various liquids without
any tendency to agglomerate, an important condition for applications in medicine.
Several types of superparamagnetic oxide nanoparticles have been investigated for
application in hyperthermia. The most promising candidates are magnetite and
maghemite since their biocompatibility has already been shown. The amount of
magnetic material to reach the necessary temperature depends on the concentration
of the particles in the cells. Direct injection allows larger quantities than intravascular
administration or antibody targeting. On the other hand, direct injections into the
tumor involve a certain danger of the formation of metastases. In any case, the
amount of magnetic nanoparticles necessary for a sufcient temperature increase
depends on the magnetic properties of the particles and on the external radiofrequency eld. Under then optimum conditions only 0.1 mg per mL of tissue is
necessary to induce cell death.
2.2.7
Gas Sensors
Gas sensors have been known and applied since the early 1960s [82, 83]. The elds of
application range from industrial and automotive needs (NOx, NH3, SO2, hydrocarbons, etc.) via domestic gas determinations (CO2, humidity) up to the security sector,
where traces of explosives have to be detected. In a working sensor system, the
information resulting from the chemical or physical interaction between a gas molecule
and the sensor has to be transformed into a measurable signal. Numerous possibilities
suchaselectrochemical,calorimetric,acoustic,chemoresistantandothereffectsarewell
established. Chemoresistors typically use metal oxides. These change their electrical
resistance when they oxidize or reduce gases to be detected [8285]. Continuous
improvements have been elaborated concerning sensitivity, selectivity, stability, response and recovery time [86]. In connection with nanotechnological developments
sensors based on metal oxides and metal nanoparticles have been intensively studied
and have reached the state of engineering and even technology (see Figure 2.6) [86].
Sensors based on nanosized metal oxides provide both receptor and transducer
functions [87]. The receptor must ensure a specic interaction of the sensors surface
with the target analyte. The transducers task is to transform the molecular information into a measurable change of the electrical resistance. For instance, the conductivity of n-type semiconducting metal oxides increases on contact with reducing
gases, whereas that of p-type oxides decreases. Oxidizing gases cause opposite effects.
A frequently used wide-bandgap n-type semiconductor is SnO2. A qualitative
explanation (for details see [86]) of the working principle says that, in the presence
of dry air, oxygen is ionosorbed on the oxide surface, depending on temperature as
O2 (<420 K), as O (420670 K) or as O2 (>870 K). The electrons required for the
reduction of O2 come from the conduction band, so generating an electron-decient
region, the so-called spacecharge layer Lair [8891]. Lair depends on the Debye
length LD, a material- and temperature-dependent value. For SnO2 at 523 K it is about
3 nm [92]. In real systems water is present to some extent, forming hydroxyl groups
on the surface and affecting the sensors properties. The inuence of water has been
discussed in detail [93].
Reducing gases interact with the ionosorbed oxygen species and are oxidized, for
instance CO ! CO2, which is desorbed. Even traces of reducing gases decrease the
number of oxygen species to such an extent that, due to the release of surface-trapped
electrons, the increase in conductance becomes measurable. In the case of oxidizing
target gases, the process is inverse: additional electrons are removed from the
semiconductor, resulting in an increase in Lair. Hence adsorption of oxidizing gases,
for example NO2 or O3, causes a decrease in conductance.
The efciency of a gas sensor depends not only on the material of which it is made,
but decisively also on the size of the particles and their arrangement, since the
relevant reactions occur on the particles surface. In an ideal case all existing
percolation paths are used, contributing to a maximum change in conductance.
The response time depends on the equilibrium between the diffusion rate of the
participating gases. Film thickness and porosity are therefore of special relevance for
the quality of a sensor [94, 95].
A vital role in this connection is played by the size of the particles forming the
macroscopic lm. Since the analyte moleculesensor interaction occurs on the particles
surface, their surface:size ratio plays a dominant role. Since the relative proportion of the
surfaceincreaseswithdecreasingparticlesize,smallerparticlesshouldbemoreefcient
than larger particles. This can clearly be seen from Figure 2.30 [87].
SnO2 particles with diameters below about 10 nm exponentially increase the
sensors response. In addition to the increase in surface area, particle radii in the
range of the spacecharge layer Lair decrease the Schottky barriers between depleted
zones or even lead to an overlap, with the consequence that surface states dominate
the electrical properties and so have a decisive inuence on the sensor performance.
Very small differences in the particle size can have crucial consequences for the
sensors ability. As has been shown for WO3 nanoparticles, a reduction from 33 to
25 nm increases the sensitivity towards 10 ppm NO2 at 573 K by a factor of 34 [96].
Several similar examples for other metal oxides are known [86].
Metal nanoparticles as building blocks for sensor systems have been under
investigation since the late 1990s. Thin lms of ligand-protected metal nanoparticles
change their conductance when gas molecules are absorbed in the regions between
the nanoparticles [97, 98]. Films of 2-nm gold nanoparticles, covered with octanethiol
molecules, turned out to change the conductance reversibly if gas molecules such as
toluene, 1-propanol or water became part of the interparticle sphere. This principle
j29
30
2.3
Technologies on the Nanoscale
2.3.1
Introduction
This section deals with images of nanotechnology that do not fulll the strict
denition of nanoscience and nanotechnology, explained in Section 2.1. Why is this
necessary? The term nanotechnology has now reached such a broad and thereby
diffuse meaning that it seems helpful to discuss some examples of those already
installed techniques that are falsely integrated into nanotechnology in a broader sense.
Indeed, in some cases it is not trivial to decide if an effect and a thereby resulting
technique follow the precise denition or not. From experience it can be seen that
wrong nanotechnology means techniques that are settled on the nanoscale, but
without the decisive size-dependent or functionality-determined nano-effect. What is
usually meant by this common understanding is technology(ies) on the nanoscale.
The exclusion of those techniques from the scientically exactly dened techniques is
not discrimination. Rather, some of them became very important and others will
follow. To conclude this introduction, one should try to differentiate clearly between
nanotechnology and technologies on the nanoscale. The latter can also be considered
as scaling effects without indicating nano-specic effects.
2.3.2
Structured Surfaces
It has long been known that structured surfaces change the physical and chemical
properties of the corresponding material. Two property changes dominate the interest
in structured surfaces: (i) change in wettability and (ii) change in optical properties.
Both are of enormous importance both in nature and in technique. What kind of
structure are we talking about? Let us consider rst a natural technique, that has
been copied in many respects: the wettability behavior. Barthlott et al. have investigated
since about 1990 the surface of lotus leaves for its special property of having a
permanent clean surface [120122]. Like all primary parts of plants, lotus leaves are
covered by a layer of hydrophobic material. In case of lotus leaves, this layer consists of
epicuticular wax crystals. Scanning Electron Microscopy (SEM) investigations additionally showed a structural design that is responsible for the super-hydrophobicity of
lotus leaves. This special behavior has subsequently become known as the lotus effect.
The SEM images showed that the surface of the leaves consists of a double structure of
microsized cells decorated with nanosized waxy crystals.
The physical background for this phenomenon can be seen in the behavior of water
droplets on a micro/nanostructured surface. It is important to state that the effect is
not dependent on a distinct size of the structure units. As it turned out, the lotus
combination of micro- and nanosized units is advantageous, but it is not a condition.
Also, the absolute size of the nanostructure units is not decisive to observe the effect,
it can only improve or worsen the hydrophobic nature.
j31
32
The wettability generally describes the interaction of a liquid with a solid surface. It
is described by the Young equation, ssg ssl slgcosy, where ssg solidgas
interfacial tension, ssl solidliquid interfacial tension, slg liquidgas interfacial
tension and y solidliquid contact angle [123]. The contact angle y is the angle
between the solid surface and the tangent applied at the surface of the droplet.
Figure 2.32 illustrates the situation of a water droplet on a smooth surface, a
nanostructured surface and a micro/nanostructured surface.
It is obvious that structured surfaces in any case cause larger contact angles than
at surfaces and so show an increased hydrophobicity. The reason is that the energy
to distribute a water droplet on a structured surface extends the gain in energy by
additional interactions of water molecules with the surface.
Soon after its recognition, the lotus effect led to the development of articially
micro/nanostructured surfaces. Lithographic techniques, self-assembly processes,
controlled deposition, size reduction and replication by physical contact are applicable routes [124]. An elegant replication procedure will be briey considered. It uses
masks consisting of nanoporous alumina lms. The advantage is their rather simple
fabrication by anodization of aluminum surfaces, the easy adjustability of the pore
diameters and the hardness and the temperature stability of alumina [125129].
Using appropriate imprinting devices, various polymers and metals could be
nanostructured [130]. The successful 1 : 1 polymer transfer from the mask to the
surface is shown in Figure 2.33 by means of a poly(methyl methacrylate) (PMMA)
surface.
Polycarbonate (PC) and polytetrauoroethylene (PTFE) could also be nanostructured with different pore widths. Aluminum, iron, nickel, palladium, platinum,
copper, silver and brass are examples of successfully nanostructured metals.
Figure 2.34 shows an AFM image of a nanostructured silver surface using a mask
with 50-nm pores.
j33
34
By means of PTFE surfaces, imprinted with masks of 50, 120, 170 and 200 nm, it
has been demonstrated that each of the surfaces makes the hydrophobicity increase,
compared with an untreated surface. However, there is no spontaneous effect to be
observed, rather it is a scaling phenomenon. In Figure 2.35, light microscopic images
of water droplets on variously nanostructured PTFE surfaces are shown. There is an
increasing contact angle to be registered if the pore width and thereby the pillars on
the surfaces increase. This continuous development of the contact angles can also be
followed from Figure 2.36.
In addition to the wettability properties, a second physical behavior changes with
structure: the light transmission of transparent materials. In Figure 2.37, the increasing transmission of visible light through PMMA windows with decreasing structure
size is demonstrated.
Improvements in the transparency of glasses, linked with a reduction in reection,
has important practical consequences in optical devices.
2.4
Final Remarks
Referring to Figure 2.6, only seven of a huge number of possible examples have
been selected here to demonstrate the enormous diversity of nanoscience and
nanotechnology. Nano-effects can occur everywhere, both in simple materials
and in complex biological structures. This makes nanoscience and nanotechnology a unique eld of research and development. The examples discussed
illustrate the universality of this future-determining technology, which in many
of the most attractive elds is still at the very beginning. However, it can be
predicted that distinct elds that are still part of basic research will develop into
techniques which will inuence daily life dramatically. Others, usually those of
easier and faster research and development, have already become routine
techniques.
The selected examples also indicate that basic research is fundamental to developing nanotechnology further. Only by basic research will we discover nano-effects in
the different elds named in the denition. However, also in basic science the
research assignment should not just be to look for novel nano-effects: physics,
chemistry, biology and materials science will also discover relevant effects when
working in other elds.
j35
36
References
1 Nanoscience and nanotechnologies:
opportunities and uncertainties, Royal
Society and The Royal Academy of
Engineering, (2004) 5, Science Policy
Section, The Royal Society, London.
2 Brune, H., Ernst, H., Grunwald, A.,
Gr
unwald, W., Hofmann, H., Krug, H.,
Janich, P., Mayor, M., Rathgeber, W.,
Schmid, G., Simon, U., Vogel, V. and
Wyrwa, D. (2006) Wissenschaftsethik und
Technikfolgenabschatzung Vol. 27.
Nanotechnology Assessment and
Perspectives, Springer, Berlin.
3 Schwerdtfeger, P. (ed.) (2002) Relativistic
Electronic Structure Theory. Part 1:
Fundamentals, Elsevier, Amsterdam.
4 Schwertdfeger, P. (ed.) (2005) Relativistic
Electronic Structure Theory. Part 2:
Applications, Elsevier, Amsterdam.
5 Hess, B.A. (ed.) (2002) Relativistic Effects in
Heavy-element Chemistry and Physics,
Wiley, New York.
6 Mie, G. (1908) Annals of Physics, 25, 377.
7 Schmid, G. and Giebel, U.unpublished
work.
8 Collier, C.P., Mattersteig, G., Wong, E.W.,
Luo, Y., Beverly, K., Sampaio, J., Raymo,
F.M., Stoddart, J.F. and Heath, J.R. (2000)
Science, 289, 1172.
9 Schliwa, M. (ed.) (2003) Molecular Motors,
Wiley-VCH, Weinheim.
10 Tyreman, M.J.A. and Molloy, J.E. (2003)
IEE Proceedings Nanobiotechnology, 150,
95.
11 Vallee, R.B. and Hook, P. (2003) Nature,
421, 701.
12 Schliwa, M. and Woehlke, G. (2003)
Nature, 422, 759.
13 Ball, P. (2002) Nanotechnology, 13, R15.
14 Braig, K., Otwinowski, Z., Hegde, R.,
Boisvert, D.C., Joachimiak, A., Horwich,
A.L.andSigler,P.B.(1994)Nature,371,578.
15 Ditzel, L., Lowe, J., Stock, D., Stetter,
K.-O., Huber, H., Huber, R. and
Steinbacher, S. (1998) Cell, 93, 125.
16 Wang, J. and Boisvert, D.C. (2003) Journal
of Molecular Biology, 327, 843.
References
32 Badjic, J.D., Balzani, V., Credi, A., Silvi, S.
and Stoddart, J.F. (2004) Science 303. 1845.
33 Balzani, V. (2005) Small, 1, 278.
34 Moore, G. (1965) Electronics, 38, 114.
35 Wolf, S.A., Treger, D. and Chtchelkanova,
A. (2006) MRS Bulletin, 31, 400.
36 Richter, H.J. and Harkness, S.D. IV,
(2006) MRS Bulletin, 31, 384.
37 Allenspach, R. and Jubert, P.-O. (2006)
MRS Bulletin, 31, 395.
38 Zarembowitch, J. and Kahn, O. (1991)
New Journal of Chemistry, 15, 181.
39 Breuning, E., Ruben, M., Lehn, J.-M.,
Renz, F., Garcia, Y., Knesofontov, V.,
G
utlich, P., Wegelius, E. and Rissanen, K.
(2000) Angewandte Chemie-International
Edition, 39, 2504.
40 Feldheim, D.L. and Keating, C.D. (1998)
Chemical Society Reviews, 27, 1.
41 Simon, U. (1998) Advanced Materials, 10,
1487.
42 Simon, U. and Schon, G. (2000) in
Handbook of Nanostructured Materials and
Nanotechnology (ed. H.S. Nalwa),
Academic Press, New York, Vol. 3, p. 131.
43 Simon, U., Schon, G. and Schmid, G.
(1993) Angewandte Chemie-International
Edition in English, 2, 250.
44 Simon, U. (2004) in Nanoparticles. From
Theory to Application (ed. G. Schmid),
Wiley-VCH, Weinheim, p. 328.
45 Bezryadin, A., Dekker, C. and Schmid, G.
(1977) Applied Physics Letters, 71, 1273.
46 Schmid, G., Boese, R., Pfeil, R.,
Bandermann, F., Meyer, S., Calis, G.H.M.
and van der Velden, J.W.A. (1981)
Chemische Berichte, 114, 3634.
47 Schmid, G. (1990) Inorganic Syntheses, Vol
32, 7, 214.
48 Chi, L.F., Hartig, M., Drechsler, T.,
Schaak, Th., Seidel, C., Fuchs, H. and
Schmid, G. (1998) Applied Physics Letters
A, 66, 187.
49 Zhang, H., Schmid, G. and Hartmann, U.
(2003) Nano Letters, 3, 305.
50 Santini, J.T., Jr. Richards, A.C., Scheidt,
R., Cima, M.J. and Langer, R. (2000)
Angewandte Chemie-International Edition,
39, 2396.
j37
38
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
References
100 Han, L., Daniel, D.R., Mayer, M.M. and
Zong, C.-J. (2001) Analytical Chemistry,
73, 4441.
101 Krasteva, N., Besnard, I., Guse, B., Bauer,
R.E., Muellen, K., Yasuda, A. and
Vossmeyer, T. (2002) Nano Letters, 2, 551.
102 Zang, H.-L., Evans, S.D., Henderson, J.R.,
Miles, R.E. and Shen, T.-H. (2002)
Nanotechnology, 13, 439.
103 Vossmeyer, T., Guse, B., Besnard, I.,
Bauer, R.E., Muellen, K. and Yasuda, A.
(2002) Advanced Materials, 14, 238.
104 Joseph, Y., Guse, B., Yasuda, A. and
Vossmeyer, T. (2004) Sensors and Actuators
B, 98, 188.
105 Simon, U., Flesch, U., Maunz, W., M
uller,
R. and Plog, C. (1999) Microporous and
Mesoporous Materials, 21, 111.
106 Moos, R., M
uller, R., Plog, C., Knezevic,
A., Leye, H., Irion, E., Braun, T.,
Marquardt, K.-J. and Binder, K. (2002)
Sensors and Actuators B, 83, 181.
107 Franke, M.E., Simon, U., Moos, R.,
Knezevic, A., M
uller, R. and Plog, C.
(2003) Physical Chemistry Chemical
Physics, 5, 5195.
108 Kreibig, U., Fauth, K., Granquist, C.-G.
and Schmid, G. (1990) Zeitschrift Fur
Physikalische Chemie-International Journal
of Research in Physical Chemistry &
Chemical Physics, 169, 11.
109 van Staveren, M.P.J., Brom, H.B. and De
Jongh, L.J. (1991) Physics Reports-Review
Section of Physics Letters, 208, 1.
110 Simon, U., Schon, G. and Schmid, G.
(1993) Angewandte Chemie-International
Edition in English, 32, 250.
111 Sch
on, G. and Simon, U. (1995) Colloid
and Polymer Science, 273, 101.
112 Sch
on, G. and Simon, U. (1995) Colloid
and Polymer Science, 273, 202.
113 Brust, M., Betel, D., Shiffrin, D.J. and
Kieley, C.J. (1995) Advanced Materials, 7,
795.
114 Andres, R.P., Bielefeld, J.D., Henderson,
J.I., Janes, D.B., Kolagunta, V.R., Kubiak,
C.P., Mahoney, W.J. and Osifchin, R.G.
(1996) Science, 273, 1690.
j39
j41
3
Top-Down Versus Bottom-Up
Wolfgang J. Parak, Friedrich C. Simmel, and Alexander W. Holleitner
3.1
Introduction
42
up in the nanometer range as the smallest relevant scale for functional subunits.
Then again, if materials are built up synthetically from their basic chemical building
blocks, one will also rst arrive at this length scale. In this sense, nano is the size
scale where physics, chemistry and biology meet in a natural way (Figure 3.1).
The rst scientist who pointed out that there is plenty of room at the bottom was
Richard Feynman in 1959 [oral presentation given on 29 December 1959 at the
Annual Meeting of the American Physical Society at the California Institute of
Technology (Caltech)]. He envisioned scientic discoveries and new applications of
miniature objects as soon as material systems could be assembled at the atomic scale.
To this end, machines and imaging techniques would be necessary which can be
controlled at the nanometer or subnanometer scale. Fifty years later, the scanning
tunneling microscope and the atomic force microscope are ubiquitous in scientic
laboratories, allowing to image structures with atomic resolution [412]. As a result,
various disciplines within nanotechnology aim towards manufacturing materials for
diverse products with new functionalities at the nanoscale.
As we have seen, we can approach the nanoscale from two sides: by making things
smaller, that is, by downscaling, and by constructing things from small building
blocks, that is, by upscaling. The rst method is referred to as the top-down and the
second as the bottom-up approach. The top-down approach follows the general
trend of the microelectronic industry towards miniaturization of integrated semiconductor circuits. Modern lithographic techniques allow the patterning of nanoscale structures such as transistor circuits with a precision of only a few nanometers
(see http://www.icknowledge.com and publications therein for more information).
As we will see in this chapter, the industrial demand for ever smaller electronic
circuits has provided several physical tools by which materials can be probed and
manipulated at the nanometer scale. In contrast, the bottom-up approach is based on
molecular recognition and chemical self-assembly of molecules [13]. In combination
with chemical synthesis techniques, the bottom-up approach allows for the assembly
3.1 Introduction
How can we get to smaller and smaller sculptures and objects? The idea of top-downstrategies is to take processes known from the macroscopic world and to adopt them
in such a way that they can be used for doing the same thing on a smaller scale. Since
ancient times, humans have created artwork and tools by structuring materials. Let us
take the artist as an example, who carves and sculpts gures out of blocks of wood or
stone, respectively. For this purpose, the artist needs tools, usually a carving knife or
chisel, with which he can locally ablate parts from the original piece of material and
thus give it its desired shape (Figure 3.2). If we continue with the idea of a sculptor, it
is evident that in order to sculpt smaller objects, smaller tools are needed, such as
miniature rasps and knives. Chinese artisans have used such tools to carve little
pieces of wood and other materials into sculptures in the submillimeter regime,
which can only be seen with magnifying glasses [some of these highly impressive
sculptures can be seen in the National Taiwan Museum (http://www.ntm.gov.tw)].
If we want to fabricate even smaller structures or objects and study their physical
properties, classical mechanical tools such as rasps will no longer work. For that
purpose, scanning probe microscopes are powerful instruments for probing and
manipulating materials at the nanometer scale and can thus be seen as one of the key
inventions for nanotechnology. On the one hand, they allow imaging and probing of
the characteristics of nanoscale objects with the highest resolution. Examples include
topology, material conguration and electrical, chemical and magnetic properties of
the studied objects. On the other hand, scanning probe microscopes allow local
manipulation and even shaping of the nanostructures. In a seminal work in 1982,
Binnig and Rohrer invented the scanning tunneling microscope (STM) [4], for which
they were awarded the Nobel Prize in 1986. In such a microscope, piezoelectric
j43
44
crystals move a scanning tip across the surface of a sample, while the electric current
is recorded between the tip and the sample. If the tip is located very close to the
surface of the sample, the electric current is composed of tunneling electrons. In the
nanoscale quantum world, the wave character of electrons plays an important role. As
soon as the distance between the tip and the surface is on the order of the electron
wavelength, electrons can tunnel from the tip to the surface. The size of the tunneling
current has an exponential dependence on the distance between the tip and surface.
Therefore, the point of the tip which is closest to the surface predominantly
contributes to the current. In principle, this point can be made up of only one atom,
which allows for atomic resolution. The scanning tunneling microscope is sensitive
to the electronic density at the surface of the sample, which allows for, for example,
imaging of the electronic orbitals of atoms and molecules. The scanning tunneling
microscope can operate not only under ultrahigh vacuum conditions, but also at
atmospheric pressure and at various temperatures, which makes it a unique imaging
and patterning tool for the nanosciences. For instance, scanning tunneling microscopes are utilized to pattern nanostructures by moving single atoms across surfaces,
while the corresponding change of the quantum mechanical conguration can be
recorded in situ [14].
The atomic force microscope (AFM) operates similarly to the scanning tunneling
microscope [5]. Here, the force between the scanning tip and the sample surface is
extracted by measuring the deection of the tip towards the sample. Again, the atomic
force microscope can be utilized as an instrument to image and to shape materials on
the nanometer scale [15].
In experiments in which only a few nanostructures need to be patterned and
probed, scanning probe microscopes can be exploited to structure and shape
materials at the nanometer scale. To dene an array of millions of nanoscale systems
in parallel such as in integrated electronic circuits the top-down approach of
microlithography is the technique of choice (Figure 3.3). Microlithography has been
the technological backbone of the semiconductor industry for the last 45 years (see
http://www.icknowledge.com and publications therein for more information). The
minimum feature size of optically dened patterns depends on the wavelength of the
utilized light and also factors that are due to, for example, the shape of the lenses and
the quality of the photoresist. In 2003, the typical linewidth of semiconductor circuits
fell below 100 nm, that is, the semiconductor industry can be literally seen as being
part of the nanotechnologies. For this achievement, argon uoride excimer lasers are
applied with a laser wavelength of 193 nm in the deep ultraviolet region. For
lithography in this optical range, tricks such as optical proximity correction and
phase shifting were invented and successfully implemented. On the one hand,
the miniaturization of semiconductor circuits is limited by economic costs for
the semiconductor industry, since the implementation of new techniques for the
realization of ever smaller feature sizes results in ever increasing costs. On the other
hand, physical material properties, such as the high absorption level of refractive
mirrors at short optical wavelengths, set a natural limit of a few tens of nanometers
for the miniaturization process. Since the 1980s, the decline of optical lithography
has been predicted as being only a few years away. However, each time optical
3.1 Introduction
lithography had reached traditional limits, new techniques extended the economically sustainable lifetime of top-down microlithography.
At present, there are several possible successor top-down nanotechnologies for
industry, for example, extreme ultraviolet light lithography (EUV), electron beam
lithography with multicolumn processing facilities [see also Figure 3.3(c2)], the
focused ion beam (FIB) technique and the ultraviolet nano-imprinting technique [16].
The implementation of each of these techniques requires enormous technical
challenges to be overcome. One of the most promising techniques is that using EUV
light with a wavelength of only 13 nm. For this technique, fabrication errors of the
optical components need to be in the nanometer or subnanometer range. For
comparison, state-of-the-art X-ray telescopes, such as the Zeiss XMM Newton telescope, exhibit a granularity of the mirror surfaces of 0.4 nm (see W. Egle, Mission
Impossible: XMM-Newton Proves the Opposite, Innovation, Carl Zeiss, 2000, 8, 1214,
ISSN 1431-8059; this le can be downloaded from http://www.zeiss.com). In addition
to state of the art optical requirements, all metrological components of the EUV
technique need to exhibit subnanometer resolution. As a result, the cost of a stepper
machine for EUV exposure of photoresists is US$50 million per system, providing a
linewidth of 35 nm (see http://www.icknowledge.com for more information).
For medium-sized businesses, the ultraviolet nano-imprinting technique seems
to be the most promising method to fabricate nanoscale circuits [17]. Here, nanostructures are mechanically imprinted into a photoresist. The stamp with the
j45
46
The antipode of the top-down approach is the so-called bottom-up technique. Here a
complex structure is assembled from small building blocks. These building blocks
possess specic binding capabilities often termed molecular recognition properties which allow them to arrange automatically in the correct way. Self-assembly is
an essential component of bottom-up approaches [18]. The ultimate examples of
molecular recognition are biological receptorligand pairs: molecules that recognize
and bind to each other with very high specicity. Prominent examples of such pairs
are antibodies and their corresponding antigens and complementary strands of
deoxyribonucleic acid (DNA) [19].
We can visualize the bottom-up assembly of materials with the example of LEGO
building blocks, a common toy for children. Again we take the example of a small
sculpture as also used to describe top-down approaches. LEGO building blocks can
have different functions, as symbolized by their color (Figure 3.4). Furthermore,
there are building blocks of different size and each building block has a dened
number of binding sites, realized here as knobs. In this way the blocks can only be
attached to each other in a dened way. This can be seen as molecular recognition.
However, we should point out that the example of the LEGO blocks fails to describe
self-assembly. There is still a helping hand needed to assemble the individual bricks
to form the complete structure the assembly process is not spontaneous (cf. the selfassembly models of Whitesides [20]). The example of LEGO blocks hold the basic
concept of a bottom-up strategy: the construction of a new material by assembling
basic small building blocks. If instead of LEGO blocks nanoscopic building blocks are
3.1 Introduction
j47
48
3.2
First Example: Nanotweezers
Among the fundamental tools for mechanical work are tweezers or ngers. The basic
function of tweezers is to hold and release things. In principle, tweezers comprise two
ngers, which can close to hold something and open to release it again. So far only a
few functional nanongers are available. Although a variety of tools exist which can
hold nanometer-sized objects, the problem is to release them again. This problem is
of fundamental rather than of technological nature. The interested reader is referred
to the excellent articles by Nobel Prize winner Richard Smalley, who describes the
problem of nonspecic adhesion as the problem of sticky ngers [31, 32]. The basic
module of a nanonger is a nanometer-sized hinge, that is, a structure which can
repeatedly open and close. In this section, we will show examples of such nanongers
or hinges based on top-down microlithography and on bottom-up self-assembly of
biological molecules.
3.2.1
Top-Down Nanotweezers
Lithographically dened tweezers have been realized as a so-called micro-electromechanical system (MEMS). Similar micron-sized mechanical resonators have already
been applied in inkjet printers and in airbags of cars. MEMS resonators have also
triggered major scientic interest, since the reduction in size of an electromechanical
resonator down to the nanometer regime would allow researchers to study, for
example, mechanical effects in the quantum realm [3336] or the coupling of the
eigenmode of such a nanoscale resonator to the electromagnetic eld of a photon [37, 38]. Furthermore, the small mass of nanoelectromechanical devices makes
them extremely sensitive adsorptive sensors, since the mass of adsorbed molecules
can signicantly change the eigenfrequency of a nanoscale mechanical device [39, 40].
For lifting and manipulation of objects on the nanoscale, it is necessary to use
opposing forces to seize and hold a oating or a freely suspended nanostructure, such
as a single protein in liquids or an individual nanowire within a three-dimensional
electrical circuit. Kim and Lieber reported the rst nanoscale tweezers in operation [42]. They attached two carbon nanotubes to metal electrodes on opposite sides of
a micron-thick glass needle, in order to manipulate nanoscale clusters and nanowires
with a size of about 500 nm. The tweezers were closed electrostatically using a voltage
j49
50
In recent years, the unique biochemical and mechanical properties of DNA have been
increasingly utilized for non-biological applications, for example, to realize tiny
nanomechanical devices. Many of these devices exploit the different rigidity of singleand double-stranded DNA and can be switched back and forth between several
structures by the addition or removal of DNA fuel strands. One of the prototype
devices in this eld are DNA nanotweezers [50]. Their operational principle is shown
in Figure 3.7. In the open state, the DNA tweezers consist of three strands of DNA.
j51
52
One central DNA strand is hybridized to two other strands in such a way that together
they form two roughly 6-nm long, rigid double-stranded arms connected by a short,
single-stranded exible hinge. In the open state, each of the arms still has singlestranded extensions available for hybridization. The addition of a long fuel strand
which is complementary to these extensions can then be used to close the tweezer
structure, that is, the two arms are forced together by the hybridization with the fuel
strand. The device can be switched back to its original conguration with a
biochemical trick. In the closed state, a short single-stranded section of the device
is deliberately left unhybridized. These nucleotides serve as an attachment point for
an anti-fuel strand, which is exactly complementary to the fuel strand. A biochemical process known as branch migration unzips the structure, when fuel and antifuel try to bind with each other. After completion of this process, a waste duplex is
ejected and the DNA tweezers have returned to their open conguration again. The
device can be operated cyclically by the alternate addition of set and reset strands. The
transition between the different states of the device can be characterized in uorescence measurements, utilizing the distance dependent quenching of uorophores
due to uorescence resonance energy transfer (FRET).
The same operation principle has since been used in many other DNA devices [51].
A simple variation of the DNA tweezers is the DNA actuator device [52]. Here, instead
of two single-stranded extensions, the arms of the tweezers are connected by a singlestranded loop. Depending on the sequence of the fuel strands, the device can either
be closed similarly to the tweezers or stretched into an elongated conformation.
Whereas the intermediate conguration is a rather oppy structure (like the open
tweezers), the closed and the stretched congurations are much more rigid. While
these devices resemble macroscopic tweezers in shape, they cannot actually be used
to grab nano-objects. However, such a function may be achieved by the incorporation of so-called aptamers into DNA devices [53]. These are special DNA or RNA
sequences with a high binding afnity for other molecules, for example, proteins.
3.3
Second Example: Nanomotors
Another important mechanical element is the motor. A motor is a device that can
generate periodic movements and carry a load with it. Similarly to the previous
section, we will show two examples of nanomotors. The version using the top-down
approach is again based on microlithography and follows the ideas of micromechanical engineering. The version using the bottom-up approach is based on functional
organic molecules.
3.3.1
Top-Down Nanomotors
will only give two brief examples in this section. As with macroscopic motors,
nanomotors also periodically convert input energy into mechanical work. The energy
to drive the motor can originate from different sources, such as light, electric elds or
chemical gradients.
A simple light-driven propeller has been demonstrated by Ormoss group. The
propeller itself is created by selectively illuminating the parts of a light curing resin
that resemble the shape of the propeller. The resin is cured at the illuminated regions
due to photo-polymerization. Dissolution of the noncured parts of the resin nally
results in a freestanding propeller (Figure 3.8) [5457]. The propeller can be driven by
incident light into rotation. When light is reected at the arms of the propeller,
momentum is transferred which causes the rotation. For a detailed description of the
underlying physics we refer to the original article by Galajda and Ormos [55]. In order
to achieve controlled rotation of the propeller, either a freestanding propeller is
trapped in the focus of laser tweezers [55, 56] or a propeller bound to an axis on top of a
substrate is driven by the light origination from an integrated waveguide [54]. So far
the smallest propellers created in this way still have a size of a few micrometers.
An even smaller synthetic motor has been created by Zettls group by scaling down
MEMS technology to the nanometer scale [58]. The axis of this motor is formed by a
carbon nanotube which is xed between two anchor electrodes (Figure 3.9). A metal
plate xed to the carbon nanotube acts as a rotor. The outer shell of the nanotube
beneath the rotor has been detached from the inner shells of the nanotube by shear
forces so that a rotational bearing has been formed. The rotor is driven by an
oscillating electric eld between three additional electrodes. In practice, the rotor was
j53
54
Motors play an important role not only in our technical macroscopic life, but also on a
molecular level in any biological organism. Molecular motors, for example, help
bacteria to move and to transport cargo inside cells, and they are also the basis for
contracting and expanding our muscles [59]. Since the eld of nanotechnology
emerged, scientists have thought about how to harness molecular motors for the
construction of articial machines. For example, the molecular motor kinesin has
been used to transport colloidal quantum dots along the tracks of microtubules [60].
Such a concept might eventually lead to a scenario where building blocks could be
transported along rails to their designated positions. Another example is the
membrane-bound protein ATP synthase [61]. This rotary motor is driven by a proton
gradient and is used in cells for the synthesis of ATP [61]. In the reverse direction the
motor uses the energy of ATP hydrolysis to create a proton gradient. The rotation of
the motor could be used to propel an actin lament that had been attached to the
upper subunit of the motor with ATP as fuel [62]. Such a concept might eventually be
used as a nanopropeller for moving small vesicles. While these concepts are based on
using naturally existing molecular motors, synthetic molecular motors have also
been chemically synthesized [63]. In this section we will briey describe the concept
and realization of such a motor that has been demonstrated by Gaubs group [64].
Molecules can often assume different structural arrangements called
conformations. Azobenzene, for example, can be reversibly switched upon illumi-
j55
56
3.4
Third Example: Patterning
One basic requirement for many technologies and applications is the ability to form
controlled patterns on a surface. This can be exemplied with the components and
techniques needed to fabricate an electronic circuit [67]. In order to make circuits, one
essentially has to master three different steps. First, active elements such as
transistors must be realized which are able to process information [68]. Second,
the active elements have to be arranged into a functional geometry [28]. Third, the
active elements have to be connected by wires [69]. Arranging elements on a surface is
basically the same as forming a pattern. Another example is controlled cell attachment. If a surface is patterned partly with molecules that promote cell adhesion and
partly with molecules that repel cells, cells only will grow on the desired parts of the
surface. This is very important, for example, for bioelectronic interfaces, where cells
have to be guided in such a way that they adhere on top of the active electronic
elements, but not to other parts of the surface [70].
In this section we will briey describe two examples for making small surface
patterns. Soft lithography is a top-down approach, whereby surfaces can be structured with lithographically generated stamps. Following a self-assembly strategy, twodimensional lattices of biological molecules, in particular DNA, can be formed in a
bottom-up approach.
3.4.1
Soft Lithography
3.4.2
Two-Dimensional DNA Lattices
j57
58
[28, 8689] and the impressively powerful DNA origami scheme [82] will soon allow
the arrangement of these nanoparticles into arbitrary geometries and patterns.
3.5
Fourth Example: Quantum Dots1)
In Chapter 4, the physical principles of quantum dots are described. Quantum dots
are arguably the ultimate examples of nanometer structures. In this section, we will
show that we can manufacture quantum dots both with top-down (Section 3.5.2) and
with bottom-up techniques (Section 3.5.4). Depending of the manufacturing process
used, the properties of the quantum dots vary and we will explain this in particular
with respect to their optical properties. Also the possible applications vary among the
different types of quantum dots.
3.5.1
Different Methods for Making Quantum Dots
Here we will give a brief overview of the most popular methods used to fabricate
quantum dots in practice. Lithographically dened quantum dots are the classical
example of the application of a top-down method, whereby the quantum dot is created
by locally etching away parts of the raw material. Colloidal quantum dots, on the other
hand, are an example of a bottom-up approach, in which they are assembled from
small building blocks (in this case surfactant-stabilized atoms).
The ultimate technique for the fabrication of quantum dots should be able to
produce signicant amounts of sample, with such high control of quantum dot size,
shape and monodispersity that single-particle properties are not averaged out by
sample inhomogeneity. So far, ensembles of quantum dots produced by the best
1) This paragraph has been adopted from a
previous edition and the authors acknowledge
their former coauthors Dr. Liberato Manna, Dr.
Daniele Gerion and Prof. Dr. Paul Alivisatos 90.
available techniques still show a behavior deriving from a distribution of sizes, but
this eld is evolving very rapidly. In this section we give a short survey of the most
popular fabrication approaches. Different techniques lead to different typologies of
quantum dots. The connement can be obtained in several different ways and in
addition the quantum dot itself can have a peculiar geometry, it can be embedded into
a matrix or grown on a substrate or it can be a free nanoparticle. Each of these cases
is strictly related to the preparative approach chosen.
3.5.2
Lithographically Defined Quantum Dots
j59
60
vertical arrangement, structures with very few electrons can be realized [100].
Recently, several research efforts have been focused on the investigation of manybody phenomena in these quantum dot systems. Relevant examples are, for instance,
the study of the Kondo effect [101104] and the design and control of coherent
quantum states with the ultimate goal of quantum information processing [105, 106].
A remarkable advantage of lithographically dened quantum dots is that their
electrical connection to the macro-world is straightforward. The manufacturing
processes are similar to those used in chip fabrication and in principle such structures
could be embedded within conventional electronic circuits. However, as the geometry
j61
62
related to different exciton states in the dots, and is reminiscent of the emission from
atoms. As mentioned already for lithographically dened quantum dots, many
parallels can be drawn between atoms and quantum dots [115, 117119]. For these
reasons, quantum dots have gained the nickname articial atoms. Current research
efforts are devoted to quantum dot ordering and positioning and also to the reduction
of the quantum dot size distribution. In contrast to the case of lithographically
dened quantum dots, it is challenging to make electrical contact to self-assembled
quantum dots and therefore most of the possible applications can be found in optics.
One of the major goals of research on self-assembled quantum dots is the fabrication
of non-classical light sources from single dots. Another is to use them as lightaddressable storage devices.
3.5.4
Colloidal Quantum Dots
Colloidal quantum dots are remarkably different from the quantum dot systems
mentioned above as they are chemically synthesized using wet chemistry and are
free-standing nanoparticles or nanocrystals grown in solution (Figure 3.13) [120]. In
this case, colloidal quantum dots are just a subgroup of a broader class of materials
that can be synthesized at the nanoscale using wet chemical methods. In the
fabrication of colloidal nanocrystals, the reaction chamber is a reactor containing
a liquid mixture of compounds that control the nucleation and the growth. In a
general synthesis of quantum dots in solution, each of the atomic species that will
form the nanocrystal building blocks is introduced in the reactor as a precursor. A
precursor is a molecule or a complex containing one or more atomic species required
for growing the nanocrystal. Once the precursors have been introduced into the
reaction ask they decompose, forming new reactive species (the monomers) that
will cause the nucleation and the growth of the nanocrystals. The energy required to
decompose the precursors is provided by the liquid in the reactor, either by thermal
collisions or by a chemical reaction between the liquid medium and the precursors or
by a combination of the two mechanisms [121].
The key parameter in the controlled growth of colloidal nanocrystals is the
presence of one or more molecular species in the reactor, broadly termed here
surfactants. A surfactant is a molecule that is dynamically adsorbed on the surface
of the growing quantum dot under the reaction conditions. It must be mobile enough
to provide access for the addition of monomer units, but stable enough to prevent the
aggregation of nanocrystals. The choice of surfactants varies from case to case: a
molecule that binds too strongly to the surface of the quantum dot is not suitable, as it
would not allow the nanocrystal to grow. On the other hand, a weakly coordinating
molecule would yield large particles or aggregates [122]. Some examples of suitable
surfactants include alkanethiols, phosphines, phosphine oxides, phosphates, phosphonates, amides, amines, carboxylic acids and nitrogen-containing aromatics. If the
growth of nanocrystals is carried out at high temperatures (e.g., at 200400 C) then
the surfactant molecules must be stable under such conditions in order to be a
suitable candidate for controlling the growth.
At low temperatures, or more generally when the growth is stopped, the surfactants are more strongly bound to the surface of the nanocrystals and provide their
solubility in a wide range of solvents. This coating allows for great synthetic exibility
in that it can be exchanged with another coating of organic molecules having different
functional groups or polarity. In addition, the surfactants can be temporarily removed
and an epitaxial layer of another material with different electronic, optical or
magnetic properties can be grown on the initial nanocrystal [123, 124].
By controlling the mixture of surfactant molecules that are present during the
generation and the time of growth of the quantum dots, excellent control of their size
and shape is possible [121, 125127] (Figure 3.14). As described in Chapter 4, the
wavelength of emission of quantum dots depends on their size. This can be nicely
seen by observing the uorescence light of solutions of colloidal quantum dots with
different size [128, 129] (Figures 3.14 and 3.15).
In contrast to organic uorophores, colloidal quantum dots have a continuous
absorption spectrum for energies higher than their bandgap, a symmetric emission
j63
64
spectrum without a red tail and, most importantly, reduced photobleaching [130, 131]
(Figure 3.15).
Since colloidal nanocrystals are dispersed in solution, they are not bound to any solid
support as is the case for the other two quantum dots systems described above.
Therefore, they can be produced in large quantities in a reaction ask and later they can
be transferred to any desired substrate or object. It is possible, for example, to coat their
surface with biological molecules such as proteins or oligonucleotides. Many biological
molecules perform tasks of molecular recognition with extremely high efciency. This
means that ligand molecules bind with very high specicity to certain receptor
molecules, similarly to a key-and-lock system. If a colloidal quantum dot is tagged
with ligand molecules, it specically binds to all the positions where a receptor molecule
is present. In this way it has been possible, for example, to make small groupings of
colloidal quantum dots mediated by molecular recognition [22, 23, 132] and to label
specic compartments of cell with different types of quantum dots [133135].
Although colloidal quantum dots are rather difcult to connect electrically, a few
electron transport experiments have been reported. In these experiments, nanocrystals were used as the active material in devices that behave as single electron
transistors [68, 136].
3.6
Perspectives and Limits of Top-Down and Bottom-Up Approaches
In 1965, Gordon Moore, co-founder of Intel Corporation, predicted that the number
of transistors on a computer chip would double about every 18 months [137]. The
exponential law, also known as Moores rst law, has described the development of
integrated circuits surprisingly well for decades. As the market for information
technology continues to grow, the demand for computer hardware instigates more
and more sophisticated top-down techniques to build more densely packed transistor
circuits. Moores second law states that the implementation of a next generation of
integrated circuits at minimum cost will be exponentially more expensive. Until all
the constraints nally limit the growth of the semiconductor top-down industry,
scientists and engineers assume that nanotechnology will give answers to most of the
technological challenges. For instance, as soon as the feature size of the semiconductor transistors reaches the level at which quantum phenomena are important,
different concepts for the assembly need to be considered. One possibility is the
bottom-up approach, which is based on molecular recognition and chemical selfassembly of molecules [13]. In combination with chemical synthesis techniques, the
bottom-up approach allows the assembly of macromolecular complexes with new
functionalities. Assuming Moores laws apply [137], the nal limit for optical topdown lithography is likely to be reached in less than a decade.
There are limitations also for the bottom-up assembly of complex nanostructures.
We can illustrate this by the example of assembling nanoparticles with DNA to
groupings of particles. Although building blocks exist in which each nanoparticle is
modied with an exactly dened number of binding sites, still no controlled
assemblies of more than around ve particles exist. There are two fundamental
technological problems: nonspecic adsorption and oppyness of biological
molecules. Nonspecic adsorption causes particles to stick together, although they
are not supposed to be connected. Many biological molecules that can be used as
glue for the assembly of particle groupings, such as proteins and DNA, tend to
adsorb nonspecically on the surface of nanoparticles. Although covalent attachment
of these molecules dominates over nonspecic adsorption and thus connection of the
particles via their designated binding sites, there is a non-negligible amount of
nonspecic interaction between the particles. For a larger particle grouping, just a
single nonspecic connection can destroy the build-up of the whole grouping. One
major task in this direction is to improve the surface chemistry of particles in order to
obtain inert surfaces to which biological molecules do not adsorb nonspecically.
Biological molecules are intrinsically soft compared with inorganic materials. This
implies that the connection between particles that are connected via biological
molecules will always retain a certain degree of exibility. In particular, the attachment of the glue, that is, the biological molecule, to the particle surface is not rigid.
Therefore, it will be almost impossible to form large, nonperiodic three-dimensional
stiff structures with static geometry of particle groupings connected via biological
molecules. Further, particle assemblies involving biological molecules as glue will be
always limited in stability. Biological molecules are bound to their natural environment and cannot withstand many articial conditions, such a high temperatures.
Although both top-down and bottom-up strategies have clear limitations, we
are still far away from having reached them. In fact, the intrinsics may nally be
overcome by combining both approaches. Still, Feynmans statement is true: there is
plenty of room at the bottom!
j65
66
References
1 Whitesides, G.M. (2005) Nanoscience,
nanotechnology and chemistry. Small, 1,
172179.
2 Whitesides, G.M. (1998) Nanotechnology:
art of the possible. Technology Review, 101,
8487.
3 Lehn, J.M. (2004) Supramolecular
chemistry: from molecular information
towards self-organization and complex
matter. Reports on Progress in Physics, 67,
249265.
4 Binnig, G., et al. (1982) Surface studies by
scanning tunneling microscopy. Physical
Review Letters, 49, 5761.
5 Binnig, G., Quate, C.F. and Gerber, C.
(1986) Atomic force microscope. Physical
Review Letters, 56, 930933.
6 Gimzewski, J.K. and Joachim, C. (1999)
Nanoscale science of single molecules
usinglocalprobes.Science,283,16831688.
7 Lieber, C.M., Liu, J. and Sheehan, P.E.
(1996) Understanding and manipulating
inorganic materials with scanning probe
microscopes. Angewandte Chemie International Edition in English, 35, 687704.
8 Poggi, M.A., et al. (2004) Scanning probe
microscopy. Analytical Chemistry, 76,
34293443.
9 Wiesendanger, R. (1997) Scanning-probebased science and technology. Proceedings
of the National Academy of Sciences of
the United States of America, 94,
1274912750.
10 Wiesendanger, R. (1994) Contributions of
scanning probe microscopy and
spectroscopy to the investigation and
fabrication of nanometer-scale structures.
Journal of Vacuum Science & Technology B,
12, 515529.
11 Friedbacher, G. and Fuchs, H. (1999)
Classication of scanning probe
microscopies (technical report). Pure and
Applied Chemistry, 71, 13371357.
12 Quate, C.F. (1997) Scanning probes as a
lithography tool for nanostructures.
Surface Science, 386, 259264.
References
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
j67
68
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
References
78
79
80
81
82
83
84
85
86
87
88
j69
70
References
125 Peng, X., et al. (2000) Shape control of
CdSe nanocrystals. Nature, 404, 5961.
126 Puntes, V.F., Krishnan, K. and Alivisatos,
A.P. (2002) Synthesis of colloidal cobalt
nanoparticles with controlled size and
shapes. Topics in Catalysis, 19, 145148.
127 Kudera, S., et al. (2006) Synthesis and
perspectives of complex crystaline nanostructures. Phys. Status Solidi C, 203,
13291336.
128 Alivisatos, A.P.(August, 1995)
Semiconductor nanocrystals. MRS
Bulletin, 2332.
129 Alivisatos, A.P. (1996) Perspectives on the
physical chemistry of semiconductor
nanocrystals. Journal of Physical Chemistry
A, 100, 1322613239.
130 Wu, M.X., et al. (2003)
Immunouorescent labeling of cancer
marker Her2 and other cellular targets
with semiconductor quantum dots
corrigenda. Nature Biotechnology, 21, 452.
131 Bruchez, M.P. (2005) Turning all the
lights on: quantum dots in cellular assays.
132
133
134
135
136
137
j71
j73
4
Fundamental Principles of Quantum Dots1)
Wolfgang J. Parak, Liberato Manna, and Thomas Nann
4.1
Introduction and Outline
4.1.1
Nanoscale Science and Technology
In the last decade new directions of modern research, broadly dened as nanoscale
science and technology have emerged [2, 3]. These new trends involve the ability to
fabricate, characterize and manipulate articial structures, whose features are controlled at the lower nanometer scale. They embrace areas of research as diverse as
engineering, physics, chemistry, materials science and molecular biology. Research in
this direction has been triggered by the recent availability of revolutionary instruments
and approaches that allow the investigation of material properties with a resolution
close to the atomic level. Strongly connected to such technological advances are
pioneering studies that have revealed new physical properties of matter at a level which
is intermediate between the atomic and molecular level and bulk.
Materials science and technology is a rapidly evolving eld and is currently making
the most signicant contributions to research in nanoscale science. It is driven by the
desire to fabricate materials with novel or improved properties. Such properties can
be, for instance, strength, electrical and thermal conductivity, optical response,
elasticity and wear resistance. Research is also evolving towards materials that are
designed to perform more complex and efcient tasks. Examples include materials
with a higher rate of decomposition of pollutants, a selective and sensitive response
towards a given biomolecule, an improved conversion of light into current and more
efcient energy storage. For such and more complex tasks to be realized, novel
materials have to be based on several components whose spatial organization
is engineered at the molecular level. This class of materials can be dened as
1) This chapter has been partly adapted from a
previous version which included contributions also from Dr. Daniele Gerion, Dr.
Friedrich Simmel and Professor Dr. Paul
Alivisatos [1].
Nanotechnology. Volume 1: Principles and Fundamentals. Edited by Gnter Schmid
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31732-5
74
4.2
Nanoscale Materials and Quantum Mechanics
4.2.1
Nanoscale Materials are Intermediates Between Atomic and Bulk Matter
j75
76
crystal might have higher chemical reactivity than the corresponding bulk solid and
that it will probably melt at lower temperatures. Consider now the example of a
carbon nanotube, which can be thought of as a sheet of graphite wrapped in such a
way that the carbon atoms on one edge of the sheet are covalently bound to the atoms
on the opposite edge of the sheet. Unlike its individual components, a carbon
nanotube is chemically extremely stable because the valences of all its carbon atoms
are saturated. Moreover, we guess that carbon nanotubes can be good conductors
because electrons can move freely along these tiny, wire-like structures. Once again,
we see that such nanoscopic objects can have properties which do not belong to the
realm of their larger (bulk) or smaller (atoms) counterparts. However, there are many
additional properties specic to such systems which cannot be understood by such a
simple reasoning. These properties are related to the sometimes counterintuitive
behavior that charge carriers (electrons and holes) can exhibit when they are forced to
dwell in such structures. These properties can only be explained by the laws of
quantum mechanics.
4.2.2
Quantum Mechanics
discrete energy level spectrum. Transitions between any two levels are seen, for
instance, as discrete peaks in the optical spectra. The system is then also referred to as
quantum conned. If all the dimensions of a semiconductor crystal shrink down to
a few nanometers, the resulting system is called a quantum dot and will be the
subject of our discussion throughout this chapter. The main point here is that in order
to rationalize (or predict) the physical properties of nanoscale materials, such as their
electrical and thermal conductivity or their absorption and emission spectra, we need
rst to determine their energy level structure.
For quantum conned systems such as quantum dots, the calculation of the energy
structure is traditionally carried out using two alternative approaches. One approach
was just outlined above. We take a bulk solid and we study the evolution of its band
structure as its dimensions shrink down to few nanometers. This method will be
described in more detail later (Section 4.4). Alternatively, we can start from the
individual electronic states of single isolated atoms as shown in Section 4.3, and then
study how the energy levels evolve as atoms come closer and start to interact with each
other.
4.3
From Atoms to Molecules and Quantum Dots
The chemical approach towards quantum dots resembles the bottom-up strategy by
which molecules are formed by chemical reactions between individual atoms or
smaller molecules. Molecules are stable assemblies of an exactly dened nite number
of atoms, whereas solids are stable assemblies of atoms or molecules with quasiinnite dimensions. If the assembly of units within a solid does not follow translational
symmetry, the solid is called amorphous. If, on the other hand, the units are repeated
regularly, the solid is called crystalline. Since quantum dots have nite dimensions and
are regular assemblies of atoms, such nano-objects are regarded as large molecules
from a chemists point of view, whereas physicists see them usually as small crystals.
The electronic properties of quantum dots can now be described and calculated by
linear combinations of atomic orbitals (LCAO method or H
uckel theory) and other
approximations [10]. Here, the starting point is an atom, whereas the physical
approach in Section 4.4 starts with an innite wavefunction. It will be shown that
the results of both approaches are basically the same.
The fundamental idea of quantum theory was developed at the beginning of the
last century and afrms that particles have wave-like properties and vice versa. In
about 1923, de Broglie suggested his famous momentumwavelength relationship (4.3)2) by combining Einsteins relativistic energy (4.1) with the energy of a
photon (4.2):
2) (1) (2) ) mc2 hn (h Planks constant,
c speed of light).
Momentum of a photon p mc ) pc hn.
Wavelength of a photon l c/n ) p h/l
) l h/p.
j77
78
E mc 2
4:1
E hn
4:2
l h=p
4:3
The left-hand term (wavelength) in Equation 4.3 represents the wave nature of a
particle, whereas the right-hand term (momentum) represents the particle-nature of
a wave. Equation 4.3 can be written as3)
p
l h=p h= 2mE V
4:4
where m is the mass of the particle and E and V are its total and potential energy,
respectively. Combining Equation 4.4 with the classical three-dimensional wave
equation:
r2 yx; y; z 2p=l2 yx; y; z
where C is the wavefunction results in
p
r2 y 2p 2mE V=h2 y 8p2 m=h2 E Vy
4:5
4:6
2m
E Vy 0
h2
4:7
or
^y Ey
H
4:8
^ H
; y is called eigenfunction of
using the Hamiltonian operator H
the operator and E is the eigenvalue, which represents the energy.
The rst step to calculate electronic properties of matter is to apply Schr
odingers
equation to the hydrogen atom. Therefore, we look at the wavefunction of the electron
in the potential eld of the nucleus, which is basically the coulomb attraction between
electron and nucleus:
V
e2
4pe0 r
q
;...
x; . . . ; hi qx
4:9
4:10
The solution of Schrodingers equation with the separated variables leads to three
quantum numbers associated with the three functions and the hydrogen atoms
3) E p2/2m V (total energy E kinetic energy
p2/2m potential energy V)
Yp2 2mE VY3l h2mE V 1=2
energy levels. Furthermore, each one-electron wavefunction can exist in two forms
called spin states.4) The function y2 (yy , respectively) gives the probability density
of nding an electron at a given point. The different eigenfunctions y for different
sets of quantum numbers are called atomic orbitals (AOs) with corresponding
energies (eigenvalues) E [10, 11].
Schrodingers equation for multi-electron atoms and molecules becomes increasingly complicated, since all of the interactions between different electrons and nuclei
contribute to the potential energy. In the BornOppenheimer approximation, the
terms that describe movement of the nuclei are decoupled from those that describe
the movement of the electrons, which move much faster compared with the nuclei.
Thus the Hamiltonian for a molecule with N atoms and K electrons reads
!
X X
h2 X 2
r
Zj =r ij
^
H
2m i1;K
j1;N i1;K
4:11
X
X
X
X
1=r il
Zj Zm =r jm
i1;K 1 li 1;K
j1;N 1 mj 1;N
odinger
where Zi are charges of the nuclei and rij distances between two charges. Schr
equations for molecules cannot be solved analytically. Therefore, approximation
methods have to be used. These methods can be categorized as either ab initio or
semiempirical. Ab initio methods use only natural constants to solve Schr
odingers
equation. The most prominent ab initio method is the HartreeFock method.
Semiempirical methods use measured values (usually spectroscopic data) for the
same purpose.
The numerical solution of the HartreeFock equation is only valid for atoms and
linear molecules. Therefore, another approximation is needed: orbitals are written as
the product of so-called basis functions (usually exponential functions) and the
wavefunctions of hydrogen. These functions are called basis sets and vary for
different approaches. Ab initio methods have high computing requirements, so that
they are restricted to the calculation of some dozens of atoms.
In order to calculate electronic states of larger molecules, one has to introduce
further simplications. Semiempirical methods make use of experimental or tted
data to replace variables or functions in the HartreeFock matrices. The general
procedure for calculating solutions for Schrodingers equation remains the same as
described above, but due to the simplications, semiempirical methods allow for the
calculation of molecules with several hundred atoms, including nanocrystals.
Figure 4.1 displays schematically the transition from atomic orbitals (s, p or sp3)
over molecular orbitals (s, s ) to quantum dots and semiconductor energy bands.
Since electrons populate the orbitals with lowest energy rst, there is a highest
occupied (molecular) or binding orbital (the valence band in semiconductors)
and a lowest unoccupied (molecular) or antibinding orbital (the conduction
band). The energy gap between the highest occupied (molecular) orbital (HOMO)
4) Readers who are interested in the details of this
solution can nd it in every common physical
phemistry textbook [10, 11].
j79
80
and lowest unoccupied (molecular) orbital (LUMO) is characteristic for luminophores, quantum dots and semiconductors. The energy gap decreases with increasing number of atoms in the transition region between molecules and bulk solids, as
indicated in Figure 4.1. The exact calculation of this (most interesting) transition
region follows the mathematical scheme described above. Alternatively, one can view
the problem similar to a particle-in-a-box approach, which is outlined in Section 4.4.
First calculations of the energy of the rst excited state in semiconductor quantum
dots were carried out in the early 1980s by Brus [13, 14]. Brus did not solve
Schrodingers equation for the quantum dot, but for the exciton within a semiconductor nanocrystal by means of a variational method [effective-mass approximation
(EMA)]. This approach thus resembles the particle-in-a-box method (see Section 4.4).
The rst semiempirical calculation was published in 1989 by Lippens and Lannoo [15].
They used the tight-binding approach to model CdS and ZnS quantum dots. As
depicted in Figure 4.1 and calculated rstly by Brus, they found an increasing energy
gap between HOMO and LUMO with decreasing nanocrystal size. Moreover, their
results t much better with experimental data than those obtained with the effective
mass approximation (EMA).
Further renements include the linear combination of atomic orbitals (LCAO) [16],
the semiempirical pseudopotential calculation [17] and the kp method [18]. All of
these methods provide estimates for the size-dependent bandgap of quantum dots.
Even though the agreement of the calculations with the experimental data differs
slightly, the general result is clear: the bandgap of the quantum dots increases with
decreasing size of the nanocrystals. These results were as expected and thus not very
useful for the experimental scientist so far. However, they paved the way for more
sophisticated calculations including the inclusion of defects in nanocrystals and the
presence of surface ligands. Very few examples of such calculations have been
published so far. One example is the calculation of the size-dependent behavior of the
quantum dotligand bond [19].
4.4
Shrinking Bulk Material to a Quantum Dot
We now consider the case of a three-dimensional solid with size dx, dy, dz containing N
free electrons. Free means that those electrons are delocalized and thus not bound
to individual atoms. Furthermore, we will make the assumption that the interactions
between the electrons, and also between the electrons and the crystal potential, can
be neglected as a rst approximation. Such a model system is called free
electron gas [20, 21]. Astonishingly, this oversimplied model still captures many
of the physical aspects of real systems. From more complicated theories, it has been
learnt that many of the expressions and conclusions from the free electron model
remain valid as a rst approximation even when one takes electroncrystal and
electronelectron interactions into account. In many cases it is sufcient to replace
the free electron mass m by an effective mass m which implicitly contains the
corrections for the interactions. To keep the story simple, we proceed with the free
electron picture. In the free electron model, each electron in the solid moves with a
!
velocity v vx ; vy ; vz . The energy of an individual electron is then just its kinetic
energy:5)
1 !
1
E m v 2 mv2x v2y v2z
2
2
4:12
j81
82
ms 1/2), only two electrons with opposite spins can have the same velocity v . This
case is analogous to the Bohr model of atoms, in which each orbital can be occupied
!
by two electrons at maximum. In solid-state physics, the wavevector k kx ; ky ; kz
of a particle is more frequently used instead of its velocity to describe the particles
!
!
state. Its absolute value k j k j is the wavenumber. The wavevector k is directly
!
!
proportional to the linear momentum p and thus also to the velocity v of the
electron:
!
p mv
h !
k
2p
4:13
The scaling constant is Plancks constant h and the wavenumber is related to the
wavelength l associated with the electron through the de Broglie relation [20, 21]
(Figure 4.2):
!
k j k j
2p
l
4:14
The calculation of the energy states for a bulk crystal is based on the assumption
of periodic boundary conditions (Figure 4.2). Periodic boundary conditions are a
mathematical trick to simulate an innite (d ! 1) solid. This assumption
implies that the conditions at opposite borders of the solid are identical. In this way,
an electron that is close to the border does not really feel the border. In other
words, the electrons at the borders behave exactly as if they were in the bulk. This
condition can be realized mathematically by imposing the following condition on
the electron wavefunctions: y(x, y, z) y(x dx, y, z), y(x, y, z) y(x, y dy, z) and
y(x, y, z) y(x, y, z dz). In other words, the wavefunctions must be periodic with a
period equal to the whole extension of the solid number [21, 22]. The solution of the
stationary Schrodinger equation under such boundary conditions can be factorized
into the product of three independent functions y(x, y, z) y(x)y(y)y(z)
Aexp(ikxx)exp(ikyy)exp(ikzz). Each function describes a free electron moving along
one Cartesian coordinate. In the argument of the functions kx,y,z is equal to
nDk n2p/dx,y,z and n is an integer [2022]. These solutions are waves that
propagate along the negative and positive directions for kx,y,z > 0 and kx,y,z < 0,
respectively. An important consequence of the periodic boundary conditions is that
!
all the possible electronic states in the k space are equally distributed. There is an
easy way of visualizing this distribution in the ideal case of a one-dimensional free
electron gas: there are two electrons (ms 1/2) in the state kx 0 (vx 0), two
electrons in the state kx Dk (vx Dv), two electrons in the state kx Dk
(vx Dv), two electrons in the state kx 2Dk (vx 2Dv) and so on.
For a three-dimensional bulk material we can follow an analogous scheme. Two
electrons (ms 1/2) can occupy each of the states (kx, ky, kz) (nxDk, nyDk,
nzDk), again with nx,y,z being an integer. A sketch of this distribution is shown in
!
Figure 4.3. We can easily visualize the occupied states in k-space because all these
states are included in a sphere whose radius is the wavenumber associated with the
highest energy electrons. At the ground state, at 0 K, the radius of the sphere is the
Fermi wavenumber kF (Fermi velocity vF). The Fermi energy E F / k2F is the energy of
the last occupied electronic state. All electronic states with an energy E EF are
occupied, whereas all electronic states with higher energy E > EF are empty. In a solid,
the allowed wavenumbers are separated by Dk n2p/dx,y,z. In a bulk material dx,y,z
is large and so Dk is very small. Then the sphere of states is lled quasicontinuously [21].
We need now to introduce the useful concept of the density of states D3d(k),
which is the number of states per unit interval of wavenumbers. From this
denition D3d(k)Dk is the number of electrons in the solid with a wavenumber
between k and k Dk. If we know the density of states in a solid we can calculate,
for instance, the total number of electrons having wavenumbersRless than a given
k
kmax, which we will call N(kmax). Obviously, N(kmax) is equal to 0 max D3d kdk. In
the ground state of the solid all electrons have wavenumbers k kF, where kF is
the Fermi wavenumber. Since in a bulk solid the states are homogeneously
!
distributed in k-space, we know that the number of states between k and k Dk is
proportional to k2Dk (Figure 4.3). This can be visualized in the following way.
!
The volume in three-dimensional k-space scales with k3. If we only want to count
the number of states with a wavenumber between k and k Dk, we need to
determine the volume of a spherical shell with radius k and thickness Dk. This
volume is proportional to product of the surface of the sphere (which scales as k2)
j83
84
with the thickness of the shell (which is Dk). D3d(k)Dk is thus proportional to k2Dk
and, in the limit when Dk approaches zero, we can write
D3d k
dNk
/ k2
dk
4:15
p p
dNE dNk dk
/ E 1= E / E
dE
dk dE
4:16
This can be seen schematically in Figure 4.3. With Equation 4.16 we conclude our
simple description of a bulk material. The possible states in which an electron can be
found are quasi-continuous. The density of states scales with the square root of the
energy. More details about the free electron gas model and more rened descriptions
of electrons in solids can be found in any solid-state physics textbook [20].
4.4.2
Two-Dimensional Systems
We now consider a solid that is fully extended along the x-and y-directions, but whose
thickness along the z-direction (dz) is only a few nanometers (Figure 4.5). Free
electrons can still move freely in the xy plane. However, movement in the z-direction
is now restricted. Such a system is called a two-dimensional electron gas (2DEG) [23].
As mentioned in Section 4.2, when one or more dimensions of a solid become
smaller than the de Broglie wavelength associated with the free charge carriers, an
additional contribution of energy is required to conne the component of the motion
of the carriers along this dimension. In addition, the movement of electrons along
such a direction becomes quantized. This situation is shown in Figure 4.4. No
electron can leave the solid and electrons that move in the z-direction are trapped in a
box. Mathematically this is described by innitely high potential wells at the border
z 1/2dz.
j85
86
The solutions for the particle in a box situation can be obtained by solving the onedimensional Schrodinger equation for an electron in a potential V(z), which is zero
within the box but innite at the borders. As can be seen in Figure 4.4, the solutions
are stationarywaves with energies6)E nz r2 k2z =2m h2 k2z =8p2 m h2 n2z =8md2z ,
nz 1, 2, . . . [10, 22]. This is similar to states kz nzDkz with Dkz p/dz. Again,
each of these states can be occupied at the maximum by two electrons.
Let us compare the states in the k-space for three- and two-dimensional materials
(Figures 4.3 and 4.5). For a two-dimensional solid that is extended in the xy
plane there are only discrete values allowed for kz. The thinner the solid in the zdirections, the larger is the spacing Dkz between those allowed states. On the other
hand, the distribution of states in the kx ky plane remains quasi-continuous.
Therefore, one can describe the possible states in the k-space as planes parallel to
the kx- and ky-axes, with a separation Dkz between the planes in the kz-direction. We
can number the individual planes with nz. Since within one plane the number of
states is quasi-continuous, the number of states is proportional to the area of the
plane. This means that the number of states is proportional to k2 k2x k2y . The
number of states in a ring with radius k and thickness Dk is therefore proportional
to kDk. Integration over all rings yields the total area of the plane in k-space. Here,
dNk
/k
dk
4:17
j87
88
In the ground state, all states with k kF are occupied with two electrons. We now
want to know how many states exist for electrons that have energies between E and
E DE. From Equations
p 4.12 and 4.13 we
pknow the relation between k and E: E
(k) ( k2 and thus k / E and dk=dE / 1= E . By using Equation 4.17 we obtain the
density of states for a two-dimensional electron gas; see also Figure 4.5 [22].
D2d E
p
dNE dNk dk p
/ E 1= E / 1
dE
dk dE
4:18
Let us now consider the case in which the solid also shrinks along a second (y)
dimension. Now electrons can only move freely in the x-direction and their motion
along the y- and z-axes is restricted by the borders of the solid (Figure 4.6). Such a
system is called quantum wire and when electrons are the charge carriers a onedimensional electron system (1DES). The charge carriers and excitations now can
move only in one dimension and occupy quantized states in the other two dimensions.
The states of a one-dimensional solid can now be obtained by methods that are
analogous to those described for the three- and two-dimensional materials. In the
x-direction electrons can move freely and again we can apply the concept of periodic
boundary conditions. This gives a quasi-continuous distribution of states parallel to
the kx-axis and for the corresponding energy levels. Electrons are conned along the
remaining directions and their states can be derived from the Schr
odinger equation
for a particle in a box potential. Again, this yields discrete ky and kz states. We can
now visualize all possible states as lines parallel to the kx-axis. The lines are separated
by discrete intervals along ky and kz, but within one line the distribution of kx states is
quasi-continuous (Figure 4.6). We can count the number of states along one line by
measuring the length of the line. The number of states is therefore proportional to
k kx. Hence the number of states with wavenumbers in the interval between k and
k Dk is proportional to Dk:
D1d k
dNk
/1
dk
4:19
In the ground state, all states with k kF are occupied with two electrons. From
Equations 4.12 and 4.13,
p we know the relation
p between k and E for free electrons:
E(k) ( k2, and thus k / E and dk=dE / 1= E . By using Equation 4.19, we obtain the
density of states for a one-dimensional electron gas:
D1d E
p
p
dNE dNk dk
/ 1 1= E / 1= E
dE
dk dE
4:20
j89
90
When charge carriers and excitations are conned in all three dimensions, the system
is called quantum dot. The division is somewhat arbitrary since, for instance,
clusters made of very few atoms are not necessarily considered as quantum dots.
Although clusters are smaller than the de Broglie wavelength, their properties
depend critically on their exact number of atoms. Larger clusters have a well-dened
lattice and their properties no longer depend critically on their exact number of
atoms. We shall then refer to such systems with the term quantum dots [5464]
(Figure 4.7).
4.5
Energy Levels of a (Semiconductor) Quantum Dot
j91
92
model. As described in the previous section (Figure 4.4), the lowest energy for an
electron in a one-dimensional potential well is
E well;1d 1=8h2 =md2
4:21
where d is the width of the well. In a quantum dot, the charge carriers are conned
in all three dimensions and this system can be described as an innite threedimensional potential well. The potential energy is zero everywhere inside the
well but is innite on its walls. We can also call this well a box. The simplest
shapes for a three-dimensional box can be, for instance, a sphere or a cube. If
the shape is cubic, the Schrodinger equation can be solved independently for each
of the three translational degrees of freedom and the overall zero-point energy
is simply the sum of the individual zero point energies for each degree of
freedom [10, 65]:
E well;3dcube 3E well;1d 3=8h2 =md2
4:22
4:23
4:24
4:25
were me and mh are the effective masses for electrons and holes, respectively. In
order to calculate the energy required to create an electronhole pair, another term
(ECoul) has to be considered. The Coulomb interaction ECoul takes into account the
mutual attraction between the electron and the hole, multiplied by a coefcient that
describes the screening by the crystal. In contrast to Ewell, the physical content of
this term can be understood within the framework of classical electrodynamics.
However, an estimate of such a term is only possible if the wavefunctions for the
electron and the hole are known. The strength of the screening coefcient depends
on the dielectric constant e of the semiconductor. An estimate of the coulomb term
yields
E Coul 1:8e2 =2pee0 d
4:26
This term can be fairly signicant because the average distance between an electron
and a hole in a quantum dot can be small [13, 14, 55, 69, 70]. We can now estimate
the size-dependent energy gap of a spherical semiconductor quantum dot, which is
given by the following expression [13, 14, 55, 6870]:
E g dot E g bulk E well E Coul
4:27
j93
94
Then, by inserting Equations 4.24 and 4.26 into Equation 4.27, we obtain
E g d E g bulk h2 =2m d2 1:8e2 =2pee0 d
4:28
Here we have emphasized the size dependence in each term. Equation 4.28 is only a
rst approximation. Many effects, such as crystal anisotropy and spinorbit
coupling, have to be considered in a more sophisticated calculation. The basic
approximation for the bandgap of a quantum dot comprises two size-dependent
terms: the connement energy, which scales as 1/d2, and the Coulomb attraction,
which scales as 1/d. The connement energy is always a positive term and thus the
energy of the lowest possible state is always raised with respect to the bulk situation.
On the other hand, the Coulomb interaction is always attractive for an electronhole
pair system and therefore lowers the energy. Because of the 1/d2 dependence, the
quantum connement effect becomes the predominant term for very small
quantum dot sizes (Figure 4.9).
The size-dependent energy gap can be a useful tool for designing materials with
well-controlled optical properties. A much more detailed analysis on this topic can be
found in, for example, a paper by Efros and Rosen [61].
In this chapter, we have shown how the dependence of the energy gap of
semiconductors on the size of the material can be explained by either shrinking
down the material from bulk to nanometer dimensions or by assembling the material
atom by atom. Both views, one in a top-down and the other in a bottom-up approach,
ultimately lead to the same physics. In Chapter 3, a more general description of these
two types of approaches is given.
References
References
1 Parak, W.J., Manna, L., Simmel, F.C.,
Gerion, D. and Alivisatos, P. (2004)
Nanoparticles from Theory to Application
(ed. G. Schmid), 1st edn., Wiley-VCH,
Weinheim, 4.
2 Lane, N. (2001) Journal of Nanoparticle
Research, 3, 95.
3 Service, R.F. (2000) Science, 290,
1526.
4 Kingon, A.I., Maria, J.-P. and Streiffer, S.K.
(2000) Nature, 406, 1032.
5 Lloyd, S. (2000) Nature, 406, 1047.
6 Ito, T. and Okazaki, S. (2000) Nature, 406,
1027.
7 Peercy, P.S. (2000) Nature, 406, 1023.
8 Cohen-Tannoudji, C., Diu, B. and Laloe, F.
(1997) Quantum Mechanics, 1st edn.,
Wiley, New York.
9 Yoffe, A.D. (2001) Advances in Physics,
50, 1.
10 Atkins, P.W. (1986) Physical Chemistry, 4th
edn., W. H. Freeman, New York.
11 Karplus, M. and Porter, R.N. (1970) Atoms
and Molecules, 1st edn., W. A. Benjamin,
New York.
12 Alivisatos, A.P. (1997) Endeavour, 21, 56.
13 Brus, L.E. (1983) Journal of Chemical
Physics, 79, 5566.
14 Brus, L.E. (1984) Journal of Chemical
Physics, 80, 4403.
15 Lippens, P.E. and Lannoo, M. (1989)
Physical Review B-Condensed Matter, 39,
10935.
16 Delerue, C., Allan, G. and Lannoo, M.
(1993) Physical Review B-Condensed Matter,
48, 11024.
17 Wang, L.-W. and Zunger, A. (1996) Physical
Review B-Condensed Matter, 53, 9579.
18 Fu, H., Wang, L.-W. and Zunger, A. (1998)
Physical Review B-Condensed Matter, 57,
9971.
19 Schrier, J. and Wang, L.-W. (2006) Journal
of Physical Chemistry B, 110, 11982.
20 Kittel, C. (1989) Einf
uhrung in die
Festkorperphysik, 8th edn., R. Oldenbourg
Verlag, Munich.
j95
96
j97
5
Fundamentals and Functionality of Inorganic Wires,
Rods and Tubes
Jorg J. Schneider, Alexander Popp, and Jorg Engstler
5.1
Introduction
Nanostructured one-dimensional inorganic tubes, wires and rods are known for a
variety of single elements and combinations thereof. The number of studies towards
their synthesis and properties is now vast. For nanowires anisotropic nanocrystals
with large aspect ratio (length to diameter) around 5000 papers have been published
during the last 2 years. An excellent comprehensive monograph and timely reviews
presenting the current state of the art up to 2005 exist [1].
In this chapter, physical properties of 1D inorganic structures will be discussed rst,
followed by a section devoted to general techniques for the synthesis of inorganic
wires, rods and tubes. The chapter then highlights some of the material developments
made over the last few years in the very active area of nanostructured inorganic rods,
wires and tubes with respect to their specic materials functionality. [The eld of
carbon nanotubes (CNTs) is probably still the fastest growing of all 1D materials. CNTs
will only be touched upon in this chapter as far as sensing and nanomicro integration
in functional devices are concerned. For further reading on this topic, the reader is
referred to numerous excellent monographs in the eld.] Due to the breadth of the
eld, the selection of materials and applications is somewhat subjective and reects
what the authors personally feel are hot topics. Where could the materials discussed
impact on future technological developments? Fields of sensing and micro/nanoelectronics integration will be selectively addressed here. It is the intention of this
chapter to introduce the reader to these elds and the currently ongoing rapid
developments in these promising future elds of functional 1D nanomaterials.
A drastic change in materials properties is often connected with the nanoscale range
(1100 nm). In addition to an understanding of fundamental size-related electronic
effects [quantum size effects (QSE)], which are connected with this miniaturization
of matter (for an intriguing description of how quantum phenomena arise in 0D and
1D nanostructured matter, see Chapter 4), interest in nanostructured materials often
arises from the fact that the small size connected with nanoscaled matter creates new
chemistry. For example, the extremely high number of interfaces connected with
Nanotechnology. Volume 1: Principles and Fundamentals. Edited by Gnter Schmid
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31732-5
98
5.2
Physical Properties of 1D Structures
Starting from a three-dimensional solid, the connement of electrons into a onedimensional structure leads to the quantization of electronic states in two directions
(lets us say x and y). In the z-direction electrons can move freely and give rise to a
quasi-continuous distribution of energy states along this dimension. Along x and y
they are conned and only one discrete state is possible. Once the diameter of the 1D
system is comparable to the de Broglie wavelength of the electrons, the 1D structure
will become a quantum wire. Probably the most important application-related aspect
of this dimensionality reduction in 1D materials is the restricted ow of charge
carriers in only a single direction, the conductivity channel (see Section 5.4.3.1). An
intriguing example can be found in single crystalline silicon nanowires smaller than
j99
100
5 nm in diameter (5 nm is the exciton size for Si). Via scanning tunneling microscopy
(STM), the electronic states and the bandgaps of such Si wires have been probed for
wires with different diameters <10 nm. Gaps ranging from 1.1 eV (7 nm Si wire
diameter) up to 3.5 eV (1.3 nm Si wire diameter) have been determined, demonstrating the extreme quantum connement effect in such structures (Figure 5.2) [3].
A variety of direction-dependent electronic effects, for example, polarizability of
light, are also different for anisotropic 1D materials compared with 0D materials [4].
Although electron conductivity in 1D wires and rods is favored along the preferred
direction, phonon transport is greatly impeded, in thin 1D nanostructures, due to
boundary scattering in the conned directions. Electrons may suffer elastic scattering
events during their journey along the wire. This has important implications on the
heat conductivity of 1D nanowire structures and for application of such structures in
wiring or circuiting next-generation semiconductor devices. On the other hand, poor
heat transport in conned nanowires may be used for development of thermoelectric
materials. The thermoelectric effect (Seebeck effect), which is responsible for this
property, describes the enhancement of the thermal electronic conductivity through a
material as phonon transport in the structure worsens (due to the 1D connement
effect). Theory has predicted a signicant increase over bulk values of this thermoelectric effect depending on the diameter, composition and charge carrier concentration of the 1D material of choice [5]. Nevertheless, research in that technologically
important area is still in its infancy. An example is discussed in Section 5.3.1. For a
detailed discussion of mesoscopic transport phenomena, for example, boundary
scattering in 1D conned structures, the reader is referred to Ref. [6].
5.3
Synthetic Methods for 1D Structures
Synthetic methods for obtaining 1D nanostructures are numerous and have been
reviewed by various authors recently [7]. They can be divided into top-down and
bottom-up techniques; in the former a desired nanostructure is formed by sophisticated physical techniques, for example, electron beam structuring or laser structuring of a bulk material. Such techniques, however, are always restricted to the
wavelength of the structuring beam, thus nanomaterials with dimensions well below
10 nm are still beyond the scope of these methods.
Bottom-up techniques, however, show a rich diversity towards desirable chemical and materials compositions and also structure (amorphous, crystalline), morphology (0D, 1D and 2D) and size. Several bottom-up synthetic techniques leading
to the formation of 1D structures which seem to have a more general impact can be
identied: the template method, self-assembly techniques, vapor to liquidsolid
synthesis of 1D nanostructures and electrochemical techniques. These techniques
show an enormous breadth since a variety of elemental compositions for different
structures and morphologies and also sizes are accessible. Often combinations
of individual techniques are employed, broadening further the scope for the
experimentalist.
5.3.1
The Template Approach
This is probably the most versatile method for the synthesis of 1D structures. A host
structure with a pore morphology the template is lled with a compact, more or
less dense material or as lm. The former produces compact wires or rods and the
latter results in the formation of tubes.
Filling of the pores is possible either via solution (simply by capillary lling), by
electrochemical deposition techniques or via the gas phase.
Oxidation of valve metals such as aluminum, titanium, tantalum and hafnium
leads to porous metal oxide lms on metal surfaces [8]. For aluminum this is a useful
and outstanding technique to prepare both surface-attached and free-standing porous
2D alumina lms (after detachment from the metal surface) with varying pore
diameters. The pore size of these lms is strongly dependent on the experimental
conditions and can be varied between 10 and several hundred nm [9]. Especially in the
case of aluminum, these lms can be fairly thick (up to several tens of mm) or thin
(down to several hundred nm), but still self-supporting, free-standing and handable.
The diameter of the pores (Dp) and the cells (Dc) of a porous alumina membrane
are dependent on the anodization potential applied in the electrolytic process [9].
Additional experimental parameters governing the pore size are temperature and
current density. The latter is inuenced by the concentration and type of electrolyte
used in the electrolysis process. Nearly perfectly ordered pores are accessible by
prestructuring the metallic surface [9a] (Figure 5.3).
Consequently, porous alumina membranes have been widely used to prepare
various types of mesoscale materials within their pores. Their enormous synthetic
impact in the area of mesostructured 1D materials comes from the ability to
combine well-established wet chemical synthesis techniques (e.g., solgel chemistry) with the straightforward synthesis and subsequent lling of porous alumina
templates.
Preparation of active sol precursors, lling and aging within the mesopores of
alumina, followed by nal calcination steps, have led to a huge variety of different
mesostructured 1D materials, for example, in the ceramics eld [1e,11]. The pores
can even be used to arrange silica in a columnar or circular arrangement with a
dened internal mesostructure. Further entrapment of a variety of 1D nanostructures into the mesopores of alumina such as metallic rods (Pt, Au, Pd), semiconductor rods and carbon nanostructures have been reported recently [1e,11].
Using a combination of solgel processing on the surface of (1 0 0)-oriented silicon
templates, a porous structure with an ordered arrangement of pores is formed. These
studies show that a topographic structure (here Si) can be used to engineer pores on
this solid surface, allowing a high degree of freedom. The lling of such pores with
cobalt has been demonstrated [12].
Another template technique giving porous, free-standing membranes, but of
polymeric materials, uses heavy ion track etching of polymer foils [13] to ll the
statistically formed ion tracks within the polymer with nanoscale materials in order
to form rod-like structures [14, 15] (Figure 5.4). This led to bunches of randomly
j101
102
Figure 5.4 Ion track etching process of a polymer film. Heavy ions
hit the polymer foil, and generate tracks in the membrane (a). The
tracks are stochastically arranged (b); solid rods are synthesized in
the tracks, for example, via electrochemical deposition (c); free
rods after complete polymer membrane dissolution (d).
oriented composite 1D structures. Careful dissolution of the template gives freestanding aligned metallic rods with well-dened crystal lattices (Figures 5.5 and 5.6).
Usingthetemplateapproachinconnectionwithelectrodepositionallowstheporesof
such templates to be lled with single- or multicomponent metallic rods. After
deposition of a metal electrode on the back side of such a porous template, one or even
more metals can be densely deposited inside the pores on purpose. This has been used,
for example, for deposition of Au/Pt segments. The deposition of gradient materials via
electrodeposition is also possible. This technique can also be has used for thedeposition
of magnetic structures. Their magnetic behavior depends on the aspect (length-todiameter) ratio of the individual rods. The easy axis of magnetization is parallel to the
nanorod or nanowire axis if the electrodeposited structure is longer than its width [16].
Otherwise, it is perpendicular to the deposited structure. This difference depending on
particle shape (morphology, platelet vs. rod structure) can be used to align bimetallic
nanorods side-by-side (Figure 5.7) [17a] or one after the other when the magnetic eld is
applied parallel to the substrate and to the 1D magnetic materials easy axis [17b].
Hybrid Co/Au nanorod structures are accessible by using organometallic chemistry. The appropriate choice of the molecular Co and Au precursors and the
Figure 5.6 High-resolution TEM micrograph of a 70-nm singlecrystalline Au nanowire. Low magnification (a); enlargement area
(b). Reproduced with permission from Ref. [14].
j103
104
stabilizing ligand allows control of the growth process and of the overall morphology
of the rod (tip or whole body growth) [18]. Combining this technique with the
template method should lead to 2D-arranged hybrid metallic rods.
Devising methods for aligning 1D materials with a uniform growth front is of
general importance for optimizing device performance, for example, in thermoelectric materials such as bismuth telluride. For example, overgrowth of rods when the
template pores are already lled has to be avoided in order not to cover the pores of
the template with an active material, which may lead to short-circuiting. This has
been achieved by applying a pulsed potential deposition technique and has yielded
a uniform growth front of Bi2Te3 nanowire arrays in porous alumina [19]. Using
nanorods as sacricial templates to generate polymeric or ceramic nanorod structures is another method that uses porous alumina as the initial template and
shows the high versatility of this porous template structure. Electrodeposition of,
for example, nickel rods into porous alumina, followed by dissolution of the oxide
template and coating of the metallic wires with either organic polymers or inorganic
polyelectrolyte ceramics, nally gives the corresponding polymeric or ceramic (after
subsequent calcinations) 1D structures. The complete method uses a hybrid technique of rst templating followed by a layer-by-layer deposition technique [20].
Although the nano-templating approach using porous alumina (Figure 5.8) works
well for template diameters down to about 20 nm, reports on 1D nanomaterials with
smaller diameters prepared with this so-called hard template technique [1e] are still
scarce so far [21].
5.3.2
Electrochemical Techniques
5.3.2.1 Electrospinning
This technique can be used to create nano- to microscale bers of mainly polymeric
materials [22]. Recent advances in the technique of electrospinning have brought this
j105
106
dense mesostructured ceramic. Since the as-prepared structure is already built from
nanosized particles, this step can often be performed under smoother conditions as
typically used in conventional ceramic processing [32].
5.3.3
VaporLiquidSolid (VLS) and Related Synthesis Techniques
For the synthesis of 1D structures from the gas phase, vaporliquidsolid (VLS) and
vaporsolid (VS) processes are the typical growth mechanisms which are accepted to
explain the 1D growth of mesoscopic structures [33]. In this growth process, a catalyst
particle rst melts, becomes saturated with a gaseous precursor and, when oversaturated, either an elemental wire or a compound wire depending on the precursor
extrudes from this catalyst droplet to form a single-crystal nanowire. This is
essentially the method proposed for whisker growth from the vapor [33]. Nanowires
grow as long as active catalyst is supplied and the growth temperature is maintained
(Figure 5.11).
Recent model studies on the inuence of various metal catalysts on the growth,
structure morphology and size of inorganic nanowires have shown that different
catalyst metals with different crystal morphologies are able to generate different wire
morphologies based on the VLS formation process [34, 35]. For the growth of Si
nanowires with Au nanocluster catalysts, it was found that the lowest energy surface,
which is a {1 1 1} plane for Si, controls nucleation and growth (Figure 5.12) [34]. The
results point towards the importance of an additional effect of oxygen in the goldcatalyzed growth kinetics of Si wires. The presence of oxygen can suppress Au catalyst
j107
108
migration and inuences the wire diameter and the overall wire morphology [36].
The presence of oxygen, be it in the form of gas-phase oxygen or surface-bound
oxygen, could be an important experimental parameter to modulate nanowire
morphology further in a more general way during VLS growth [36].
For ZnO nanowires, it has been shown recently that control of the partial O2
pressure under growth conditions is crucial for reproducible large-scale wire
formation (Figure 5.13). It is very likely that the gas-phase conditions play a major
role in the VLS growth of this and speculatively also other 1D materials made by this
technique. Studies towards understanding the inuence of reactive gas-phase
species are central to deducing how individual morphologies of 1D materials
depend on catalyst composition, shape and crystallinity in addition to the overall
reaction conditions [37].
The current understanding of growth control of nanowires via the VLS technique
is that (a) interplay given by the phase diagram of temperaturepressure and (b)
the composition ofthe precursor elements and the catalyst particle under consideration are crucial. Careful control of these conditions can give rise to a variety of
wire morphologies which are accessible at will once the conditions are thoroughly
adjusted.
As shown, indicative for the VLS mechanism are catalyst tips at the faceted ends of
the nanowires. Silicides are interesting semiconductors, with promising thermoelectric properties. Even though silicide wires (MSi2, M Fe, Co, Cr) were grown in
the presence of nickel or iron [38], no catalyst metal particles are detected on the
faceted nanowire ends as usually observed for the VLS growth mode. Obviously,
in the chemical vapor transport technique (CVT) which is used for the synthesis
of silicide wires a different gas-phase mechanism than in the most widely found
catalyst-driven VLS growth may operate.
A promising technique related to the VLS Au seeded nanowire growth method has
been reported. It allows oriented low-temperature gas-phase growth of pure Si nanowires [39]. Plasma growth at temperatures as low as 300 C enables an additional
degree of synthetic control over nanowire orientation (Figure 5.14). The crucial step
in the formation process seems to be Si incorporation at the vaporliquid interface.
Additional techniques also related to the traditional VLS-type growth mechanism
are catalyst-driven growth from the solution/liquid phase into the solid state (SLS)
and also from a supercritical liquid (uid) solution (SLFS) into the solid state. The
SLFS technique is synthetically complementary to the long-established technique
of seed-mediated growth from solution, which yields one-dimensional colloidal
metallic nanostructures [40]. In the former, a molecular precursor and in the latter
a nanoparticle is decomposed under controlled conditions in solution or under
supercritical conditions and then serves as a source for the metal catalyst particles
from which the crystalline 1D structure grows (Figure 5.15) [41]. For this growth
process from solution, low-melting metals as catalyst particles are essential (e.g., In,
Bi, Sn). For metals with higher melting points (e.g., Au, Ge), supercritical solvent
conditions have been successfully employed [4143]. This synthetic technique
allowed for the rst time access to colloidal Au quantum wires which show a
spectroscopic QSE.
j109
110
5.4
Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
in Nano/Micro/Macro-Integration
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
Figure 5.15 SolutionLiquidSolid mechanism for nanowire growth from solution. Adapted from Ref. [41].
Sensors based on a one-dimensional morphology (wire, tube) offer selective recognition for biological and chemical species of interest [48, 49]. This is based on their
unique electronic and optical properties, which have been studied in detail either for
isolated objects or in an unordered fashion for bundles of them. However, the control
and use of well-arranged, aligned 1D structures have also made enormous progress
within the last few years and have an impact on the applicability of such structures in
sensor devices. The basis for this is the eld effect transistor (FET) geometry, which
j111
112
has been extensively explored as a device bridging the micro- and the nanoworld. In a
FET device the contact between the source and drain electrode consists of a
semiconductor. Its conductivity can be modulated by a third electrode, the gate
coupled to the semiconductor by a dielectric layer through which charge is injected
into the semiconductor. Since the binding of charged or polar molecules to the 1D
semiconductor structure alters the channel electrode characteristics, this may lead to
an accumulation or depletion of charge carriers and thus an increase or decrease in
device conductance (Figure 5.16).
The idea was already put forth in the 1980s, but only for planar devices, which
however, show only limited applicability for this sensor principle [5052]. In a planar
2D lm, only the near-surface region is altered by this effect, whereas in a singlecrystalline rod surface, binding of an analyte has a strong impact on the depletion or
accumulation of charge carriers in the overall structure of the nanoscale 1D object.
The size of this effect depends, of course, on the size of the wire (which needs to be in
the region of 25 nm) and its hybridization with a biomolecule (e.g., DNA). This can
create a high charge density on the nanowire surface, which produces an electrostatic
gating effect. This so-called eld effect reduces the charge carrier concentration and
results in an increase in resistance (Vgate > 0) (Figure 5.16).
With respect to CNT structures, recently the separation of metallic and semiconducting single-walled CNTs, albeit still in low quantities, has been achieved. Dielectrophoresis is the key to the nanoscale manipulation and separation of CNTs. An AC
eld induces polarization in a nanoscale object and this results in a force which can be
used to manipulate and assemble nano-objects [53].
Dielectrophoresis in physiologically relevant saline solution has been performed
and it has been shown that this technique is useful in manipulating nanowires across
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
j113
114
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
and semiconducting CNTs. The technique uses a eld change occurring during
nanotube deposition between individual electrodes of a preformed microstructured
device (Figure 5.22) [55, 65]. Although not fully understood yet, redistribution of the
electric eld around the CNT in the gap seems to be important for the organization,
rather than short-circuiting of the gap electrodes by the entrapped object [65].
In this approach, the overall dimensions of an individually contacted CNT device
are only limited by the dimensions of the contacting electrodes rather than the
j115
116
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
dimensions of the CNT itself. However, this seems to be the case for other approaches
also [66]. So far the electrical device characteristics of over 100 CNTs have thus been
measured individually. Only up to 10% of the electrodes were bridged by multiple
CNTs [55].
Taking further into account the established routes for functionalizing CNTs [67],
this approach seems viable for applying CNTs in a comparable way to that described
above for Si wire arrays, for example, as sensor devices for the multiple detection of
analytes. The eld of immobilization of biomolecules is a sector of intense research
activity [68]. Due to the electronic properties of individual CNTs (e.g., as semiconducting or metallic tubes), their sensing towards biomolecules could be very
selective. Therefore, combining nanotubes with biosystems may provide access to
nanosized biosensors. The rst step for biomolecular interaction is the attachment of
the biomolecule to the tube. This could be achieved either via a covalent bonding
interaction or via a wrapping mechanism which uses non-covalent interactions, for
example, on the basis of van der Waals interactions. The former needs functionalization of the CNTs followed by covalent bonding of the target biomolecule and is already
well established. The latter rely on physisorption mechanisms and work typically for
proteins.
For the development of covalent CNT functionalization techniques, a tool box of
methods which have shown success in fullerene functionalization have been
advantageously employed so far [67]. For example, for the case of functionalization
of CNTs with bromomalonates containing thiol groups, attachment to gold surfaces
is possible, allowing the synthesis of electrode arrays for sensor applications [69].
Chemical functionalization of CNTs has also paved the way to DNACNT adducts
and subsequently to studies of their properties as biosensors. However, it seems that
DNA attachment occurs mainly towards the end of the nanotubes [70]. This sitespecic interaction points towards a sequence-specic poly(nucleic acid)DNA base
pairing rather than an unspecic interaction. This might be helpful for differentiation between two DNA sequences [68]. Pyrene functionalization of DNA itself can
lead to a very specic DNACNT adduct. Its stability is mediated via hydrophobic
interactions of the graphite-type sidewalls of the nanotubes and the pyrene anchors of
the DNA [71].
DNACNT adducts are highly water soluble and therefore DNA functionalization
of nanotubes avoids the use of surfactants, which are often used to solubilize CNTs
alone. As with native DNA, the adducts with CNTs are still charged species. Specic
DNA sequence detection with electrochemical CNTDNA biosensors has been
reported [72]. So far still a theoretical promise, the specic size-selective major
groove binding mechanism of B-DNA towards single-walled CNTs could lead to a
composite structure with unique sensor properties for ultrafast DNA sequencing or
electronic switching based on the combination of the individual electronic properties
of the single components CNT and DNA joined together in this composite structure
(Figure 5.23). [73].
In this respect, it is intriguing to see the results that a strong binding capability of
the major groove of DNA towards 0D nanoparticles has already been proven for gold
nanoclusters, both theoretically and experimentally [74].
j117
118
5.4.2
Piezoelectrics Based on Nanowire Arrays
Converting mechanical energy into electric energy or signals is the area of piezoelectronics and relies on specic structures and morphologies of materials. Recently,
nanomaterials have come into the focus of this application-driven area. ZnO is a
material which probably exhibits the most diverse morphological congurations of
any nanomaterial so far studied in detail. In addition to particles and wires there
are nanobelts, nanosprings, nanorings, nanobows and nanohelices known and
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
characterized for ZnO [75]. Apart from use as a catalyst and sensor material which can
be considered as more traditional areas for 1D nanoscale ZnO [76], the semiconducting and piezoelectric properties of ZnO nanowire arrays have recently been probed as
piezoelectrics with unique properties. Such a ZnO-based nanogenerator converts
mechanical energy into electric power and vice versa using massively aligned ZnO
nanowires (Figure 5.24) [77, 78]. An electric eld is created by deformation of a ZnO
nanowire within the array by the piezoelectric effect. Across the top of the nanowire the
potential distribution varies from negative at the strained side of the surface V s to
positive at the stretched surface V s . This potential difference is measured in
millivolts and is due to the piezoelectric effect. On an atomic scale, the displacement of
Zn2 ions in the wurtzite lattice with respect to the O2 counterions is responsible for
that. The charges cannot move or combine without releasing strain via mechanical
movement. As long as the potential difference is maintained (via constant deformation), the system is stable. When external free charges (e.g., via a metal tip) are induced,
the wire is discharged. As an effect, the current which then ows is the result of the
DV V s =V s -driven ow of electrons from the semiconducting ZnO wire to the
metal tip. This ow of electrons neutralizes the ionic charges in the volume of the ZnO
wire and reduces the potentials V s and V s [78, 77].
j119
120
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
limited due to weak van der Waals bonds between electronically active molecular
building blocks. This situation is a general one for an inorganic crystalline semiconductor compared with crystalline organic ones.
It is important for the exploitation of inorganic semiconductor nanowires to
integrate them in massive arrangements in semiconductor devices, for example,
thin-lm transistors (TFTs), in which they might offer an increased electronic
performance and future device integration. The former is due to the fact that
drastically higher charge carrier mobilities m can be obtained when using multiple
single-crystal nanowires in FET devices instead of polycrystalline thin lms
(Figure 5.25) [82].
In a poly-Si TFT channel material, the electrical carriers have to travel across
multiple grain boundaries (curved pathway), whereas in a massive parallel-arranged
single-crystal nanowire array, charge carriers travel from source to drain within a
single-crystalline structure, which ensures a high carrier mobility m. The same effect
has been observed, for example, for compound semiconductors such as ZnO. First
sintering of isolated ZnO 0D nanoparticles has to be employed to obtain coarse grain
polycrystalline thin lms (Figure 5.26) [83]. When comparing the electronic performance of polycrystalline ZnO thin lms with that of single-crystalline ZnO nanowires deposited between the source and drain of a FET device, the performance of
ZnO nanowires is intriguing (Table 5.1).
Calculations have shown that for a channel length (geometric source to drain
distance) of 20 mm in a FET device, the number of grain boundary contacts is
diminished by a factor of 6 compared with a polycrystalline ZnO semiconductor
electrode against one composed of a 1D wire morphology. In addition, the reduced
hoping frequency for the charge carriers across the grain boundaries in the 1D ZnO
structures compared with the polycrystalline ZnO thin-lm morphology adds
towards their increased mobility (Figure 5.27) [85]. One may imagine how this can
j121
122
Material
Si
Si
Si
ZnO
ZnO
On/off ratio
8
10
100
1000
1.6 105
5 103
1 105
Ref.
92
84a
89
84b
85
85
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
j123
124
mobilities within the array comparable to those already measured in pristine single
tubes. These results clearly indicate that alignment of 1D nanostructures is crucial for
the most effective device integration of CNTs [87].
The channel conductivity for nanowire and nanotube transistor devices has been
treated theoretically and a universal analytical description has been developed. It has
been found that the transconductance of the channel differs from classical device
theory because of the specic nanowire charge distribution. Mainly different
electrostatics for the 1D channel structure are responsible for the different device
characteristics [88].
Due to the microscale length dimensions in which many semiconductor nanowire materials are obtainable, large-area substrates such as standard FET device
structures can already be applied in device fabrication using 1D nanomaterials. An
intriguing example is the arrangement of crystalline, several tens of micrometer long
Si nanowires, which have been arranged between the source and drain of a FETdevice
from a solution-derived process (Figure 5.29) [84]. Therein growth and integration
of the particular nanowire material are separated from the device fabrication.
Wire synthesis is done via a CVD approach, which of course is not compatible with
the device fabrication process. However, the wires can be solution processed and
in a second follow-up step deposited in a highly aligned fashion. The two-step route
is applicable to sensitive FET substrates such as plastics and points towards a
general route for future integration of nanowires into micromacro integrated
device architectures. Always a single-crystalline Si wire connects the source and
drain; the distance between such individual wires is 5001000 nm. This assures that
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
the wires have no contact. Extraordinary high charge carrier mobilities have been
realized [82, 89].
The idea of separating the two individual processes, (i) material synthesis and (ii)
materials processing, into device architectures has been used to arrange semiconductor nanowires on exible polymer-based substrates. Seminal to the development
of this eld of printed inorganic devices was probably the report of the rst printed allinorganic thin-lm eld effect transistor based on 0D CdSe nanoparticles [90].
Synthesis of semiconductor rods or wires was performed, for example, via a CVD
process on a silicon wafer. First the wafer-based source material was structured via a
lithographic etching procedure. This technique works for a variety of 1D materials
such as single-walled CNTs, GaN, GaAs and Si wires [84a]. An independent dry
transfer process of the previously synthesized nanowires using an elastomeric
stamp-based printing technique involves transfer of the 1D structures to a exible
substrate, for example, polyimide. It was thus possible to fabricate transistor devices
with the highest electronic performance and also very high mechanical exibility
(Figure 5.30).
A general applicable condensed-phase approach for the alignment of nanowires
uses the LangmuirBlodgett technique [91]. It allows a massive parallel arrangement
on planar substrates. The technique can be used in a layer-by-layer process allowing
crossed nanowire structures with dened pitch to be formed (Figures 5.31 and 5.32).
Moreover, this approach may allow the use of nanowires of different composition
within the layering process. This gives rise to organized nanowire heterostructures.
j125
126
Due to the hierarchical scaling of the structures from the nanometer up to the
micrometer regime, reliable electrical contacting is possible.
Molecular precursors for the synthesis of polycrystalline silicon wires represent a
promising alternative route to the widely employed synthetic CVD technique to
inorganic semiconductor materials. Known since the early days of organosilicon
chemistry, hydrogenated silicon compounds are known as chain (SinH2n2) or ring
molecules (SinH2n). For n 3, these molecules are liquids and decompose around
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
300 C to give Si. Using cyclopentasilane Si5H10, thin lms of this precursor have
been solution processed and converted to polycrystalline Si (Figure 5.33) [92].
This route is compatible with inkjet printing and has been used to set up a
complete TFT device architecture via printing. Charge carrier mobilities of 6.5 cm2
V1 s1 have been reported [92]. Other FET device architectures have also been
studied using nanowires as active materials. FETs with surrounding gates have
been synthesized and their electrical performance studied for Si [93]. Similar
structures have been fabricated for InAs as active semiconductor. InAs performs
with a high electron mobility and tends to form ideal ohmic contacts to many metals
(Figure 5.34) [94]. Such structures are expected to show enhanced transconductance along the wire [95].
j127
128
A complete ow process for the fabrication of a silicon nanowire array with vertical
surround gate FETs has been devised (Figure 5.35) [96]. Such an architecture was
proposed earlier for CNTs but inorganic nanowires have the advantage that they can
be grown as mechanically stiff vertical objects, whereas this is more complicated with
single-walled CNTs [97].
Surround gate-type FETs have also been reported for Ge nanowire arrays. First, the
nanowire array with the surround gate shell structure is synthesized via a multistep
CVD process, followed by patterning of the nanosized core shell Ge/Al2O3/Al from
solution on to an Si substrate. Finally, the structure is electrically contacted. This leads
to a macroscopic device with a multiple number of surround gate nanowires arranged
in a quasi-parallel fashion (Figure 5.36) [98].
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
j129
130
repeated on the initially formed nanowire trunks to yield nally the branch structure
in a follow-up process step. In this second process step, individual synthesis
conditions, for example, the molecular wire precursor, can be varied, giving rise to
branched nanowire heterostructures. The individual nanowire diameter of the
branches is determined by the Au catalyst particle size; however, additional deposition of material on the side facets of the wires leads to thickening of the wires, and a
tapered structure is therefore often observed [108].
The hierarchical growth of already branched nanowire structures to higher
organized ensembles can also be realized in a solution-based growth process [105].
The precursor (e.g., an elemental chalcogenide) and a capping agent [e.g., an alkyl
5.4 Contacting the Outer World: Nanowires and Nanotubes as Building Blocks
(aryl)phosphine] rst initiates the nanorod growth, followed by branching due to the
facetted growth mechanism into a nanotetrapod structure. An end selective extension and further branching of such nanotetrapods are then possible by adding a new
precursor to the same mixture. However, removal of the nano-objects from the initial
synthesis mixture of the grown nanostructures, redispersion and regrowth, for
example, with a different precursor, are also possible and allow further hierarchical
growth at the ends of the original tetrapod structures. This growth gives rise to the
formation of branched heterostructures. Thus, interfaces with different compositions can be designed in these structures, allowing control over charge carriers
(electrons or holes) due to tailoring of the interface.
A detailed catalyst-driven (via Au aerosol particles) growth study of branched
heterostructures by combining Group IIIV materials, GaAs, GaP, InP and AlAs, has
led to the conclusion that the growth mechanisms of such heterostructures depend
on the relationship of the interface energies between the growing materials and the
catalyst particle. It turned out that the growth of straight heterostructures in a
particular direction seems favored over growth in another direction.
Finally, comparing gas-phase techniques and solution-based routes to branched
nanowire structures, both methods show great potential towards the synthesis of
higher organized morphologies based on 1D wires. It is remarkable that in both
synthetic approaches the growth can be initiated on already formed branched structures just by adding fresh catalyst particles. However, the solution-based methods
surely are favored due to the relative ease of formation of such structures and their
further assembly potential towards higher organized multibranched wire structures.
One further step towards this end has been demonstrated by tipping the ends of
individual CdSe rods or tetrapods on both sides with Au nanoparticles, in solution
(Figure 5.39) [109].
Although already a truly composite structure on its own (e.g., the excitonic spectra
of the CdSe nanorod structure and the plasmonic Au spectra are not a superposition
of the individual features of the individual semiconductor material and the metal
nanoparticle), functionalization of the nanoparticle ends with alkanedithiols has led
to assembled lines of individual composite nanowires (Figure 5.39) [109].
AuCl3 concentration during tip growth (b); selfassembled chain of nano CdSe dumbbells (real
size 29 4 nm) formed by adding hexanedithiol
bifunctional linker molecules which connect Au
tipped ends of CdSe dumbbells. Adapted from
Ref. [110].
j131
132
5.5
Outlook
Studies towards functionality in inorganic 1D materials are an interesting and interdisciplinary eld, bringing together fundamental science and opening up opportunities towards possible applications of these materials in various areas. Directed,
organized growth of such 1D objects is surely a key in most of the envisioned areas of
application of nanoscience.
In recent years, it has become clear that in addition to a bottom-up synthetic
approach to 1D nanomaterials, a top-down approach which allows structuring,
assembly and integration of 1D nanomaterials on the next higher length scale is
necessary to organize and use 1D materials as building blocks in functional devices.
The question no longer seems to be which of the two techniques, bottom-up or topdown, is the more powerful approach to new functional devices, but rather how we
can make use in the most efcient way of both technologies to bridge the dimension
gap: nanomicromacro. In areas where current microtechnologies and materials
are best compatible with bottom-up techniques for the synthesis and handling of 1D
nanomaterials, they already work hand in hand and the unique properties of, for
example, 1D materials have to be exploited successfully. This has already been
demonstrated in current electronic and optoelectronic devices such as eld effect
transistors, light-emitting diodes, gas sensors and nanoresonators [110]. A key to this
development of the eld is certainly strong interdisciplinary research efforts between
chemists, physicists and engineers.
References
1 (a) Rao, C.N.R. and Govindaraj, A. (2005)
Nanotubes and Nanowires, Royal Society of
Chemistry, Cambridge; (b) Yang, P. and
Poeppelmeier, K.R. (2006) Inorganic
Chemistry Forum: Special Issue on
Nanowires. Inorganic Chemistry, 45,
75097510; (c) Tenne, R. (2006) Nature
Nanotechnology, 1, 103111; (d) Murphy,
C.J., Gole, A.M., Hunyadi, S.E. and
Orendorff, C.J. (2006) Inorganic
Chemistry, 45, 75447554; (e) Kline, T.R.,
Tan, M., Wang, J., Sen, A., Chan, M.W.H.
and Mallouk, T.E. (2006) Inorganic
Chemistry, 45, 75557565; (f) Xiang, X.,
Yang, P., Sun, Y., Wu, Y., Mayers, B.,
Gates, E., Yin, Y., Kim, F. and Yan, H.
(2003) Advanced Materials, 15, 353389;
(g) Goldberger, J., Fan, R. and Yang, P.
(2006) Accounts of Chemical Research, 39,
References
8
9
10
11
12
13
j133
134
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
References
38 (a) Schmitt, A.L., Bierman, M.J.,
Schmeisser, D., Himpsel, F.J. and Jiu, S.
(2006) Nano Letters, 6, 16171621; (b)
Szczech, J.P., Schmitt, A.L., Bierman,
M.J. and Jin, S. (2007) Chemistry of
Materials, 19, 32383243.
39 Aella, P., Ingole, S., Petuskey, W.T. and
Picraux, S.T. (2007) Advanced Materials,
19, 26032607.
40 Murphy, C.J., Gole, A.M., Hunyadi, S.E.
and Orendorff, C.J. (2006) Inorganic
Chemistry, 45, 75447554.
41 Wang, F., Dong, A., Sun, J., Tang, R., Yu,
H. and Buhro, W.E. (2006) Inorganic
Chemistry, 45, 75117521.
42 Hanrath, T. and Korgel, B.A. (2002)
Journal of the American Chemical Society,
124, 14241429.
43 Hanrath, T. and Korgel, B.A. (2003)
Advanced Materials, 15, 437440.
44 Smith, P.A., Nordquist, C.D., Jackson,
T.N., Mayer, T.S., Martin, B.R., Mbindyo,
J. and Mallouk, T.E. (2000) Applied Physics
Letters, 77, 13991401.
45 Whang, D., Jin, S., Wu, Y. and Lieber,
C.M. (2003) Nano Letters, 3, 12551259.
46 Huang, Y., Duan, X., Wei, Q. and Lieber,
C.M. (2001) Science, 291, 630633.
47 Snow, E.S., Novak, J.P., Campbell, P.M.
and Park, D. (2003) Applied Physics Letters,
82, 21452147.
48 Patolsky, F. and Lieber, C.M. (2005)
Materials Today, 8, 2028.
49 Patolsky, F., Timko, B.P., Zheng, G. and
Lieber, C.M. (2007) MRS Bulletin, 32,
142149.
50 Bergveld, P. (1972) IEEE Transactions on
Bio-Medical Engineering, BME-19,
342351.
51 Blackburn, G.F. (1987) in Biosensors:
Fundamentals and Applications (ed. A.P.F.
Turner), Oxford University Press, Oxford,
p. 481.
52 Hafeman, D.G., Parce, J.W. and
McConnell, H.M. (1988) Science, 240,
11821185.
53 (a) Smith, P.A., Nordquist, C.D., Jackson,
T.N., Mayer, T.S., Martin, B.R., Mbindyo,
J. and Mallouk, T.E. (2000) Applied Physics
54
55
56
57
58
59
60
61
62
j135
136
75
76
77
78
79
80
81
82
83
References
84 (a) Duan, X., Niu, Ch., Sahi, V., Chen, J.,
Parce, J.W., Empedocles, St. and Goldman,
J.L. (2003) Nature, 425, 274278; (b) Ong,
B.S., Li, Ch., Li, Y., Wu, Y. and Loutfy, R.
(2007) Journal of the American Chemical
Society, 129, 27502751.
85 Sun, B. and Sirringhaus, H. (2005) Nano
Letters, 5, 24082413.
86 (a) Durkop, T., Getty, S.A., Cobas, E. and
Fuhrer, M.S. (2004) Nano Letters, 4, 3539;
(b) Snow, E.S., P: Novak, J., M: Campbell,
P. and Park, D. (2003) Applied Physics
Letters, 82, 21452147; (c) Xiao, K., Liu, Y.,
Hu, P., Yu, G., Wang, X. and Zhu, D.
(2003) Applied Physics Letters, 82,
21452147; (d) Bradley, K., Gabriel, J.C.P.
and Gr
uner, G. (2003) Nano Letters, 3,
13531355; (e) Seidel, R., Graham, A.P.,
Unger, E., Duesberg, G.S., Liebau, M.,
Steinhoegl, W., Kreupl, F. and Hoehnlein,
W. (2004) Nano Letters, 4, 831834; (f)
Zhou, Y., Gaur, A., Hur, S.-H., Kocabas,
C., Meitl, M.A., Shim, M. and Rogers, J.A.
(2004) Nano Letters, 4, 20312035.
87 Kocabas, C., Hur, S.H., Gaur, A., Meitl,
M.A., Shim, M. and Rogers, J.A. (2005)
Small, 1, 11101116.
88 Rotkin, S.V., Ruda, H.E. and Shik, A.
(2003) Applied Physics Letters, 83,
16231626.
89 (a) Ridley, B.A., Nivi, B. and Jacobson,
J.M. (1999) Science, 286, 746748; (b) Ahn,
J.-H., Kim, H.-S., Lee, K.J., Jeon, S., Kang,
S.J., Sun, Y., Nuzzo, R.G. and Rogers, J.A.
(2006) Science, 314, 17541757; (c)
Menard, E., Lee, K.J., Khang, D.-Y.,
Nuzzo, R.G. and Rogers, J.A. (2004)
Applied Physics Letters, 84, 53985400.
90 Ridley, B.A., Nivi, B. and Jacobson, J.M.
(1999) Science, 286, 746749.
91 Whang, D., Jin, S., Wu, Y. and Lieber,
C.M. (2003) Nano Letters, 3, 12551259.
92 Shimoda, T., Matsuki, Y., Furusawa, M.,
Aoki, T., Yudasaka, I., Tanaka, H.,
Iwasawa, H., Wang, D., Miyasaka, M. and
Taakeuchi, Y. (2006) Nature, 440, 783786.
93 Becker, J.S., Suh, S. and Gordon, R.G.
(2003) Chemistry of Materials, 15,
29692976.
j137
138
106
107
108
109
110
j139
6
BiomoleculeNanoparticle Hybrid Systems
Maya Zayats and Itamar Willner
6.1
Introduction
Metal and semiconductor nanoparticles (NPs) or quantum dots (QDs) exhibit unique
electronic, optical and catalytic properties. The comparable dimensions of NPs or
QDs and biomolecules such as enzymes, antigens/antibodies and DNA suggest that
by the integration of the biomolecules with NPs (or QDs) combined hybrid systems
with the unique recognition and catalytic properties of biomolecules and with the
electronic, optical and catalytic features of NPs might yield new materials with
predesigned properties and functions. For example, metallic NPs, such as Au or Ag
NPs, exhibit size-controlled plasmon excitons. These plasmon absorbance bands are
sensitive to the dielectric properties of the stabilizing capping layers of the NPs [1, 2]
or to the degree of aggregation of the NPs that leads to interparticle-coupled plasmon
excitons [3, 4]. Thus, spectral changes occurring in metal NP assemblies as a result of
biomolecule-induced recognition events that occur on NP surfaces and alter the
surface dielectric properties of the NP or stimulate aggregation might be used for
optical biosensing. Similarly, semiconductor QDs reveal size-controlled absorption
and uorescence features [57]. The high uorescence quantum yields of QDs and
the stability of QDs against photobleaching can be used to develop new uorescent
labels for optical biosensors [8, 9]. Alternatively, metallic NPs exhibit catalytic
functions reected by their ability to catalyze the reduction and growth of the NP
seeds by the same metal or a different metal to form coreshell NPs. These properties
may be applied to form NP-functionalized proteins or nucleic acids that provide
hybrid systems that act as electroactive labels for amplied biosensing [10, 11] or as
templates for growing nanostructures [12]. Furthermore, the coupling of biomolecules to metallic or semiconductor NPs might allow the use of the electronconducting properties of metal NPs or the photoelectrochemical functions of
semiconductor NPs to develop new electrical or photoelectrochemical biosensors.
Indeed, tremendous scientic advances have been achieved in the last few years by
conjugating biomolecules and NPs to functional hybrid systems. Numerous new
140
6.2
Metal Nanoparticles for Electrical Contacting of Redox Proteins
The electrical contacting of redox proteins with electrodes is a key issue in bioelectronics. Numerous redox enzymes exchange electrons with other biological components such as other redox proteins, cofactors or molecular substrates. The exchange
of electrons between the redox centers of proteins and electrodes could activate the
bioelectrocatalytic functions of these proteins and thus provide a route to design
different amperometric biosensors. Most of the redox proteins lack, however, direct
electron transfer communication with electrodes, and hence, the bioelectrocatalytic
activation of the redox enzymes is prohibited. The lack of electrical contact between
the redox centers and the electrode surfaces is attributed to the spatial separation of
the redox sites from the electrode by means of the protein shell [19]. Different
methods to electrically communicate redox enzymes with electrodes were developed,
including the application of diffusional electron mediators [20], tethering redox relays
to the proteins [2123] and the immobilization of redox proteins in electroactive
polymers [2426]. A recently developed procedure for the electrical contacting of
redox proteins with electrodes involved the extraction of the native redox cofactor
from the protein and the reconstitution of the resulting apo-enzyme on a surface
modied with a monolayer consisting of a relay tethered to the respective cofactor
units [2730]. The reconstitution process aligned the protein on the electrode surface
in an optimal orientation, while the relay units electrically contacted the cofactor sites
with the conductive support, by shortening the electron transfer distances [31]. All of
these methods permitted the bioelectrocatalytic activation of the respective enzymes
and the development of amperometric biosensors and biofuel cells, [32, 33].
The availability of metal nanoparticles exhibiting conductivity allowed the
generation of nanoparticleenzyme hybrid systems for controlled electron transfer [34, 35]. Recently, highly efcient electrical contacting of the redox enzyme
glucose oxidase (GOx) through a single Au nanoparticle (Au NP) was demonstrated
[36]. The GOxAu NP conjugate was constructed by the reconstitution of an
apo-avoenzyme, apo-glucose oxidase (apo-GOx) on a 1.4-nm Au55 nanoparticle
functionalized with N6-(2-aminoethyl) avin adenine dinucleotide (FAD cofactor,
amino derivative, 1). The resulting enzymeNP conjugate was assembled on a
thiolated monolayer by using different dithiols (24) as linkers [Figure 6.1(A), route a].
Alternatively, the FAD-functionalized Au nanoparticle was assembled on a thiolated
monolayer associated with an electrode, and apo-GOx was subsequently reconstituted
on the functional nanoparticles [Figure 6.1(A), route b]. The enzyme electrodes
prepared by these two routes revealed similar protein surface coverage of about
1 1012 mol cm2. The Au NP was found to act as a nanoelectrode that acted as relay
units transporting the electrons from the FAD cofactor embedded in the protein to the
electrode with no additional mediators, thus activating the bioelectrocatalytic functions of the enzyme. Figure 6.1(B) shows the cyclic voltammograms generated by the
enzyme-modied electrode, in the presence of different concentrations of glucose.
The electrocatalytic currents increase as the concentrations of glucose are elevated, and
the appropriate calibration curve was extracted [Figure 6.1(B), inset]. The resulting
nanoparticle-reconstituted enzyme electrodes revealed unprecedented efcient electrical communication with the electrode (electron transfer turnover rate about
5000 s1). This effective electrical contacting, far higher than the turnover rate of
the enzyme with its native electron acceptor, oxygen (about 700 s1), made the enzyme
electrode insensitive to oxygen or to ascorbic acid or uric acid, which are common
interferents in glucose biosensing. The rate-limiting step in the electron transfer
communication between the enzyme redox center and the electrode was found to be
the charge transport across the dithiol molecular linker that bridges the particle to the
electrode. The conjugated benzenedithiol (4) was found to be the most efcient
electron-transporting unit among the linkers (24).
A similar concept was applied to electrically contact pyrroloquinoline quinone
(PQQ)-dependent glucose dehydrogenase [37]. Apo-glucose dehydrogenase (apoGDH) was reconstituted on PQQ-cofactor units (5) covalently linked to the
amino-functionalized Au NPs [Figure 6.2(A)]. The electrocatalytic anodic currents
developed by the enzyme-modied electrode, in the presence of variable concentrations of glucose, are depicted in Figure 6.2(B). The resulting electrocatalytic currents
imply that the system is electrically contacted and that the Au NPs mediate the
electron transfer from the PQQ-cofactor center embedded in the protein to the
electrode. Using the saturation current value generated by the system [Figure 6.2(B),
inset] and knowing the surface coverage of the reconstituted enzyme, 1.4 1010
mol cm2, the electron transfer turnover rate between the biocatalyst and the
electrode was estimated to be 1180 s1, a value that implies effective electrical
communication between the enzyme and the electrode, which leads to the efcient
bioelectrocatalytic oxidation of glucose.
6.3
Metal Nanoparticles as Electrochemical and Catalytic Labels
j141
142
Figure 6.1 (A) The assembly of an electricallycontacted glucose oxidase (GOx) monolayerfunctionalized electrode by the reconstitution of
the apo-enzyme on an Au NP modified with the
flavin adenine dinucleotide (FAD) cofactor (1).
(B) Cyclic voltammograms corresponding to the
bioelectrocatalyzed oxidation of different glucose
j143
144
NPs [3, 38, 39], the covalent tethering of the biomolecules to chemically functionalized
NPs [40, 41] or the supramolecular binding of biomolecules to functionalized NPs, for
example, the association of avidin-tagged nucleic acids with biotinylated NPs [42]. The
biomolecule-modied NPs may be employed as electrochemical tracers for the
amplied detection of biorecognition events [43, 44]. The chemical dissolution of
the NP labels associated with the biorecognition events followed by electrochemical
preconcentration of the released ions on the electrode and the subsequent stripping off
of the collected metal provide a general means for the use of the particles as amplifying
units for the biorecognition events [45]. Alternatively, the direct electrochemical
stripping off of the NPs bound to the biorecognition complex was employed to
transduce the biorecognition events [4648]. These methods for stripping of the
electrochemically pre-concentrated metals or the direct electrochemical dissolution of
the metals led to 34 orders of magnitude improved detection limits, compared with
normal pulse voltammetric techniques used to monitor DNA hybridization.
The method for the detection of DNA by the capturing the gold [49, 50] or silver [51]
nanoparticles on the hybridized target, followed by the anodic stripping off of the
metal tracer is depicted in Figure 6.3. Picomolar and sub-nanomolar levels of the
DNA target have thus been detected. For example, the electrochemical method was
employed for the Au NP-based quantitative detection of the 406-base human
cytomegalovirus DNA sequence (HCMV DNA) [50]. The HCMV DNA was immobilized on a microwell surface and hybridized with the complementary oligonucleotide-modied Au nanoparticles as labels. The resulting surface-immobilized Au
nanoparticle double-stranded assembly was treated with HBrBr2, resulting in the
oxidative dissolution of the gold particles. The solubilized Au3 ions were then
electrochemically reduced and accumulated on the electrode and subsequently
analyzed by anodic stripping voltammetry. The same approach was applied for
analyzing an antigen [52] using Au nanoparticle labels and stripping voltammetry
measurement. Further sensitivity enhancement can be obtained by catalytic enlargement of the gold tracer in connection with nanoparticle-promoted precipitation of
gold [49] or silver [5355]. Combining such enlargement of the metal particle tags,
with the effective built-in amplication of electrochemical stripping analysis, paved
the way to sub-picomolar detection limits. The silver-enhanced Au NP stripping
method was used for the detection of DNA sequences related to the BRCA1 breast
cancer gene [53]. The method showed substantial signal amplication as a result of
the catalytic deposition of silver on the gold tags; the silver signal was 125 times
greater than that of the gold tag. A detection limit of 32 pM was achieved.
A further modication of the metal NP-induced amplied detection of DNA
involved combination of magnetic particles and biomolecule-functionalized metallic
NPs as two inorganic units that operate successfully in biosensing events. The
formation of a biorecognition complex on the magnetic particles followed by labeling
of the complex with the metallic NPs allowed the magnetic separation of the metallabeled recognition complexes and their subsequent electrochemical detection by
stripping voltammetry [49, 53]. Whereas the magnetic separation of the labeled
recognition complexes enhanced the specicity of the analytical procedures, the
stripping off of the metallic labels contributed to specic analysis [56, 57]. Furthermore, the use of the metallic NP labels accumulated on the magnetic NPs as catalytic
seeds for the electroless enlargement of the particles by metals provided a further
amplication path for the electrochemical stripping of the labels [57].
Figure 6.4 depicts the amplied detection of DNA by the application of nucleic
acid-functionalized magnetic beads and Au NPs as catalytic seeds for the deposition
of silver [57]. A biotin-labeled nucleic acid (6) was immobilized on the avidinfunctionalized magnetic particles and hybridized with the complementary biotinylated nucleic acid (7). The hybridized assembly was then reacted with the
Au-nanoparticle-avidin conjugate (8). Treatment of the magnetic particlesDNAAu
nanoparticle conjugate with silver ions (Ag) in the presence of hydroquinone results
in the electroless catalytic deposition of silver on the Au nanoparticles, acting as
catalytic labels. The latter process provided the amplication path since the catalytic
accumulation of silver on the Au nanoparticle originates from a single DNA recognition event. The magnetic separation of the particles by an external magnet concentrated the hybridized assembly from the analyzed sample. The current originated by
the voltammetric stripping off of the accumulated silver then provided the electronic
j145
146
signal that transduced the analysis of the target DNA. Also, Au nanoparticle-based
detection of DNA hybridization based on the magnetically induced direct electrochemical detection of the 1.4-nm Au67 NP tag linked to the target DNA was
reported [58]. The Au67 NP tag was directly detected after the hybridization process,
without the need for acid dissolution.
An additional method to enhance the sensitivity of electrochemical DNA detection
involved the use of polymeric microparticles (carriers) loaded with numerous Au
nanoparticle tags [59]. The Au NP-loaded microparticles were prepared by binding of
biotinylated Au NPs to streptavidin-modied polystyrene spheres. The hybridization
of target DNA immobilized on magnetic beads with the nucleic acid functionalized
with Au nanoparticle-carrier polystyrene spheres, followed by the catalytic enlargement of gold labels, magnetic separation and then detection of the hybridization
event by stripping voltammetry (Figure 6.5) allowed the determination of DNA
targets at a sensitivity corresponding to 300 amol. A further method for the amplied
detection of biorecognition complexes included the use of Au NPs as carriers of
electroactive tags [60]. That is, the redox-active units capping the Au NPs linked to the
biorecognition complexes allowed the amperometric transduction of the biosensing
process. A detection limit of 10 amol for analyzing DNA with the functionalized NPs
was reported.
The metal NP labels might be linked along double-stranded DNA, rather than be
tethered by hybridization to the analyzed DNA, for the amplied electrical analysis of
DNA. The generation of the metal nanoclusters along the DNA and their use for the
amplied electrical analysis of DNA are depicted in Figure 6.6 and they follow
the concepts used for the fabrication of metallic nanowires (see Section 6.10) [61]. The
method involves the immobilization of a short DNA primer on the electrode that
hybridizes with the target DNA (step A). The negative charges associated with the
phosphate units of the long target DNA collect Ag ions from the solution to form
phosphateAg complexes (step B). The bound Ag ions are then reduced by
hydroquinone, resulting in the formation of metallic silver aggregates along the
j147
148
DNA (step C). The subsequent dissolution and electrochemical stripping of the
dissolved silver clusters (step D) then provide the route to detect the hybridized DNA.
The catalytic features of metal nanoparticles permit the subsequent electroless
deposition of metals on the nanoparticle clusters associated along the DNA and the
formation of enlarged, electrically interconnected, nanostructered wires. The formation of conductive domains as a result of biorecognition events then provides an
alternative path for the electrical transduction of biorecognition events. This was
exemplied by the design of a DNA detection scheme by using microelectrodes
fabricated on a silicon chip [62] (Figure 6.7). The method relied on the generation of a
DNAAu NP sandwich assay within the gap separating two microelectrodes. A probe
nucleic acid (9) was immobilized in the gap separating the microelectrodes. The
target DNA (10) was then hybridized with the probe interface and, subsequently,
the nucleic acid (11)-functionalized Au nanoparticles were hybridized with the free
30 -end of the target DNA, followed by silver enhancement of the Au NP labels.
Catalytic deposition of silver on the gold nanoparticles resulted in electrically
interconnected particles, exhibiting low resistance between the electrodes. The low
resistances between the microelectrodes were controlled by the concentration of the
target DNA, and the detection limit for the analysis was estimated to be about
5 1013 M. A difference of 106 in the gap resistance was observed upon analyzing by
this method the target DNA and its mutant. A related conductivity immunoassay of
proteins was developed, based on Au NPs and silver enhancement [63].
6.4
Metal Nanoparticles as Microgravimetric Labels
Nanoparticles provide a weight label that may be utilized for the development of
microgravimetric sensing methods [quartz crystal microbalance (QCM)] that are
ideal for the detection of biorecognition events. Also, the catalytic properties of
metallic NPs may be employed to deposit metals on the NP-functionalized biorecognition complexes, thus allowing enhanced mass changes on the transducers
j149
150
6:1
where f0 is the fundamental frequency of the quartz crystal, Dm is the mass change, A
is the piezoelectrically active area, rq is the density of quartz (2.648 g cm3) and mq is
the shear modulus (2.947 1011 dyn cm2 for AT-cut quartz). Thus, any mass
changes of the piezoelectric crystals are accompanied by a change in the resonance
frequency of the crystal.
The microgravimetric QCM method was applied for the amplied detection of
DNA using nucleic acid-functionalized Au NPs as weight labels [6668]. A target
DNA molecule (13) was hybridized to an Auquartz crystal that was modied with a
probe oligonucleotide (12) and the 14-functionalized Au NPs were hybridized to the
30 -end of the duplex DNA associated with the crystal (Figure 6.9). The subsequent
secondary dendritic amplication was achieved by the interaction of the resulting
interface with the target DNA (13) that was pretreated with the (12)-functionalized Au
NP [67, 69]. Concentrations of DNA (13) as low as 1 1010 M could be detected by the
amplication of the target DNA by the nucleic acid-functionalized Au NP labels.
Also, the detection of DNA using nucleic acid-functionalized Au NPs and catalytic
metal deposition on the NPs labels was reported [70, 71]. The Au nanoparticles act as
catalytic seeds and catalyze the reduction of AuCl4 and the deposition of gold on
the Au NPs. Thus, the catalytic enlargement of the nanoparticles increased the mass
associated with the piezoelectric crystal and provided an active amplication route for
the amplied microgravimetric detection of the DNA. For example, Figure 6.10(A)
depicts the amplied detection of the 7249-base M13mp18 DNA by using the catalytic
j151
152
amplication for the sensing of thrombin. Upon analyzing thrombin at a concentration of 2 109 M, the primary amplication step resulted in a frequency change of
30 Hz, whereas the secondary amplication step altered the crystal frequency by
900 Hz.
6.5
Semiconductor Nanoparticles as Electrochemical Labels for Biorecognition Events
j153
154
events of DNA [76]. Dissolution of the CdS (in the presence of 1 M HNO3) followed by
the electrochemical reduction of the Cd2 to Cd0 accumulated the metal on the
electrode. The subsequent stripping off of the generated Cd0 (to Cd2) provided the
electrical signal for the DNA analysis. This method was further developed by using
magnetic particles functionalized with probe nucleic acids as sensor units that
hybridize with the analyte DNA and the nucleic acid-functionalized CdS NPs labels
that hybridize with the single-strand domain of the analyte DNA and trace the primary
formation of the probeanalyte double-stranded complex. The magnetic separation of
the magnetic particleCdS NP aggregates crosslinked by the analyte DNA, followed by
dissolution of the CdS and electrochemical collection and stripping off of the Cd metal,
provide the amplied electrochemical readout of the analyte DNA. In fact, this system
combined the advantages of magnetic separation of the tracers CdS NPs associated
with the DNA recognition events with the amplication features of the electrochemical stripping method. Highly sensitive detection of DNA is accomplished by this
method (detection limit 100 fmol, reproducibility RSD 6%) [76].
By using different semiconductor NPs as labels, the simultaneous and parallel
analysis of different antibodies or different DNAs was accomplished. A model system
for multiplexed analysis of different nucleic acids with semiconductor NPs was
developed [77]. Three different kinds of magnetic particles were modied by three
different nucleic acids (21ac) and subsequently hybridized with the complementary
target nucleic acids (22ac). The particles were then hybridized with three different
kinds of semiconductor nanoparticles, ZnS, CdS, PbS, that were functionalized with
nucleic acids (23ac) complementary to the target nucleic acids associated with the
magnetic particles [Figure 6.12(A)]. The magnetic particles allowed the easy separation
and purication of the analyte samples, whereas the semiconductor particles provided
nonoverlapping electrochemical readout signals that transduced the specic kind of
hybridized DNA. Stripping voltammetry of the respective semiconductor nanoparticles yielded well-dened and resolved stripping waves, thus allowing simultaneous
electrochemical analysis of several DNA analytes. The same strategy was also applied
for the multiplexed immunoassay of proteins [78], with simultaneous analysis of four
antigens. The arsenal of inorganic labels for the parallel multiplexed analysis of
biomolecules and the level of amplication were further extended by using other metal
sulde composite nanostructures. For example, InS nanorods provided an additional
resolvable voltammetric wave, while the nanorods conguration of the label increased
the amplication efciency due to the higher content of stripped-off metal from the
nanorod conguration as compared with a spherical NP structure [79].
This method to encode biomolecular identity by semiconductor NPs was extended
for the parallel analysis of different proteins by their specic aptamers [80]. An Au
electrode was functionalized with aptamers specic for thrombin and lysozyme
[Figure 6.12(B)]. Thrombin and lysozyme were labeled with CdS and PbS NPs,
respectively, and the NP-functionalized proteins acted as tracer labels for the analysis
of the proteins. The NP-functionalized proteins were linked to the respective aptamers
and subsequently interacted with the nonfunctionalized thrombin or lysozyme. The
competitive displacement of the respective labeled proteins associated with the surface
by the analytes, followed by dissolution of the metal suldes associated with the
surface and the detection of the released ions by electrochemical stripping, then
allowed the quantitative detection of the two proteins [Figure 6.12(C)]. This method
was further extended for coding unknown single polymorphisms (SNPs) using
different encoding QDs [81]. This protocol relied on the ZnS, CdS, PbS and CuS
NPs modied with four different mononucleotides and the application of the NPs to
construct different combinations for specic SNPs, that yielded a distinct electronic
ngerprints for the mutation sites.
6.6
Metal Nanoparticles as Optical Labels for Biorecognition Events
The unique size-controlled optical properties of the metallic NPs reected by intense
localized plasmon excitons [1, 2, 82] turn the NPs into powerful optical tags for
biorecognition processes. Furthermore, the electronic interactions of the localized
j155
156
plasmon with other plasmonic waves allow one not only to develop new optical
amplication paths for biosensing, but also the use these electronic coupling
phenomena to follow dynamic processes associated with biorecognition events. For
example, surface plasmon resonance (SPR) is a common technique for following
biorecognition events at metallic surfaces [8385]. The changes in the dielectric
properties of the metallic surfaces and the changes in the thickness of the dielectric
lms associated with the metallic surfaces alter the resonance features of the surface
plasmon wave and provide the basis for SPR biosensors. The electronic coupling
between an Au NP conjugated to the biorecognition complex and the plasmon wave
may lead to amplication of the detection processes [86].
A colorimetric detection method for nucleic acids is based on the distancedependent optical properties of DNA-functionalized Au NPs. The aggregation of
Au NPs leads to a red shift in the surface plasmon resonance of the Au NPs as a result
of an interparticle coupled plasmon exciton. Thus, the hybridization-induced aggregation of DNA-functionalized Au NPs changes the color of the solution from red to
blue [3]. Changes in the optical properties of the Au NPs upon their aggregation
provide a method for the sensitive detection of DNA and provide a way to design
optical DNA biosensors [8790]. Specically, two batches of 13-nm diameter Au NPs
were separately functionalized with two thiolated nucleic acids that acted as labels for
the detection of the analyte DNA [Figure 6.13(A)]. Each of the NP labels is modied
with nucleic acid complementary to the two ends of the analyte DNA. Since each of
the nucleic acid-functionalized Au NPs includes many modifying oligonucleotides,
the addition of the target DNA to a solution of the two DNA-functionalized Au NPs
resulted in the crosslinking and aggregation of the nanoparticles through hybridization. Aggregation changed the color of the solution from red to purple as a result of
interparticle coupled plasmon absorbance [Figure 6.13(B)]. The aggregation process
was found to be temperature-dependent, and the aggregated Au NPs can reversibly
dissociate upon elevation of the temperature through the melting of the double strands
and reassociate with a decrease in the temperature through the rehybridization
process, which results in the reversible changes of the spectrum [91]. These melting
transitions, aggregation and deaggregation, occur in a narrow temperature range
[Figure 6.13(C)] and allow the design of selective assays for DNA targets and high
discrimination of the mismatched targets. The color changes in the narrow temperature range lies in the background of the simplest test to follow the aggregation of Au
NPs, the Northwestern spot test [87]. This is an extremely sensitive method to
discriminate between aggregated and nonaggregated gold NPs in aqueous solutions,
and it relies on a detectable color change from red to blue upon aggregation. The test
consists of spotting of a droplet of an aqueous solution of particles on a reversed-phase
thin-layer chromatographic plate. A blue spot indicates aggregation in the presence
of the target DNA, whereas a red spot indicates the presence of freely dispersed
particles [Figure 6.13(C)]. The sharp melting transitions of DNA-functionalized
gold nanoparticles were applied to discriminate the target DNA from DNA with
single-base-pair mismatches simply by following the changes in the nanoparticle
absorption as a function of temperature [87, 88]. The melting properties of DNA-linked
nanoparticle aggregates are affected by a number of factors, which include DNA
surface density of the modifying nucleic acids, nanoparticle size, interparticle distances and salt concentration [91].
Recently, the use of DNA-based machines for the amplied detection of DNA was
developed [92, 93]. The aggregation of Au NPs was used as a readout signal that follows
the operation of the machine and the detection of the respective DNA [94] [Figure 6.14
(A)]. The machine consists of a nucleic acid track (24) that includes three domains, I,
II and III. Domain I acts as the recognition site, and hybridization with the target DNA
(24a) triggers, in the presence of polymerase and the dNTP mixture, the replication of
the DNA track. Formation of the duplex, and specically the formation of the duplex
region II, generates the scission site for nicking enzyme Nb BbvC I. The enzymeinduced scission of the duplex activates the autonomous operation of the machine,
where the replication and strand displacement of the complementary nucleic acid of
region III proceed continuously. The displaced nucleic acid 25 may be considered as
the waste product. In the presence of two kinds of nucleic acids 26- and 27-
j157
158
functionalized Au NPs, which are complementary to the two ends of the waste
product, aggregation of Au NPs proceeds. The color changes as readout method of
aggregation of the Au NPs are depicted in Figure 6.14(B). The method allowed the
optical detection of the target DNA with a sensitivity that corresponded to 1 1012 M.
A different metal NP-induced analysis of DNA through the aggregation of the NPs
employed the salt effect on the double layer potential of the NPs [95, 96]. Au NPs
(15 nm) were functionalized with a probe nucleic acid, and the effect of the addition
of the salt (up to 2.5 M) on the stability of the modied particles, as compared with
unmodied (bare) NPs was examined. While the nucleic acid-functionalized Au NPs
revealed stability in the presence of added salt, the nonfunctionalized Au NPs
precipitated at a salt concentration of 0.1 M. The probe-functionalized Au NPs, when
hybridized with the complementary target DNA, revealed a rapid red to purple color
transition (<3 min) upon addition of 0.5 M NaCl. This color change was attributed to
the lowering of the Au NP surface potential upon addition of the salt, which resulted in
a decrease in the electrostatic repulsive interactions between the particles and
consequently to shorter interparticle distances and a coupled plasmon absorbance.
The effect of salt on the stability of unmodied Au NP upon interaction with a
probe nucleic acid before and after hybridization was used for the colorimetric
detection of specic sequences in amplied genomic DNA [97, 98]. The method
relied on the different effects of single- and double-stranded DNA on unmodied
citrate-coated Au NPs. The adsorption of short ss-DNA probes on Au NPs stabilizes
the citrate-coated NPs against salt-induced aggregation. The exposure of unmodied
gold nanoparticles to a saline mixture containing amplied genomic DNA and short
ss-DNA complementary to the regions in genomic DNA resulted in the aggregation
of Au NPs and a color change from red to blue. If the short oligomers were not
complementary to the regions in the genomic DNA, no color change occurred due to
the stabilization of Au NPs by these oligomers. The method permitted the sequencespecic detection of label-free oligonucleotides at the level of 100 fmol and was
adapted to detect single-base mismatches.
The hybridization-induced aggregation of metallic NPs was extended to analyze
ions and small molecules using aptamers and DNAzymes. Aptamers are nucleic acids
with specic recognition properties towards small molecules or proteins. They are
prepared by the Systematic Evolution of Ligands by Exponential Enrichment (SELEX)
procedure, which involves the selection and amplication of a sequence-specic
nucleic acid to a target molecule from a library of 10151016 nucleic acids [99101].
Similarly, catalytic nucleic acids (DNAzymes or ribozymes) are prepared by eliciting
3
Figure 6.14 (A) Analysis of a DNA target (24a),
by a nucleic acid track (24), consisting of
regions I, II and III. The replication followed by
nicking of the primary duplex (24a)/(24) yields
displacement of the waste product (25). The
displaced product (25) stimulates the aggregation of (26)- and (27)-functionalized Au NPs
that are complementary to the two ends of (25).
j159
160
j161
162
j163
164
layer, and the cholera toxin (B-subunit) linked to the lactose derivative induced
aggregation of the Au NPs. Upon aggregation, the color of solution of Au NPs
changed from red to deep purple. The selectivity of the bioassay arises from the fact
that thiolated lactose mimics the GM1 ganglioside, the native receptor of the cholera
toxin. The detection limit of the assay was 54 nM. Similarly, the colorimetric sensing
of platelet-derived growth factors (PDGFs) and their receptors (PDGFRs) based on
aggregation of aptamer-functionalized Au NPs was also developed [117].
In addition to the absorbance features of metallic NPs that were used to follow
biorecognition events in solution and on surfaces, other optical methods have been
employed to detect the association of biomolecule-functionalized Au NPs on biochips.
These methods included scanometric detection by light scattering, surface plasmon
resonance spectroscopy, resonance-enhanced absorption by NPs and enhanced
Raman scattering.
A scanometric DNA detection method was developed, and this was based on a
sandwich assay format involving a DNA-functionalized glass slide, the target DNA
and Au NP probes [118]. In a typical setup for scanometric detection, the modied
glass slide was illuminated in the plane of the slide with white light. The slide served
in such a conguration as a planar waveguide that prevents any light from reaching
the microscope objective by total internal reectance. Wherever NP probes were
attached to the surface, evanescently coupled light was scattered from the slide, and
the NP labels were imaged as bright, colored spots. This approach was used for the
detection of target DNA molecules that were specically bound to a DNA-functionalized surface. The resulting DNA hybrid was labeled with gold nanoparticles that
allowed the scanometric detection of the DNA (Figure 6.17). At high target concentrations (1 nM), the Au NPs on the surface could be visualized with naked eye. At
low target concentrations (100 pM), the coverage of the surface-bound Au NPs was
too low, and an enhancement process was needed. Enlargement of the Au NPs by the
catalytic reduction of silver ions and the deposition of silver metal on the Au NPs
Figure 6.17 Scanometric detection of DNA on a surface using goldsilver coreshell NPs.
resulted in a 100-fold increase in the light-scattered signal and thus increased the
sensitivity detection of target DNA (50 fM) [118]. This method was used to detect
single-base mismatches in oligonucleotides, which were hybridized to DNA probes
and were immobilized at different domains of a glass support. High sensitivities were
provided by the deposition of silver, whereas the selectivity was achieved by
examination of the melting properties of the spots: the mismatched spot reveals a
lower melt temperature owing to its lower association constant. The scattering of
light is size-dependent, and hence, by using different sized nanoparticles the
simultaneous detection of different DNA sequences is feasible. Accordingly, the
light scattered by DNA-functionalized 50- and 100-nm Au NP probes was used to
identify two different target DNAs in solution [119]. The scanometric method was
successfully applied to detect the MTHFR gene from genomic DNA at concentration
as low as 200 fM without PCR amplication of the target, by the application of
improved optical imaging instruments [120]. A similar approach was used to identify
single nucleotide polymorphisms (SNPs) in unamplied human genomic DNA
samples representing all possible genotypes for three genes involved in thrombotic
disorders [121].
The scanometric method was applied as an optical detection means in a series of
systems that employed nucleic acid-functionalized Au NPs as barcodes for biorecognition events such as antigenantibody complex formation or DNA hybridization. In one system (Figure 6.18), prostate-specic antigen (PSA) was detected by
nucleic acid-functionalized Au NPs that acted as signaling barcodes [122]. The Au
NPs were modied with the polyclonal anti-PSA Ab that was further functionalized
with thiolated nucleic acid (31), which were hybridized with the complementary
nucleic acid (310 ), and its sequence acted as a barcode for the sensing process. In the
presence of PSA and magnetic particles functionalized with the monoclonal anti-PSA
Ab, an aggregate consisting of the sandwich structure of the Au NPs and the
magnetic particles was formed. The magnetic separation of the aggregate was
followed by the thermal displacement of the barcode DNA. The released barcode
nucleic acid was then amplied by the polymerase chain reaction (PCR), and the
product was used to bridge the Au NPs with the complementary nucleic acid to a glass
surface. The Ag-enhanced Au NPs were then analyzed by the scanometric method.
Although this analytical protocol involves many steps, the PCR amplication step
leads to an ultrasensitive detection method, and PSA at a level of 30 aM was analyzed.
An analogous process was used to analyze DNA (Figure 6.19) [123]. By this
method, the functionalized Au NPs include the duplex DNA barcode (32/320 ) and
the nucleic acid units (33) that recognize the target DNA. In the presence of the
target DNA (35), the magnetic particles modied with the nucleic acids 34 and the
32/33-functionalized Au NPs, the magnetic particleAu NP aggregate is formed
through crosslinking the particles by 35. The subsequent magnetic separation of the
aggregate and the thermal separation of the DNA barcode were followed by scanometric detection of the released DNA code on surfaces. A PCR-like sensitivity that
corresponded to 500 zM was claimed for the analysis of DNA. The DNA barcodebased sensing protocols were used to analyze protein cancer markers [124], the
amyloid biomarkers for Alzheimers disease [125] and the genes of various
j165
166
pathogens, such as hepatitis B, Ebola virus, variola virus (smallpox, VV) and human
immunodeciency virus (HIV) [126]. Furthermore, different modications of the
method that use uorescence detection [127] and colorimetric assay [128] have been
reported.
The colorimetric scattering assay of gold nanoparticles was applied for the rapid
detection of mecA gene in unamplied genomic DNA sequences [129]. The method
was based on hybridization of DNA-functionalized Au NPs probes with the complementary sequences of target DNA in solution. The resulting solution was then spotted
on a glass waveguide, which was illuminated with white light in the plane of the slide.
The color of the light scattered by the oligonucleotide-functionalized Au NPs probes is
different from that of the Au NPs aggregates generated by the hybridization process
between the probes and target DNA. In the absence of the target DNA, individual
4050-nm Au NPs scatter green light, whereas in the presence of the target DNA
aggregated nanoparticles scatter yellow to orange light. The method showed enhanced
detection sensitivity (about four orders of magnitude) compared with the colorimetric
assay, thus allowing the detection of zeptomole quantities of target DNA, 333 fM of
synthetic DNA and 33 fM of genomic DNA. An improved light-scattering strategy
using gold nanoparticles as labels for the detection of specic target DNA in a
j167
168
homogeneous solution was reported [130]. The method is based on the aggregation of
DNA-functionalized Au NPs through the hybridization of the target DNA with probelabeled Au NPs and enhanced light scattering that originated from the aggregates in
solution. The light-scattering assay demonstrated high sensitivity, and the human p53
gene, exon 8 DNA, was detected at a concentration as low as 0.1 pM. Moreover, the
assay showed a high degree of specicity and was used for discrimination between
perfectly matched targets and targets with single base-pair mismatches.
The hyper-Rayleigh scattering (HRS) technique (nonlinear light scattering) was
used to monitor DNA hybridization on unmodied gold nanoparticles in a saline
solution, and the target DNA was analyzed at a concentration of 10 nM [131]. The
HRS assay for DNA detection was based on the differences in the electrostatic
interactions between ssDNA and dsDNA with the particles. The method permitted
the analysis of single-base mismatch in DNA by monitoring the HRS intensity from
the different DNAs interacting with the gold nanoparticles.
Surface-enhanced Raman scattering (SERS) of nanoparticle-bound substrates
allows the amplication of molecular vibrational spectra by up to 106-fold [132
134]. Modication of metal NPs with different Raman dyes was used to generate
multiply coded NPs [135138] and for the preparation of thousands of codes to be
written and read by means of surface-enhanced Raman resonance (SERR) scattering
without the need for spatial resolution of components of the code [135, 137]. The use of
SERS for the analysis of biorecognition events was demonstrated with the application
of Au NPs that were functionalized with Raman dyes and recognition elements [139, 140]. Formation of the complementary recognition complex on surfaces,
followed by the electroless deposition of Ag on the Au NPs, allowed the enhanced
readout of the biorecognition events by SERS. The concept was applied for the parallel
detection of various analytes on surfaces in an array conguration [Figure 6.20(A)]. For
example, six different thiol-functionalized Raman-active dyes, Cy3, Cy3.5, Cy5,
TAMRA, Texas Red and Rhodamine 6G, were linked through an oligonucleotide
spacer (10 adenosine units) to six different oligonucleotides and coupled to Au NPs
(13 nm) to yield six different Raman dye-labeled Au NP probes [139] [Figure 6.20(B)].
These Au NP probes were then employed as labels for hybridization with the
complementary targets. The capture DNA strands were spotted on a surface, and
the specic binding of Au NPs by hybridization with target DNAs was followed by the
silver enhancement of the Au NPs and analysis by SERS [Figure 6.20(C) and (D)]. The
detection limit of this method was 20 fM. The assay demonstrated the ability to
discriminate SNPs in DNA and RNA targets. A similar concept was also used to
identify proteinprotein and proteinsmall molecule interactions by using Au NPs
that were functionalized with specic antibodies and encoded with specic Raman
dyes [140]. Compared with colorimetric and scanometric detection methods, this
method offers enhanced multiplex sensing capabilities afforded by the narrow spectral
bands ngerprints of Raman dyes. Further developments of the SERS technique
allowed highly sensitive immunoassay procedures [141143].
Au NPs have been widely employed for signal amplication of biorecognition
events based on nanoparticle-enhanced SPR spectroscopy [82, 86]. The changes in
the dielectric properties at thin lms of metals, such as gold lms, as a result of
biomolecular recognition processes, are the basis for the SPR technique. Labeling of
the biorecognition complexes with Au or Ag NPs besides changing the dielectric
properties, results in the electronic coupling between the localized NPs plasmon and
the surface plasmon wave of the metal lm. This electronic coupling signicantly
affects the resonance frequency of the surface wave, thus leading to the enhanced
amplied optical transduction of the biorecognition events. Accordingly, Au NPs
were used as labels in immunosensing [144146] and DNA sensing [147149]
applications. The binding of Au NPs to the immunosensing interface led to a large
shift in the plasmon angle, a broadening of the plasmon resonance region and an
increase in the minimum reectance, and these effects allowed the detection of the
antigen with picomolar sensitivities [144]. Similarly, the sensitivity of DNA analysis
was enhanced 1000-fold (10 pM) when Au NP-functionalized DNA molecules
j169
170
were used as labels [147]. Also, the sensitive detection of DNA hybridization by
applying catalytic growth of Au NPs as a means to enhance the SPR shifts was
demonstrated [150].
The charging of Au NPs that are coupled to thin gold lms affects the electronic
coupling between the localized NP plasmon and the surface plasmon wave, resulting
in a shift in the SPR spectra. Thus, biocatalytic electron transfer reactions that
involved NPenzyme conjugates may be monitored by SPR spectroscopy [151]. Au
NPs (1.4 nm) were functionalized with N6-(2-aminoethyl) avin adenine dinucleotide
(FAD cofactor, amine derivative; (1), and apo-glucose oxidase (apo-GOx) was
reconstituted onto the cofactor sites. The nanoparticleGOx conjugates were then
assembled on a gold thin lm (SPR electrode) by using a long-chain dithiol, HS
(CH2)9SH, monolayer as a bridging linker. This yielded the biocatalytically active
glucose oxidase (GOx) bound to the Au NPs in an aligned conguration [Figure 6.21
(A)]. The biocatalyzed oxidation of glucose resulted in the formation of the reduced
form of the cofactor, FADH2. In the absence of O2 that acts as the natural electron
acceptor for GOx, electron transfer proceeded from the reduced cofactor to the Au
j171
172
6.7
Semiconductor Nanoparticles as Optical Labels
Figure 6.23 Fluorescent analysis of (A) an antigen by antibodyfunctionalized QDs and (B) a DNA by a nucleic acid-functionalized
QD.
and their functionalization with biomolecules were developed recently [165, 166].
Unlike molecular uorophores, which typically have very narrow excitation spectra,
semiconductor QDs absorb light over a very broad spectral range. This makes it
possible to excite optically a broad spectrum of quantum dot colors using a single
excitation laser wavelength, which may enable one to probe simultaneously several
markers in biosensing and assay applications. Indeed, functionalized semiconductor
QDs have been used as uorescence labels for numerous biorecognition
events [9, 167170], including their use in immunoassays for protein detection
[Figure 6.23(A)] and nucleic acid detection [Figure 6.23(B)]. For example, CdSe/ZnS
QDs were functionalized with avidin and these were used as uorescent labels for
biotinylated antibodies. Fluoroimmunoassays utilizing these antibody-conjugated
NPs were successfully used in the detection of protein toxins (staphylococcal
enterotoxin B and cholera toxin) [171, 172].
Similarly, CdSe/ZnS QDs conjugated to appropriate antibodies were applied for
the multiplexed uoroimmunoassay of toxins [Figure 6.24(A)]. Sandwich immunoassays for the simultaneous detection of the four toxins (cholera toxin, (CT), ricin,
shiga-like toxin 1 (SLT), and staphylococcal enterotoxin B, (SEB)) by using different
sized QDs were performed in single wells of a microtiter plate in the presence of
mixture of all four QDantibody conjugates [Figure 6.24(B)] [173], thus leading to the
uorescence that encodes for the toxin. In another example, multiplexed immunoassay formats based on antibody-functionalized QDs were used for simultaneous
detection of Escherichia coli O157:H7 and Salmonella typhimurium bacteria [174] and
for the discrimination between diphtheria toxin and tetanus toxin proteins [175].
Fluorescent QDs were used for the detection of single-nucleotide polymorphism in
human oncogene p53 and for the multiallele detection of the hepatitis B and hepatitis
C virus in microarray congurations [176]. DNA-functionalized CdSe/ZnS QDs of
different sizes were used to probe hepatitis B and C genotypes in the presence of a
background of human genes. The discrimination of a perfectly matched sequence of
p53 gene in the presence of background oligonucleotides including different singlenucleotide polymorphism sequences was detected with true-to-false signal ratios
higher than 10 (under stringent buffer conditions) at room temperature within
minutes. Also, DNAQD conjugates were used as uorescence probes for in situ
j173
174
hybridization assays (FISH). For example, the QD-based FISH labeling method was
used for the detection of Y chromosome in human sperm cells [177]. A QD-based
FISH method to analyze human metaphase chromosomes was reported [178] by using
QD-conjugated total genomic DNA as a probe for the detection of EBRB2/HER2/neu
gene. Also, the FISH technique was used for the multiplex cellular detection of
different mRNA targets [179].
Nonetheless, the superior photophysical features of semiconductor QDs are demonstrated in organic solvents, and their introduction into aqueous media, where
biorecognition and biocatalytic reactions proceed, is accompanied by a severe or even
complete loss of their uorescence properties. Different methods to stabilize the
j175
176
the presence of the quencherpeptide capping layer. The hydrolytic cleavage of the
peptide resulted in the removal of the quencher units and this restored the uorescence of the QDs. For example, collagenase was used to cleave the Rhodamine Red-X
dye-labeled peptide (38) linked to CdSe/ZnS QDs [Figure 6.27(A)]. While the tethered
dye quenched the uorescence of the QD, hydrolytic scission of the dye and its
corresponding removal restored the uorescence [Figure 6.27(B)].
In a related study, the activity of tyrosinase (TR) was analyzed by CdSe/ZnS
QDs [192]. The QDs were capped with tyrosine methyl ester monolayer. The
tyrosinase-induced oxidation of the tyrosine groups to the respective dopaquinone
units generated active quencher units that suppressed the uorescence of the QDs
j177
178
[Figure 6.28(A)]. The depletion of the uorescence of the QDs upon their interaction
with different concentrations of tyrosinase (TR) is displayed in Figure 6.28(B). The
tyrosinase-stimulated oxidation of phenol residues was further employed to use the
QDs to monitor the activity of thrombin [192]. The CdSe/ZnS QDs were functionalized with the peptide 39 that included the specic sequences for cleavage by
thrombin and the tyrosine site. The tyrosinase-induced oxidation of tyrosine yields
the dopaquinone units that quenched the uorescence of the QDs [Figure 6.29(A)].
The hydrolytic scission of the peptide by thrombin cleaved off the quinone quencher
units and restored the uorescence of the QDs [Figure 6.29(B)].
The FRET process occurring within a duplex DNA structure consisting of tethered
CdSe/ZnS QDs and a dye was applied to probe DNA hybridization and the DNase
I cleavage of the DNA [193]. Nucleic acid 40-functionalized CdSe/ZnS QDs
were hybridized with the complementary Texas Red-functionalized nucleic acid
41 [Figure 6.30(A)]. The time-dependent resonance energy transfer from the QDs to
the dye units was used to monitor the hybridization process. Treatment of the DNA
duplex with DNase I resulted in the cleavage of the DNA and the recovery of the
uorescence properties of the CdSe/ZnS QDs. After cleavage of the double stranded
DNA with DNase I, the intensity of the FRET band of the dye decreased and the
uorescence of CdSe/ZnS QDs increased [Figure 6.30(B)]. The luminescence
properties of the QDs were only partially recovered due to the nonspecic adsorption
of the dye on QDs.
The QDs were also used to probe the formation of aptamerprotein complexes [194].
An anti-thrombin aptamer was coupled to QDs, and the nucleic acid sequence was
hybridized with a complementary oligonucleotidequencher conjugate (Figure 6.31).
The uorescence of the QDs was quenched in the QDquencher duplex. In the
presence of thrombin the duplex was separated and the aptamer underwent a
conformational change to the quadruplex structure that binds thrombin. The displacement of quencher units from the blocked aptamer activated the luminescence
functions of the QDs, and a about 19-fold increase in their uorescence was observed.
6.8
Semiconductor Nanoparticles for Photoelectrochemical Applications
j179
180
onto electrodes permits the utilization of the photogenerated electronhole pair for
inducing the formation of photocurrents (photoelectrochemical effect). The photocurrent may be formed by the transfer of the conduction band electrons to the bulk
electrode and the concomitant transfer of electrons from the electron donor to
the valence band holes to yield a steady-state cathodic photocurrent. Alternatively,
the photogenerated conduction band electrons may be transferred to a solutionsolubilized electron acceptor, with the concomitant transfer of electrons from the
Figure 6.30 (A) Assembly of the CdSe/ZnS and Texas Redtethered duplex DNA. (B) Fluorescence spectra of: (a) the (40)functionalized CdSe/ZnS QDs; (b) the (40)/(41) duplex DNA
tethered to the QDs and the Texas Red chromophore; (c) after
treatment of the duplex DNA tethered to CdSe/ZnS and the dye
with DNase I. (Reprinted with permission from [193]. Copyright
2005 American Chemical Society).
j181
182
electrode to the valence band hole and the formation of an anodic photocurrent. The
coupling of biomoleculeQD conjugates to electrode surfaces enables one not only to
use the photoelectrochemical effect as a means to transduce biosensing processes,
but also to tailor functional nano-architectures on surfaces that perform logic gate
operations or act as switching systems. Different QDDNA hybrid systems [195, 196]
or QDprotein conjugates [197, 198] were assembled on electrodes, and the control of
the photoelectrochemical properties of the QDs by the biomolecules was demonstrated. CdS nanoparticles were assembled on electrodes by double-stranded nucleic
acids acting as bridging units, and the effect of a redox-active intercalator on the
resulting photocurrent and its direction was demonstrated [196]. Dithiol-tethered
single-stranded ssDNA (42) was assembled on an Au electrode and subsequently
hybridized with a complementary dithiolated ssDNA (43) to yield a double-stranded
DNA. The resulting surface was treated with CdSNPs to yield a semiconductor
nanoparticle interface linked to the electrode surface (Figure 6.32). Irradiation of the
dsDNACdS NP-modied electrode in the presence of triethanolamine (TEOA) as
electron donor resulted in an anodic low-intensity photocurrent. The observed
generated photocurrent was attributed to an imperfect structure of the CdS NPDNA
assemblies that resulted from direct contact between the NPs and the electrode,
rather then from charge transport through the DNA. The DNA duplex structure
linking the CdS NPs to the electrode could be employed, however, as a medium to
incorporate redox-active intercalators that facilitate electrical contact between the NPs
and the electrode, resulting in enhanced photocurrents. Methylene blue (MB) (44) was
intercalated into the (42/43)-dsDNA coupled to the CdS NPs [Figure 6.32(A) and (B)].
The cyclic voltammogram of the system implied that at potentials E > 0.28 V (vs.
SCE) the intercalator exists in its oxidized form (44), whereas at potentials E < 0.28 V
(vs. SCE) the intercalator exists in its reduced leuco form (45). Coulometric analysis of
the MB redox wave, E 0.28 V (vs. SCE), knowing the surface coverage of the
dsDNA, indicated that about 23 intercalator units were associated with the doublestranded DNA. An anodic photocurrent was generated in the system in the presence of
TEOA as electron donor and MB intercalated into the dsDNA and while applying a
potential of 0 V (vs. SCE) on the electrode. At this potential MB existed in its oxidized
state (44), which acts as an electron acceptor. The resulting photocurrent was about
fourfold higher than that recorded in the absence of the intercalator.
The enhanced photocurrent was attributed to the trapping of conduction band
electrons by the intercalator units and their transfer to the electrode that was biased at
0 V, thus retaining the intercalator units in their oxidized form. The oxidation of
TEOA by the valence band holes then led to the formation of the steady-state anodic
photocurrent. Biasing the electrode at a potential of 0.4 V (vs. SCE), a potential that
retained the intercalator units in their reduced state (45), led to blocking of the
photocurrent in the presence of TEOA and under an inert argon atmosphere. This
experiment revealed that the oxidized intercalator moieties with the DNA matrix
played a central role in the charge transport of the conduction band electrons and the
generation of the photocurrent.
Figure 6.33(A), curve b, shows the photocurrent generated by the (42/43)dsDNA
linked to the CdS NPs in the presence of the reduced intercalator 45 under conditions
where the electrode was biased at 0.4 V (vs. SCE) and the system was exposed to air
(oxygen). At a bias potential of 0.4 V (vs. SCE), the intercalator units exist in their
reduced leuco form (45) that exhibits electron-donating properties [Figure 6.32(B)].
Photoexcitation of the CdS NPs yields electronhole pairs in the conduction band and
valence band, respectively. The transport of the conduction band electrons to oxygen
with the concomitant transport of electrons from the reduced intercalator units to the
j183
184
valence band holes completed the cycle for the generation of the photocurrent. The
fact that the electrode potential retained the intercalator units in their reduced state
and the innite availability of the electron acceptor (O2) yielded the steady-state
cathodic photocurrent in the system.
The introduction of TEOA and oxygen to the electrode modied with (42/
43)DNA duplex and associated CdS NPs allowed the control of the photocurrent
direction by switching the bias potential applied on the electrode. Figure 6.33(B)
depicts the potential-induced switching of the photocurrent direction upon switching
the electrode potential between 0.4 (cathodic photocurrents) and 0 V (anodic
photocurrents), respectively. A layer-by-layer deposition of nucleic acid-functionalized CdS QDs on the electrode was followed by the photoelectrochemical transduction of the assembly process [195]. Semiconductor CdS NPs (2.6 0.4 nm) were
functionalized with one of the two thiolated nucleic acids 46 and 47 that are
complementary to the 50 - and 30 -ends of a target DNA molecule (48). An array of
CdS NP layers was then constructed on an Au electrode by a layer-by-layer hybridization process with the use of the target DNA 48 as crosslinker of CdS QDs
functionalized with nucleic acids (46 or 47) complementary to the two ends of the
DNA target (Figure 6.34). Illumination of the array in the presence of a sacricial
electron donor resulted in the generation of a photocurrent. The photocurrents
increased with the number of generations of CdS NPs associated with the electrode,
and the photocurrent action spectra followed the absorbance features of the CdS NPs,
which implies that the photocurrents originated from the photoexcitation of the CdS
nanoparticles. The ejection of the conduction band electrons into the electrode
occurred from the QDs that were in intimate contact with the electrode support. This
was supported by the fact that Ru(NH3)63 units (E 0.16 V vs SCE), which were
electrostatically bound to the DNA, enhanced the photocurrent from the DNACdS
array. The Ru(NH3)63 units acted as charge transfer mediators that facilitated the
hopping of conduction band electrons from CdS particles, which lacked contact with
the electrode, due to their separation by the DNA tethers.
Enzymes or redox proteins were also linked to semiconductor QDs, and the
resulting photocurrents were employed to assay the enzyme activities and to develop
different biosensors. Cytochrome c-mediated biocatalytic transformations were coupled to CdS NPs, and the direction of the resulting photocurrent was controlled by the
oxidation state of the cytochrome c mediator [198]. The CdS NPs were immobilized on
an Au electrode through a dithiol linker, and mercaptopyridine units, acting as
promoter units that electrically communicate between the cytochrome c and the NPs,
were linked to the semiconductor NPs (Figure 6.35). In the presence of reduced
cytochrome c, the photoelectrocatalytic activation of the oxidation of lactate by lactate
dehydrogenase (LDH) proceeded, while generating an anodic photocurrent [Figure 6.35(A)]. Photoexcitation of the NPs resulted in the ejection of the conduction
band electrons into the electrode and the concomitant oxidation of the reduced
cytochrome c by the valence band holes. The resulting oxidized cytochrome c
subsequently mediated the LDH-biocatalyzed oxidation of lactate. Similarly, cytochrome c in its oxidized form was used to stimulate the bioelectrocatalytic reduction
of NO3 to NO2 in the presence of nitrate reductase (NR), while generating a cathodic
Figure 6.34 Layer-by-layer deposition of CdS NPs using 46- and 47-functionalized NPs and 48 as crosslinker.
The association of Ru(NH3)63 with the DNA array facilitates charge transport and enhances the
resulting photocurrent. (Reproduced with permission from [195]. Copyright 2001 Wiley-VCH).
j185
186
Figure 6.36 Assembly of the CdS NPAChE hybrid system for the
photoelectrochemical detection of the enzyme activity
(h+ hole). (Reprinted with permission from [197]. Copyright
2003 American Chemical Society).
j187
188
6.9
Biomolecules as Catalysts for the Synthesis of Nanoparticles
It is well established that living organisms may synthesize NPs and even shaped
metallic NPs [200205]. For example, triangular gold nanoparticles were synthesized
by using Aloe vera and lemongrass plant (Cymbopogon exuosus) extracts as reducing
agents [205, 206]. Although the mechanism of growth of the NPs is not clear, the
resulting nanostructures originate from one or more products that are generated by
the cell metabolism. This suggests that biomolecules might be active components for
synthesizing NPs.
Indeed, a new emerging area in nanobiotechnology involves the use of biomaterials and, specically, enzymes as active components for the synthesis and growth of
particles [207]. As the enlargement of the NPs dominates their spectral properties
(extinction coefcient, plasmon excitation wavelength), the biocatalytic reactions that
yield the nanoparticles may be sensed by the optical properties of the generated NPs.
Furthermore, enzymemetal NP conjugates may provide biocatalytic hybrid systems
that act as amplifying units for biosensing processes.
Different oxidases generate H2O2 upon biocatalyzed oxidation of the corresponding substrates by molecular oxygen. The H2O2 generated was found to reduce AuCl4
in the presence of Au NP seeds that act as catalysts. This observation led to the
development of an optical system for the detection of glucose oxidase activity and for
the sensing of glucose [208] [Figure 6.37(A)]. A glass support was modied with an
aminopropylsiloxane lm to which negatively charged Au NPs were linked. Glucose
oxidase (GOx) biocatalyzed the oxidation of glucose, and this led to the formation of
H2O2 that acted as a reducing agent for the catalytic deposition of Au on the Au NPs
associated with the glass support. The enlargement of the particles was then followed
spectroscopically [Figure 6.37(B)]. Since the amount of the H2O2 formed is controlled
by the concentration of glucose, the absorbance intensities of the resulting NPs are
dominated by the concentration of glucose and the respective calibration curve
[Figure 6.37(C)] was extracted. Other enzymes demonstrated, similarly, the biocatalytic generation of reducing products that grow NPs. For example, alkaline phosphatase hydrolyses p-aminophenol phosphate to yield p-aminophenol and the latter
product reduces Ag on Au NP seeds that act as catalytic sites [209].
Different bioactive o-hydroquinone derivatives such as the neurotransmitters
dopamine (51), L-DOPA (52), adrenaline (53) and noradrenaline (54), were found
to act as effective reducing agents of metal salts to the respective metal NPs, for
example, the reduction of AuCl4 to Au NPs without catalytic seeds [210]. For
(d) 5 105, (e) 1.1 104, (f) 1.8 104 and (g)
3.0 104 M. For all experiments, the reaction
time was 10 min and the temperature was 30 C.
(C) Calibration curve corresponding to the
absorbance at l 542 nm of the Au NPfunctionalized glass supports upon analyzing
different concentrations of glucose. (Reprinted
with permission from [208]. Copyright 2005
American Chemical Society).
190
example, Figure 6.38 depicts the absorbance features of the Au NPs generated by
different concentrations of dopamine (51). As the concentration of dopamine
increased, the plasmon absorbance of the Au NPs was intensied. Although the
Au NPs were enlarged, as the concentration of dopamine was elevated, the maximum
absorbance of the plasmon band was blue-shifted. A detailed TEM analysis revealed
that the increase in the concentration of dopamine indeed enhanced the growth of the
j191
192
activity [Figure 6.40(A)]. The fact that the enzyme controls the absorbance of the Au NPs
and thus their degree of enlargement [Figure 6.40(B)] suggested that upon inhibition
of AChE the growth of the particles would be blocked. Indeed, the addition of paraoxon
(55), a well-established AChE irreversible inhibitor that mimics the functions of
organophosphate nerve gases, to the biocatalytic system that synthesizes the nanoparticles resulted in the inhibition of the growth of the Au NPs [Figure 6.40(C)].
Also, an electron-transfer mediator, Os(II) bispyridine-4-picolinic acid, was used for
the enlargement of Au NP seeds through the biocatalytic oxidation of glucose by
GOx [213]. A similar mediated redox mechanism was used to follow the inhibition of
AChE through blocking the enlargement of the Au NPs [213].
j193
194
surfaces. Citrate-stabilized Au NP seeds were immobilized on an aminopropylsiloxane lm associated with a glass support, and the functionalized surface was treated
with different concentrations of NADH in the presence of AuCl4. The growth of the
Au NPs was followed spectroscopically [Figure 6.41(B)]. The plasmon absorbance of
the Au NPs increased as the concentration of NADH was elevated, consistent with the
growth of the particles. The SEM images of the particles conrmed the growth of the
NPs on the glass support, [Figure 6.41(C)]. The particles generated by 2.7 104,
5.4 104 and 1.36 103 M NADH are shown in images (I), (II) and (III), which
reveal the generation of Au NPs with an average diameter of 13 1, 40 8 and
20 5 nm, respectively. The NPs generated by the high concentration of NADH (III)
reveals a 2D array of enlarged particles that touch each other, consistent with the
spectral features of the surface.
The quantitative growth of Au NPs by NADH allowed the assay of NAD-dependent
enzymes and their substrates. This was demonstrated by the application of the enzyme
lactate dehydrogenase (LDH) and lactate as substrate [75] [Figure 6.42(A)]. The
absorbance spectra of the NPs enlarged in the presence of different concentrations
of lactate, are shown in Figure 6.42(B). From the derived calibration curve, lactate
j195
196
the detailed analysis of the growth directions of the shaped NPs, and this enabled the
growth mechanism to be proposed. For example, HRTEM images of tripod particles
are given, Figure 6.43(D) and (E). The lattice planes exhibit an inter-planar distance of
0.235 nm that corresponds to the {1 1 1} type planes of crystalline gold. The pods,
separated by 120 , revealed a crystallite orientation of [0 1 1] with growth directions of
and
[211]
the pods of type <211>; namely the pods are extended the direction [112],
[1 2 1] [Figure 6.43(D), inset]. The shaped Au NPs revealed a red-shifted plasmon
absorbance band at l 680 nm, consistent with a longitudinal plasmon exciton in
the rod-like structures. The blue color of the shaped Au NPs distinctly differs from
the red spherical Au NPs.
The fact that the degree of shaping and the absorbance spectra of the resulting Au
NPs were controlled by the concentration of NADH allowed the development of an
ethanol biosensor based on the shape-controlled synthesis of NPs. The biocatalyzed
oxidation of ethanol by AlcDH yields NADH, and the biocatalytically generated
NADH acted as an active reducing agent for the formation of blue shaped Au NPs. As
the amount of NADH is controlled by the concentration of ethanol, the extent of the
structurally developed shaped nanoparticles exhibiting the red-shifted longitudinal
plasmon was then controlled by the substrate concentration. Figure 6.43(F) shows the
absorbance spectra resulting from the gradual development of the shaped Au NPs
formed in the presence of different concentrations of ethanol. The derived calibration
curve corresponding to the optical analysis of ethanol by the shaped Au NPs is
depicted in Figure 6.43(F), inset.
6.10
Biomolecule Growth of Metal Nanowires
The synthesis of nanowires is one of the challenging topics in nanobiotechnology [216, 217]. The construction of objects at the molecular or supramolecular level
could be used to generate templates or seeds for nanometer-sized wires. Nanowires
are considered as building blocks for the self-assembly of logic and memory circuits
in future nanoelectronic devices [218]. Thus, the development of methods to
assemble metal or semiconductor nanowires is a basic requirement for the construction of nanocircuits and nanodevices. Furthermore, the nanowires should
include functional sites that allow their incorporation into structures of higher
complexity and hierarchical functionality. The use of biomaterials as templates for the
generation of nanowires and nanocircuitry is particularly attractive. Biomolecules
exhibit dimensions that are comparable to those of the envisaged nano-objects. In
addition to the fascinating structures of biomaterials that may lead to new inorganic
or organic materials, templates of biological origin may act as factories for the
production of molds for nanotechnology. The replication of DNA, the synthesis of
proteins and the self-assembly of protein monomers into sheets, tubules and
laments all represent biological processes for the high-throughput synthesis of
biomolecular templates for nanotechnology. Specically, the coupling of NPs to
biomolecules might yield new materials that combine the self-assembly properties of
the biomolecules with the catalytic functions of the NPs. That is, the catalytic
enlargement of the NPs associated with the biomolecules might generate the metallic
nanowires.
Among the different biomaterials, DNA is of specic interest as a template for the
construction of nanowires [219221]. The ease of synthesis of DNA of controlled
lengths and predesigned shapes, together with the information stored in the base
sequence, introduce rich structural properties and addressable structural domains for
the binding of the reactants that form nanowires. Also, numerous biocatalysts such as
endonucleases, ligase, telomerase and polymerase can cut, paste, elongate or
replicate DNA and may be considered as nanotools for shaping the desired DNA and,
eventually, for the generation of nanocircuits. In addition, the intercalation of molecular components into DNA and the binding of cationic species such as metal ions to the
phosphate units of nucleic acids allow the assembly of chemically active functional
complexes, which may be used as precursors for the formation the nanowires.
One of the early examples that demonstrated the synthesis of Ag nanowires [222] is
depicted in Figure 6.44. Two microelectrodes, which were positioned opposite one
j197
198
another with a 1216-mm separation gap, were functionalized with 12-base oligonucleotides that were then bridged with a 16-mm long l-DNA [Figure 6.44(A)]. The
resulting phosphate units of the DNA bridge were loaded with Ag ions by ionexchange and the bound Ag ions were reduced to Ag metal with hydroquinone. The
resulting small Ag aggregates along the DNA backbone were further reduced by
reducing Ag under acidic conditions and catalytic deposition of Ag on the Ag
aggregates formed metallic nanowires. The resulting Ag nanowires exhibited dimensions corresponding to micrometer long and about 100 nm wide nanowires. The
conduction properties of the nanowires revealed non-ohmic behavior and threshold
potentials were needed to activate electron transport through the wires [Figure 6.44
(B)]. The potential gap in which no current passes through the nanowires was
attributed to the existence of structural defects in the nanowires. The hopping of
electrons across these barriers requires an overpotential that is reected by the break
voltage. This study was extended with the synthesis of many other metallic nanowires
on DNA templates, and Cu [223], Pt [224] and Pd [225] nanowires were generated on
DNA backbones.
The use of DNA and metallic NPs as templates and NPs as building blocks of
nanowires was further demonstrated by using telomers synthesized by HeLa cancer
cell extracts [226]. The constant repeat units that exist in the telomers provided
addressable domains for the self-assembly of the NPs and the subsequent synthesis of
the nanowires. By one approach, the primer 56 was telomerized in the presence of the
dNTPs nucleotide mixture, which contained (aminoallyl)-dNTPs (57) [Figure 6.45(A)].
The resulting amine-containing telomers were treated with Au NPs (1.4 nm) (58),
which were functionalized with the single N-hydroxysuccinimidyl ester groups, to
yield Au NP-modied telomers. The Au NP-decorated DNA wires were then enlarged
by electroless gold deposition to generate metallic nanowires [Figure 6.45(B) and (C)].
A second approach involved the telomerase-induced generation of telomers that
included constant repeat units. The hybridization of the Au NPs functionalized with
nucleic acids complementary to the telomer repeat domains, followed by electroless
enlargement of the NPs, yielded Au nanowires.
The binding of proteins, such as RecA, to DNA has been used as a means for the
patterning of nanoscale DNA-based metal wires with nonconductive or semiconductive gaps [227]. The metallization of the free, non-protein-coated DNA segments
permitted the sequence-specic biomolecular lithography of DNA-based nanowires [228]. A single-stranded nucleic acid sequence was complexed with RecA and
was carried to a double-stranded duplex DNA, a process that led to the nanolithographic patterned insulation of the DNA template by the protein [Figure 6.46(A)].
A carbon nanotube was then specically attached to the protein patch using a series
of antigenantibody recognition processes: The anti-RecA antibody (Ab) was bound to
the protein linked to the DNA duplex and the biotinylated anti-antibody was then
linked to the RecA Ab. The latter Ab was used to bind specically the streptavidincoated carbon nanotubes on the protein patch. Ag ions were bound to the free DNA
domains and these were reduced to Ag clusters that were associated with the DNA
assembly. The dsDNA was modied with aldehyde groups prior to this process to allow
the chemical reduction of the electrostatically bound Ag ions to Ag clusters. The
j199
200
tubes that were used as templates for growing Ag nanowires [230] [Figure 6.47(A)].
The peptide nanotubes were loaded with Ag ions, which were reduced with citrate to
yield metallic silver nanowires inside the peptide nanotubes [Figure 6.47(B)]. The
peptide coating was then removed by enzymatic degradation in the presence of
proteinase K to yield micrometer-long Ag nanowires with a diameter of 20 nm
[Figure 6.47(C)]. Upon application of D-phenylalanine-based peptide brils, which
are resistant to proteinase K, the peptide coating of the Ag nanowires was preserved.
Peptide nanotubes were also used to generate coaxial metalinsulatormetal nanotubes [231] [Figure 6.48(A)]. The D-phenylalanine peptide 59 assembled nanotube
was interacted with Ag ions and the resulting intra-tube-associated ions were
reduced to form the Ag0 nanowire. The resulting composite was further modied
by tethering of a thiol-functionalized peptide to the nanowire peptide coating. The
association of Au NPs with the thiol groups followed by deposition of gold on the Au
NP seeds generated the resulting coaxial metal nanowires [Figure 6.48(B)].
j201
202
The specic assembly of protein subunits into polymeric structures could provide a
means for the patterning of the generated metal nanowires. The f-actin lament
provides specic binding for the biomolecular motor protein myosin, which forms a
complex with the lament, where the motility of myosin along the lament is driven by
ATP [232, 233]. The f-actin lament is formed by the reversible association of
g-actin subunits in presence of ATP, Mg2 and Na ions. Accordingly, the f-actin
lament was used as a template for the formation of metallic nanowires [234]. The
f-actin lament was covalently modied with Au NPs (1.4 nm) that were functionalized
with single N-hydroxysuccinimidyl ester groups, and the Au NP-functionalized g-actin
subunits were then separated and used as versatile building blocks for the formation
of the nanostructure. The Mg2/Na/ATP-induced polymerization of the functionalized monomers yielded the Au NP-functionalized laments [Figure 6.49(A)],
and electroless catalytic deposition of gold on the Au NP-functionalized f-actin
lament yielded 13-mm long gold wires of height 80150 nm. The gold wires revealed
metallic conductivity with a resistance similar to that of bulk gold. By the sequential
polymerization of naked actin lament units on the preformed Au NPactin wire
and the subsequent electroless catalytic deposition of gold on the Au NPs, patterned
actinAu wireactin laments were generated [Figure 6.49(C)]. A related approach
was applied to yield the inverse Au wireactinAu wire patterned laments
[Figure 6.49(B)]. The nanostructure consisting of actinAu nanowireactin was used
j203
204
enzyme unit) and the biocatalyticNP hybrid material was used as a template for the
stepwise synthesis of metallic nanowires [209] [Figure 6.50(A)]. The enzymeAu NP
hybrid was used as a biocatalytic ink for the pattering of Si surfaces using dip-pen
nanolithography (DPN). The subsequent glucose-mediated generation of H2O2 and
the catalytic enlargement of the NPs resulted in the catalytic growth of the particles,
and this yielded micrometer-long Au metallic wires exhibiting heights and widths in
the region of 150250 nm, depending on the biocatalytic development time interval
[Figure 6.50(B)].
This process is not limited to redox enzymes, and NP-functionalized biocatalysts
that transform a substrate to an appropriate reducing agent may be similarly used
as biocatalytic inks for the generation of metallic nanowires. This was exemplied with the application of Au NP-functionalized alkaline phosphatase, AlkPh
(average loading of 10 NPs per enzyme unit) as biocatalytic ink for the generation
of Ag nanowires [209] [Figure 6.51(A)]. The alkaline phosphatase-mediated hydrolysis of p-aminophenol phosphate (60) yielded p-aminophenol (61), which reduced
Ag to Ag0 on the Au NP seeds associated with the enzyme. This allowed the
enlargement of the Au particles acting as a core for the formation of a continuous
silver wires exhibiting height of 3040 nm, on the protein template [Figure 6.51
(B)]. The method allowed the stepwise, orthogonal formation of metal nanowires
composed of different metals and controlled dimensions [Figure 6.51(C)]. Furthermore, the biocatalytic growth of the nanowires exhibited a self-inhibition
mechanism and upon coating of the protein with the metal no further enlargement
occurred. This self-inhibition mechanism is specically important since the
dimensions of the resulting nanowires are controlled by the sizes of the biocatalytic templates.
6.11
Conclusions and Perspectives
j205
206
j207
208
References
1 Mulvaney, P. (1996) Langmuir, 12,
788800.
2 Alvarez, M.M., Khoury, J.T., Schaaff, T.G.,
Shagullin, M.N., Vezmar, I. and
Whetten, R.L. (1997) The Journal
of Physical Chemistry B, 101,
37063712.
3 Mirkin, C.A., Letsinger, R.L., Mucic, R.C.
and Storhoff, J.J. (1996) Nature, 382,
607609.
4 Aslan, K., Luhrs, C.C. and PerezLuna, V.H. (2004) The Journal of
Physical Chemistry B, 108, 1563115639.
5 Brus, L.E. (1991) Applied Physics A, 53,
465474.
6 Alivisatos, A.P. (1996) Science, 271,
933937.
7 Grieve, K., Mulvaney, P. and Grieser, F.
(2000) Current Opinion in Colloid and
Interface Science, 5, 168172.
8 Chan, W.C.W., Maxwell, D.J., Gao, X.,
Bailey, R.E., Han, M. and Nie, S. (2002)
Current Opinion in Biotechnology, 13,
4046.
9 Sapsford, K.E., Pons, T., Medintz, I.L.
and Mattoussi, H. (2006) Sensors, 6,
925953.
10 Katz, E., Willner, I. and Wang, J. (2004)
Electroanalysis, 16, 1944.
11 Wang, J. (2003) Analytica Chimica Acta,
500, 247257.
12 Patolsky, F., Weizmann, Y., Lioubashevski,
O. and Willner, I. (2002) Angewandte
Chemie-International Edition, 41,
23232327.
13 Daniel, M.-C. and Astruc, D. (2004)
Chemical Reviews, 104, 293346.
14 Katz, E. and Willner, I. (2004) Angewandte
Chemie-International Edition, 43,
60426108.
15 Rosi, N.L. and Mirkin, C.A. (2005)
Chemical Reviews, 105, 15471562.
16 Wang, J. (2005) Small, 1, 10361043.
17 Medintz, I.L., Uyeda, H.T., Goldman, E.R.
and Mattoussi, H. (2005) Nature
Materials, 4, 435446.
References
34 Willner, B., Katz, E. and Willner, I. (2006)
Current Opinion in Biotechnology, 17,
589596.
35 Niemeyer, C.M. (2003) Angewandte
Chemie-International Edition, 42,
57965800.
36 Xiao, Y., Patolsky, F., Katz, E., Hainfeld,
J.F. and Willner, I. (2003) Science, 299,
18771881.
37 Zayats, M., Katz, E., Baron, R. and
Willner, I. (2005) Journal of the American
Chemical Society, 127, 1240012406.
38 Park, S.-J., Lazarides, A.A., Mirkin, C.A.,
Brazis, P.W., Kannewurf, C.R. and
Letsinger, R.L. (2000) Angewandte ChemieInternational Edition, 39, 38453848.
39 Parak, W.J., Pellegrino, T., Micheel, C.M.,
Gerion, D., Williams, S.C. and
Alivisatos, A.P. (2003) Nano Letters, 3,
3336.
40 Ghosh, S.S., Kao, P.M., McCue, A.W. and
Chappelle, H.L. (1990) Bioconjugate
Chemistry, 1, 7176.
41 Dubertret, B., Calame, M. and Libchaber,
A.J. (2001) Nature Biotechnology, 19,
365370.
42 Niemeyer, C.M. (2001) Chemistry A
European Journal, 7, 31883195.
43 Merkoi, A., Aldavert, M., Marn, S. and
Alegret, S. (2005) Trends in Analytical
Chemistry, 24, 341349.
44 Merkoi, A. (2007) FEBS Journal, 274,
310316.
45 Wang, J. (1985) Stripping Analysis, VCH,
Weinheim.
46 Gonzales-Garca, M.B. and Costa-Garca,
A. (1995) Bioelectrochemistry and
Bioenergetics, 38, 389395.
47 Gonzalez-Garca, M.B., FernandezSanchez, C. and Costa-Garca, A. (2000)
Biosensors & Bioelectronics, 15, 315.
48 Ozsoz, M., Erdem, A., Kerman, K.,
Ozkan, D., Tugrul, B., Topcuoglu, N.,
Ekren, H. and Taylan, M. (2003) Analytical
Chemistry, 75, 21812187.
49 Wang, J., Xu, D., Kawde, A.-N. and Polsky,
R. (2001) Analytical Chemistry, 73,
55765581.
j209
210
References
102 Schlatterer, J.C., Stuhlmann, F. and
Jaschke, A. (2003) ChemBioChem, 4,
10891092.
103 Breaker, R.R. and Joyce, G.F. (1994)
Chemistry & Biology, 1, 223229.
104 Cuenoud, B. and Szostak, J.W. (1995)
Nature, 375, 611614.
105 Santoro, S.W., Joyce, G.F., Sakthivel, K.,
Gramatikova, S. and Barbas, C.F. III
(2000) Journal of the American Chemical
Society, 122, 24332439.
106 Travascio, P., Witting, P.K., Mauk, A.G.
and Sen, D. (2001) Journal of the American
Chemical Society, 123, 13371348.
107 Travascio, P., Li, Y. and Sen, D. (1998)
Chemistry & Biology, 5, 505517.
108 Santoro, S.W. and Joyce, G.F. (1997)
Proceedings of the National Academy of
Sciences of the United States of America,
94, 42624266.
109 Faulhammer, D. and Famulok, M. (1996)
Angewandte Chemie-International Edition
in English, 35, 28372841.
110 Liu, J. and Lu, Y. (2003) Journal of the
American Chemical Society, 125,
66426643.
111 Liu, J. and Lu, Y. (2004) Analytical
Chemistry, 76, 16271632.
112 Liu, J. and Lu, Y. (2006) Angewandte
Chemie-International Edition, 45, 9094.
113 Guarise, C., Pasquato, L., De Filillis, V.
and Scrimin, P. (2006) Proceedings of the
National Academy of Sciences of the United
States of America, 103, 39783982.
114 Laromaine, A., Koh, L., Murugesan, M.,
Ulijn, R.V. and Stevens, M.M. (2007)
Journal of the American Chemical Society,
129, 41564157.
115 Choi, Y., Ho, N.-H. and Tung, C.-H. (2007)
Angewandte Chemie-International Edition,
46, 707709.
116 Schoeld, C.L., Field, R.A. and Russell,
D.A. (2007) Analytical Chemistry, 79,
13561361.
117 Huang, C.-C., Huang, Y.-F., Cao, Z., Tan,
W. and Chang, H.-T. (2005) Analytical
Chemistry, 77, 57355741.
118 Taton, T.A., Mirkin, C.A. and Letsinger,
R.L. (2000) Science, 289, 17571760.
j211
212
References
167
168
169
170
171
172
173
174
175
176
177
178
179
180
j213
214
References
229 Yan, H., Park, S.H., Finkelstein, G., Reif,
J.H. and LaBean, T.H. (2003) Science, 301,
18821884.
230 Reches, M. and Gazit, E. (2003) Science,
300, 625627.
231 Carny, O., Shalev, D.E. and Gazit, E.
(2006) Nano Letters, 6, 15941597.
232 Vale, R.D. (2003) Journal of Cell Biology,
163, 445450.
233 dos Remedios, C.G. and Moens, P.D.J.
(1995) Biochimica et Biophysica Acta, 1228,
99124.
234 Patolsky, F., Weizmann, Y. and Willner, I.
(2004) Nature Materials, 3, 692695.
235 Murphy, C.J., San, T.K., Gole, A.M.,
Orendorff, C.J., Gao, J.X., Gou, L.,
Hunyadi, S.E. and Li, T. (2005) The Journal
of Physical Chemistry B, 109, 1385713870.
j215
j217
7
Philosophy of Nanotechnoscience
Alfred Nordmann
7.1
Introduction: Philosophy of Science and of Technoscience
One way or another, the philosophy of science always informs and reects the development of science and technology. It appears in the midst of disputes over theories
and methods, in the reective thought of scientists and since the late nineteenth
century also in the analyses of so-called philosophers of science. Four philosophical
questions, in particular, are answered implicitly or contested explicitly by any
scientic endeavor:
.
.
.
How is a particular science to be dened and what are the objects and problems in
its domain of interest?
What is the methodologically proper or specically scientic way of approaching
these objects and problems?
What kind of knowledge is produced and communicated, how does it attain
objectivity, if not certainty, and how does it balance the competing demands of
universal generality and local specicity?
What is its place in relation to other sciences, where do its instruments and
methods, its concepts and theories come from and should its ndings be explained
on a deeper level by more fundamental investigations?
When researchers publish their results, when they review and critique their peers,
argue for research funds or train graduate students, they offer examples of what they
consider good scientic practice and thereby adopt a stance on all four questions.
When, for example, there is a call for more basic research on some scientic question,
one can look at the argument that is advanced and discover how it is informed by a
particular conception of science and the relation of science and technology. Frequently it involves the idea that basic science identies rather general laws of causal
relations. These laws can then be applied in a variety of contexts and the deliberate
control of causes and effects can give rise to new technical devices. If one encounters
such an argument for basic science, one might ask, of course, whether this picture of
j 7 Philosophy of Nanotechnoscience
218
basic versus applied science is accurate. While it may hold here and there, especially
in theoretical physics, it is perhaps altogether inadequate for chemistry. And thus one
may nd that the implicit assumptions agree less with the practice and history of
science and more with a particular self-understanding of science. According to this
self-understanding, basic science disinterestedly studies the world as it is, whereas
the engineering sciences apply this knowledge to change the world in accordance
with human purposes.
Science and scientic practice are always changing as new instruments are
invented, new problems arise, new disciplines emerge. Also, the somewhat idealized
self-understandings of scientists can change. The relation of science and technology
provides a case point. Is molecular electronics a basic science? Is nanotechnology
applied nanoscience? Are the optical properties of carbon nanotubes part of the world
as it is or do they appear only in the midst of a large-scale engineering pursuit that
is changing the world according to human purposes? There are no easy or straightforward answers to these questions and this is perhaps due to the fact that the
traditional ways of distinguishing science and technology, and basic and applied
research, do not work any longer. As many authors are suggesting, we should
speak of technoscience [1, 2] which is dened primarily by the interdependence
of theoretical observation and technical intervention [3].1) Accordingly, the designation nanotechnoscience is more than shorthand for nanoscience and nanotechnologies but signies a mode of research other than traditional science and
engineering. Peter Galison, for example, notes that [n]anoscientists aim to build
not to demonstrate existence. They are after an engineering way of being in
science [5]. Others appeal to the idea of a general purpose technology and thus
suggest that nanotechnoscience is fundamental research to enable a new technological development at large. Richard Jones sharpens this when he succinctly labels at
least some nanotechnoscientic research as basic gizmology.2)
Often, nanoscience is dened as an investigation of scale-dependently discontinuous properties or phenomena [6]. This denition of nanoscience produces in its
wake an ill-dened conception of nanotechnologies these encompass all possible
technical uses of these properties and phenomena. In its 2004 report on nanoscience
and nanotechnologies, the Royal Society and Royal Academy of Engineering denes
these terms as follows:
7.2
From Closed Theories to Limits of Understanding and Control
7.2.1
Closed Relative to the Nanoscale
In the late 1940s, the physicist Werner Heisenberg introduced the notion of closed
theories. In particular, he referred to four closed theories: Newtonian mechanics,
j219
j 7 Philosophy of Nanotechnoscience
220
Maxwells theory with the special theory of relativity, thermodynamics and statistical
mechanics, non-relativistic quantum mechanics with atomic physics and chemistry.
These theories he considered to be closed in four complementary respects:
1. Their historical development has come to an end, they are nished or have
reached their nal form.
2. They constitute a hermetically closed domain in that the theory denes conditions
of applicability such that the theory will be true wherever its concepts can be
applied.
3. They are immune to criticism; problems that arise in contexts of application are
deected to auxiliary theories and hypotheses or to the specics of the set-up, the
instrumentation, and so on.
4. They are forever valid: wherever and whenever experience can be described with
the concepts of such a theory, the laws postulated by this theory will be proven
correct [10].3)
All this holds for nanotechnoscience: It draws on an available repertoire of theories
that are closed or considered closed in respect to the nanoscale, but it is concerned
neither with the critique or further elaboration of these theories, nor with the
construction of theories of its own.4) This is not to say, however, that closed theories
are simply applied in nanotechnoscience.
When Heisenberg refers to the hermetically closed character of closed theories (in
condition 2 above), he states merely that the theory will be true where its concepts can
be applied and leaves quite open how big or small the domain of its actual applicability
is. Indeed, he suggests that this domain is so small that a closed theory does not
contain any absolutely certain statement about the world of experience [10]. Even for
a closed theory, then, it remains to be determined how and to what extent its concepts
can be applied to the world of experience.5) Thus, there is no pre-existing domain of
phenomena to which a closed theory is applied. Instead, it is a question of success,
This notion of application has been the topic of many recent discussions on
modeling6) but it does not capture the case of nanotechnoscience. For in this case,
researchers are not trying to bring nanoscale phenomena into the domain of
quantum chemistry or uid dynamics or the like. They are not using models to
extend the domain of application of a closed theory or general law. They are not
engaged in tting the theory to reality and vice versa. Instead, they take nanoscale
phenomena as parts of a highly complex mesocosm between classical and quantum
regimes. They have no theories that are especially suited to account for this
complexity, no theories, for example, of structureproperty relations at the nanoscale.7) Nanoscale researchers understand, in particular, that the various closed
theories have been formulated for far better-behaved phenomena in far more easily
controlled laboratory settings. Rather than claim that the complex phenomena of
the nanoscale can be described such that the concepts of the closed theory now
apply to them, they draw on closed theories eclectically, stretching them beyond
their intended scope of application to do some partial explanatory work at the
nanoscale.8) A certain measurement of a current through an organicinorganic
molecular complex, for example, might be reconstructed quantum-chemically or in
the classical terms of electrical engineering and yet, the two accounts do not
j221
j 7 Philosophy of Nanotechnoscience
222
compete against each other for offering a better or best explanation [20]. Armed
with theories that are closed relative to the nanoscale, researchers are well equipped
to handle phenomena in need of explanation, but they are also aware that they bring
crude instruments that are not made specically for the task and that these
instruments therefore have to work in concert. Indeed, nanoscale research is
characterized by a tacit consensus according to which the following three propositions hold true simultaneously:
1. There is a fundamental difference between quantum and classical regimes such
that classical theories cannot describe quantum phenomena and such that
quantum theories are inappropriate for describing classical phenomena.
2. The nanoscale holds intellectual and technical interest because it is an exotic
territory [14] where classical properties such as color and conductivity emerge
when one moves up from quantum levels and where phenomena such as
quantized conductance emerge as one moves down to the quantum regime.
3. Nanoscale researchers can eclectically draw upon a large toolkit of theories from
the quantum and classical regimes to construct explanations of novel properties,
behaviors or processes.
Taken together, these three statements express a characteristic tension concerning
nanotechnology, namely that it is thought to be strange, novel and surprising on the
one hand, familiar and manageable on the other. More signicantly for the present
purposes, however, they express an analogous tension regarding available theories:
they are thought to be inadequate on the one hand but quite sufcient on the other.
The profound difference between classical and quantum regimes highlights what
makes the nanocosm special and interesting but this difference melts down to a
matter of expediency and taste when it comes to choosing tools from classical or
quantum physics. Put yet another way: what makes nanoscale phenomena scientically interesting is that they cannot be adequately described from either perspective,
but what makes nanotechnologies possible is that the two perspectives make do when
it comes to accounting for these phenomena.
Available theories need to be stretched in order to manage the tension between
these three propositions. How this stretching actually takes place in research practice
needs to be shown with the help of detailed case studies. One might look, for
example, at the way in which theory is occasionally stuck in to satisfy an extraneous
explanatory demand.9) A more prominent case is the construction of simulation
j223
j 7 Philosophy of Nanotechnoscience
224
13) Since the publication of Richard Joness book 15) To be sure, it is a commonplace that laboratory
practice is more complex than the stories told
on Soft Machines, that concept has been the
in scientic papers. Traditional scientic
subject of an emerging discussion [25, 26]. It
research often seeks to isolate particular
concerns the question of whether the term
causal relations by shielding them against
machine retains any meaning in the notion of
interferences from the complex macroscopic
a soft machine when this is thought of as a
world of the laboratory. Whether it is easy or
non-mechanical, biological machine (while it
difcult to isolate these relations, whether
clearly does retain meaning if thought of as a
they are stable or evanescent, is of little
concrete machine in the sense of Simondon).
importance for the scientic stories to be told.
14) This is especially true, perhaps, for the concept
The situation changes in respect to
self-assembly, which has been cautiously
nanotechnoscience: its mission is to ground
delimited, for example, by George Whitesides
future technologies under conditions of
[27], but which keeps escaping the box and
complexity. In this situation, it is more
harks backwards and forwards to far more
troubling that scientic publications tell
ambitious notions of order out of chaos,
stories only of success.
spontaneous congurations at higher levels,
etc.
nanotoxicology. It is nding out the hard way that physico-chemical characterization does not go very far, and that even the best methods for evaluating chemical
substances do not REACh all the way to the nanoscale [28].16) In other words, the
methods of chemical toxicology go only so far and tell only a small part of the
toxicological story though regarding chemical composition, at least, there are
general principles, even laws that can be drawn upon. With regard to the surface
characteristics and shape of particles of a certain size, one has to rely mostly on
anecdotes from very different contexts, such as the story of asbestos. For lack of
better approaches, therefore, one begins from the vantage point of chemical
toxicology and condently stretches available theories and methods as far as they
will go while the complexities of hazard identication, let alone risk assessment
(one partially characterized nanoparticle or nanosystem at a time?!) tend to be
muted.17)
There is yet another, again more general, way to make this point. Theories that
are closed relative to the nanoscale can only introduce nonspecic constraints.
The prospects and aspirations of nanotechnologies are only negatively dened:
Everything is thought to be possible at the nanoscale that is not ruled out by
those closed theories or the known laws of nature. This, however, forces upon us
a notion of technical possibility that is hardly more substantial than that of logical
possibility. Clearly, the mere fact that something does not contradict known laws
is not sufcient to establish that it can be realized technically under the complex
conditions of the nanoregime. Yet once again, there is no theoretical framework
or language available to make a distinction here and to acknowledge the
specicities and difculties of the nanoworld since all we have are theories
that were developed elsewhere and that are now stretched to accommodate
phenomena from the nanosphere.18) However, failure to develop an understanding also of limits of understanding and control at the nanoscale has tremendous
cost as it misdirects expectations, public debate and possibly also research
funding.
j225
j 7 Philosophy of Nanotechnoscience
226
7.3
From Successful Methods to the Power of Images
7.3.1
(Techno)scientific Methodology: Quantitative Versus Qualitative
primarily in the absence, even deliberate suppression of visual clues by which to hold
calculated and experimental images apart. Indeed, the (nano)technoscientic researcher frequently compares two displays or computer screens. One display offers a
visual interpretation of the data that were obtained through a series of measurements
(e.g. by an electron or scanning probe microscope), the other presents a dynamic
simulation of the process he might have been observing and for this simulation to
be readable as such, the simulation software produces a visual output that looks like
the output for an electron or scanning probe microscope. Agreement and disagreement between the two images then allows the researchers to draw inferences about
probable causal processes and to what extent they have understood them. Here, the
likeness of the images appears to warrant the inference from the mechanism
modeled in the simulation to the mechanism that is probably responsible for the
data that were obtained experimentally. Accordingly (and this cannot be done here),
one would need to show how nanoscale researchers construct mutually reinforcing
likenesses, how they calibrate not only simulations to observations and visual
representations to physical systems but also their own work to that of others, current
ndings to long-term visions. This kind of study would show that unifying theories
play little role in this, unless the common availability of a large tool-kit of theories can
be said to unify the research community. Instead of theories, it is instruments (STM,
AFM, etc.), their associated software, techniques and exemplary artefacts (buckyballs,
carbon nanotubes, gold nanoshells, molecular wires) that provide relevant common
referents [3335].
7.3.2
Ontological Indifference: Representation Versus Substitution
This is also not the place to subject this qualitative methodology to a sustained
critique. Such a critique is easy, in fact, from the point of view of rigorous and
methodologically self-aware quantitative science [31]. Far more interesting is the
question of why, despite this critique, a qualitative approach appears to be good
enough for the purposes of nanoscale research. As Peter Galison has pointed out,
these purposes are not to represent the nanoscale accurately and, in particular, not to
decide what exists and what does not exist, what is more fundamental and what is
derivative. He refers to this as the ontological indifference of nanotechnoscience [5].
Why is it, then, that nanotechnological research can afford this indifference? For
example, molecular electronics researchers may invoke more or less simplistic
pictures of electron transport but they do not need to establish the existence of
electrons. Indeed, electrons are so familiar to them that they might think of them
as ordinary macroscopic things that pass through a molecule as if it were another
material thing with a tunnel going through it [20]. Some physicists and most
philosophers of physics strongly object to such blatant disregard for the strangely
immaterial and probabilistic character of the quantum world that is the home of
electrons, orbitals, standing electron waves [36, 37]. And indeed, to achieve a practical
understanding of electron transport, it may be necessary to entertain more subtle
accounts. However, it is the privilege of ontologically indifferent technoscience that it
j227
j 7 Philosophy of Nanotechnoscience
228
can always develop more complicated accounts as the need arises. For the time being,
it can see how far it gets with rather more simplistic pictures.20)
Ontological indifference amounts to a disinterest in questions of representation
and an interest, instead, in substitution.21) Instead of using sparse modeling tools
to represent only the salient causal features of real systems, nanoresearchers
produce in the laboratory and in their models a rich, indeed oversaturated
substitute reality such that they begin by applying alternative techniques of data
reduction not to nature out there but to some domesticated chunk of reality in the
laboratory. These data reduction and modeling techniques, in turn, are informed by
algorithms which are concentrated forms of previously studied real systems, they
are tried and true components of substitute realities that manage to emulate real
physical systems [38].22) In other words, there is so much reality in the simulations
or constructed experimental systems before them, that nanotechnology researchers can take them for reality itself [39]. They study these substitute systems and, of
course, have with these systems faint prototypes for technical devices or applications. While the public is still awaiting signicant nanotechnological products to
come out of the laboratories, the researchers in the laboratories are already using
nanotechnological tools to detach and manipulate more or less self-sufcient
nanotechnological systems which only require further development before they
can exist as useful devices outside the laboratory, devices that not only substitute for
but improve upon something in nature.
7.3.3
Images as the Beginning and End of Nanotechnologies
Again, it may have appeared like a cumbersome path that led from qualitative
methodology and its constructions of likeness to the notion that models of nanoscale
phenomena do not represent but substitute chunks of reality and that they thereby
involve the kind of constructive work that is required also for the development of
nanotechnological systems and devices. For a more immediate illustration of this
point, we need to consider only the role of visualization technologies in the history of
nanotechnological research.23) Many would maintain, after all, that it all began for
real when Don Eigler and Erhard Schweizer created an image with the help of 35
xenon atoms. By arranging the atoms to spell IBM they did not represent a given
reality but created an image that replaces a random array of atoms by a technically
ordered proto-nanosystem. Since then, the ability to create images and to spell words
has served as a vanguard in attempts to assert technical control in the nano-regime
the progress of nanotechnological research cannot be dissociated from the development of imaging techniques that are often at the same time techniques for
intervention. Indeed, Eigler and Schweizers image has been considered proof of
concept for moving atoms at will. It is on exhibit in the STM web gallery of IBMs
Almaden laboratory and is there appropriately entitled The Beginning a beginning that anticipates the end or nal purpose of nanotechnologies, namely to directly
and arbitrarily inscribe human intentions on the atomic or molecular scale.
Images from the nanocosm are at this point (early 2008) still the most impressive
as well as popular nanotechnological products. By shifting from quantitative coordinations of numerical values to the construction of qualitative likeness, from the
conventional representation of reality to the symbolic substitution of one reality by
another, nanotechnoscience has become beholden to the power of images. Art
historians and theorists like William Mitchell and Hans Belting, in particular, have
emphasized the difference between conventional signs that serve the purpose of
representation and pictures or images that embody visions and desires, that cannot
be controlled in that they are not mere vehicles of information but produce an excess
of meaning that is not contained in a conventional message [40, 41].
The power of images poses some of the most serious problems of and for
nanoscience and nanotechnologies. This is readily apparent already for The
Beginning. As mentioned above, it is taken to signify that for the rst time in
history humans have manipulated atoms at will and thus as proof of concept for the
most daring nanotechnological visions and by the most controversial nanotechnological visionaries such as Eric Drexler. This was not, of course, what Eigler and
23) It is no accident that this is perhaps the beststudied and most deeply explored aspect of
nanotechnologies.
j229
j 7 Philosophy of Nanotechnoscience
230
Schweizer wanted to say. Their image is testimony also to the difculty, perhaps the
limits of control of individual atoms. But the power of their image overwhelms any
such testimony.
Here arises a problem similar to the one encountered in Section 7.2.3. The
specicity, complexity and difculty of work at the nanoscale do not have a language
and do not nd expression. The theories imported from other size regimes can only
carve out an unbounded space of unlimited potential, novelty, possibility. And the
pictures from the nanocosm show us a world that has already been accommodated to
our visual expectations and technical practice.24) Ontologically indifferent, nanotechnoscience may work with simplistic conceptions of electron transport and it
produces simplistic pictures of atoms, molecules, standing electron waves which
contradict textbook knowledge of these things. For example, it is commonly maintained that nanosized things consist only of surface and have no bulk. This is what
makes them intellectually and technically interesting. But pictures of the nanocosm
invariably show objects with very familiar bulk-surface proportions, a world that looks
perfectly suited for conventional technical constructions. And thus, again, we might
be facing the predicament of not being told or shown what the limits of nanotechnical
constructions and control might be.
The power of images also holds another problem, however. In the opposition
of conventional sign and embodied image the totemistic, fetishistic, magical
character of pictures comes to the fore. To the extent that the image invokes a
presence and substitutes for an absence, its kinship to voodoo-dolls, for example,
becomes apparent. This is not the place to explore the analogy between simulations and voodoo-dolls [31], but it should be pointed out that nanotechnologies in
a variety of ways cultivate a magical relation to technology and their imagery
reinforces this. Indeed, in the history of humankind we might have begun with
an enchanted and uncanny nature that needed to be soothed with prayer to the
spirits that dwelled in rocks and trees. Science and technology began as we
wondered at nature, became aware of our limits of understanding and yet tamed
and rationalized nature in a piece-meal fashion. Technology represents the extent
to which we managed to defeat a spirited, enchanted world and subjected it to
our control. We technologized nature. Now, however, visitors to science museums are invited to marvel at nanotechnologies, to imagine technological
agency well beyond human thresholds of perception, experience and imagination
and to pin societal hopes for technological innovation not on intellectual
understanding but on a substitutive emulation that harnesses the self-organizing
powers of nature. We thus naturalize technology, replace rational control over
brute environments by a magical dependency on smart environments and we
may end up rendering technology just as uncanny as nature used to be with its
earthquakes, diseases and thunderstorms [42, 43].25)
24) Compare footnote 20 above.
25) This is a strong indictment not of particular
nanotechnologies but of certain ways of
propagating our nanotechnological future.
7.4
From Definitions to Visions
7.4.1
Wieldy and Unwieldy Conceptions
The rst two sections gave rise to the same complaint. After surveying the role of
theories and methodologies for the construction of technical systems that can
substitute for reality, it was noted that this tells us nothing about the specicity,
complexity and difculty of control at the nanoscale. The nanocosm appears merely
as that place from where nanotechnological innovations emanate and so far it appears
that it can be described only in vaguely promising terms: the domain of interest to
nanoscience and nanotechnologies is an exotic territory that comprises all that lies in
the borderland of quantum and classical regimes, all that is unpredictable (but
explicable) by available theories and all that is scale-dependently discontinuous,
complex, full of novelty and surprise.26)
However, as one attempts a positive denition of nanotechnoscience and its
domain of phenomena or applications, one quickly learns how much is at stake.
In particular, denitions of nanotechnology suggest the unity of a program so
heterogeneous and diverse that we cannot intellectually handle or manage the
concept any more. By systematically overtaxing the understanding, such denitions
leave a credulous public and policy makers in awe and unable to engage with
nanotechnology in a meaningful manner. The search for a conceptually manageable denition is thus guided by an interest in specicity but also by a political value
it is to facilitate informed engagement on clearly delimited issues. In purely public
contexts, therefore, it is best not to speak of nanotechnology in the singular at all but
only of specic nanotechnologies or nanotechnological research programs [44]. In
the present context, however, an effort is made to circumscribe the scope or domain
of nanotechnoscience, that is, to consider the range of phenomena that are encountered by nanoscience and nanotechnologies. This proves to be a formidable
challenge.
7.4.2
Unlimited Potential
There is an easy way to turn the negative description of the domain into a positive one.
One might say that nanoscience and nanotechnologies are concerned with everything molecular or, slightly more precisely, with the investigation and manipulation
of molecular architecture (as well as the properties or functionalities that depend on
molecular architecture).
26) Tellingly, the most sophisticated denition of
nanoscience is quite deliberate in saying
nothing about the nanocosm at all. Indeed,
this denition is not limited to nanoscale
phenomena or effects but intends a more
j231
j 7 Philosophy of Nanotechnoscience
232
Everything that consists of atoms is thus an object of study and a possible design
target of nanoscale research. This posits a homogeneous and unbounded space of
possibility, giving rise, for example, to the notion of an all-powerful nanotechnology
as a combinatorial exercise that produces the little BANG [45] since bits, atoms,
neurons, genes all consist of atoms, since all of them are molecular, they all look alike
to nanoscientists and engineers who can recombine them at will. And thus comes
with the notion of an unlimited space of combinatorial possibilities the transgressive
character of nanotechnoscience: categorial distinctions of living and inanimate,
organic and inorganic, biological and technical things, of nature and culture appear to
become meaningless. Although hardly any scientist believes literally in the innite
plasticity of everything molecular, the molecular point of view proves transgressive in
many nanotechnological research programs. It is particularly apparent where
biological cells are redescribed as factories with molecular nanomachinery. Aside
from challenging cultural sensibilities and systematic attempts to capture the special
character of living beings and processes, nanotechnoscience here appears naively
reductionist. In particular, it appears to claim that context holds no sway or, in other
words, that there is no topdown causation such that properties and functionalities of
the physical environment partially determine the properties and behaviors of the
component molecules.27)
This sparsely positive and therefore unbounded view of nanoscale objects and their
combinatorial possibilities thus fuels also the notion of unlimited technical potential
along with visions of a nanotechnological transgression of traditional boundaries.
Accordingly, this conception of the domain of nanoscience and nanotechnologies
suffers from the problem of unwieldiness it can play no role in political discourse
other than to appeal to very general predispositions of technophobes and
technophiles [46].
Three further problems, at least, come with the conception of the domain as
everything molecular out there. And as before, internally scientic problems are
intertwined with matters of public concern. There is rst the (by now familiar)
scientic and societal problem that there is no cognizance of limits of
understanding and control as evidenced by a seemingly naive reductionism.
There is second the (by now also familiar) problem that technoscientic achievements and conceptions have a surplus of meaning which far exceeds what the
research community can take responsibility for the power of images is dwarfed
by the power of visions (positive or negative) that come with the notion of
unlimited potential. And there is nally the problem of the relation of technology
and nature.
It was not very difcult to identify at least four major problems with the
commonly held view that the domain of nanotechnological research encompasses everything molecular It proves quite difcult, in contrast, to avoid
those problems. In particular, it appears to defy common sense and the insights
of the physical sciences to argue that molecules should have a history or that
they should be characterized by the specic environments in which they appear.
Is it not the very accomplishment of physical chemistry ever since Lavoisier that
it divested substances of their local origins by considering them only in terms
of their composition, in terms of analysis and synthesis? [48]. And should one
not view nanoscience and nanotechnologies as an extension of traditional
physics, physical chemistry and molecular biology as they tackle new levels
j233
j 7 Philosophy of Nanotechnoscience
234
of complexity? All this appears evident enough, but yet there are grounds on
which to tackle the formidable challenge and to differentiate the domain of
nanoscientic objects.28)
As noted above, bulk chemical substances are registered and assessed on the
grounds of a physicochemical characterization. Once a substance has been approved, it can be used in a variety of contexts of production and consumption. On this
traditional model, there appears no need to consider its variability of interactions in
different biochemical environments (but see [29, 30]). Although the toolkit of
nanotoxicology is still being developed, there is a movement afoot according to
which a carbon nanotube is perhaps not a carbon nanotube. What it is depends on its
specic context of use: dispersed in water or bound in a surface, coated or uncoated,
functionalized or not all this is toxicologically relevant. Moreover, a comprehensive
physico-chemical characterization that includes surface properties, size and shape
would require a highly complex taxonomy with too many species of nanoparticles,
creating absurdly unmanageable tasks of identication perhaps one particle at a time.
Instead, the characterization of nanoparticles might proceed by way of the level of
standardization that is actually reached in production and that is required for
integration in a particular product with a smaller or larger degree of variability,
error tolerance, sensitivity to environmental conditions, as the case may be for a
specic product in its context of use. Nanotoxicology would thus be concerned with
product safety rather than the safety of component substances. On this account, the
particles would indeed be dened by their history and situation in the world and thus
thickly by their place within and their impact upon nature as the specic evolved
conditions of human life on Earth.29)
There is another, more principled, argument for a thickly differentiated
account of the objects that make up the domain of nanoscience and nanotechnologies. The unbounded domain of everything molecular includes not
only the objects and properties that we now have access to and that we can now
measure and control. It also includes those objects and properties that one may
gain access to in the future. This way of thinking is indifferent to the problem of
actual technical access also in that it does not consider how observational
instruments and techniques structure, shape and perhaps alter the objects
in the domain. On this account, the domain appears open and unlimited because
j235
j 7 Philosophy of Nanotechnoscience
236
7.5
From Epistemic Certainty to Systemic Robustness
7.5.1
What Do Nanoscientists Know?
This account of skill knowledge presses the question of where the science is in
technoscience. The answer to this question can be found in the rst section of this
chapter: it is in the (closed) theories that are brought as tools to the achievement of
partial control and partial understanding. Nanotechnoscience seeks not to improve
theory or to change our understanding of the world but primarily to manage
The shift from hypotheses that take the form of sentences to actions within technocultural systems, from epistemic questions of certainty to systemic probes of
robustness has implications also for the risk society that looks to government
mostly for protection from risk [54].31)
31) The precautionary principle refers to the
30) This diagnosis is not entirely novel or
certainty and uncertainty of knowledge
surprising. Technology, writes Heidegger, is
regarding risks. Where technology assessment
always ahead of science and, in a deep sense,
shifts from truth of sentences about risk to the
science is only applied technology [47]. By this
robustness or resilience of emerging technical
he means not only that laboratory science
systems and their interaction with other
requires instruments and experimental
technical systems. In this case, a different kind
apparatus for stabilizing the phenomena. He
of prudential approach is required for
means more generally that a technological
example, Dupuy and Grinbaums ongoing
attitude informs the scientic way of
normative assessment [55].
summoning phenomena to predictably appear
once certain initial conditions are met.
j237
j 7 Philosophy of Nanotechnoscience
238
Expectations of certainty and assurances of safety will not be met by nanotechnologies. Other technologies already fail to meet them. Certainty about the
safety of a new drug, for example, is produced by the traditional method of a clinical
trial that establishes or refutes some proposition about the drugs efcacy and
severity of side-effects. A far more complex and integrated mechanism is required
where such certainty is unattainable and where robustness needs to be demonstrated. Here, several activities have to work in tandem, ranging from traditional
toxicology, occupational health and epidemiology all the way to the deliberate
adoption of an unknown risk for the sake of a signicant desired benet. If this
integration works, social robustness will be built into the technical system along
with the robustness of acquired skills, tried and true algorithms, measuring and
monitoring apparatus. The fact that nanoscale researchers demonstrate acquired
capabilities and that they thus produce mere skill knowledge creates a demand for
skill knowledge also in a social arena where nanotechnological innovations are
challenged, justied and appropriated.
7.6
What Basic Science Does Nanotechnology Need?
The preceding sections provided a survey of nanotechnoscience in terms of disciplinary questions (a complex eld partially disclosed by stretching closed theories), of
methodology (constructions and qualitative judgments of likeness), of ontology (a
thin conception of nature as unlimited potential) and of epistemology (acquisition
and demonstration of capabilities). This does not exhaust a philosophical characterization of the eld which would have to include, for example, a sustained investigation
of nanotechnology as a conquest of space or a kind of territorial expansion.32) Also,
nothing has been said so far about nanotechnology as an enabling technology that
might enable, in particular, a convergence with bio- and information technologies.
Finally, it might be important to consider nanotechnoscience as an element or
symptom of a larger cultural transition from scientic to technoscientic research.
This survey is limited in other ways. It glossed over the heterogeneity of research
questions and research traditions. And it focused exclusively on the way in which
nanotechnological research has developed thus far. There is nothing in the preceding
account to preclude a profound reorientation of nanoscience and nanotechnologies.
Indeed, one reorientation might consist in the whole enterprise breaking apart and
continuing in rather more traditional disciplinary settings with nano ceasing to be
a funding umbrella but becoming a prex that designates a certain approach. Thus,
under the sectoral funding umbrellas food and agriculture, energy, health,
manufacturing or environment researchers with the nano prex would investigate how problems and solutions can be viewed at the molecular level. Their work
32) One implication of this is that nanotechnology
should not be judged as the promise of a
future but, instead, as a collective experiment
in and with the present [56].
j239
j 7 Philosophy of Nanotechnoscience
240
Finally, one might ask whether nanotechnoscience can and should be construed as a social science of nature.38) As an enabling, general-purpose or key
technology it leaves undetermined what kinds of applications will be enabled by
it. This sets it apart from cancer research, the Manhattan project, the arms race,
space exploration, articial intelligence research, and so on. As long as nanotechnoscience has no societal mandate other than to promote innovation, broadly
conceived, it remains essentially incomplete, requiring social imagination and
public policy to create an intelligent demand for the capabilities it can supply. As
research is organized to converge upon particular societal goals [61], nanoscience
and nanotechnology might be completed by incorporating social scientists,
anthropologists and philosophers in its ambitions to design or shape a world
atom by atom.
Nanotechnologies are frequently touted for their transformative potential, for
bringing about the next scientic or industrial revolution. This chapter did not
survey a revolutionary development, but pragmatic and problematic integrations of pre-existing scientic knowledge with the novel discoveries at the
nanoscale. If one expects science to be critical of received theories and to
produce a better understanding of the world, if one expects technology to
enhance transparency and control by disenchanting and rationalizing nature,
these pragmatic integrations appear regressive rather than revolutionary. If one
abandous these expectations and makes the shift from epistemic certainty to
systemic robustness, these pragmatic integrations hold the promise of producing socially robust technologies. In the meantime, there is no incentive for
researchers and hardly any movement on the side of institutions to consider
seriously the question of a disciplinary reorientation and consolidation of the
nanosciences and nanotechnologies. A nanotechnological revolution has not
happened yet: we may be waiting for it in vain and this is probably a good
thing.39)
References
References
1 Latour, B. (1987) Science in Action, Harvard
University Press, Cambridge, MA.
2 Haraway, D. (1997)
Modest_Witness@Second_Millenium,
Routledge, New York.
3 Nordmann, A. (2004) Was ist
TechnoWissenschaft? Zum Wandel der
Wissenschaftskultur am Beispiel
von Nanoforschung und Bionik, in
Bionik: Aktuelle Forschungsergebnisse
in Natur-, Ingenieur- und Geisteswissenschaften (eds T. Rossmann and
C. Tropea), Springer, Berlin, pp. 209218.
4 Hacking, I. (1983) Representing and
Intervening, Cambridge University Press,
New York.
5 Galison, P. (2006) The pyramid and the
ring, presented at the conference of the
Gesellschaft f
ur analytische
Philosophie (GAP), Berlin.
6 Brune, H., Ernst, H., Grunwald, A.,
Gr
unwald, W., Hofmann, H., Krug, H.,
Janich, P., Mayor, M., Rathgeber, W.,
Schmid, G., Simon, U., Vogel, V. and
Wyrwa, D. (2006) Nanotechnology:
Assesssments and Perspectives, Springer,
Berlin.
7 Nanoscience and Nanotechnologies:
Opportunities and Uncertainties, Royal
Society and Royal Academy of
Engineering, London. 2004.
8 Echeverra, J. (2003) La Revolucin
Tecnoscientca, Fondo de Cultura
Econmica de Espaa, Madrid.
9 Johnson, A.(3 March 2005) Ethics and the
epistemology of engineering: the case
of nanotechnology, presented at
the conference Nanotechnology:
Ethical and Legal Issues,
Columbia, SC.
10 Heisenberg, W. (1974) The notion of a
closed theory in modern science, in
Across The Frontiers (ed. W. Heisenberg),
Harper and Row, New York, pp. 3946.
11 Bokulich, A. (2006) Heisenberg meets
Kuhn: closed theories and paradigms.
Philosophy of Science, 73, 90107.
j241
j 7 Philosophy of Nanotechnoscience
242
References
45
46
47
48
49
50
51
52
53
j243
j245
8
Ethics of Nanotechnology. State of the Art and Challenges Ahead
Armin Grunwald
8.1
Introduction and Overview
In view of the revolutionary potentials attributed to nanosciences and nanotechnology with respect to nearly all elds of society and individual life [1, 2], it is not
surprising that nano has attracted great interest in the media and in the public.
Parallel to high expectations, for example in the elds of health, growth and
sustainable development, there are concerns about risks and side-effects. Analyzing,
deliberating and assessing expectable impacts of nanotechnology on future society
are regarded as necessary parts of present and further development. There have
already been commissions and expert groups dealing with ethical, legal and social
implications of nanotechnology (ELSI) [3, 4]. An ethical reection on nanotechnology
emerged and has already led to new terms such as nano-ethics and the recent
foundation of a new journal Nano-Ethics. The quest for ethics in and for nanotechnology currently belongs to public debate in addition to scientic self-reection. The
ethical aspects of nanotechnology discussed in the (so far few) treatises available
show broad evidence of this relatively new eld of science and technology ethics.
Nanotechnology has been attracting increasing awareness in practical philosophy
and in professional ethics. However, there has been some time delay compared with
the development of nanotechnology itself: While the number of publications on NT
[nanotechnology] per se has increased dramatically in recent years, there is very little
concomitant increase in publications on the ethical and social implications to be
found [5, p. R10]. Certain terms, such as privacy, manmachine interface, the
relationship between technology and humankind or equity are often mentioned.
In the last few years, ethical reection on nanotechnology developed quickly and
identied many ethically relevant issues [6, 7]. However, well-justied criteria for
determining why certain topics, such as nanoparticles or crossing the border between
technology and living systems, should be ethically relevant are mostly not given and
there is no consensus yet. In particular, the novelty of the ethical questions touched
by developments emerging from nanotechnology compared with ethical issues in
246
8.2
The Understanding of Ethics1)
In modern discussion, the distinction between factual morals on the one hand and
ethics as the reective discipline in cases of moral conicts or ambiguities on the
other has widely been accepted [10]. This distinction takes into account the plurality
of morals in modern society. As long as established traditional moral convictions
(e.g. religious ones) are uncontroversial and valid among all relevant actors and as
long as they are sufcient to deal with the respective situation and do not leave open
relevant questions, ethical reection is not in place. Morals are, in fact, the actionguiding maxims and rules of an individual, of a group or of society as a whole.
Ethical analysis, on the other hand, takes these morals as its subjects to reect on.
Ethics is concerned with the justication of moral rules of action, which can lay
claim to validity above and beyond the respective, merely particular morals [8]. In
1) This chapter summarises general work of the
author in the eld of ethics of technology [8, 9]
in order to introduce the basic notions to be
used in the following in a transparent way. See
also [1, Section 6.2].
particular, ethics serves the resolution of conict situations which result out of the
actions or plans of actors based on divergent moral conceptions by argumentative
deliberation only, which is grounded in philosophical ideas such as the Categorical
Imperative by Immanuel Kant, the Golden Rule or the Pursuit of Happiness
(Utilitarianism).
Normative aspects of science and technology lead, in a morally pluralistic society,
unavoidably to societal debates at the least and often also to conicts over technology.
We can witness recent examples in the elds of nuclear power and radioactive waste
disposal, stem cell research, genetically modied organisms and reproductive
cloning. As a rule, what is held to be desirable, tolerable or acceptable is controversial
in society. Open questions and conicts of this type are the point of departure for the
ethics of technology [10, 11]. Technology conicts are, as a rule, not only conicts over
technological means (e.g. in questions of efciency), but also include diverging ideas
over visions of the future, of concepts of humanity and on views of society.
Technology conicts are often controversies about futures: present images of the
future which are considerably inuenced by our illustrations of the scientic and
technological advance are highly contested [12]. The role of the ethics of technology
consists of the analysis of the normative structure of technology conicts and of the
search for rational, argumentative and discursive methods of resolving them. In this
continental understanding, ethics is part of the philosophical profession. In ethical
reection in the various areas of application, however, there are close interfaces to
and inevitable necessities for interdisciplinary cooperation with the natural and
engineering sciences involved as well as with the humanities. Even transdisciplinary
work might be included in cases of requests for broad participation, for example in
the framework of participatory technology assessment [13].
Technology is not nature and does not originate of itself, but is consciously
produced to certain ends and purposes namely, to bring something about which
would not happen of itself. Technology is therefore always embedded in societal
goals, problem diagnoses and action strategies. In this sense, there is no pure
technology, that is a technology completely independent of this societal dimension.
Therefore, research on and development of new technologies always refer to
normative criteria of decision-making including expectations, goals to be reached
and values involved [14].
But even if technology is basically beset with values, this does not imply that every
decision in research and technology development must be scrutinized in ethical
regard. Most of the technically relevant decisions can, instead, be classied as a
standard case in moral respect in the following sense [6, 9]: they do not subject the
normative aspects of the basis for the decision (criteria, rules or regulations, goals)
to specic reection, but assume them to be given for the respective situation and
accept the frame of reference they create. In such cases, no explicit ethical reection
is, as a rule, necessary, even if normative elements self-evidently play a vital role in
these decisions the normative decision criteria are clear, acknowledged and
unequivocal. It is then out of the question that this could be a case of conict with
moral convictions or a situation of normative ambiguity the information on the
normative framework can be integrated into the decision by those affected and by
j247
248
8.3
Ethical Aspects of Nanotechnology an Overview
Which are the ethical aspects of nanotechnology and related innovations in the
sense dened above?
Which of the identied ethical aspects of nanotechnology are specic for nanotechnology and novel to ethics?
Where are relations to recent or ongoing ethical debates in other technology elds,
if any?
The resulting map of ethical aspects described in the following is organized with
reference to existing ethical debates (e.g., debates on privacy, equity or human
nature). This classication has the advantage that it automatically allows one to refer
to normative frameworks and therefore enables us to investigate whether (a) existing
frameworks are sufcient to deal with the value problems involved and (b) if not,
whether there are new challenges to the frameworks which are specically caused by
nanotechnology. In this way, it is possible to arrive at a structured and well-founded
picture of ethical aspects in nanotechnology.2)
8.3.1
Equity: Just Distribution of Opportunities and Risks
j249
250
with regard to both of these types of a potential nano-divide (after the well-known
digital divide) are based on the assumption that nanotechnology can lead not only to
new and greater options for individual self-determination (e.g. in the eld of
medicine), but also to considerable improvement of the competitiveness of national
economies. Current discussions on distributive justice on both national and international levels (in the context of sustainability as well) are therefore likely to gain
increased relevance with regard to nanotechnology.
A specic future eld of debate with respect to equity will be the human
enhancement issue ([17]; see Section 8.5 of this chapter). If technologies of
improving human performance were to be available then the question arises of
who will have access to those technologies, especially who will be able to pay for them
and what will happen to persons and groups excluded from the benets. There could
develop a separation of the population into enhanced and normal people where a
situation is imaginable with normal to be used as a pejorative attribute [18] and a
coercion towards enhancement might occur: Merely competing against enhanced
co-workers exerts an incentive to use neuro-cognitive enhancement and it is harder to
identify any existing legal framework for protecting people against such incentives to
compete [19, p. 423]. Special problems can be expected for disabled persons [20].
Equity aspects, however, are not really new ethical aspects caused by nanotechnology, but are rather intensications of problems of distribution already existing and
highly relevant. Problems of equity belong indispensably to modern technology in
general and are leading to ongoing and persistent debates in many elds. The digital
divide [21] is, perhaps, the best-known example. But also in military respects or with
regard to access to medical high-tech solutions these inequalities exist and are
debated in many areas. There is no new ethical question behind them but there might
be new and dramatic cases emerging driven by nanotechnological R&D. This point
has already arrived at the international level of the nanotech debate [22].
8.3.2
Environmental Issues3)
The limited availability of many natural resources such as clean water, fossil fuels and
specic minerals highlights the importance of the efciency of their use, of
recycling and of substituting non-renewable resources by renewable ones.
Many scientists and engineers claim that nanotechnology promises less material
and energy consumption and less waste and pollution from production. Nanotechnology is also expected to enable new technological approaches that reduce the
environmental footprints of existing technologies in industrialized countries or to
allow developing countries to harness nanotechnology to address some of their most
pressing needs [22]. Nanoscience and nanotechnology may be a critical enabling
component of sustainable development when they are used wisely and when the social
context of their application is considered [26]. There are a lot of high expectations
concerning positive contributions of nanotechnology to sustainable development.
However, all the potential positive contributions to sustainable development may
come at a price. The ambivalence of technology with respect to sustainable development also applies to nanotechnology [27]. The production, use and disposal of
products containing nanomaterials may lead to their appearance in air, water, soil or
even organisms [1, Chapter 5, 28]. Nanoparticles could eventually be transported as
aerosols over great distances and be distributed diffusely. Despite many research
initiatives throughout the world, only little is known about the potential environmental and health impacts of nanomaterials. This situation applies also and above all
for substances which do not occur in the natural environment, such as fullerenes or
nanotubes. The challenge of acting under circumstances with high uncertainties but
with the nanoproducts already at the marketplace is the heart of the ethical challenges
by nanoparticles (because of the high relevance and because nanoparticles and their
possible risks are under intensive public observation today [29, 30], this topic will be
dealt with in-depth (see Section 8.4).
Questions of eco- or human toxicity of nanoparticles, on nanomaterial ow, on the
behavior of nanoparticles in spreading throughout various environmental media, on
j251
252
their rate of degradation or agglomeration and their consequences for the various
conceivable targets are, however, not ethical questions (see Section 8.2). In these
cases, empiricalscientic disciplines, such as human toxicology, eco-toxicology or
environmental chemistry, are competent. They are to provide the knowledge basis for
practical consequences for working with nanoparticles and for disseminating
products based on them. However, as the debate on environmental standards of
chemicals or radiation has shown [31, 32], the results of empirical research do not
determine how society should react. Safety and environmental standards in our
case for dealing with nanoparticles are to be based on sound knowledge but cannot
logically be derived from that knowledge. In addition, normative standards, for
example concerning the intended level of protection, the level of public risk acceptance and other societal and value-laden issues enter the eld. Because of this
situation, it is not surprising that frequently conicts about the acceptability of risks
occur [33, 34] and this is obviously a non-standard situation in moral respect (see
Section 8.2). Therefore, the eld of determining the acceptability and the tolerability
of risks of nanoparticles is an ethically relevant issue.
In particular, there are a lot of sub-questions where ethical investigation and debate
are asked for in the eld of nanoparticles. Such questions are [1, Section 6.2]:
.
What follows from our present lack of knowledge about the possible side-effects of
nanoparticles? This is a challenge to acting rationally under the condition of high
uncertainty a common problem in practical ethics.
Is the precautionary principle [35] relevant in view of a lack of knowledge and what
would follow from applying this principle [28, 31, 32] (see Section 8.4)?
application is developing here for the ethics of science and technology. The type
of questions posed, however, is well known from established discussions on risks
(e.g. risks by exposure to radiation or by new chemicals). Really novel ethical
questions are not to be expected in spite of the high practical relevance of the eld.
From an ethical point of view this situation is well known: there are, on the one hand,
positive expectations with regard to sustainable development which legitimate a moral
postulate to explore further and to exhaust those potentials. On the other hand, there
are risks and uncertainties. This situation is the basic motivation of technology
assessment (TA) as an operationalization of ethical reections on technology [36].
The basic challenge with strong ethical support is shaping the further development of
nanotechnology in the direction of sustainable development [25, 37].
8.3.3
Privacy and Control
j253
254
logistics. These objects have at present a size of several tenths of a millimeter in each
dimension, so that they are practically unnoticeable to the naked eye. Further
miniaturization will permit further reductions in size and the addition of more
functions without nanotechnology being needed but nanotechnology will
promote and accelerate these developments.
The ethically relevant questions on a right to know or not to know, on a personal
right to certain data, on a right to privacy, as well as the discussions on data protection
and on possible undesirable inherent social dynamisms and, in consequence, of a
drastic proliferation of genetic and other tests, have been a central point in bio- and
medical-ethical discussions for some time. Nanotechnological innovations can
accelerate or facilitate the realization of certain technical possibilities and therefore
increase the urgency of the problematics of the consequences; in this area, however,
they do not give rise to qualitatively new ethical questions.
8.3.4
Military Use of Nanotechnology
Nanotechnology can improve not only multiple peaceful uses but also military and
future arms systems. The foundation of the so-called Institute for Soldier Nanotechnologies (http://web.mit.edu/isn/) to enhance soldier survivability makes it clear
that nanotechnology has applications in the military eld. It is predicted that
nanotechnology will bring revolutionary changes in this areas as well [4042].
Nevertheless, progress in a military technology will not only improve survival and
healing, it always implies its use to enhance the efcacy of weapons, surveillance
systems and other military equipment. As nanotechnology will provide materials and
products that are stronger, lighter, smaller and more sensitive, there may be
projectiles with greater velocity and smaller precision-guidance systems. Moreover,
nanotechnology will inuence the processing in energy generation and storage,
displays and sensors, logistics and information systems, all being important elements of warfare. A particular point of interest is the use of BCI (braincomputer
interaction) for navigation support for jet pilots, which is being researched by many
projects funded by the US defense agency DARPA [43]. In several countries the
Departments or Ministries of Defense have arranged nanotechnological programs.
No one wants to be left behind [42].
The ethical concerns related to these developments mentioned in the available
literature may be classied into the following points [41, 42]:
.
.
.
As long as the military in general is regarded as ethically allowed and this is the
case in most concepts of ethics as far as the military is used for legitimate purposes
j255
256
these technologies will not be as long as it seems today. Nanotechnology will soon allow
many diseases to be monitored, diagnosed and treated in a minimally invasive way and
it thus holds great promise of improving health and prolonging life. Whereas
molecular or personalized medicine will bring better diagnosis and prevention of
disease, nanomedicine might very well be the next breakthrough in the treatment of
disease [46, p. 1012).
Addressing symptoms more efciently or detecting early onsets of diseases is
without doubt recommended by ethics. These potentials are so remarkable that
ethical reection almost seems to be superuous if one looks solely at the potentials.
A comprehensive analysis, however, has to include as noted above also possible
side-effects, especially risks [4850]. New types of responsibility and new tasks for
weighing up pros and cons might occur. For example, new forms of drug delivery
based on nanotechnology (using fullerenes, nanostructured membranes, gold nanoshells, dendrimers [1, Section 3.3]) could also have consequences which are not
expected and might not be welcome. Careful observations of the advance of
knowledge and early investigations of possible side-effects have to be conducted.
HTA (Health Technology Assessment) offers several established approaches for early
warning. The ethically relevant issues are [7]:
.
.
.
.
.
.
the gulf between diagnostic and therapeutic possibilities and the problem of
undesirable information;
data protection and the protection of privacy (see Section 8.3.3), especially the
danger of genetic discrimination;
preventive medicine and screening programs;
increases in costs through nanomedicine and problems of access and equity (see
Section 8.3.1);
deferment of illness during the lifetime of humans;
changes in the understanding of illness and health [52].
However, there is probably no eld of science in which dealing with risks is so well
established as in medicine and pharmaceutics. Advances in medicine (diagnosis and
therapy) are evidently related to risks and there are a lot of established mechanisms
such as approval procedures for dealing with them. There is nothing new about this
situation. Therefore, using nanotechnology for medical purposes is a standard
situation in moral respects (following the notion introduced in Section 8.2). Against
this background, it seems to be improbable that direct applications of nanotechnology for medical purposes might lead to completely new ethical questions [47, 51].
The ethical topics to be aware of are not specic for the use of nanotechnology but
are also valid for a lot of other advances in medical science and practice.
The boundaries of such a standard situation in moral respects would, however, be
transgressed in some more visionary scenarios as in the vision of longevity or the
abolition of aging. Nanotechnology could, in connection with biotechnology and,
perhaps, neurophysiology, build the technological basis for realizing such visions. In
this respect, the idea has been proposed that nanomachines in the human body could
permanently monitor all biological functions and could, in case of dysfunction,
damage or violation, intervene and re-establish the correct status. In this way, an
optimal health status could be sustained permanently [53] which could considerably
enlarge the human lifespan. Such methods, however, would require dramatic
technological progress [2, Section 7.2.3]. According to the state of present knowledge,
neither the prediction of the time needed for such developments nor an assessment
of their feasibility at all can seriously be given.
A new area that is both practically and ethically interesting and much closer to
realization consists of creating direct connections between technical systems and the
human nervous system [39, 54, 55]. There is intensive current work on connecting
the world of molecular biology with that of technology. An interesting eld of
development is nanoelectronic neuro-implants (neurobionics), which compensate
for damage to sensory organs or to the nervous system [43]. Micro-implants could
restore the functions of hearing and eyesight. Even today, simple cochlear or retina
implants, for example, can be realized. With progress in nano-informatics, these
implants could approach the smallness and capabilities of natural systems. Because
of the undoubtedly positive goals of healing and restoring damaged capabilities,
ethical reection could, in this case, concentrate above all on the denition and
prevention of misuse. Technical access to the nervous system, because of the
possibilities for manipulation and control which it opens up, is a particularly sensitive
issue. A more complex ethical issue would be neuro-cognitive enhancement of
functions of the brain [19] (see Section 8.5).
Summing up, in the eld of medical applications of nanotechnology there are,
considered from the standpoint of todays knowledge, no ethical concerns which are
specically related to the use of nanotechnology. There are a lot of positive potentials
which probably also will bear risks these risks, however, might be dealt with by
standard measures of risk analysis and management established in medical
practice [48]. However, things might change in the more distant future (e.g., if there
were to be a shift from the classical medical viewpoint to the perspective of
enhancing human performance; see Section 8.5).
8.3.6
Artificial Life
Basic life processes take place on a nanoscale, because lifes essential building blocks
(such as proteins) have precisely this size. By means of nanobiotechnology, biological
processes will, due to frequently expressed expectations, be made technologically
controllable. Molecular factories (mitochondria) and transport systems, which
play an essential role in cellular metabolism, can be models for controllable bionanomachines. Nanotechnology on this level could permit the engineering of cells
and allow a synthetic biology constructing living systems from the bottom or
modifying existing living systems (such as viruses) by technical means. An intermeshingofnaturalbiologicalprocesseswithtechnicalprocessesseemstobeconceivable.The
classicalbarrierbetweentechnologyandlifeisincreasinglybeingbreachedandcrossed.
The technical design of life processes on the cellular level, direct links and new
interfaces between organisms and technical systems portend a new and highly
dynamic scientic and technological eld. Diverse opportunities, above all, in the
j257
258
eld of medicine, but also in biotechnology, stimulate research and research funding
[1, Section 3.3]. New ethical aspects are certainly to be expected in this eld. They will
possibly address questions of articial life and of rearranging existing forms of living
systems by technical means, for example the reprogramming of viruses. Their
concrete specication, however, will only be possible when research and development can give more precise information on elds of application and products.
Without knowing much about products and systems emerging from the mentioned
developments, there is one thing which seems clear already today. We can surely expect
discussions about risks because technically modifying or even creating life is morally
and with respect a very sensitive eld [56]. The corresponding discussions of risks will
have structural similarities to the discussion on genetically modied organisms
(GMOs) because in both cases the source code of life is attached by technical means.
It could come to discussions about safety standards for the research concerned, about
containment strategies, about eld trials and release problems. The danger of
misuse will be made a topic of debate, such as technically modifying viruses in order
to produce new biological weapons that could possibly be used by terrorists. In this area
of nanotechnology, opposition, rejection and resistance in society could be feared,
comparable to the GMO case. There will be a demand for early dealing with possible
ethical and risk problems and for public dialogue and involvement.
In spite of the partly still speculative nature of the subject, ethical reection of the
scientic advance on crossing the border between technology and life does not seem
to be premature [49, 50]. There are clear indications that scientic and technical
progress will intensify the at present non-existent urgency of these questions in
the coming years. However, against the questions to be answered in this section, it
has to be stated that these ethical questions are not really specic to nanotechnology.
The slogan shaping the world atom by atom [57] does not make a difference
between technology and life and is, therefore, background to crossing the border
between technology and life. But the ethical debates to be expected can rely on
preceding investigations. Since the 1980s, these subjects have been repeatedly
discussed in the debates on GMO, on articial intelligence and on articial life.
There is a long tradition of ethical thought which has to be taken into account when
facing the new challenges in the eld.
8.3.7
Human Enhancement4)
Within the tradition of technical progress, that has, at all times, transformed
conditions and developments which, until then, had been taken as given, as
unalterable fate into inuenceable, manipulable and formable conditions and
developments, the human body and its psyche are rapidly moving into the dimension
of the formable. The vision of enhancing human performance has been conjured
4) Because of the high relevance in current
debates and the challenge to traditional
thinking involved, this topic has been selected
for an in-depth investigation (see Section 8.5).
j259
260
8.4
Nanoparticles and the Precautionary Principle5)
j261
262
times, often a wait-and-see approach had been taken. New substances have been
introduced assuming that either probably no negative developments and impacts at
all would occur or that, in case of adverse effects, ex post repair and compensation
strategies would be appropriate. The asbestos case is one of the best-known
experiences where this approach failed and where the failure had dramatic
consequences [67].
Such experiences with hazards caused by new materials, by radiation or by new
technologies (see [68] for impressive case studies) led to risk regulations in different
elds, in order to prevent further negative impacts on health and the environment.
Important areas are [1, Section 5.1]:
.
.
regulations for working places with specic risk exposures (nuclear power plants,
chemical industry, aircraft, etc.) to protect staff and personnel;
procedural and substantial regulations for nutrition and food (concerning conservation procedures, maximum allowed concentrations of undesirable chemicals
such as hormones, etc.) to protect consumers;
environmental standards in many areas to sustain environmental quality (concerning ground water quality, maximum allowed rate of specic emissions from
fabrication plants, power plants, heating in households, etc.);
safety standards and liability issues to protect users and consumers (in the eld of
automobile transport, for power plants, engines, technical products used in
households, etc.).
There are established mechanisms of risk analysis, risk assessment and risk
management in many areas of science, medicine and technology, for example in
dealing with new chemicals or pharmaceuticals. Laws such as the Toxic Substance
Control Act in the USA [69] constitute an essential part of the normative framework
governing such situations. This classical risk regulation is adequate if the level of
protection is dened and the risk can be quantied as the product of the probability of
the occurrence of the adverse effects multiplied by the assumed extent of the possible
damage. In situations of this type, thresholds can be set by law, by self-commitments
or following participatory procedures, risks can be either minimized or kept below a
certain level and also precautionary measures can be taken to keep particular effects
well below particular thresholds by employing the ALARA (as low as reasonably
achievable) principle [63]. Insofar as such mechanisms are able to cover challenges at
hand to a sufcient extent, there is a standard situation in moral respect (see
Section 8.2) regarding the risk issue. As the ongoing debate shows (which will be
explained in more detail below), the eld of risks of nanoparticles will not be a
standard situation in this sense.
As soon as the conditions for the classical risk management approach are no longer
fullled, uncertainties and ambivalent situations are the consequence. This is the
case if, on the one hand, scientic knowledge concerning possible adverse effects is
not available at all or is controversial or highly hypothetical or if empirical evidence is
still missing. On the other, classical risk management might not be applicable if
adverse affects could have catastrophic dimensions with respect to the extent of
possible damage, also in case of (nearly) arbitrary small probabilities of their
occurrence. In the eld of nuclear power plants, scenarios of this type have been used
as counter-arguments against that technology. The catastrophic dimension of possible, at least thinkable, accidents should, in the eyes of opponents, be a decisive
argument even in case of an extreme low probability of such events. This type of
situation motivated, for example, Hans Jonas [60] to postulate a heuristics of fear
and the obligation to use the worst scenario as orientation for action.
In the philosophical debate, however, it became clear that Jonass approach besides
inherent philosophical problems of the naturalistic and teleological approach might
be very appropriate to raise awareness with regard to precautionary situations but
completely inadequate to be operationalized by regulation. Jonass approach missed
completely a legitimate procedure for deciding about the applicability and adequacy of
precautionary strategies. What can still be learned from Jonass work is the high
relevance of normative reection in cases where classical risk management would no
longer be adequate [35]. Such situations often are welcomed entry points for ideology
and interest-driven statements in arbitrary directions. In fact, it is very difcult to
identify what a rational approach to dealing with non-classical situations could be
and in which way it could be proven to be rational [33].
The observation that in many cases severe adverse effects in the course of the
introduction of new materials had not been detected in an early stage but rather led to
immense damage to human health, the environment and also the economy [68]
motivated debates about precautionary regulation measures which could be applied
in advance of having certain and complete knowledge because it might then be too
late to prevent damage. Wide international agreement on the precautionary principle
was reached during the Earth Summit [United Nations Conference on Environment
and Development (UNCED)] in Rio de Janeiro in 1992 and became part of Agenda 21:
In order to protect the environment, the precautionary approach should be widely
applied by States according to their capabilities. Where there are threats of serious or
irreversible damage, lack of full scientic certainty shall not be used as a reason for
postponing cost-effective measures to prevent environmental degradation [70]. The
precautionary principle was incorporated in 1992 in the Treaty on the European
Union: Community policy on the environment shall aim at a high level of protection
taking into account the diversity of situations in the various regions of the Community. It shall be based on the precautionary principle . . . (Article 174).
The precautionary principle substantially lowers the (threshold) level for action of
governments (see [35] for the following). It considerably alters the situation in
comparison with the previous context in which politicians could use (or abuse) a
persistent dissent among scientists as a reason (or excuse), simply not to take action at
all. In this way, political action simply could come much too late. It is, however, a
difcult task to establish legitimate decisions about precautionary measures without
either to run into the possible high risks of a wait-and-see strategy or to overstress
precautionary argumentation with the consequence of no longer to be able to act any
more or to cause other types of problems (e.g., for the economy) without need. The
following characterization of the precautionary principle shows in spite of the fact
that it still does not cover all relevant aspects the complex inherent structure of the
precautionary principle:
j263
264
Following the preceding analysis, the central questions in the current situation
concerning the use of nanoparticles and the knowledge about possible impacts are as
follows [66]:
1. Is there a precautionary situation at all, characterized by epistemic uncertainty or
unquantiable risk?
2. Is there reasonable concern for the possibility of adverse effects in the eld of
nanoparticles to legitimate the application of the precautionary principle?
3. If yes, what would follow from this judgment with respect to adequate precautionary measures?
The rst question has to be answered positive (e.g. [1, Section 5.2, 27]). There are
unknown or uncertain cause-effect relations and still unknown probabilities of risks
resulting from the production, use and proliferation of nanoparticles. The scope of
possible effects, their degrees and the nature of their seriousness (in relation to the
chosen level of protection) can currently, even in the best cases, only be estimated in
qualitative terms. Therefore, we are witnessing a typical situation of uncertainty
where established risk management strategies are not to be applied [3, p. 4f, 27].
The answer to the second question, whether there is reasonable concern for the
possibility of adverse effects in the eld of synthetic nanoparticles is also to be
answered positive. There are rst toxicological results from the exposure of rats to
high concentrations of specic nanoparticles which showed severe up to lethal
consequences. Because the exposure concentrations were extremely high and the
transfer of knowledge achieved by the exposure of rats to the situation of humans is
difcult, these results do not allow for the conclusion of evidence of harm to human
health but they support assuming reasonable concern for the possibility of adverse
effects caused by synthetic nanoparticles. There is a strict difference from classical
risk management where evidence of adverse effects is required, not evidence of their
possibility (only).
The third question is the most difcult one. In the precautionary situation given by
positive answers to the rst two questions, the challenge is to identify a rational
course of action. The suspicion of adverse effects of nanoparticles could serve as a
legitimating reason for very strict measures such as a moratorium on nanoparticle
use [29, 30]. However, a mere suspicion this would be sufcient in the abovementioned Jonas scenario does not legitimate strict precautionary measures.
Instead, this would depend on a scientic assessment of the state of the art and
that the quality of the information available [35]. This scientic assessment of the
knowledge available for the nanoparticle case has recently been performed [1, Section
5.2] in a comprehensive manner. It resulted that there are indications for nanoparticle
risks for health and the environment in some cases but that there is, based on the state
of the art, no reason for serious concern about adverse effects but well-grounded serious
concern about the possibilities of such effects. In the same direction: Taking into
account our present-day knowledge, there is, with regard to nano-specic effects
(excluding self-organization effects and cumulative effects of mass production), no
reason for particularly great concern about global and irreversible effects of the
specic technology per se, with it being on a par with the justiable apprehension
concerning nuclear technology and genetic engineering [27, p. 16].
The mere possibility of serious harm implied by a wider use of nanoparticles,
however, does not legitimize using the precautionary principle as an argument for a
moratorium or other prohibitive measures. However, because the state of knowledge
permanently changes, continuous monitoring and assessment of the knowledge
production concerning impacts of nanoparticles on human health and the environment are urgently required. One of the most dramatic lessons which can be learned
from the asbestos story is exactly the lesson about the crucial necessity for such a
systematic assessment [67].
The next question is what other (and weaker) types of measures are required in
the present (precautionary) situation. These could, for example, be self-organization
measures in science, such as the application of established codes of conduct or
the adaptation of existing regulation schemes to special features of nanoparticles.
It seems helpful to reconsider, as an example, the implementation of the precautionary principle in the genetically modied organisms (GMOs) case: The European
Directive 2001/18 (superseding Directive 90/220), concerning the deliberate
release of GMOs into the environment, is the rst piece of international legislation
in which the precautionary principle is translated into a substantial precautionary
framework. . . . In the framework of this Directive the precautionary principle is
translated into a regulatory framework which is based on a so-called case-by-case
and a step-by-step procedure. The case-by-case procedures facilitates a mandatory
scientic evaluation of every single release of a GMO. The step by step procedure
facilitates a progressive line of development of GMOs by evaluating the
j265
266
.
.
8.5
Human Enhancement by Converging Technologies
j267
268
8.5.1
Human Enhancement: Visions and Expectations
Human enhancement is a very old theme. Mankinds dissatisfaction with itself has
been known from ancient times discontent with mankinds physical endowments,
its physical and intellectual capabilities, with its vulnerability to exogenic eventualities such as disease, with the inevitability of aging and, nally, of death, dissatisfaction
with its moral capacities or and this will probably be particularly frequent with
ones physical appearance. Various methods have been developed and established in
order to give certain wishes for improvement a helping hand. Todays esthetic
surgery, as a kind of business with considerable and further growing returns, is
at present the probably most widespread method of human enhancement. Pharmaceuticals enhancing the performance of the mind or preventing tiredness are
increasingly used [19]. Extending the physical limits of human capabilities through
intensive training in competitive sports can also be understood as enhancement.
Making use of technical means for improving performance in sport (doping), however,
is being practiced as we often can read about in newspapers but is still held to be
unsportsmanlike and unethical.
If the types of enhancement just listed apply to individuals (high athletic performance, individual beauty), collective human enhancement is, in its turn, also no new
topic. Humankinds often deplored defects in terms of morals or civilization led, for
example, in the with regard to morality as well progress-optimistic European
Enlightenment to approaches towards trying to improve humankind as a whole
through education. Beginning with the individual, above all, in school education, farreaching processes towards the advancement of human civilization were to be
stimulated and supported.
In the twentieth century, the totalitarian regimes in the Soviet Union and in the
German Nazi Reich also applied strategies for improvement due to the respective
ideologies, either related to ideas of socio-Darwinist racism and anti-Semitism or to
the orthodox communist and anti-bourgeois ideology. These historical developments
show that strategies of improvement must be scrutinized very careful with respect to
possibly underlying totalitarian ideologies.
In the current discussion of human enhancement, it is neither a question of an
improvement through education and culture nor by indoctrination or power, but of
technical improvement. Initiated by new scientic visions and utopias which are under
discussion, completely new possibilities of human development have been proposed.
The title of the report of an American research group to the National Science
Foundation conveys its program: Converging Technologies for Improving Human
Performance [17]. Nanotechnology and the Converging (NBIC) Technologies offer,
according to this report, far-reaching perspectives for perceiving even the human body
and mind as formable, to improve them through precisely targeted technical measures, and, in this manner, also to increase their societal performance. Strategies of
enhancement start at the individual level but are aiming, in the last consequence, at the
societal level: Rapid advances in convergent technologies have the potential to
enhance both human performance and the nations productivity. Examples of payoff
will include improving work efciency and learning, enhancing individual sensory
and cognitive capacities, revolutionary changes in healthcare, improving both individual and group efciency, highly effective communication techniques including
brain to brain interaction, perfecting humanmachine interfaces including neuromorphic engineering for industrial and personal use, enhancing human capabilities
for defense purposes, reaching sustainable development using NBIC tools and
ameliorating the physical and cognitive decline that is common to the aging mind
[17, p. 1]. Among these proposed strategies of enhancement are the following:
.
The extension of human sensory faculties: the capabilities of the human eye can be
augmented, for example, with respect to visual acuity (Eagle Eye) or with regard to
a night vision capability by broadening the electromagnetic spectrum visible in the
infrared direction; other sensory organs, such as the ear, could likewise be
improved, or completely new sensory capabilities, such as the radar sense of bats,
could be made accessible to human beings.
j269
270
moral respect (see Section 8.2) and that, therefore, this is a eld where ethical
reection is required in spite of the partially speculative status (see Section 8.3.7).
8.5.2
Occasions of Choice and Need for Orientation
Scientic and technical progress leads to an increase in the options for human action
and has therefore at rst sight an emancipatory function: an augmentation of the
possibilities for acting and deciding and a diminution of the conditions which have to
be endured as unalterable takes place. Whatever had been inaccessible to human
intervention, whatever had to be accepted as non-inuenceable nature or as fate
becomes an object of technical manipulation or shaping. This is an increase of
contingency in the conditio humana, a broadening of the spectrum of choices possible
among various options [58].
Inuencing the faculties of the healthy human body in the form of an improvement can be shown to be the logically consistent continuation of scientic and technical
progress [76]. Up to now the physical capabilities of healthy humans have to be taken
as given, as a heritage of the evolution of life. The sensoric capabilities of the eye or the
ear, for example, cannot be extended (except by technical means outside the human
body, such as microscopes). It would be an act of emancipation from Nature to be able
to inuence these capabilities intentionally by technical means. New occasions of
choice would appear: in which directions would an enhancement be sensible, which
additional functions of the human body should be realized, and so on? Also, more
individuality could be the result if decisions on specic enhancements could be made
at the individual level. In this way, human enhancement seems to contribute further
to realizing grand normative ideas of the Era of Enlightenment.
However, there is also another side of the coin. The rst price to be paid is the
dissolution of established self-evidence about human beings and their natural
capabilities. The increase of contingency is also an increased need for orientation
and decision-making [58]. Second, the outcomes of those decisions about technical
improvements can then be attributed to human action and decision increased
accountability and responsibility are the consequence [77]. Third, not only are new
options opened but existing ones might be closed. For example, it is rather probable
that disabled persons in a society using enhancement technologies to a large
extent would have greater problems of conducting their life [20]. Furthermore,
problems of access and equity would arise in increasing amount [18]. This ambivalent
situation is characteristic of many modern technologies and does not per se legitimize
arguments against enhancement but calls for attention and awareness and also for
rational debates about dealing with the dark side of enhancement technologies.
8.5.3
Human Enhancement No Simple Answers from Ethics
Increasing humankinds occasions of choice and its emancipation from the givenness of its biological and intellectual constitution also bring about uncertainties, a loss
of security which have been unquestioned up to now and the need for new orientation
to answering the above-mentioned general questions, as well as other, more specic
questions, for example concerning neuro-implants: However, ethical dilemmas
regarding the enhancement of the human brain and mind are potentially more
complex than, for example, genetically enhancing ones growth rate, muscle mass or
even appearance because we primarily dene and distinguish ourselves as individuals
through our behavior and personality [55]. This diagnosis poses the question of how
far humans may, should, want to go in the (re-)construction of the human body and
mind with the aim of enhancement [18, 74]. In advance of identifying the ethical
questions involved, it seems appropriate to highlight an anthropological aspect
affected: the difference between healing and enhancement.
Legitimate interventions into the human body and mind are at present carried out
with the aims of healing or preventing disease or deciencies. Improving human
beings is, as yet, not a legitimate aim of medicine. Although the borderline between
healing and enhancing interventions can hardly be drawn unambiguously [74] and
although the terms health and illness are not well claried [20, 52], there is
obviously a categorical difference between the intentional enhancement of human
beings and healing disorders [20]. Healing orients itself conceptually on a condition
of health held to be ideal. This can either be explicitly dened or merely implicitly
understood in both cases, healing means closing the gap between the actual
condition and the assumed ideal condition. What is to be understood under the ideal
condition has certainly been dened culturally in different manners in the course of
history. In each individual case, however, this is, at least in context, obvious enough.
The ophthalmologist who subjects his patient to an eye-test has a conception of what
the human eye should be able to perceive. He or she will propose technical
improvements of the current state (e.g. a pair of spectacles) only for deviations from
this conception and only from a certain degree of deviation on. The purpose of such
measures is restoring the normal state, which succeed may more or less well.
Traditional medical practice is probably unimaginable without the manner of
thinking that a normal or ideal state serves in the background as a normative
criterion for dening deviation. Medical treatment does not extend beyond this
normal or ideal state. Just this for medical practice, essential way of thinking
would, in view of the possible technical improvement of human beings, probably
become meaningless. A situation would develop where normative frameworks for
technical interventions into the human body and mind would be needed but will not
be available because this is a completely new situation in moral respects. This poses
the question of how far humans may, should, want to go in the (re-)construction of the
human body and mind with the aim of improving them.
Ethical issues raised by this situation have been characterized with regard to neurocognitive enhancement [19]. This classication can be extended to other forms of
enhancement also:
.
Safety: risks of enhancement might consist of unintended side-effects for the brain
of the person to be enhanced (e.g., by unsuccessful technical interventions or in the
long term).
j271
272
Personhood and intangible values: the way in which we see ourselves as humans and
our imagination of a good life could be changed.
First we have to draw our attention to the fact that the spontaneous rejection with
which the concept of human enhancement is often confronted in the population is, in
itself, no ethical argument per se. The fact that we are not accustomed to dealing with the
enhancement issue and the cultural alien-ness of the idea of technically enhanced
human beings are social facts and are quite understandable but they have only limited
argumentative force as such. Spontaneous rejection might occur only because of
feeling uncomfortable and unfamiliar in thinking about technical enhancement and
could be changed by getting more familiar with it. Therefore, feeling and intuition are
factual but still have to be scrutinized for whether there are ethical arguments hidden
behind them at all and how strong these arguments would be.
The often-mentioned assertion that a human beings naturalness would be
endangered or even be eliminated by technical improvement is also no strong
argument per se. Humankinds naturalness or culturality are competing and
partially linked patterns of interpretation of the human condition. Using
humankinds naturalness as an argument in the sense that we should not
technically improve the evolutionarily acquired faculties of sight, hearing, thinking, and so on, just because they are naturally developed and evolutionarily adapted,
would be a naive naturalistic fallacy: out of the fact that we nd ourselves to be
human beings, for instance, with eyes which function only within a certain
segment of the electromagnetic spectrum, nothing follows normatively directly
at all. Limiting human capabilities to the naturally given properties would reduce
humans to museum pieces and would blind out the cultural aspects of being
humans, to which also belongs transcending the status quo, that is thinking beyond
what is given, as has been the thesis of many writers on philosophical anthropology
such as Arnold Gehlen.
From these considerations it cannot be concluded that human enhancement is
permitted or even imperative. It merely follows that one should not make it too easy
on oneself with an ethical repudiation. Strong imperative arguments are, in fact, not
in sight [18]. However, argumentatively, the repudiation front also is not very strong.
It points to a great extent to the consequences of enhancement consequences which,
like the fears of an increasing separation of society [18], are, to a great extent,
hypothetical and which can therefore provide only very general and provisional
orientation. In the nal analysis, the ethical debate seems to narrow itself down to
single-case argumentation: which concrete improvement is meant, which goals and
purposes are connected with it, which side-effects and risks are to be apprehended
and the question of weighing up these aspects against the background of ethical
theories, such as Kantianism or utilitarianism. Universally applicable verdicts such as
a strong imperative duty or a clear rejection of any improvement whatsoever seem at
present to be scarcely justiable. What follows from this situation for the future is the
responsibility to reect on the criteria for the desirability or acceptability of concrete
possibilities for enhancement. A lot of work is still in front of ethics [75, 78].
Recently, it has been proposed to structure the eld of possible ethical standpoints
and perspectives with regard to enhancement technologies ([79, p. 9], applied to
cognitive enhancement, but the scheme seems to be more generally usable) in the
following, instructive way:
.
.
.
.
The decision to take one of these perspectives and to work with it in conceptualizing future developments in this eld depend on the assessment of the strength of
ethical arguments pro and con and also on images of human nature.
8.5.4
Enhancement Technologies A Marketplace Scenario Ahead?
What would follow from the description of the current debate given above? In order to
give an answer we should take a brief look at the eld of human cloning, which is not a
technical enhancement of humans but would be a deep technical intervention into
human life. Human cloning is regarded against the background of many ethical
positions, for example Kantian ethics [74] as a form of instrumentalization of
humans. The genetic disposition of an individual would be intentionally xed by
cloning. The persons affected would not be able to give their agreement to be cloned
in advance. Cloning means an intentional and external determination of the genome
of a later individual. An informed consent, the information of the affected person
about the cloning process and its impacts as well as the agreement of that person,
would not be possible because the cloning has to be done at the early embryo phase.
Therefore, human (reproductive) cloning and research to this end were banned in
many countries soon after the clone sheep Dolly had been presented.
The ethical debate about human enhancement is, so far, completely different from
the debate about human cloning. A restrictive regulation for human enhancement
technologies and the research which could enable it, similar to the ban on human
cloning technologies, has not been postulated yet. What makes the difference in
debating about regulation while the intuitive and spontaneous public reactions are
j273
274
similar in both elds? The thesis is that the situation in ethical respect itself is
completely different.
In the debate on human cloning, there is a strong ethical argument which
motivated severe concern. The postulate of human dignity implies, in many interpretations, the idea that humans must not be instrumentalized without their consent.
But cloning means determining intentionally the genome of an individual without
any chance of consent in advance. In contradiction to the usual fertilization, which
includes a high degree of statistical inuence, cloning would double or multiply a
well-known genome. This would imply a determination of the developing persons
without any chance of reversibility [74, 80, 81]. This point seems to be the ethical
reason behind the strong regulatory measures which have been taken extremely
quickly.
In the eld of human enhancement, however, things are completely different in
this respect. Human enhancement in the elds mostly mentioned [1] would
be applied to adults. Additional or improved sensory capabilities, for example, would
be implemented in the same way as other medical procedures today: the candidates
would be informed about the operational approach, about costs, about the process of
adapting the new features, as well as about possible risks and side-effects. After
having received such information, the enhancement candidates would either leave
the enhancement station (in order not to use the term hospital) or would sign a
letter of agreement and, thereby, would give their informed consent.
Ethical argumentation, therefore, will be performed by using different types of
arguments. The arguments given so far are mostly concerned with possible impacts of
technical enhancement. Expectable problems of distributive justice and equity are
often mentioned [18, 50, 75]. Such argumentations, however, work with highly
uncertain knowledge about future impacts of enhancement technologies. As can
be learned from the history of technology assessment, it is very difcult to assess
the consequences of completely new technologies where no or only little experience
would be available as a basis for projections. In such cases, normative biases, pure
imagination and ideology are difcult to separate from what could be derived from
scientic knowledge. Often, an argument can be countered by a contradicting
argument for example, in the case of coercion mentioned above: The straightforward legislative approach of outlawing or restricting the use of neurocognitive
enhancement in the workplace or school is itself also coercive. It denies people the
freedom to practice a safe means of self-improvement, just to eliminate any negative
consequences of the (freely taken) choice not to enhance [19, p. 423]. Another
example would be the problem of equity: Unequal access is generally not grounds for
prohibiting neurocognitive enhancement, any more than it is grounds for prohibiting other types of enhancement, such as private tutoring or cosmetic surgery . . . . In
comparison with other forms of enhancement that contribute to gaps in socioeconomic achievement, from good nutrition to high-quality schools, neuro-cognitive
enhancement could prove easier to distribute equitably [19, p. 423]. Or take
arguments which make use of a possible change of the personhood of people by
enhancement: And if we are not the same person on Ritalin as off, neither we are
the same person after a glass of wine as before or on vacation as before an exam
19, p. 424]. Therefore, ethical analysis building on such highly uncertain assumptions about possible impacts could only constitute weak ethical arguments, pro
enhancement as well as contra.
In this situation, it is rather probable that the introduction of enhancement
technologies could happen according to the economic market model. Enhancement
technologies would be researched, developed and offered by science and transferred to the marketplace. History shows (see Section 8.5.1) that a demand for such
enhancement technologies is imaginable, as is currently the case in the eld of
esthetic surgery. Public interest would be restricted to possible situations of market
failure. Such situations would be of the types discussed above: problems of equity,
prevention of risks, questions of liability or avoidance of misuse. This scenario
could be perceived as defensive with respect to current ethical standards, as cynical
or as poor in ethical respects. Many people might feel uneasy about it. However, at
the moment, such a scenario would t the state of ethical analysis of technical
enhancement of the human body and mind.
Such a scenario, however, has not to be equivalent to a laissez-faire model of the use
of enhancement technologies [79; see the quotation at the end of Section 5.3]. Also, a
marketplace scenario would have to be embedded in a societal environment,
consisting of normative frameworks [15], ethical standards and regulation. To clarify
the ethical issues involved in enhancement technologies and their relations to social
values and to regulation is, to a large extent, still a task ahead.
8.6
Conceptual and Methodical Challenges
It is a characteristic trait of modern societies that they draw the orientation needed for
opinion formation and decision-making increasingly from debates about future
developments and less and less from existing traditions and values [77, 82]. Modern
secular and scienticized society generally orients itself, instead of on the past, more
on wishes and hopes, but also on fears with regard to the future. The notion of a risk
society [77] and the global movement towards sustainable development [25] are
examples of this approach. Ethical reection, therefore, has to be related to communication about those future prospects. The necessity of providing orientation in
questions such as human enhancement leads to the methodical challenge of applying
the moral point of view on non-existing but only projected ideas with their own
uncertainties and ambiguities.
8.6.1
Ethical Assessments of Uncertain Futures
Ethical analysis on nanotech issues is to a large extent confronted with the necessity to
deal with, in part far-reaching, future projections [58, 76]. Ethical inquiry in elds
such as articial life or human enhancement takes elements of future communication like visions as its subject. Providing orientation then means to analyze, assess
j275
276
and judge those far-reaching future prospects on the basis of todays knowledge and
seen from the todays moral point of view.
Frequently, the futures used in the debate on nanotech and society differ to a
large extent and, sometimes, waver between expectations of paradise and fears of
apocalyptic catastrophes. Expectations which regard nanotech as the anticipated
solution of all of humanitys problems standing in Drexlers [53] technologyoptimistic tradition contrast radically with fears expressed in the technologyskeptical tradition of Joys [83] line of argumentation where self-reproducing
nanorobots are no longer simply a vision which is supposed to contribute to the
solution of humanitys gravest problems [53], but are communicated in public
partially as a nightmare. The visionary pathos in many technical utopias is
extremely vulnerable to the simple question of whether everything could not just
be completely different and it is as good as certain that this question will also be
asked in an open society. But as soon as it is posed, the hoped effect of futuristic
visions evaporates and can even turn into its opposite [1, Section 5.3].
The general problem is that the spectrum of proposed future projections often
seems to be rather arbitrary, for example between expectations of paradise and
apocalypse, warning against catastrophes in completely diverging directions. One
example will be given: the uncertainty of our knowledge about future developments
of nanotechnology and its consequences in connection with the immense imagined
potential for damage, of possibly catastrophic effects, are taken as an occasion for
categorizing even the precautionary principle (see Section 8.4) as insufcient for
handling these far-reaching future questions [61, 62]. Instead, the authors view of
societys future with nanotechnology leaves open solely the existential renunciation
of nanotechnology as the only solution, going beyond even Hans Jonas Imperative of
Responsibility [60], inasmuch as they formulate a duty to expect the catastrophe in
order to prevent the catastrophe (an analysis of the inherent contradictory structure of
this argument is included in [76]). It seems interesting that the argumentative
opponents, the protagonists of human enhancement by converging technologies,
also warn against catastrophes, but only in the opposite sense: If we fail to chart the
direction of change boldly, we may become the victims of unpredictable catastrophe
[17, p. 3]. If, however, the ultimate catastrophe is cited in both directions as a threat,
this leads to an arbitrariness of the conclusions. Ethics, therefore, is confronted not
only with problems of judging states or developments by applying well-justied
moral criteria but also with the necessity to assess the futures used in these debate
with respect to their rationality.
This situation leads to new methodical challenges for ethical inquiry. Especially
the uncertainty of the knowledge available standing behind the future prospects
makes it difcult to assess whether a specic development at the humanmachine
interface or concerning the technical improvement of the human body should
be regarded as science ction (SF), as technical potential in a far future, as a
somewhat realistic scenario or as probably becoming part of reality in the near
future. Ethical analysis has to take into account this uncertainty and has to be
combined with an epistemology of future projections [76] and with new types of a
future knowledge assessment [84]. In the following, we will propose an Ethical
Far-reaching visions have been put forward in the nanotech debate since its very
beginning. A new paradise has been announced since the days of Eric Drexlers
book Engines of Creation [53]. This line of story-telling about nanotech has been
continued in the debate on human enhancement by converging technologies:
People will possess entirely new capabilities for relations with each other, with
machines and with the institutions of civilization. . . . Perhaps wholly new ethical
principles will govern in areas of radical technological advance, such as the routine
acceptance of brain implants, political rights for robots and the ambiguity of death in
an era when people upload aspects of their personalities to the Solar System Wide
Web [17, p. 19]. It might seem that such speculations should not or could not be
subject to ethical inquiry at all.
Because of the power of visions in the societal debates for examples for research
funding, motivating young scientists and for public acceptance it is of great importance to scrutinize such visions carefully instead of denoting them as obscure and
fantastic. In spite of the futuristic and speculative nature of those visions (see [85] for
thenotion of futuristic visions and[76] and [1] forrst steps toward an epistemological
analysis of future knowledge), an early ethical dealing with nanotech visions would be
an important and highly relevant task in order to allow more transparent debate,
especially about sciences agenda [1, 71]. There is a need to make those visions more
specic, to extract their epistemic and normative key assumptions and ideas and to
relate them to other key issues in public debate. Visions in nanotechnology are very
different in nature and often refer to more general ideas, such as [49]
.
.
.
.
.
.
.
.
j277
278
j279
280
As can easily be seen, there are some similarities to the situation described above in
the eld of nanotech: far-reaching visions and expectations, uncertainty of knowledge, diverse positions of different actors and the aim at inuencing agenda setting.
Obviously, there are also differences: usually, in foresight processes ethical reection
is not part of the game whereas regional cooperation, creation of new networks,
mobilization and contributing to economic welfare by exploiting chances of new
technologies and of regional resources are major issues. The reason for bringing the
more philosophical issue of assessing and reecting nanotech visions with respect to
epistemological and ethical questions with established foresight methodologies lies
in the common challenge of being confronted with the necessity to deal rationally
with highly uncertain future knowledge and the inseparably interwoven normative
elements of expectations, desires and fears often involved. Giving advice to the
scientic agenda, for example via contributing to the processes of dening issues and
priorities of the public funding of science and technology, following an open and
democratic debate, is, therefore, difcult to achieve.
In various foresight exercises, it has been a common experience that in order to
arrive at a workable view on the future in the respective eld it is crucial to focus on a
concrete and tangible topic. Compared with this experience, it seems impossible to
apply a foresight exercise to such a broad and grand topic like nanotechnology. On the
contrary, more specic subtopics should be addressed such as the use of nanoparticles in cosmetics, developments towards articial life or opportunities or threats
concerning the equity of access to nanotech benets. Then it is imaginable to set up a
foresight process which would
.
.
.
include research and reection parts as the vision assessment activities described
above;
involve other societal groups (stakeholders, customers, policymakers, regulators,
business people, etc.);
provide a balanced and ethically as well as epistemologically reected view on the
respective part of the nanotech eld which then could be used as a valuable input
for debates about sciences agenda in this eld.
Since the very beginning of ethical reection in science and technology, there has been
an ongoing discussion about an adequate relation in time between scientic
technological advances and ethics. Ethics often seems to pant helplessly behind
technical progress and to fall short of the occasionally great expectations [88]. The rapid
pace of innovation in technicization has the effect that ethical deliberations often come
too late: when all of the relevant decisions have already been made, when it is long
since too late for shaping technology. Technological and scientic progress sets facts
which, normatively, can no longer be revised [74]: It is a familiar cliche that ethics does
not keep pace with technology [38]. This ethics last model means that rst there
have to be concrete technological developments, products and systems which then
could be reected by ethics. Ethics in this perspective, could, at best, act as a repair
service for problems which are already on the table.
In contrast, the ethics rst model postulates comprehensive ethical reection
on possible impacts already in advance of technological developments. Ethics
actually can provide orientation in the early phases of innovation, for example
because future projections and visions emerging on the grounds of scientic and
technical advances may be subject to ethical inquiry. Because there are early ideas
available about the scientic and technical knowledge and capabilities as well as
about their societal impacts risks as well as chances long before market entry, it
is possible to reect and discuss their normative implications. For example, Jonas
worked on ethical aspects of human cloning long before cloning technology was
available even in the eld of animals. Obviously, ethical reection in this model has
to deal with the situation that the knowledge about technology and its consequences
is uncertain and preliminary.
This does not necessarily mean that ethical deliberations have to be made for
absolutely every scientic or technical idea. The problems of a timely occupation with
new technologies appear most vividly in the diverse questions raised by the visions of
salvation and horror as regards nanotechnology and human enhancement. What sense
is there in concerning oneself hypothetically with the ethical aspects of an extreme
lengthening of the human life span or with self-replicating nanorobots [38]? The ethics
rst perspective is exaggerated in these cases to such an extent that any relevance will
be lost. Most scientists are of the opinion that these are speculations which stem much
rather from the realm of science ction than from problem analysis which is to be taken
seriously: If discussions about the ethics and dangers of nanotechnology become
xated on a worry that exists only in science ction, we have a serious problem [89].
We should not forget that ethical reection binds resources and there should therefore
be certain evidence for the validity of these visions, if resources are to be invested
in them which could then be lacking elsewhere. Therefore, methods of assessing
visions of human enhancement are required which allow for an epistemological
investigation of the visions under consideration (see the sections above).
Ethical judgment in very early stages of development could provide orientation for
shaping the process of scientic advance and technological development (e.g., with
regard to questions of equity or of risks of misuse). In the course of the continuing
concretization of the possibilities for application of nanotechnologies, it is then
possible continuously to concretize the at rst abstract estimations and orientations on the basis of newly acquired knowledge and nally to carry out an ethically
reected technology assessment. In this way, ethical analysis is an ongoing process,
accompanying the scientic and technological advance.
j281
282
Due to nanotechnologys and the converging technologies early stage of development, we have here a rare case of an advantageous opportunity: there is the
chance and also the time for concomitant reection, as well as the opportunity to
integrate the results of reection into scientic agenda and technology design and
thereby to contribute to the further development of science and technology [38]. In
view of the visionary nature of many the prospects in nanotechnology and of long
and longer spans of time within which the realization of certain milestones can be
expected, there is, in all probability, enough time to analyze the questions posed. In
general, it applies in this case that this reective discussion should take place
already in the early phases of development, because then the greatest possibilities
for inuencing the process of scientic development are given. The chances are
good that, in the eld of nanotechnology, ethical reection and the societal
discussion do not come too late, but can accompany scientictechnical progress
critically and can, in particular, help to inuence sciences agenda by ethically
reected advice.
References
1 Schmid, G., Brune, H., Ernst, H.,
Gr
unwald, W., Grunwald, A., Hofmann,
H., Janich, P., Krug, H., Mayor, M.,
Rathgeber, W., Simon, B., Vogel, V. and
Wyrwa, D. (2006) Nanotechnology Assessment and Perspectives, Springer, Berlin.
2 Paschen, H., Coenen, C., Fleischer, T.,
Gr
unwald, R., Oertel, D. and Revermann,
C. (2004) Nanotechnologie. Forschung und
Anwendungen, Springer, Berlin.
3 Royal Society (2004) Nanoscience and
Nanotechnologies: Opportunities and
Uncertainties, Royal Accademy, London.
4 Nanoforum (2004) Nanotechnology.
Benets, Risks, Ethical, Legal and Social
Aspects of Nanotechnology, http://www.
nanoforum.org, accessed 2 October 2006.
5 Mnyusiwalla, A., Daar, A.S. and Singer,
P.A. (2003) Mind the gap. Science and
ethics in nanotechnology. Nanotechnology,
14, R9R13.
6 Grunwald, A. (2005) Nanotechnology a
new eld of ethical inquiry? Science and
Engineering Ethics, 11, 187201.
7 Ach, J.S. and Jomann, N. (2006) Size
matters. Ethical and social challenges of
nanobiotechnology an overview. LIT,
M
unster.
References
14 van Gorp, A. (2005) Ethical Issues in
Engineering Design: Safety and
Sustainability, Simon Stevin Series in the
Philosophy of Technology, Delft.
15 van Gorp, A. and Grunwald, A. (2008)
Ethical responsibilities of engineers in
design processes, risks, regulative
frameworks and societal division of labour,
in preparation.
16 Gethmann, C.F. (1994) Die Ethik
technischen Handelns im Rahmen der
Technikfolgenbeurteilung, in
Technikbeurteilung in der Raumfahrt
Anforderungen, Methoden, Wirkungen
(eds A. Grunwald and H. Sax), Edition
Sigma, Berlin, pp. 146159.
17 Roco, M.C. and Bainbridge, W.S. (eds)
(2002) Converging Technologies for
Improving Human Performance,
National Science Foundation, Arlington,
VA.
18 Siep, L. (2005) Die biotechnische
Neuerndung des Menschen, presented at the
XX. Deutscher Kongress f
ur Philosophie,
Berlin.
19 Farah, M.J., Illes, J., Cook-Deegan, R.,
Gardner, H., Kandel, E., King, P., Parens,
E., Sahakian, B. and Wople, P.R. (2004)
Neurocognitive enhancement: what can
we do and what should we do? Nature
Reviews. Neuroscience, 5, 421425.
20 Wolbring, G. (2005) The Triangle of
Enhancement Medicine, Disabled People
and the Concept of Health: A New
Challenge for HTA, Health Research and
Health Policy. Research Paper, Alberta;
http://www.cspo.org/ourlibrary/
documents/HTA.pdf, accessed 27 July
2007.
21 Krings, B.-J. and Riehm, U. (2006) Die
Nutzung und Nichtnutzung des Internets.
Eine kritische Reexion der Diskussion
zum Digital Divide, in Netzbasierte
Kommunikation, Identitat und Gemeinschaft.
Net-based Communication, Identity and
Community (eds U. Nicanor and A.
Metzner-Szigeth), Berlin, pp. 233251.
22 Salamanca-Buentello, F., Persad, D.L.,
Court, E.B., Martin, D.K., Daar, A.S.
23
24
25
26
27
28
29
30
31
32
j283
284
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
www.nanotechweb.org/articles/column/
1/1/11/1, accessed 2 December 2006.
Stieglitz, T. (2006) Neuro-technical
interfaces to the central nervous system.
Poiesis Praxis, 4, 95110.
Freitas, J.A. (1999) Nanomedicine. Volume
I: Basic Capabilities, Landes Biosciences,
Georgetown, TX.
European Technology Platform (2006)
Nanomedicine Nanotechnology for
Health, European Commission,
Luxembourg.
Kralj, M. and Pavelic, K. (2003) Medicine
on a small scale. How molecular medicine
can benet from self-assembled and
nanostructured materials. EMBO Reports,
4, 10081012.
Farkas, R. and Monfeld, C. (2004)
Ergebnisse der Technologievorschau
Nanotechnologie pro Gesundheit 2003.
Technikfolgenabschatzung Theorie Praxis,
13, 4251.
Baumgartner, C. (2004) Ethische Aspekte
nanotechnologischer Forschung und
Entwicklung in der Medizin. Parlament,
B2324, 3946.
Bruce, D. (2006) Ethical and social issues
in nanobiotechnologies. EMBO Reports, 7,
754758.
Ach, J. and Siep, L. (eds) (2006) Nanobio-ethics. Ethical Dimensions of
Nanobiotechnology, Berlin.
MacDonald, C. (2004) Nanotech is novel;
the ethical issues are not. Scientist,
18, 3.
Gethmann, C.F. (2004) Zur Amphibolie
des Krankheitsbegriffs, in Wissen und
Verantwortung. Band 2: Studien zur
medizinischen Ethik (eds A. GethmannSiefert and K. Gahl), Alber, Freiburg.
Drexler, K.E. (1986) Engines of Creation
The Coming Era of Nanotechnology, Anchor
Books, Oxford.
Scott, S.H. (2006) Converting thoughts
into action. Nature, 442, 141142.
Turner, D.C. and Sahakian, B.J. (2006)
Ethical questions in functional
neuroimaging and cognitive
enhancement. Poiesis Praxis, 4, 8194.
References
56 Grunwald, A. (2004) The case of
nanobiotechnology. Towards a prospective
risk assessment. EMBO Reports, 5,
3236.
57 National Nanotechnology Initiative (1999)
Shaping the World Atom by Atom,
Washington, DC.
58 Grunwald, A. (2007) Converging
technologies: visions, increased
contingencies of the conditio humana
and search for orientation, Futures, 39,
pp 380392.
59 Luther, W. (ed.) (2004) Industrial
Application of Nanomaterials Chances and
Risks, http://www.techportal.de, accessed
2 October 2006.
60 Jonas, H. (1979) Das Prinzip
Verantwortung, Suhrkamp Frankfurt/
Main, English edn.: The Imperative of
Responsibility, London 1984.
61 Dupuy, J.-P. (2005) The Philosophical
Foundations of Nanoethics. Arguments
for a Method, Lecture at the University of
South Carolina, 3rd March.
62 Dupuy, J.-P. and Grinbaum, A. (2004)
Living with uncertainty: toward the
ongoing normative assessment of
nanotechnology. Techne, 8, 425.
63 Renn, O. and Roco, M. (2006)
Nanotechnology and the need for risk
governance. Journal of Nanoparticle
Research, 8 (2), pp. 153191.
64 M
unchener R
uckversicherung, (2002)
Nanotechnology What is in
Store for Us? M
unchener
R
uckversicherungsgesellschaft, http://
www.munichre.com/publications/30203534_en.pdf, accessed 12 November 2006.
65 Phoenix, C. and Treder, M. (2003) Applying
the Precautionary Principle to
Nanotechnology, http://www.crnano.org/
Precautionary.pdf, accessed 2 October
2006.
66 Grunwald, A. (2008) Nanotechnology and
the precautionary principle, in
Nanotechnology and Nanoethics: Framing
the Field (ed. F. Jotterand), Berlin, in press.
67 Gee, D. and Greenberg, M. (2002)
Asbestos: from magic to malevolent
68
69
70
71
72
73
74
75
76
77
j285
286
j287
9
Outlook and Consequences
G
unter Schmid
Nanotechnology could become the most inuential force to take hold of the
technology industry since the rise of the Internet. Nanotechnology could increase
the speed of memory chips, remove pollution particles in water and air and nd
cancer cells quicker. Nanotechnology could prove beyond our control and spell the
end of our very existence as human beings. Nanotechnology could alleviate world
hunger, clean the environment, cure cancer, guarantee biblical life spans or concoct
super weapons of untold horror . . . . Nanotechnology could spur economic development through spin-offs of the research. Nanotechnology could harm the opportunities of the poor in developing countries . . . . Nanotechnology could change the
world from the bottom up. Nanotechnology could become an instrument of terrorism. Nanotechnology could lead to the next industrial revolution . . . . Nanotechnology could change everything.
This only incompletely cited collection of meaningful as well as senseless predictions is listed in a brochure on The Ethics and Politics of Nanotechnology by
UNESCO published in 2006 [1]. It impressively demonstrates how the understanding of nanotechnology depends from the observers individual opinion, the spreading medium or from the different political trends. Depending on the respective
authors personal attitude, hopes are raised or catastrophes are predicted.
In spite of the variations of the denition of nanotechnology (see Chapters 1 and 2),
from a scientic point of view, strictly observed in this and the other books in this
series, we must refrain from unserious promises, but also from scenarios drawing
the decline of mankind. Both belong to the eld of science ction and are not based on
scientic ndings. Indeed, there are enough scientically substantiated facts making
nanotechnology one of the most inuential technologies we have ever had, in
agreement with the very rst of the UNESCO headlines cited above. This can be
followed from the few examples given in Sections 2.2.12.2.7 in Chapter 2, but even
more from the following volumes that deal with information technology, medicine,
energy and instrumentation on the nanoscale, making observations in the nanoworld
possible. Without entering science ction elds, our present knowledge in nanoscience allows the prognosis that nano-based data storage systems will offer
288
capacities to store the whole of the worlds literature on a single chip and to construct
notebooks with the capacity of a computer center of today. Low-priced solar cells,
moldable accumulators of high capacities, highly efcient fuel cells and hydrogen
storage systems will become available. Last but not least, progress to be expected in
medical diagnosis and therapy will revolutionize health care, beginning with
enduringly working drug delivery systems, diagnostic potentials, improved by orders
of magnitude, up to individual cancer therapies, based on the personal genome of a
patient.
These few examples indeed demonstrate the innovative power of nanoscience
and -technology. On the other hand, we should not close our eyes to possible dangers
linked with the expansion of nanotechnology. From experiences in the past, for
instance in the cases of nuclear energy and genetically modied organisms, we
should avoid similar mistakes in the case of nanotechnology. Especially scientists
are asked to contribute to an objective discussion in public, free of ideologies and
pre-opinions.
An indispensable condition to discuss chances and risks of nanotechnology, free
from any prejudice, is a minimum knowledge about nanotechnology by the public.
This is not easy to achieve, due to the very complex nature of nanotechnology
extending from physics to the life sciences. Therefore, education processes have to
start as early as possible in primary and secondary schools, in colleges and universities. This is especially necessary in order to protect people from wrong prophets, in a
positive as well as in a negative sense. There is no doubt that nanotechnology will
change our lives, but it depends on us what this change will look like.
References
1 The Ethics and Politics of Nanotechnology
United Nations Educational, Scientic and
Cultural Organization (2006), Paris.
j1
1
Pollution Prevention and Treatment Using Nanotechnology
Bernd Nowack
1.1
Introduction
Figure 1.1 Hits in the Web of Science for the search terms
(nanotechnol OR nanopart OR nanotub ) AND risk for the
years 19902006.
fact that many engineered nanoparticles are functionalized and therefore have a
different surface activity from pristine particles is pivotal for many applications where
a tailored property is needed, but such particles may behave in a completely different
way from standard particles in the environment and may, for example, be much more
mobile or show an increased (or decreased, as the case may be) toxicity. This short list
of properties exemplies the fact that engineered nanoparticles or nanotechnological
applications make use of the same properties that are looked for by environmental
scientists.
This chapter will give a general overview of potential environmental applications of
nanotechnology and nanoparticles and will also give a short overview of the current
knowledge about possible risks for the environment.
1.2
More Efficient Resource and Energy Consumption
j3
1.3
Pollution Detection and Sensing
Various nanostructured materials have been explored for their use in sensors for
the detection of different compounds [23]. An example is silver nanoparticle array
membranes that can be used as ow-through Raman scattering sensors for water
quality monitoring [24]. The particular properties of carbon nanotubes (CNTs)
make them very attractive for the fabrication of nanoscale chemical sensors and
especially for electrochemical sensors [2528]. A majority of sensors described so
far use CNTs as a building block. Upon exposure to gases such as NO2, NH3 or O3,
the electrical resistance of CNTs changes dramatically, induced by charge transfer
with the gas molecules or due to physical adsorption [29, 30]. The possibility of a
bottom-up approach makes the fabrication compatible with silicon microfabrication processes [31]. The connection of CNTs with enzymes establishes a fast
electron transfer from the active site of the enzyme through the CNT to an electrode, in many cases enhancing the electrochemical activity of the biomolecules
[27]. In order to take advantage of the properties of CNTs, they need to be properly
functionalized and immobilized. CNT sensors have been developed for glucose,
ethanol, sulde and sequence-specic DNA analysis [27]. Trace analysis of organic
compounds, e.g. for the drug uphenazine, has also been reported [32]. Nanoimmunomagnetic labeling using magnetic nanoparticles coated with antibodies
specic to a target bacterium have been shown to be useful for the rapid detection of
bacteria in complex matrices [33].
1.4
Water Treatment
Clean water is a requirement for all properly functioning societies worldwide, but is
often limited. New approaches are continually being examined to supplement traditional water treatment methods. These need to be lower in cost and more effective
than current techniques for the removal of contaminants from water. In this context
also nanotechnological approaches are considered. In this section the following
application areas will be covered: nanoparticles used as potent adsorbents, in some
cases combined with magnetic particles to ease particle separation; nanoparticles
used as catalysts for chemical or photochemical destruction of contaminants; nanosized zerovalent iron used for the removal of metals and organic compounds from
water; and nanoltration membranes.
1.4.1
Adsorption of Pollutants
Sorbents are widely used in water treatment and purication to remove organic and
inorganic contaminants. Examples are activated carbon and ion-exchange resins.
The use of nanoparticles may have advantages over conventional materials due to
the much larger surface area of nanoparticles on a mass basis. In addition, the unique
structure and electronic properties of some nanoparticles can make them especially
powerful adsorbents. Many materials have properties that are dependent on size [34].
Hematite particles with a diameter of 7 nm, for example, adsorbed Cu ions at lower
pH values than particles of 25 or 88 nm diameter, indicating the uniqueness of
surface reactivity for iron oxides particles with decreasing diameter [35]. However,
another study found that normalized to the surface area the nanoparticles had a lower
adsorption capacity than bulk TiO2 [36]. Several types of nanoparticles have been
investigated as adsorbents: metal-containing particles, mainly oxides, carbon nanotubes and fullerenes, organic nanomaterials and zeolites.
For the removal of metals and other inorganic ions, mainly nanosized metal
oxides [37, 38] but also natural nanosized clays [39] have been investigated. Also,
oxidized and hydroxylated CNTs are good adsorbers for metals. This has been found
for various metals such as Cu [40], Ni [41, 42], Cd [43, 44] and Pb [45, 46]. Adsorption
of organometallic compounds on pristine multi-walled CNTs was found to be
stronger than for carbon black [47].
Chemically modied nanomaterials have also attracted a lot of attention, especially
nanoporous materials dues to their exceptionally high surface area [48]. The particle
size of such materials is, however, not in the nano-range but normally 10100 mm.
Another option is to modify chemically the nanoparticle itself [49]. TiO2 functionalized with ethylenediamine was, for example, tested for its ability to remove anionic
metals from groundwater [50].
CNTs have attracted a lot of attention as very powerful adsorbents for a wide variety
of organic compounds from water. Examples include dioxin [51], polynuclear
aromatic hydrocarbons (PAHs) [5254], DDT and its metabolites [55], PBDEs [56],
chlorobenzenes and chlorophenols [57, 58], trihalomethanes [59, 60], bisphenol A
and nonylphenol [61], phthalate esters [62], dyes [63], pesticides (thiamethoxam,
imidacloprid and acetamiprid) [64] and herbicides such as sulfuron derivatives [65, 66], atrazine [67] and dicamba [68]. Cross-linked nanoporous polymers
that have been copolymerized with functionalized CNTs have been demonstrated to
have a very high sorption capacity for a variety of organic compounds such as pnitrophenol and trichloroethylene [69]. It was found that purication (removal of
amorphous carbon) of the CNTs improved the adsorption [54]. The available
adsorption space was found to be the cylindrical external surface; neither the inner
cavity nor the inter-wall space of multi-walled CNT contributed to adsorption [70].
Unlike the case with fullerenes, no adsorptiondesorption hysteresis was observed,
indicating reversible adsorption [70].
Fullerenes have also been tested for adsorption of organic compounds. Adsorption
depends to a great extent on the dispersion state of the C60 [71], which is virtually
insoluble in water [72]. Because C60 forms clusters in water, there are closed interstitial spaces within the aggregates into which the compounds can diffuse, which
leads to signicant adsorptiondesorption hysteresis [70, 73]. Fullerenes are only
weak sorbents for a wide variety of organic compounds (e.g. phenols, PAHs, amines),
whereas they are very efcient for the removal of organometallic compounds (e.g.
organolead) [74].
j5
The semiconductor TiO2 has been extensively studied for oxidative or reductive
removal of organic pollutants [49, 87]. Illumination promotes an electron to the
conduction band, leaving a hole in the valence band. This process produces a potent
reducing and oxidizing agent. In water, photo-oxidation occurs primarily through
hydroxyl radicals. Because TiO2 requires ultraviolet light for excitation, it has been
sensitized to visible light by dyes, through incorporation of transition metal ions [49]
or by doping with nitrogen [88]. The degradation rate of several dyes by nanosized
TiO2 was found to be 1.620 times higher than for bulk TiO2 particles [89]. Several
types of compounds such as dyes [88, 90] and organic acids [91] have been shown to be
rapidly degraded. A special type of TiO2 photocatalysts are titania nanotube materials,
which were shown to have superior activity [92, 93].
1.4.5
Zerovalent Iron
Laboratory research has established that nanoscale metallic iron is very effective in
destroying a wide variety of common contaminants such as chlorinated methanes,
brominated methanes, trihalomethanes, chlorinated ethenes, chlorinated benzenes,
other polychlorinated hydrocarbons, pesticides and dyes [94]. The basis for the
reaction is the corrosion of zerovalent iron in the environment:
2Fe0 4H O2 ! 2Fe2 2H2 O
Fe0 2H2 O ! Fe2 H2 2OH
Contaminants such as tetrachloroethane can readily accept the electrons from iron
oxidation and be reduced to ethene:
C2 Cl4 4Fe0 4H ! C2 H4 4Fe2 4Cl
However, nanoscale zerovalent iron (nZVI) can reduce not only organic contaminants but also the inorganic anions nitrate, which is reduced to ammonia [95, 96],
perchlorate (plus chlorate or chlorite), which is reduced to chloride [97], selenate [98],
arsenate [99, 100], arsenite [101] and chromate [102, 103]. nZVI is also efcient in
removing dissolved metals from solution, e.g. Pb and Ni [102, 104]. The reaction rates
for nZVI are at least 2530 times faster and also the sorption capacity is much higher
compared with granular iron [105]. The metals are either reduced to zerovalent
metals or lower oxidation states, e.g. Cr(III), or are surface complexed with the iron
oxides that are formed during the reaction. Some metals can increase the dechlorination rate of organics and also lead to more benign products, whereas other metals
decrease the reactivity [106].
The reaction rates for nZVI can be several orders of magnitude faster on a mass
basis than for granular ZVI [107]. Because the reactivity of ZVI towards lightly
chlorinated and brominated compounds is low and because the formation of a
passivating layer reduces the reactivity with time, many approaches have been
explored where the surface is doped with a catalyst (e.g. Pd, Pt, Cu, Ni) to reduce
the activation energy. The same approach has also been tested for nZVI. Surfacenormalized reaction rates for such materials were found to be up to 100 times faster
than for bulk ZVI [108111].
The nanoscale iron particles can be produced either by a top-down approach (e.g.
milling of iron lings) or by direct chemical synthesis [105]. A common method for
synthesis of iron nanoparticles is by reduction of an aqueous ferric solution by
reducing agents such as sodium borohydride or sodium hypophosphite [49].
j7
1.5
Soil and Groundwater Remediation
The use of nZVI for groundwater remediation represents the most widely investigated environmental nanotechnological technique. Granular ZVI in the form
of reactive barriers has been used for many years at numerous sites all over the
world for the remediation of organic and inorganic contaminants in groundwater
(see Figure 1.3a). With nZVI, two possible techniques are used: immobile nZVI is
injected to form a zone of iron particles adsorbed on the aquifer solids (Figure 1.3b) or
mobile nZVI is injected to form a plume of reactive Fe particles that destroy any
organic contaminants that dissolve from a DNAPL (dense non-aqueous phase liquid)
source in the aquifer (Figure 1.3c). With this technique, the formation of a pollutant
plume is inhibited. The successful results of eld demonstrations using nZVI have
been published, with reported reductions in TCE of up to 96% after injection of 1.7 kg
of nanoparticles into the groundwater [112]. A larger test was conducted where 400 kg
of nZVI was injected and signicant reductions in TCE soil concentration (>80%)
and dissolved concentrations (57100%) were observed [113]. To date approximately
30 projects are under way in which nZVI is used for actual site remediation [105].
Whereas most research using nZVI has been devoted to groundwater, much less
has been published about soil remediation. These studies have mostly been done in
soil slurries and efcient removal of PAHs by nZVI has been reported [114, 115]. For
j9
10
PCBs, a removal of only about 40% was attained, caused by the very strong adsorption
of PCBs to the soil matrix and limited transfer to the nZVI particles [116]. nZVI has
also been used to immobilize Cr(VI) in chromium ore processing residue [117].
Because the iron particles have a strong tendency to aggregate and adsorb on
surfaces of minerals, much effort has been directed towards methods to disperse the
particles in water and render them mobile. In one approach, water-soluble starch was
used as a stabilizer [118], and in another, hydrophilic carbon or poly(acrylic acid)
delivery vehicles were used [119]. Modied cellulose, sodium carboxymethylcellulose
(CMC), was found to form highly dispersed nZVI [120] and also several polymers
have been tested and found to be very effective [121]. In this stabilized form the nZVI
was up to 17 times more reactive in degrading trichloroethene than non-stabilized
material. However, for other stabilizing agents a decrease in reactivity of up to 9- [121]
or 210-fold was observed [121]. To deliver the nZVI to the oil/water interface in the
case of DNAPL contamination, a copolymer was used to increase colloid stability and
at the same time increase phase transfer into the organic phase [122].
1.6
Environmental Risks
1.6.1
Behavior in the Environment
1.7 Conclusions
1.6.2
Ecotoxicology
1.7
Conclusions
This chapter was intended to give an overview of the various aspects of nanotechnology and the environment, mainly looking at it from the side of applications rather
than from the risk side. It should have become clear that nanotechnology in general
j11
12
References
1 Environmental Protection Agency, US
Environmental Protection Agency Report
EPA 100/B-07/001, EPA Washington DC
2007.
2 M. R. Wiesner, G. V. Lowry, P. Alvarez,
D. Dionysiou, P. Biswas, Environ. Sci.
Technol. 2006, 40, 4336.
3 V. L. Colvin, Nat. Biotechnol. 2003, 21,
1166.
4 M. Siegrist, A. Wiek, A. Helland,
H. Kastenholz, Nat. Nanotechnol. 2007,
2, 67.
5 R. Jones, Nat. Nanotechnol. 2007, 2, 71.
6 G. Oberdorster, E. Oberdorster, J.
Oberdorster, Environ. Health Perspect.
2005, 113, 823.
7 A. Nel, T. Xia, L. Madler, N. Li, Science
2006, 311, 622.
8 T. Masciangioli, W. X. Zhang, Environ. Sci.
Technol. 2003, 37, 102A.
9 T. Hillie, M. Munasinghe, M. Hlope,
Y. Deraniyagala, Nanotechnology, water
and development, Meridian Institute, 2006.
10 K. A. D. Guzman, M. R. Taylor, J. F.
Baneld, Environ. Sci. Technol. 2006, 40,
1401.
11 M. C. Roco, Environ. Sci. Technol. 2005, 39,
106A.
12 M. A. Albrecht, C. W. Evans, C. L. Raston,
Green Chem. 2006, 8, 417.
13 Estimated Energy savings and Financial
impacts of nanomaterials by design on
selected applications in the chemical
industry, Los Alamos National Laboratory,
Los Alamos, NM, 2006.
14 B. Vogt, Ind. Diamond Rev. 2004, 3, 30.
15 T. Garcia, B. Solsona, S. H. Taylor, Catal.
Lett. 2005, 105, 183.
References
33 S. C. Chang, P. Adriaens, Environ. Eng.
Sci. 2007, 24, 58.
34 M. F. Hochella, Geochim. Cosmochim.
Acta 2002, 66, 735.
35 A. S. Madden, M. F. Hochella, T. P.
Luxton, Geochim. Cosmochim. Acta 2006,
70, 4095.
36 D. E. Giammar, C. J. Maus, L. Y. Xie,
Environ. Eng. Sci. 2007, 24, 85.
37 S. Pacheco, J. Tapia, M. Medina,
R. Rodriguez, J. Non-Cryst. Solids 2006,
352, 5475.
38 E. A. Deliyanni, E. N. Peleka, K. A. Matis,
J. Hazard. Mater. 2007, 141, 176.
39 G. D. Yuan, L. H. Wu, Sci. Technol. Adv.
Mater. 2007, 8, 60.
40 P. Liang, Q. Ding, F. Song, J. Sep. Sci.
2005, 28, 2339.
41 C. Lu, C. Liu, J. Chem. Technol. Biotechnol.
2006, 81, 1932.
42 C. L. Chen, X. K. Wang, Ind. Eng. Chem.
Res. 2006, 45, 9144.
43 Y. H. Li, S. G. Wang, Z. K. Luan, J. Ding,
C. L. Xu, D. H. Wu, Carbon 2003, 41, 1057.
44 P. Liang, Y. Liu, L. Guo, J. Zeng, H. B. Lu,
J. Anal. At. Spectrosc. 2004, 19, 1489.
45 Y. H. Li, Y. Q. Zhu, Y. M. Zhao, D. H. Wu,
Z. K. Luan, Diamond Relat. Mater. 2006,
15, 90.
46 Y. H. Li, S. Wang, J. Wei, X. Zhang, C. Xu,
Z. Luan, D. Wu, B. Wei, Chem. Phys. Lett.
2002, 357, 263.
47 J. Munoz, M. Gallego, M. Valcarcel, Anal.
Chem. 2005, 77, 5389.
48 X. Feng, G. E. Fryxell, L. Q. Wang, A. Y.
Kim, J. Liu, K. M. Kemner, Science 1997,
276, 923.
49 S. O. Obare, G. J. Meyer, J. Environ. Sci.
Health A 2004, 39, 2549.
50 S. V. Mattigod, G. E. Fryxell, K. Alford,
T. Gilmore, K. Parker, J. Serne,
M. Engelhard, Environ. Sci. Technol. 2005,
39, 7306.
51 R. Q. Long, R. T. Yang, J. Am. Chem. Soc.
2001, 123, 2058.
52 K. Yang, L. Zhu, B. Xing, Environ. Sci.
Technol. 2006, 40, 1855.
53 K. Yang, X. L. Wang, L. Z. Zhu, B. S. Xing,
Environ. Sci. Technol. 2006, 40, 5804.
j13
14
References
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
j15
j17
2
Photocatalytic Surfaces: Antipollution and Antimicrobial Effects
Norman S. Allen, Michele Edge, Joanne Verran, John Stratton, Julie Maltby, and Claire Bygott
2.1
Introduction to Photocatalysis: Titanium Dioxide Chemistry and StructureActivity
For many years, titanium dioxide pigments have been used successfully for conferring opacity and whiteness to a host of different materials. Their principal usage is in
applications such as paints, plastics, inks and paper, but they are also incorporated
into a diverse range of products, such as foods and pharmaceuticals. The fundamental properties of titanium dioxide have given rise to its supreme position in the eld of
white pigments. In particular, its high refractive index permits the efcient scattering
of light. Its absorption of UV light has conferred durability on products. Its non-toxic
nature has meant that it can be widely used in almost any application without risk to
health and safety. However, the primary reason for its success is the ability to reect
and refract or scatter light more efciently than any other pigment, due to its high
refractive index in comparison with extenders, llers and early pigments [15] (see
Table 2.1).
Titanium dioxide exists in three crystalline modications, rutile, brookite and
anatase, all of which have been prepared synthetically. In each type, the titanium ion
coordinates with six oxygen atoms, which in turn are linked to three titanium atoms
and so on. Anatase (Figure 2.1) and rutile (Figure 2.2) are tetragonal whereas brookite
is orthorhombic. Brookite and anatase are unstable forms. Brookite is not economically signicant since there is no abundant supply in nature.
Examination (Table 2.2) of the basic properties of the two main crystal forms shows
differences in specic gravity, hardness, refractive index and relative tint strength.
The oil absorption of commercial anatase and rutile pigments also varies, in part due
to the different types of surface treatments applied to them.
Titanium dioxide has the highest average refractive index known. For anatase, it is
2.55 and for rutile it is 2.76. These high values account for the exceptional light
scattering ability of pigmentary titanium dioxide when dispersed in various media,
which in turns yields the high reectance and hiding power associated with this
pigment. Although single-crystal titanium dioxide is transparent, as a nely divided
18
Table 2.1 Refractive index of TiO2 in comparison with extenders, filler and other pigments.
Material
Refractive index
Rutile TiO2
Anatase TiO2
Lithopone
Zinc oxide
White lead
Calcium carbonate
China clay
Talc
Silica
2.76
2.52
2.13
2.02
2.00
1.57
1.56
1.50
1.48
powder it has a very high reectance and it is intensely white because its high
reectance is substantially uniform throughout the visible spectrum. This white color
is different in tone for the two crystal structures due to their different reectance
curves across the visible and near-visible spectrum (Figure 2.3).
From examination of Figure 2.3, is evident that:
.
.
Rutile TiO2 reects the radiation slightly better that anatase and is therefore
brighter.
The higher absorption of rutile at the very blue end of the visible spectrum and in
the UV region accounts for the yellower tone of rutile pigment. This higher UV
Property
Anatase
Rutile
Pigment form
Appearance
Density (g cm3)
Refractive index
Oil absorption (1b/100b)
Tinting strength (Reynolds)
3.94.2
2.76
1648
16501900
Crystal form
Density (g cm3)
Hardness (Moh)
3.87
56
4.24
67
j19
20
Figure 2.3 Reflectance of anatase and rutile pigments through the near-UV, visible and IR regions.
j21
22
Aggregates can only be broken into individual pigment particles with intensive
milling. One of the last manufacturing steps performed by the TiO2 manufacturer is
micronization and/or milling to dissociate as many aggregates as possible. Aggregates will not reform unless the pigment is heated to over 500 C. Agglomerates are
also broken up in the milling step. However, agglomerates will easily re-form during
packing, storage and transportation. The disruption of these inter-particle bonds is
generally understood to be the dispersion that needs to be performed by the TiO2
consumer.
It is possible to manipulate the TiO2 particle size to within a very narrow range
around a predetermined optimum. Generally, in paint applications this optimum is
approximately 0.20.3 mm, as it is within this range that TiO2s light scattering ability
is at its peak, which in turn maximizes the level of gloss nish.
TiO2 pigment particles are submicroscopic with size distributions narrower than
many so-called monodisperse particulates. Appropriately ground, pigment dispersions contain less than 5 wt.% of particles smaller than 0.10 mm and larger than
1.0 mm. Optical effectiveness, that is, light scattering is controlled by the mass/
volume frequency of particles in the size range from 0.1 to 0.5 mm. Gloss is
diminished by a relatively small mass/volume fraction of particle larger than about
0.5 mm. Dispersibility and lm neness is degraded by a very small mass/volume
fraction of particles larger than about 5 mm. Important optical properties such as
opacity, hiding power, brightness, tone, tinting strength and gloss are all dependent
upon the particle size and particle size distribution.
Pure titanium dioxide possesses by nature an internal crystal structure that yields
an innately high refractive index. When the particle size and particle size distribution
are to be optimized so as to contribute along with its high refractive index to a
maximum light scattering, conventional or pigmentary titanium dioxide is obtained. It
reects all the wavelength of the visible light to the same degree, producing the effect
of whiteness to the human eye. All these attributes, together with its opacity, are
achieved for an optimal particle diameter which is approximately 0.20.4 mm, that is,
in the order of half the wavelength of visible light. This fact can also be demonstrated
on the basics of Mie theory [6].
There exists, however, another type of titanium dioxide whose median crystal size
has been explicitly reduced up to 0.02 mm. This is the so-called nanoparticles or
ultrane TiO2 and will be the subject of this chapter.
The history of nanoparticle titanium dioxide dates back to the late 1970s when the
rst patent on the preparation of these materials was issued in Japan. It is in principle
possible to obtain nanoparticle TiO2 by simple milling of the pigmentary TiO2 to a
ner particle [4]. However, the properties of the ne powders in terms of purity,
particle size distribution and particle shape remain highly unsatisfactory.
Several wet-chemical processes were developed during the 1980s by TiO2 pigment
manufacturers such as Ishihara, Tioxide and Kemira. The rst part of the process, the
production of the nanoparticle base material, uses after-wash titanium hydroxylate
as the raw material. After subsequent process steps involving the decomposition of
the hydroxylate crystal structure and the reprecipitation of the TiO2, the product is
calcined to obtain oval-shaped particles with a desired primary crystal size and narrow
size distribution. The base crystals are coated in the after-treatment unit according to
the requirements of the end-use. One of the primary tasks of the after-treatment is to
ensure good dispersibility of extremely ne particles in the nal application.
TiO2 nanoparticles are also routinely produced by the gas-to-particle conversion in
ame reactors because this method provides good control of particle size, particle
crystal structure and purity [4].
Typically, the crystal size of these products is about one-tenth of the size of
the normal pigmentary grade. Figure 2.5 shows typical transmission electron
micrographs for pigmentary and nanoparticulate titanium dioxide at the same
magnication.
Table 2.3 shows a comparison of some typical values of the physical properties of
nanoparticle and conventional titanium dioxide products.
The smaller crystal size inuences various properties and leads to higher values for
the surface area and oil absorption. Lower values for specic gravity and bulk density
are also achieved. Otherwise, it has many of the properties of conventional TiO2
pigments: non-toxic, non-migratory, inert and stable at high temperatures.
The optical behavior of ultrane TiO2 differs dramatically from that of conventional TiO2 pigment. The optical properties of nanoparticle TiO2 are governed by the
Table 2.3 Typical properties of nanoparticle and conventional titanium dioxide.
Property
Nanoparticle
Pigmentary
Appearance
Crystal structure
Crystal size (mm)
Specic surface area (m2 g1)
Bulk density (g mL1)
Oil absorption (g per 100 g)
White powder
Anatase or rutile
0.0050.05
50>300
3.3
30
White powder
Anatase or rutile
0.150.3
15
4.0
16
j23
24
Figure 2.6 Comparison of the optical behavior of ultrafine TiO2 and pigmentary TiO2.
Table 2.4 Optical behavior of pigmentary and nanoparticle TiO2 under visible and UV light.
Particle size
Wavelength <400 nm
Wavelength >400 nm
Pigmentary TiO2
Semiconductor absorption
Nanoparticle TiO2
Semiconductor absorption
Scattering and reection
(Raleighs theory)
UV absorbers, ultrane TiO2 possesses effective UV lter properties over the entire
UV spectrum (UVC UVB AVA). For example, it is gaining wide acceptance for
use in sun creams. Nanoparticle TiO2, apart from its effective attenuating characteristics, is extremely inert and, therefore, safe to use next to the skin [5]. Nanoparticle
TiO2 can also be used in clear plastic lms to provide UV protection to foodstuffs. UV
radiation from articial lighting in a grocery store induces auto-oxidation in, e.g.,
meat and cheese, resulting in discoloration. In this regard it also exhibits antibacterial
behavior, which will be discussed later.
It is also possible to use ultrane TiO2 as a light stabilizer in plastics to protect
the material itself from yellowing and to retard the deterioration of the mechanical
properties. A further example of the potential of nanoparticle TiO2 as a UV lter is
found in clear wood nishes. The original color of wood panels can be retained by a
clear lacquer made with 0.54% nanoparticle TiO2 [7]. In addition to preventing wood
from darkening, ultrane TiO2 also enhances its lifetime.
An exciting and increasing popular application of the optical properties of ultrane
TiO2 is found in automotive coatings, where the ultrane powder is used as an effect
pigment in combination with mica akes to create the so-called titanium opalescent
effect. For UV protection applications and due to the intrinsic photoactivity of TiO2
pigments, mainly, nanoparticle surface-treated rutile pigments are used. Nanoparticle
anatase TiO2 nds applications in the eld of photocatalysis.
j25
26
2.2
Applications
The eld of heterogeneous photocatalysis is very diverse and involves many research
groups throughout the world. A number of research themes have emerged which
offer real potential for commercial developments and merit much greater research.
Figure 2.8 displays and summarizes many of the major elds in which TiO2 is used
as a photocatalyst, all related in a sense towards solving environmental issues.
For the purpose of this chapter, we will just focus on our application studies of TiO2
for antibacterial, self-cleaning and depollution [atmospheric contaminants such as
volatile organic compounds (VOCs) and nitrogen oxides]. For example, in indoor
environments, most surfaces, e.g. ceramic tiles, window glass or paper, are gradually
covered with organic matter such as oils, dirt and smoke residue and become
fouled [8]. Transparent TiO2 coatings can be completely unobtrusive, causing no
readily discernible changes in the substrate color or transparency, but they can
decompose organic matter as it deposits. Thus, various types of surfaces with TiO2
can be covered to make them self-cleaning under sunlight as well as room light
(Figure 2.9). Thus, surfaces based on paints, ceramics, glass and cementitious
materials containing active photocatalytic titania nanoparticles have widespread
applications to create environmentally clean areas within their proximity.
2.3
Photocatalytic Chemistry
The overall catalytic performance of titanium dioxide particles has been found to be
dependent on a number of parameters, including preparation method, annealing
temperature, particle/crystal size, specic surface area, ratio between the anatase and
rutile crystal phases, light intensity and the substrate to be degraded [9]. Furthermore,
the electrons conned in the nanomaterial exhibit a different behavior to that in the
j27
28
Polymeric and organic coatings systems are commonly utilized by titania doping
for various applications. In outdoor applications all polymers degrade. The degradation rate depends on the environment (especially sunlight intensity, temperature and
humidity) and on the type of polymer. This so-called photo-oxidative degradation is
due to combined effects of photolysis and oxidative reactions. Sunlight photolytic
degradation and/or photo-oxidation can only occur when the polymer contains
chromophores which absorb wavelengths of the sunlight spectrum on Earth (>290
nm). These wavelengths have sufcient energy to cause a dissociative (cleavage)
processes resulting in degradation.
Chromophores that can absorb sunlight are:
.
.
.
.
homolytically into alkoxy and hydroxy radicals, which can initiate another propagation cycle [18, 19]. Termination reactions are bimolecular. In the presence of
sufcient air, which is normally the case for the long-term degradation of polymers,
only the reaction of two peroxy radicals has to be considered. Here the reaction
depends on the type of peroxy radical present. Aside from these processes, polyaromatics and heterochain polymers exhibit further complex reactions but for the
purposes of this chapter the main processes of concern are those induced by the
catalytic effect of the titanium dioxide.
The ability of pigments to catalyze the photo-oxidation of polymer systems has also
attracted signicant attention in terms of their mechanistic behavior. In this regard,
much of the information originates from work carried out on TiO2 pigments in both
polymers and model systems [2027]. To date their are three current mechanisms of
the photosensitized oxidation of polymers by TiO2 and, for that matter, other white
pigments such as ZnO:
1. The formation of an oxygen radical anion by electron transfer from photoexcited
TiO2 to molecular oxygen [20]. A recent modication of this scheme involves a
process of ion annihilation to form singlet oxygen, which then attacks any
unsaturation in the polymer [28].
hn
TiO2 O2 ! TiO2 O2 I
I ! TiO2 1 O2 ion annihilation
r
I H2 O ! TiO2 HO HO2
j29
30
2HO2 ! H2 O2 O2
RCH2 CHR0 1 O2 ! RCH CHCHOOHR0
2. Formation of reactive hydroxyl radicals by electron transfer from water catalyzed
by photoexcited TiO2 [29]. The Ti3 ions are reoxidized back to Ti4 ions to start
the cycle over again.
TiO hn
2
H2 O ! H e0 Aqu rOH
Ti4 e0 ! Ti3
Ti3 O2 ! Ti4
3. Irradiation of TiO2 creates an exciton (p) which reacts with the surface hydroxyl
groups to form a hydroxyl radical [20]. Oxygen anions are also produced, which are
adsorbed on the surface of the pigment particle. They produce active perhydroxyl
radicals.
hn
TiO2 ! e0 p
r
OH p ! HO
Ti4 e0 ! Ti3
Ti3 O2 ! Ti4 O2 adsorbed
r
As mentioned previously, titanium dioxide particles are often coated with various
media. For example, to improve pigment dispersion and reduce photoactivity, the
surface of the pigment particles is coated with precipitated aluminosilicates. Zirconates are also used in some instances whereas for other applications such as in nylon
polymers and bers the anatase is coated with manganese silicates or phosphates.
Anatase will photosensitize the oxidation of a polymer, the effect being dependent on
the nature and density of the coating and increasing with pigment concentration.
Uncoated rutiles are also photosensitizers but again the effect is reduced and
proportional to the effectiveness of the coating. In this case, stabilization increases
with increasing coated rutile concentration. Thus, the surface characteristics of the
titania pigment are an important factor in controlling photoactivity. As discussed for
Figure 2.11, the surface is covered with hydroxyl groups of an amphoteric character
formed by the adsorption of water. These groups are more acidic in character on the
surface of anatase and less effectively bound than those on rutile. The surface carriers
(excitons) therefore react more slowly with the hydroxyl groups in the case of rutile.
Infrared analysis has been used to characterize the different species on the particle
surfaces. At 30003700 cm1 free and hydrogen-bonded OH groups can be detected
whereas in the region 12001700 cm1 HOH bending and carbonates can be
seen.
Surface modications of the TiO2 particles with inorganic hydrates may reduce the
photochemical reactivity of titanium pigments. This can reduce the generation of
free radicals by physically inhibiting the diffusion of oxygen and preventing release
of free radicals. The often simultaneous chemical effects of surface modication
can involve provision of hole and electron recombination sites or hydroxyl radical
recombination sites. In addition to the latter effects, the surface treatment or coating,
as mentioned above, can improve other properties such as wetting and dispersion in
different media (water, solvent or polymer), to improve compatibility with the binder
and dispersion stability and color stability. The photosensitivity of titanium dioxide is
considered to arise from localized sites on the crystal surface and occupation of these
sites by surface treatments inhibits photo-reduction of the pigment by UV radiation
and hence the destructive oxidation of the binder is inhibited. Coatings containing
25 wt.% of alumina or alumina and silica are satisfactory for general-purpose paints.
If greater resistance to weathering is desired, the pigments are coated more heavily
to about 710 wt.%. The coating can consist of a combination of several materials, e.g.
alumina, silica, zirconia, aluminum phosphates or other metals. For example, the
presence of hydrous alumina particles lowers van der Waals forces between pigment
particles by several orders of magnitude, decreasing particleparticle attractions.
Hydrous aluminum oxide phases appear to improve dispersibility more effectively
than most of the other hydroxides and oxides. Coated and surface-treated nanoparticles of titania also have commercial uses in, for example, enhanced stabilization of
polymers and coatings [7].
During the weathering of commercial polymers containing white pigments
such as titania, oxidation occurs at the surface layers of the material, which eventually
erode away, leaving the pigment particles exposed. This phenomenon is commonly
referred to as chalking and has been conrmed by scanning electron microscopy.
Methods of assessing pigment photoactivities have attracted much interest from both
scientic and technological points of view. Artical and natural weathering studies
are tedious and very time consuming. Consequently, numerous model systems have
been developed to assess their photochemical activities rapidly. Most of these systems
undergo photocatalytic reactions to give products which are easily determined,
usually by UV absorption spectroscopy, HPLC, GC, microwave spectroscopy, etc.
2.4
Photoactivity Tests for 2-Propanol Oxidation and Hydroxyl Content
These are specic tests to ascertain titanium dioxide photoactivity. The various types
and grades of titania discussed in this chapter are listed in Table 2.5. The oxidation of
2-propanol to yield acetone is a specic methodology and this has been related to
j31
32
Sample
BET surface
area (m2 g1)
Particle
size
Surface
treatment
A anatase normal
B rutile normal
C rutile normal
D rutile normal
E nano anatase
F nano anatase
G nano anatase
H nano anatase
I nano rutile
J nano rutile
K nano anatase
L nano rutile
M nano anatase
N nano anatase
O rutile normal
10.1
6.5
12.5
12.5
44.4
77.9
329.1
52.1
140.9
73.0
190.0
73.0
239.0
190.0
12.5
0.24 mm
0.28 mm
0.25 mm
0.29 mm
2030 nm
1525 nm
510 nm
70 nm
25 nm
40 nm
610 nm
3050 nm
71 nm
92 nm
250 nm
None
Al
Al
Al
None
None
None
Hydroxyapatite
None
Al, Zr
Al, Si, P
Al, Zr
Al, Si, P
Al, Si, P
Al, Si, P
% Surface
treatments
1
2.8
3.4
5
13
20
13
12
20
3.5
oxygen consumption during irradiation of the medium in the presence of the titania
particles [30]. The hydroxyl content relates to the concentration of hydroxyl functionalities present on the pigment particles and is often related to activity [31]. The
data for both tests are compared in Figure 2.12. There are a number of correlations
and trends within the data. First, all the nanoparticle grades exhibit higher photoactivities than the pigmentary grades. Thus, for oxygen consumption the anatase A is
more active than the rutile types B and D, the last being the least active and most
durable pigment. Second, of the nanoparticles the rutile grade I is the most active in
both tests. The three anatase grades E, F and G exhibit increasing activity with
hydroxyl content whereas for oxygen consumption F is the greater. It is nevertheless
clear from the data that nanoparticulates are signicantly more active.
Environmental issues play an important role in the applications of titania
llers. These include the use of their photocatalytic behavior in the development
of self-cleaning surfaces for buildings, i.e. antisoiling and antifungal growth and
VOC/NOx reduction (emissions). The latter can cause lung damage by lowering
resistance to diseases such as inuenza and pneumonia whereas in combination
with VOCs it produces smog and contributes to acid rain, causing damage to
buildings. From a commercial point of view, such benets have enormous implications. Japanese scientists [32] have been actively exploiting the development of a
variety of materials and a number of European ventures have followed suit, most
notably in Italy, with Global Engineering and Millennium Chemicals as examples,
and the European-funded PICADA Consortium [33]. Here, developments range
from self-cleaning and depolluting surfaces and facades based on nano-titania
activated coatings and cementitious materials. These applications include antisoiling, depollution of VOCs and NOx contaminants and antifungal/microbial activities.
Numerous reports have appeared in newspapers and magazine articles highlighting
such applications, e.g. self-cleaning paving and building blocks and facades that can
also depollute the surrounding atmosphere, internal coatings and paints for sanitization and elimination of MRSA and also, for example, clothes and textiles that
supposedly never need cleaning (although in many cases this is undoubtedly an
exaggeration and was promoted for public awareness of the potential). The cements
are normally loaded up to 3% w/w for optimum activation and cost efciency.
2.5
Self-Cleaning Effects: Paints/Cementitious Materials
j33
34
Figure 2.13 Weight loss (mg per 100 cm2) of PVC alkyd paint films
during irradiation in an Atlas Weatherometer containing
equivalent amounts of titania pigments.
impregnated dye such as Rhodamine B. This is illustrated in Figure 2.14, where over
a given period of light exposure the cement with photocatalyst exhibits a rapid dye
fade compared with the undoped material. In reality, this is further illustrated by the
photographs of real trials in Milan (shown in Figure 2.15), where the photocatalytic
cement remains clean after a period of use compared with that for undoped cement.
For paints and coatings, the idea is to limit the oxidation and chalking of the paint
lm to the very near surface layers such that over time with weathering rain water will
wash the top layer, leaving an underlying clean, fresh surface. The other is like the
cementitious materials the surface deposits are oxidized or burnt off leaving the
surface layer clean. In the former case it is important that the coating exhibits high
durability for a reasonably cost-effective stable system. Also, the paints chosen must
be stable to occulation and viscosity changes, cure or dry at ambient temperature
and ideally be water based to avoid further environmental problems. Most polymers
are carbon based and are unlikely to be photo-resistant, but water-based acrylic latex
paints have been evaluated. In the rst instance four types of acrylic water-based
paints were evaluated in terms of relative stability toward photoactive nanoparticles.
Here a special solgel grade of anatase was prepared in the laboratory with no postring. Particles of varying sizes were also prepared via this route. The relative paint
stabilities with and without the anatase sol particles (1020 nm) at 5% w/w after 567
hours of weathering are shown in Table 2.6. Of these paint formulations, only the
polysiloxane BS45 (Wacker) proved to be resistant to the photocatalytic effects of the
titania particles. The styreneacrylic, poly(vinyl acetate) and acrylic copolymers all
showed high degrees of chalking (weight loss).
Table 2.6 Weight loss for paints after 567 h of Atlas exposure:
various polymers plus 5% anatase sol particles, 1020 nm.
Paint composition
Styrene acrylic
Styrene acrylic anatase sol
PVA copolymer
PVA copolymer anatase sol
Acrylic copolymer
Acrylic copolymer anatase sol
Polysiloxane BS45
Polysiloxane BS45 anatase sol
12.2
97.3
11.4
97.9
7.4
101.0
23.3
13.6
j35
36
From the point of view of surface cleaning, paint lms can also be impregnated
(like cement) with dyes and fading rates measured. Aside from organic-based paints,
a number of inorganic paints are commercially available, a number made from
complex alkali metal silicates. Because of their inorganic nature, they tend to be
signicantly light stable. An example of the self-cleaning effect of a typical silicatebased paint is demonstrated by the fading data on methylene blue dye in Figure 2.16.
It is seen that photobleaching of the dye occurs more rapidly in lm with photocatalyst than an undyed (undoped) lm and that this increases with increasing
concentration of PC105 nanoparticles.
Another method used potentially to enhance the durability of a substrate while
simultaneously controlling photocatalytic activity is to dope the paints with mixtures
of durable and catalytically active grades of titanium dioxide. In this regard, mixtures
of pigmentary rutile O and nanoparticle anatase F pigments appear to provide one
interesting illustration option, with the former inducing some level of base stability
while the presence of the latter gives rise to surface activity. Figures 2.17 and 2.18
illustrate this effect for a siliconized polyester coating exposed in a QUV weatherometer for gloss and mass loss, respectively. Gloss loss is seen to be gradually
reduced with time the effect increasing with increasing loading of anatase nanoparticle F. Mass loss is also seen to increase gradually with increasing levels of the same
nanoparticle. In this case, it is evident that only low levels of shedding/chalking
occur with time such that the paint lm retains some level of durability except for
the very near surface layer.
j37
38
Table 2.7 Weight loss for Lumiflon paint pigmented with RCL-696/
nano-TiO2 after 546 h of Atlas exposure.
Nano-TiO2
Pigmentary TiO2
10 wt.% PC500
20 wt.% PC500
10 wt.% PC105
20 wt.% PC105
10 wt.% PC50
20 wt.% PC50
10 wt.% Showa Denko
20 wt.% Showa Denko
10 wt.% AT1
20 wt.% AT1
20 wt.% PC500
20 wt.% PC105
20 wt.% PC50
20 wt.% Showa Denko
20 wt.% AT1
None
Clear resin blank
RCL-696
RCL-696
RCL-696
RCL-696
RCL-696
RCL-696
RCL-696
RCL-696
RCL-696
RCL-696
None
None
None
None
None
RCL-696
None
19.0
66.5
31.0
62.8
30.4
39.0
77.0
105.4
16.6
43.2
97.6
128.7
121.4
146.8
138.7
4.7
5.4
A similar but perhaps more extreme effect is shown in Table 2.7 for a Lumion
uorinated acrylic paint lm. At 10 and 20% concentrations of the nanoparticles G, F,
E and H, chalking is fairly high, whereas the pigmentary rutile O at 20% w/w only
gives a 4.7 mass loss value. The pigmentary uncoated anatase A is also an option,
giving high levels of chalking at 10 and 20% w/w. Thus, control of pigment type and
particle size in addition to their concentrations is a critical area of development for
effective self-cleanable paint surfaces, the effect varying also with the paint formulation. In this regard, coatings could effectively be developed to suit a particular type of
environment.
2.5.1
Antibacterial Effect
The ability of the nanoparticles to destroy bacteria and fungi has also been actively
pursued and some data from our laboratories are demonstrated. The type of
photocatalytic medium, nanoparticle and bacteria/fungi all play an intimate key
role in performance. There are a number of tests one can apply [3436], the simplest
evaluation being the typical zone of inhibition on agar plates where the growth of
bacteria is measured around a paint lm. Staphylococcus aureus growth is shown on
agar plates in Figure 2.19 for typical silicate paint lms with and without PC105
nanoparticles. On the right-hand picture plate a clear zone of inhibition is seen to
develop compared with that for the undoped lm.
Similar tests have also been undertaken in our laboratories on titanium dioxide
powders. Here E. coli were used where their destruction (measured in terms of
colony-forming units) after irradiation with UV light in the presence of the titania
particles is plotted against irradiation time. A study on the range of titania powders
showed (Figure 2.20) that there was an inverse relationship between antibacterial
activity and particle size: for the pigment powders, pigment E > H > F > G A, with a
j39
40
solgel colloid dispersion C and Degussa P25 having the greatest effects. However,
the experimental conditions used provided some confounding factors which required clarication in order to identify the best experimental method and the most
effective pigments in terms of antibacterial activity, as mentioned previously. Here
the particles were dispersed in a Calgon medium and this evidently reduced activity
on the plate.
In general, the antimicrobial effect increased with increasing concentration of
nanoparticles (Figure 2.21) up to 0.04%. In the absence of any dispersion effect, the
activity of the nanoparticle E was comparable to that of the Degussa P25, followed by F
and G and with little difference between A, H and D and the mercury lamp control
(Figure 2.22). Similar results were observed at 0.02% in our work. For the nanopigments E, F and G, a further enhanced effect was noted at 0.1%, but the effect of P25
was reduced.
The enhanced activity of C over its derivative pigment G was lost when C was dried
and ground (Figure 2.23). This nding demonstrates that the drying process has a
marked effect on activity, due to a decrease in surface area during aggregation and to a
decrease in dispersion stability in water.
Pigment E has the most reactive surface because it has fewer defects, which
increases the efciency of the photogenerated radicals form our microwave analysis [33]. Thus pigments calcined at higher temperatures (E > F > G) have better
crystallinity and therefore higher antibacterial activities.
The UV light itself has little effect on the bacteria; the pigmentary grade of anatase
A has a small effect whereas the nanoparticle G has a somewhat greater effect.
However, the most interesting feature of these data is the very high destructive effect
of the mixed phase nanoparticle grade made by Degussa (P25). This nanoparticle
grade of titania is well established is the literature in terms of its high photoactivity [32]. In this work, a grade of nanoparticle anatase G was prepared in the laboratory
whereby the particles were seeded form solution and then dried but not subsequently
oven red. This so-called washed form of titania is seen in the data to be higher in
activity than that of the Degussa material. This effect is currently being investigated
further in terms of hydroxyl content and hydrogen peroxide generation.
The overall effect of activity will depend on whether more TiO2 is activated as a
consequence of increased surface area or whether less TiO2 is activated because less
light passes through the suspension due to light scattering. Larger aggregates of
particles sediment in a liquid system and an increased concentration of pigments
j41
42
shows less of an antimicrobial effect since less light passes through the suspension if
the cellparticle mixture is not stirred. Conversely, Calgon milled pigments, which
are nanometer sized, also scatter light signicantly at high concentrations and
decrease activity (optimum loading 0.01%), hence the optimum activity is presented
by nanoparticle powder aggregates in this work. The most important aspect to
consider in terms of antibacterial inactivation is the relative sizes of the titanium
particles/aggregates and the bacterial cell. E. coli measures approximately 1 3 mm; a
benzene molecule is 0.00043 mm. The porosity of the pigment has no bearing on the
antimicrobial effect, whereas the chemical pollutant can diffuse into the porous
particle structure. Thus, the higher surface area of pigment G did not enhance any
antibacterial effect. It has been veried using a disc centrifuge in our study that the
three nanoparticulate powder pigments E, F and G were aggregated into 0.7 mm
particles. In this case, they would all offer comparable active areas to bacteria. Only
the inherent ability of the pigments to generate radicals will affect antibacterial
activity. Hence the process is more sensitive to structure (crystallinity) than to texture
(surface area) and follows a clear inverse relationship with particle surface area.
2.5.2
Depollution: NOx/VOC Removal
Japanese scientists have been particularly prolic in this area for some time now. In
this part of the research work it was important to be able to develop coatings and
cementitious materials that remove NOx, SO2, VOCs and potentially ozone especially
in areas where such contamination is likely to be above recommended standards.
Examples include motorway tunnels, underground car parks, busy highways,
chemical factories and city dwellings such as schools. A pictorial representation of
the key mechanistic features of the depolluting paint coatings is illustrated in
Figure 2.24.
The materials should in this regard be durable and show little or no loss in activity
with aging, in addition to having the ability to inactivate nitric acid reaction products.
Also, as above it should be self-cleaning. The coating may also, in some cases, need to
be translucent so that existing coatings or stonework can be over-coated without
any change in appearance. To some extent the coating must be photo-resistant to
the effects of the nano-TiO2 and would probably need to be porous to allow contact
between the TiO2 surface and the gaseous pollutants. Nano-TiO2 is an excellent
scatterer of light and if the coating is porous this further increases light scatter.
Some potential problems in the design of such coatings have been circumvented in
our laboratories, such as poor adhesion and poor durability. Also, the nitric acid
formed in the reaction could damage the substrate or poison the catalytic reaction.
A suitable test method was developed to measure the efcacy of the coatings studied
via a Signal detection system (Figure 2.25).
In the diagram shown, test lms of paint are irradiated in a cell through which a
standard ow rate of nitrogen is passed with a set concentration of NOx gases. NOx
j43
44
Figure 2.25 Schematic diagram of NOx gas detection system for irradiated photocatalytic surfaces.
levels are measured via a chemiluminescence detector system before and after
irradiation to give a measure of depollution.
Commercial dry nano-TiO2 products with a range of particle size and surface
area were developed with surface areas ranging from 20 to 300 m2 g1 for evaluation
as indicated in Table 2.5. In addition to these, colloidal solgel particle media were
also developed for easy dispersion. Even with the smallest crystallite size it is difcult
to eliminate light scattering at levels above 5% at conventional coatings thickness
(25 mm) due to aggregation. With special non-dried solgel nano-TiO2 there is
less light scattering because of reduced particle aggregation. It appeared that the
coatings had to be porous before there was a signicant activity towards gaseous
reductions such as NOx.
From the data in Figures 2.26 and 2.27, the efcacy of NOx removal increases
signicantly with both an increase in particle surface area and concentration of
nanoparticle titania (anatase) in a polysiloxane paint substrate. Porosity can also be
introduced by using materials other than TiO2 itself. Nanoparticle calcium carbonate
offered the possibility of high translucency and the ability to react with nitric acid.
The results are conrmed in Figure 2.28, where it is seen that the NOx is reduced
not only with increasing titania doping but also with increasing levels of calcium
carbonate addition.
The most interesting feature of the results however, is the inuence of titania
and calcium carbonate loading on the extent of degradation of the polysiloxane paint
lms as measured by percentage weight loss. The data in Figure 2.29 show that in
the absence of calcium carbonate the extent of degradation is low, as indicated
above, whereas in its presence the rate of degradation increases with concentration
from 2.5 to 5.0% w/w. At 10% w/w of titania the extent of degradation is signicant in
the presence of the calcium carbonate. In this case the access of both moisture and
oxygen through the lm matrix will be enhanced. Film translucency also decreases
with increasing loadings of titania and calcium carbonate particles, as shown by the
data on contrast ratio in Figure 2.30.
Measurements on NOx reductions have also been obtained in terms of NO and
NO2 gases, where it is seen that the rate of NOx destruction is clearly greater in the
presence of the nanoparticles alone whereas the paint matrix gives rise to a barrier
effect, as might be expected (Table 2.8, Figure 2.31). Nevertheless, the efcacy of the
paint lms in destroying the NOx gases is high. The durability of a paint lm in terms
of NOx reduction is also important and this is illustrated by the plot in Figure 2.32 for
a typical Eco-silicate paint system. Here the percentage reduction in NOx ability is
reduced by only about 10% and thereafter stabilizes after 12 months.
j45
46
Composition
BS45 latex
BS45 latex 5% sol
Sol
NOx reduction
(mg m2 s1)
NO2
NO
NO2
0
84.9
84.9
0
9.3
55.8
0
0.060
0.320
0
0.055
0.409
Figure 2.32 Reduction in NOx with UV radiation intensity for a typical Eco-silicate paint film.
j47
48
2.6
Conclusions
References
The authors would like to thank the PICADA consortium Global Engineering, Milan,
Italy for the use of some of the data in this chapter.
References
1 J. Murphy, Additives for Plastics Handbook,
2nd edn., Elsevier, Amsterdam, 2001.
2 T. C. Patton, Pigment Handbook, Vol. 1,
Wiley, New York, 1973.
3 D. R. Vesa, P. Judin, Verfkroniek, 1994, 11, 17.
4 A. Gurav, T. Pluym, T. Xiong, Aerosol Sci.
Technol. 1993, 19, 411.
5 J. G. Balfour, New Mater. 1992, 1, 21.
6 N. S. Allen, J. F. McKellar, Photochemistry
of Dyed and Pigmented Polymers, Applied
Science, London, 1980, p. 247.
j49
50
j51
3
Nanosized Photocatalysts in Environmental Remediation
Jess P. Wilcoxon and Billie L. Abrams
3.1
Introduction
3.1.1
Global Issues
Modern industrial economies have developed approaches to manufacturing, utilization and disposal of chemical and biochemical products which have inicted
considerable damage on our air and water environments. As such, advances in
technology, medicine, mining, transportation, agricultural practices and military
practices have not come without a price. Although the quality of human life has
beneted in many ways from advances in these areas, the anthropogenic impact on
the aquatic and terrestrial biosphere has been substantial, leading to pollution of the
worlds drinking water, soils and air. Unfortunately, the adverse anthropogenic
effects on the environment are increasing [1]. This poses an undeniable threat to
the ecosystem, biodiversity and ultimately human health and life.
Industry produces an estimated 300 million tons of synthetic compounds annually,
a large percentage of which ends up as environmental pollutants [2]. As of 2001,
approximately 100 000 metric tons of chemicals were released into surface waters and
more than 720 000 metric tons were released into the atmosphere [3]. Table 3.1 shows
the 2001 toxic release inventory (TRI) gures [3]. The numbers specically for surface
water and air pollution have not changed signicantly for the 2004 TRI [4, 5].
However, as of the 2004 TRI, the total amount (including underground injection,
landlls and wastewater) of toxic chemical released into the environment as a result of
industrial practices stands at over 1.9 million metric tons [4, 5].
Accidental oil and gas spills on the order of 0.4 million tons have also resulted in
signicant damage to the aquatic ecosystem [2]. In the Niger Delta alone, more than
6800 spills have been documented (approximately one spill per day for the past 25
years); however the real number is thought to be much higher [6].
52
water discharge and total air emissions for all chemicals produced
by industry. (Reprinted with permission from T. Ohe, T. Watanabe,
K. Wakabayashi, Mutagens in surface waters: a review, Mutation
Research 2004, 567, 109).
Industry type
Chemical and allied products
Food and related products
Primary metal smelting and processing
Petroleum rening and related industries
Paper and allied products
Electric, gas, and sanitary services
Electronic and other electrical equipment
Fabricated metal products
Photographic, medical and optical goods
Coal mining and coal mine services
Tobacco products
Metal mining (e.g., Fe, Cu, Pb, Zn, Au, Ag)
Transportation equipment manufacture
Textile mill products
Stone, clay, glass, and concrete products
Leather and leather products
Plastic and rubber products
Solvent recovery operations (under RCRA)
Lumber and wood products
Industrial and commercial machinery
Petroleum bulk stations and terminals
Chemical wholesalers
Furniture and xtures
Printing, publishing and related industries
terminals
Apparel
No reported SIC code
Miscellaneous manufacturing
Total
Total water
releases (103 kg)
Total air
emissions (103 kg)
26117.1
25018.2
20262.5
7752.9
7500.9
1596.5
1332.2
790.8
646.1
344.8
241.7
193.8
89.9
79.6
73.5
56.6
32.2
10.7
9.0
8.2
5.1
0.8
0.3
0.1
103348.6
25463.3
26132.9
21849.6
71283.5
325492.4
5770.3
18346.9
3250.9
348.7
1130.3
1294.8
30251.4
2603.9
14181.8
547.7
34973.1
442.0
13825.1
3755.7
9600.4
569.0
3548.9
8750.2
<0.1
483.2
16.6
100153.0
155.7
1528.3
3068.5
761763.6
3.1 Introduction
sites is on the order of $25 million [8]. These Superfund sites are only a small
sampling exemplifying the extent of pollution in the USA. The Department of Energy
(DOE) along with other US government agencies are also responsible for billions of
cubic meters of toxic contaminants affecting groundwater and soil [8]. Along these
lines, the US military stock piles contain approximately 3 108 kg of munitions
waste [9]. Generally these munitions wastes, such as RDX and HMX (hexahydro1,3,4-trinitro-1,3,5-triazine and octahydro-1,3,5,7-tetratrinitro-1,3,5,7-tetrazocine, respectively), are difcult to break down and thus linger in soil and groundwater, posing
signicant health threats [9].
Agricultural activities also contribute to environmental contamination since they
rely on nitrogen- and phosphorus-based compounds for crop fertilization. They also
employ herbicides and pesticides to control weed and insect damage and to improve
crop yields. The run off from these activities contaminates aquifers used for human
water supply, damage coral reefs and contribute to algal blooms in costal areas and
inland water bodies.
From the above discussion, it is evident that the impact of pollution by humans on
the biosphere is extensive and perhaps overwhelming. However, as the scientic
community assesses the problem(s) at hand, new approaches to remediation will
emerge. This chapter outlines one possible approach to dealing with water and air
pollution: photocatalysis, specically using nanosized semiconductors. It is by no
means an all-encompassing solution since there is no single approach that can
address the problems noted above. However, it has a lot of potential in several
specialized areas, especially where visible light illumination, a free energy source, can
be utilized.
3.1.2
Scope
Since the topic of environmental remediation has been a focus in the scientic
community for many decades, there are numerous reviews in this eld. For a broader
scope in the general eld of environmental remediation, we refer the reader to some
of these publications [2, 3, 1012]. This is just a small sampling of the literature in this
eld and should not be regarded as a complete list.
Heterogeneous photocatalysis of pollutants is just one approach to environmental
remediation and has also been reviewed by numerous authors [1317]. TiO2 has been
the standard material for photocatalysis since its initial use in the photoelectrocatalytic generation of hydrogen reported by Fujishima and Honda in 1972 [18]. Since
this time, the eld of photocatalysis has grown signicantly and over 2000 papers
have been published [13, 17].
In recent years, with the avid interest in nanomaterials, there is also extensive
literature addressing the effects of size on photocatalytic behavior [13, 14, 16, 17, 19
35]. There is good evidence showing that size effects play a role in photocatalysis.
Accompanying a decrease in size is the increase in surface area with subsequent
changes in surface chemistry, both of which are critical in photocatalysis. Most of
the studies on nanosized photocatalysis for environmental remediation use TiO2
j53
54
3.2
General Field of Environmental Remediation
The best way to treat the pollution problem is to use conservation and preventive
measures such as recycling to limit emissions into air and water sources. Steps are
already being taken by industry to limit their emissions through optimization of
manufacturing practices in addition to recycling and regeneration of chemicals,
which also lead to economic savings in the long run [36]. Governments have also
initiated legislation putting constraints on the permissible emissions. However, the
fact remains that there is an abundance of toxic pollutants present in our water
systems, soil and air. As such, the challenge for environmental remediation is
substantial.
Scientic and technical methods to mitigate environmental pollution rely on
many approaches and vary for the cases of soil, water and air purication. The
remediation approach chosen depends on the complexity and nature of the contaminated media and economic costs of the treatment. Often there are limited acceptable
treatment approaches, as is the case with DDT contamination of sediments off the
coast of California. Mixed wastes consisting of both radiological and chemical toxins
are especially difcult and costly to separate and treat. The lack of documentation of
the types and amounts of chemicals in many waste sites makes the remediation
process especially costly.
The most common environmental remediation techniques traditionally used to
treat large volumes of water intended for municipal water supplies include carbon
adsorption, air stripping, oxidation through ozonation or chlorination and incineration, ultraltration and sedimentation [13, 34, 35]. These approaches are generally
used for large-scale pollutant removal. The main drawbacks of these techniques are
that most of them are transfer methods, where the pollutant is moved from one place
to another or transformed from one phase to another. Activated carbon adsorption is a
common and generally effective way to capture both airborne [such as volatile organic
compounds (VOCs)] and waterborne contaminants. However, disposal of the saturated carbon is an issue [13].
On a smaller scale, water puriers for home use utilize activated carbon absorbents
to remove organic contaminants at the point-of-use. This point-of-use application is
especially useful for rural locations lacking central waste treatment facilities. However, disposal of the concentrated activated carbon in a safe manner is still
problematic.
Air stripping involves the removal of VOCs from wastewater, converting the
pollutant from waterborne to airborne, necessitating the treatment of the resulting
gaseous products. Outside Western countries, air stripping is often used to dilute
airborne contaminants from industrial waste efuents. However, this approach again
simply transfers the pollution problem from the liquid phase to the gas phase without
destroying the chemical and is therefore banned in Western countries [13].
If this stripping process is accompanied by adsorption of the stripped pollutants
and photocatalytic oxidation using a high surface area mesh containing a photocatalyst and a light source, the process is still very useful. Such an approach is being
implemented commercially in Japan, as discussed in a later section.
Some of the other techniques such as incineration do lead to partial mineralization
of the pollutants; however, they require high temperatures and again can result in
unwanted gaseous by-products. Oxidation and chlorination are usually not capable of
mineralizing all types of organic wastes completely. These techniques often lead to
the formation of secondary pollutants, which can be just as toxic as the original
pollutant.
Biological degradation processes in the presence of natural sunlight is an inexpensive, common route to degradation of certain noxious materials. The process is
sometimes called the activated sludge process and relies primarily on bacteria [13]. It
j55
56
is somewhat slow and efcient only at low toxin levels. It also requires that pH and
temperature levels be controlled for the health of the bacteria. Also, many toxic
chemicals, especially herbicides and pesticides, have low solubility in water and these
pollutants are found primarily in sediments where sunlight does not penetrate.
Therefore, although pesticides such as DDT have been banned in Western countries
for decades, they are still ubiquitous in riparian and coastal environments.
Chlorination is a good process for killing viruses and bacteria; however, side
chlorination reactions with organics present in the water may produce chlorinated
species which are known to be carcinogens. In the presence of nitrates the chlorination efciency decreases and nitrates are fairly common in many water sources.
Complete mineralization of organic pollutants is necessary and can be achieved by
advanced oxidation processes (AOPs). This terminology is used to describe a group of
photochemical techniques that lead to rapid and complete mineralization of a variety
of organic compounds [13, 27, 37]. These techniques include UV ozonation, UV
peroxidation (using H2O2) and heterogeneous photocatalysis. Use of ozone as an
oxidant and disinfecting chemical is effective but can also produce unwanted byproducts such as bromate ions, which are suspected carcinogens. Thus, current
research in advanced oxidation techniques has emphasized combining photocatalysis with ozone treatment [38]. Generally, heterogeneous photocatalysis is preferable
over the other two processes. It uses air (O2) as opposed to O3 or H2O2, which are
relatively expensive reactants. Also, depending on which photocatalyst is employed,
excitation wavelengths are in the near-UV (TiO2 as photocatalyst) to visible range
(sensitized or doped TiO2, metal dichalcogenides as photocatalyst), whereas O3 and
H2O2 require short UV excitation wavelengths. However, all three methods are really
only applicable to small-scale waste treatment sites where there is a low concentration
of contaminant present. Large-scale waste remediation has yet to be demonstrated for
these AOP techniques [13, 37].
Photo-oxidation of organic and biological pollutants can be implemented along
with conventional methods using strong oxidants such as ozone and hydrogen
peroxide. However, this is really only viable for high pollution levels. Typically, this
treatment requires the concomitant use of short-wavelength UV lamps that increases
the cost and restricts its application to point-of-use, small volume, water and
air treatment. A strong motivation for the development of photocatalysts such as
TiO2 is to decrease economic costs by using at least a small portion of available solar
light to photo-oxidize organic chemicals and convert them to harmless by-products in
dilute volumes of water and air.
It has been shown that photocatalysts such as titania in combination with oxidants
such as peroxide or ozone can also disinfect water by killing pathogens and oxidizing
even stable contaminants. At present the costs of this combined advanced oxidation
process are too great to be implemented on a large scale, but may be viable in point-ofuse applications for smaller volumes in rural settings. Since titanium is the ninth
most abundant metal [39] on Earth, the cost of producing titania photocatalysts is not
likely to be the limiting factor for its application in environmental remediation.
Heterogeneous photocatalysis has a lot of potential in the eld of environmental
remediation, mainly due to the prospects of complete pollutant mineralization.
3.3 Photocatalysis
As such, it has been the focus of research for some time, with TiO2 as the most
studied photocatalyst. For economic viability, however, new photocatalysts which can
use visible light must be developed. Even the UV efciency of the best TiO2
photocatalysts in water is only around 4%, which is too low to be economically
competitive with existing approaches. The efciency for gas-phase reactions, fortunately, is higher and therefore treatment of air contamination has been the rst
commercial application of TiO2 photocatalysts.
Economic considerations in environmental remediation play a pivotal role in the
decision to employ a particular technology. This point needs to be emphasized when
comparing conventional treatment approaches with proposed methods such as
photocatalytic oxidation using nanosized catalysts. However, these nanosized catalysts provide the possibility for accessing a much larger portion of the solar spectrum
(i.e. the visible portion). This in itself would be very cost-effective, leading to a great
step forward in the eld of photocatalysis for environmental remediation.
No single remediation technology can be expected to address the diverse global
problems outlined above. However, to be widely implemented any approach will have
to possess economic advantages and be scaleable to deal with large volumes of
contaminated solids, liquids and gasses. In the next 1020 years it is possible that
photocatalytic oxidation will play a signicant role in environmental remediation if
appropriate new materials such as nanosized photocatalysts can be developed and
optimized for specic reactions. In the interim, as stated above, the best approach is
resource conservation and recycling of materials so that waste generation is minimized. Societies and their governments should provide tax incentives to promote
such approaches.
3.3
Photocatalysis
3.3.1
History and Background
A large surge in research and interest in the elds of photochemistry and photocatalysis was initiated by the oil crisis in the early 1970s, leading to a search for
alternative energy sources. Of special interest was the generation of hydrogen from
the photochemical splitting of water. Following the demonstration of photocatalytic
water splitting on a TiO2 electrode (in a photoelectrochemical cell) in 1972 by
Fujishima and Honda, the eld of photocatalysis and photoelectrochemistry
ourished [18, 40, 41].
It is worth highlighting some of the initial achievements that participated in
developing photocatalysis as a eld. The photocatalytic properties of wide-bandgap
metal oxide semiconductors such as ZnO and TiO2 were rst discussed in a tutorial
article by Markham published in 1955 [42]. The rst example of using light-driven
catalysis in environmental remediation was that of Frank and Bard, who described
the photo-reduction of cyanide ions in solution [43, 44]. The Ollis group reported the
j57
58
rst photo-oxidation of several types of organic toxins using TiO2 in 1983 and this
work initiated many other studies of this process [45, 46]. The rst application of
photocatalytic oxidation using TiO2 to kill bacteria such as Lactobacillus acidophilus,
Saccharomyces cerevisiae and Escherichia coli was reported in 1985 [47]. A group led by
Fujishima demonstrated the rst use of TiO2 semiconductor powders to photooxidize HeLa tumor cells [48]. Eventually, extensive research on photocatalysis in the
1990s, particularly in Japan, led to the rst report by Wang and co-workers of surface
coatings of TiO2 powders which had self-cleaning properties [49]. These surfaces
were also superhydrophilic, which prevents water droplets from forming. This work
would eventually lead to the commercial production of anti-fogging glass in Japan.
As the above historical timeline indicates, nearly all reviews of photocatalysis using
semiconductors are dedicated to the properties of titania. [13, 17, 22, 28, 50]. TiO2 is
the most widely investigated material due to its ready availability, lack of toxicity and
photostability. However, due to its wide optical gap of 33.2 eV, which precludes full
use of the solar spectrum, research to develop alternative materials continues actively.
Nevertheless, in Japan, commercial use of titania lms in self-cleaning tiles and air
lters has been shown to be economically viable. Self-cleaning construction materials
such as tiles are sold by several Japanese companies and these developments are a
subject of a recent review which we discuss in more detail at the end of this
chapter [50].
The search for new materials or ways to improve visible light absorption of TiO2
through sensitization and bandgap manipulation via doping continues. Improvements in the synthesis, characterization and processing of semiconductor nanoclusters is a major focus of this chapter. Recent developments in the last decade allow
photocatalysis using new nanomaterials such as WS2 and MoS2, which in the bulk
have near-IR bandgaps that are too small to drive many photo-oxidation processes.
Nanoclusters of these materials have been demonstrated to have bandgaps which red
shift into the visible region with decreasing size [51]. This permits the application of a
wider range of semiconductors as candidates for viable photocatalytic processes in
environmental remediation [52].
3.3.2
Definitions
3.3 Photocatalysis
is often called total mineralization. A scheme for these processes using TiO2
photocatalysts and the time scales associated with various steps is shown in
Figure 3.1 [13] and Figure 3.2 [37]. Note the key role of hydroxyl radicals in both
gures. Also, depending on the chemical, there are many possible rate-limiting steps
in the photocatalysis reaction.
Laser photolysis and transient absorption experiments have allowed the time
identication and time scales shown in Figure 3.1 to be determined [54]. Both trapped
holes and electrons have distinct absorbance spectra in the visible region and the
decay rate of these absorbance features following photoexcitation allows the carrier
transfer kinetics to be followed. The selectivity and efciency of these processes
depends on many factors that we review in more detail subsequently.
3.3.3
Well-Known Example Water Splitting Reaction
A light-driven electrochemical photocell to split water into hydrogen and oxygen was
rst described by Fujishima and Honda in 1972 and was a key result motivating
interest in titania as a photocatalyst [18]. Although the oxidation and reduction
potentials of TiO2 are sufcient to drive this reaction, the kinetics are difcult,
j59
60
requiring four electrons to be transferred. Hence the efciency of the process is low,
estimated at less than 0.1%. This low UV light efciency combined with the fact that
TiO2 absorbs only small amounts of the total solar spectrum means that most of the
absorbed light simply generates heat. A lesson learned from these experiments is that
although the thermodynamics for photocatalytic oxidation may be favored for a
chosen chemical, competing pathways may limit the efciency and these pathways
must be eliminated by careful design of the nanocatalyst (see Section 3.4). It is
unlikely that a single type of photocatalyst will sufce for all oxidations of interest.
Many later reports of direct, fully catalytic water cleavage are doubtful since
chemicals present during the preparation of the particles (alcohols in the case of
the TiO2 synthesis), can act as electron donors, allowing hydrogen evolution to occur.
However, the electrons are not furnished by the conduction band of the TiO2 in this
case. One must also be cautious regarding the contributions of reactor cell leaks to the
observation of oxygen in these reactions. Mills and Le Hunte provide a good critique
of these experiments [17].
3.4
Design Issues for Environmental Remediation Photocatalysts
3.4.1
Introduction
The factors affecting photocatalysis when using bulk semiconductors such as TiO2
will also need to be considered when working with nanosized photocatalysts.
The important processes outlined in Figure 3.1 can be viewed as design parameters
requiring optimization in order to have an efcient photocatalytic reaction.
3.4.2
Charge Separation
For the size range of nanoparticle photocatalysts discussed in this chapter, photogenerated charges rapidly diffuse to the particle surface and are trapped. Many TiO2
surfaces, for example, have oxygen vacancies that function as electron traps.
However, the electrons in these traps have lower energies than the initially photoexcited conduction band electrons and so are less likely to be transferred to oxidants. It
is possible to modify a nanocluster surface to improve the electron transfer process as
we discuss in the synthesis section of this chapter (Section 3.6). A longer electron
lifetime improves the generation of hydroxyl radicals from surface TiOH species.
Spatially separating the electrons from the holes by introduction of metal or
semiconductor islands on the TiO2 surface is a good approach for increasing the
electron lifetime and the photocatalytic efciency since the electron transfer is usually
the rate-limiting step. This is why most reactions are conducted in oxygen-saturated
water since oxygen is a good electron scavenger.
Several approaches to improving charge separation in titania photocatalysts have
been suggested and implemented. Deposition of metal islands such as Ag or Pt on
TiO2 clusters has been shown to facilitate the electron charge-transfer process, which
is kinetically the slowest redox event [55, 56]. The metal islands are thought to
function as sinks for electrons, reducing the recombination of the photogenerated
charges. The presence of Ag on the TiO2 was shown to enhance dye photo-oxidation
compared with the TiO2 particles alone in a batch slurry-type reactor. Orlov and coworkers [55] used a ow-through reactor design with immobilized Au-coated TiO2
particles to improve photoactivity for two common pollutants, methyl tert-butyl ether
(MTBE) and 4-chlorophenol.
In general, not all dopants will enhance charge separation and the subsequent
quantum yield of all reactions. For example, Fe3 , which is one of the most
common dopants, is capable of enhancing the photodegradation of certain
pollutants (CCl4, CHCl3 [57, 58]) while negatively impacting the photo-oxidation
of other pollutants (e.g. 4-nitrophenols [58, 59]). It has also been reported that there
is an optimum dopant concentration which depends on TiO2 particle size [58].
Particularly for Fe3 , the necessary doping level was found to increase with
decreasing TiO2 particle size where sizes 11 nm showed enhanced behavior for
photodegradation of CHCl3 [58]. However, it is not clear that the particle size and
size distribution were controlled, so specic conclusions based on size may be
difcult.
Alternatively, coupling two semiconductors can be used to improve charge
separation by transfer of one of the charges, say the electron, from the semiconductor
with the higher potential to the lower one. This leaves the hole on the original
semiconductor and can achieve better spatial charge separation. An example of this is
TiO2CdS, where CdS can absorb visible light and transfer its electron to TiO2 [60].
j61
62
3.4.3
pH of Solution
The initial carrier generation process important in photocatalysis is the photogeneration of electrons and holes, which occurs on the femtosecond time scale [24], as
shown in Figure 3.1. The carriers then diffuse to the cluster surface in less than 10 ns
and are trapped. The holes can be trapped at Ti(IV)OH sites and the electrons at
Ti(III)OH sites in around 10 and 0.1 ns, respectively. Interfacial charge transfer of
the Ti(IV)OH holes to adsorbed organic molecules and Ti(III)OH to molecular
oxygen then can occur on time scales of around 100 ns (holes) to milliseconds. The
slow time for the latter means that methods to accelerate the transfer of electrons are
important to minimize undesired surface trapped carrier recombination.
3.4.5
Presence of Simple and Complex Salts
Metal salts, especially simple ones such as NaCl, are present in most wastewater and
many natural aquifers. As such, they affect both photocatalytic activity and selectivity.
Most common anions found in water systems such as chloride, nitrate, phosphate
and sulfate decrease the catalytic oxidation of organic compounds [13, 6163]. The
presence of ions during water purication using photocatalysis is an important
research topic since some present and future sources of water will require desalination and this source of salt water also contains organic pollutants from off-shore
dumping, etc. If brackish water must rst be completely desalinated to avoid
poisoning of the photocatalyst, the cost of environmental remediation of organic
contaminants will not be economically competitive with carbon adsorption
approaches that do not have this requirement. Hence the ability to formulate a
photocatalyst which is somewhat salt tolerant is important.
Certain anions, such as sulfate and phosphate ions, can form reactive species such as
H2PO4, which are good oxidants, thus improving the photo-oxidation rate. However,
their other effect is to adsorb on the photocatalyst surface, which can block active
catalytic sites. This adsorption is strong enough that washing with an alkaline solution
is necessary to restore the photocatalytic activity. For example, phosphate adsorption at
low concentrations of only 1 mM has been reported to reduce the photo-oxidation of
organics such as ethanol and aniline by around 50% [61].
The effect of cations on the photoactivity is more varied and dependent on the
type of organic molecule considered in addition to the metal ion type and concentration [13, 31, 64, 65]. At high concentrations, their presence is mostly unfavorable
since they may undergo photo-reduction by the photocatalyst itself and deposit on
the surface, blocking active sites. By maintaining the reaction media at low pH,
the positively charged photocatalyst surface will repel cations, lessening their negative
impact.
Photo-reduction, however, can be useful in improving activity in the case of metals
such as Pt, Ag or Pd, provided that full coverage of the catalyst surface does not occur.
Instead, metal islands are formed, which function as electron storage sinks and
facilitate charge separation and transfer. This topic is discussed in more detail in the
section on synthesis and photocatalyst surface modication. Certain types of metal
ions such as Cu(II) and Fe(II) can enhance photocatalysis by limiting carrier
recombination through trapping either electrons or holes. The facile ability to change
oxidation state is critical to this function. However, many metals, including Fe, are
prone to hydroxide formation and deposition of these metal hydroxides on the
photocatalyst surface is almost always detrimental. This means that an optimal metal
ion concentration in the range 100500 ppm (0.010.05%) should be determined and
used for a chosen photocatalytic reaction.
There is a close connection between the pH of the solution and whether an ion
such as Cu(II) accelerates photocatalysis or not. For example, in the photo-oxidation
of aliphatic acids by TiO2 in the presence of Cu(II), it was reported that for pH
values <4 Cu(II) forms complexes which are active intermediates, trapping holes,
whereas copper diacetate complexes form at higher pH values and poison the
photocatalyst [66, 67].
In our work on the photo-oxidation of pentachlorophenol, we demonstrated that
nanosized TiO2 and SnO2 synthesized by ambient temperature hydrolysis, followed
by dialysis to remove unwanted by-products, was quenched by the addition of simple
salts such as NaCl at concentrations of only 10 mM [68]. These low concentrations of
NaCl have a poisoning effect on the catalyst, although it is mild in this case. Similar
observations have been made in eld tests of TiO2 where deionization of the aqueous
waste stream has been found to be necessary [69]. This is an important consideration
when estimating costs of remediation using photocatalysts, since desalination is a
costly process. In other studies by Barbeni et al., no inhibition of the PCP photocatalysis by TiO2 in the presence of NaCl at 1 mM occurred, so the presence of low
level salts is acceptable [70].
j63
64
3.4.6
Effect of Surfactants
In our work, we investigated not only the effect of simple salts on the rate of photooxidation of PCP, but also that of complex, surface-active salts/surfactants such as
dodecyltrimethylammmonium chloride (DTAC), a cationic quaternary ammonium
salt with an organic, hydrophobic tail group. Using this surfactant, we could isolate
the effects of the common anion chloride from that of its ion pair, NaCl in one case
and dodecyltrimethylammonium chloride for the other. To our surprise, this complex
cationic surfactant signicantly accelerated the rate of photo-oxidation of PCP
by Degussa P25 titania in addition to our nanosized photocatalysts [68]. Two
explanations of these experiments are reasonable:
1. Binding of PCP at photocatalytic surface sites by the presence of a bulky surfactant
such as DTAC is not the rate-determining factor in the photo-oxidation of PCP by
TiO2. Instead, this surfactant either aids hole or electron transfer to PCP or electronaccepting species such as oxygen. Possibly free hydroxyl radicals are formed, which
can diffuse to the PCP so that direct hole transfer to bound PCP is not required.
2. The sodium cation in NaCl is mainly responsible for the strong quenching of the
photo-oxidation. This contradicts other studies cited above, however.
The signicant enhancement (note the logarithmic scale on the vertical axis) due to
the presence of a surfactant is shown in Figure 3.3. Note that at similar concentrations
of NaCl the rate of PCP photo-oxidation is signicantly slower. A common argument
against using surfactant-stabilized nanoclusters as catalysts is that the surfactant will
block access to the cluster surface and thus poison the catalyst. Our study shows this
is not the case for DTAC. We also discovered that substitution of a bromide
counterion for the chloride in DTAC does not increase the photo-oxidation of PCP
as greatly as for DTAC. It is also worth noting that the surfactant peaks in the
Most photocatalysis studies have been performed in aqueous solution. An observation common to photo-oxidation of most organics in water is that dissolved oxygen
plays a vital role in the oxidation process by forming rst dioxygen radical anions by
abstraction of the electron formed upon photoexcitation of the semiconductor and
then peroxides by reaction with the water [72]. Many researchers deliberately aerate
their photocatalyst slurry suspensions in order to optimize the photo-oxidation
process and claims have been made numerous times that nearly total quenching
of the photo-oxidation process occurs when inert gas purging is employed [70]. Thus,
the use of aprotic organic solvents should cause a severe quenching of the photooxidation due to both the lack of OH radicals and reduced oxygen levels because of
decreased oxygen solubility.
Molecular oxygen has a strong afnity for electrons and its presence should reduce
undesired carrier recombination, thus increasing the effectiveness of the photocatalyst [73]. However, there is evidence that at high concentrations of oxygen the
photocatalytic activity is reduced, possibly due to changes to the TiO2 surface such as
hydroxylation, which could interfere with the adsorption of the organic on catalytic
sites [74].
If water and dissolved oxygen play critical roles in the photocatalysis process using
TiO2, then carrying out these reactions in a polar, but aprotic solvent such as
acetonitrile (ACN) would be expected to quench the process and could alter the
photo-oxidation pathway. However, complete quenching is not observed in anaerobic
catalytic photo-oxidation of pentachlorophenol by Degussa TiO2 in the aprotic solvent
ACN (lled triangles, Figure 3.4). Uncatalyzed photo-oxidation in ACN, which occurs
by a different mechanism, is quenched, however (open diamonds, Figure 3.4) [68]. In
Figure 3.4, it can be observed that although the rate of catalytic photo-oxidation is 25
j65
66
times slower in nitrogen purged ACN than water, photocatalysis does occur. Also,
the mechanism for PCP photo-oxidation in water and in ACN is similar, as veried
by the presence of common elution peaks for the main photo-oxidation intermediate
in the HPLC traces for both solvents.
3.4.8
Light Intensity
j67
68
connement due to decreased particle size from surface chemistry changes which
occur in the same size range. We will give examples of both types of size-dependent
changes in this chapter.
3.5.2
Nanomaterials and Advantages in Photocatalysis
3.5.2.1 Semiconductor Nanoclusters
As in the case of their bulk counterparts, nanosized semiconductors have a range of
forbidden energy states whose most important effect for photocatalysis is to allow the
creation of valence band holes and conduction band electrons using light. Fast
recombination of these photoexcited holes and electrons is an undesirable effect for
photocatalysis. Hence nanosized semiconductors with strong photoluminescence
are poor candidates as photocatalysts. Fortunately, the diffusion time for electrons or
holes in nanoclusters is so fast that the normal bulk recombination process is not
important. The most important effect is trapping of the carriers at the surface. This
trapping can be inuenced by cation or anion vacancies [e.g. Ti(IV) or O2] at the
surface which are deliberately or accidentally introduced during the synthesis. It is
also possible to use surface-active molecules such as surfactants, certain ions and the
general solvent environment to change the carrier lifetime at the cluster surface and
improve the oxidation process for adsorbed organic chemicals. Deposition of metal
islands such as Pt that improve charge separation by serving as electron traps is
another avenue for enhancing charge separation. Each of these synthetic strategies
will be illustrated in more detail later with examples from the literature. Many of these
effects are simply due to the different geometry and chemical bonding in a cluster
surface at small dimensions (13 nm) and would occur even in the absence of
quantum connement of the electronhole pair.
3.5.2.2 Quantum Confinement
The quantum size effect was rst discussed by Frohlich in 1937 [77, 78]. The concept
of quantum connement that he described evolved from a series of observations
where light interactions with certain materials varied as a function of form. Frohlich
discussed quantum size effects in the context of observed differences in lightscattering behavior between small particles and corresponding thin lms of the
same material [77, 78]. The quantum connement model is similar to the particle-ina-box model where an electron is conned in a nite volume in space and the number
and energy levels of the possible states is determined by the conning potential. In an
aromatic molecule, for example, this electron connement or delocalization is
determined by the degree of bond conjugation setting a length scale over which
the electron is conned by the electrostatic potential. In such a system only a discrete
number of energy states are possible and as the physical size or delocalization of the
electron increases so does the number of states, eventually becoming so closely
spaced as to be essentially continuous.
In a semiconductor, the states which make up the valence and conduction bands
are also so closely spaced as to appear continuous. However, as the semiconductor
j69
70
predictions of energy shifts with size are qualitative, rarely agreeing with experiments
for sizes less than 45 nm.
Discussions of quantum connement have been the subject of several reviews [13, 15, 17, 34, 75, 76, 7981]. Many reviews have considered the effects
of quantum connement on electrical, optical and photocatalytic properties. In
particular, Brus modeled the shift in redox potential with decreasing particle size
for CdS and InSb [82]. Bruss model is based on a description of bulk state
behavior in the limit of small crystallite size. It uses approximations from band
theory of perfect lattices and assumes a crystal structure matching that of the bulk
material. As a result, it could not account for surface states. In general, this model
predicts fairly mild quantum effects as a function of size, especially when the
crystallite sizes are larger than 5 nm.
Some of the predictions are based on the relationship between the particle size and
the effective mass of the exciton [i.e. the effective mass model (EMM)]. There is some
controversy as to whether the effective mass model should be used to make
quantitative estimates of particle sizes based on absorption spectra shifts and the
use of the same exciton effective mass as that for the bulk counterpart [79]. The actual
value for the exciton effective mass is itself dependent on size and shape.
Both cluster size and shape inuence the energy shifts observed in nanoparticles.
A rod-like cluster will have different connement potentials for the longitudinal and
transverse directions, for example. Also, the conning potentials for electrons and
holes are typically different, as reected in the different effective mass of the electrons
and holes or curvatures of the potential surfaces for the valence and conduction
bands. In most materials, electrons are much more mobile than holes and the
conduction band will shift more strongly than the valence band with decreasing size.
As noted by Wise [83], the particle size at which connement effects become
prominent is greatly inuenced by whether or not the charge carrier mobilities are
similar. Certain materials such as PbS and PbSe satisfy this requirement, but most
others such as CdS and TiO2 do not. Chemically, this is due to the unequal sharing of
electrons in polar semiconductors such as metal oxides and suldes.
The blue shift of the bandgap absorption onset impacts the design of nanosized
photocatalysts in several ways and the length scale at which such effects are rst
observed is materials dependent. This increase in the bandgap as a function of
decreasing size allows the possible use of a wider range of materials as photocatalysts.
Many of materials are not catalytically active in their bulk form. For example,
semiconductors such as MoS2 and WS2, which have near-IR absorption onsets, can
be shifted into the visible region by decreasing the cluster size. Both valence and
conduction bands shift in energy, the valence band becoming more positive and thus
the holes more strongly oxidizing. The photoexcited electrons in the conduction band
are shifted to more negative potentials, also improving their transfer to such species
as molecular oxygen. In general, the greater the effective mass of the holes compared
with the electrons, the larger is the shift in the conduction band energy with
decreasing size. The amount of the shift can be estimated by measuring the rate
of hole or electron transfer to uorescent hole or electron acceptor molecules as a
function of cluster size [84].
The interplay between quantum size effects and surface chemistry changes with
decreasing particle size can be optimized in order to enhance the activity of a
particular catalytic reaction. However, in order to do this, there must be fairly good
synthetic control over particle size and monodispersity. This size control is not
available through all synthesis techniques and seems to be difcult in the case of
TiO2. As a result, many conclusions relating photocatalytic activity to size may seem
contradictory. More often than not, other factors such as photocatalyst environment
or synthetic protocol dominate the changes in the observed reactivity.
3.5.2.3 Surface Chemistry
In addition to increasing the strength of the conning potential and shifting the light
absorption onset, decreasing particle size results in a larger fraction of atoms which
are in surface sites with bonding differing from the interior atoms. For small clusters
(1.52 nm), 7080% of all the atoms reside on the surface [85]. Sometimes cluster
surface structures are considered defective. However, these defects may be useful for
photocatalysis. For example, a metal oxide defect structure (e.g. TiO2) with oxygen
vacancies can enhance both the adsorption of water on the surface and the dissociation rate of water into hydroxyl groups and protons. This water dissociation process
requires the presence of paired acidbase sites that are situated in close proximity.
Surface sites with acid character such as titanium cations initially bind water
molecules, whereas neighboring sites with basic characteristics such as TiOTi
structures can accept a proton from the water molecule. In nanosized materials, the
structural arrangement of interior atoms that is simply the phase structure as
determined by diffraction methods, is less signicant than the surface chemistry.
For example, in the case of titania, the surface hydroxyl groups play a critical role in
initiating the oxidation process by both inuencing the adsorption of organic
chemicals and aiding the dissociation of adsorbed water into hydroxyl radicals and
hydrogen ions. Free hydroxyl radicals are very good at attacking and oxidizing a wide
range of organic groups and, with a potential of around 2.8 V, are more effective than
any radical except uorine. Even surface-bound hydroxyl radicals with a potential of
1.5 V can be effective oxidants [86].
Synthetic changes of the surface properties of nanosized catalysts can also be used
to modify redox potentials and substrate binding, independent of quantum connement effects. Kamat and Meisel have taken advantage of this effect through alterations of the interface of nanosized TiO2 particles deposited on Au [27]. The intimate
contact between the metal and the surface of the nanosized TiO2 was reported to be
crucial in improving charge transfer. Additional examples of the effect of ions and
metals on the surface of semiconductor photocatalysts will be given later.
3.5.2.4 Other Unique Materials Properties
Since most of the atoms in a nanocluster can reside at accessible surface positions,
very small changes in the chemistry of the cluster surface can signicantly alter its
photocatalytic properties. For example, in a 5060 atom cluster with a diameter of
around 1.6 nm, the addition of a single foreign atom can change the interatomic
spacing and thus the binding energy between the catalyst nanoparticle and an organic
j71
72
3.6
Nanoparticle Synthesis and Characterization
3.6.1
Introduction
j73
74
An important point about utilizing electron microscopy to study cluster size is that
only a small portion of the sample (e.g. a few hundred particles) is analyzed. Therefore,
if an unrepresentative portion of the TEM grid is chosen, conclusions regarding the
average size and size dispersion for the entire sample are not warranted. Instead,
SAXS is a more objective measurement since 1012 clusters are contributing to the
SAXS signal. The drawback of SAXS measurements for the smallest photocatalysts
(13 nm) is that the scattering is very weak and conclusions regarding size dispersion
are more problematic. Since this size regime corresponds to the size of photocatalysts
that are typically the most active, one must use analytical methods designed for the
study of large molecules and polymers such as size-exclusion chromatography (SEC)
to obtain precise size and size dispersion data for such nanoclusters.
For fully dispersed nanosized photocatalysts, it is possible to use HPLC to
separate both chemicals and photocatalysts in complex mixtures and study each
component by various analytical methods such as optical absorbance or photoluminescence. There are two mechanisms for separation. The rst, SEC, depends
on the degree of permeation of clusters into a porous chromatographic medium
which is packed into a column of a given length. Clusters are injected into a owing
mobile phase such as toluene in which they are soluble and then transported
through the porous medium. A good example of a hydrophobic porous medium
suitable for SEC is microgels of cross-linked polystyrene. Chromatographic columns packed with microgel particles are designed to separate chemicals by size.
The smallest chemicals can penetrate a larger fraction of these channels and
therefore take a longer time to elute. Using molecules of a known size, one can
calibrate the column so a given elution peak time can be used to obtain the
hydrodynamic size of a nanocluster [88]. Figure 3.7 gives an example of the effect
j75
76
of chemical size on the elution time for three calibration standards consisting of
aliphatic hydrocarbons labeled C16 (hexadecane), C12 (dodecane) and C8 (octane).
Larger metal (Au) and semiconductor (CdSe) nanoclusters are included in this
chromatogram. The time axis is proportion to the hydrodynamic size, D logt as
shown. This hydrodynamic size includes surfactants which are on the surface of the
nanoclusters.
The simple relation between elution time and hydrodynamic size breaks down
when the chemical or nanocluster interacts chemically with the column material.
Examples of materials which have strong chemical afnities for metal oxide clusters
are silica- and alumina-based columns which may be modied with various types of
organic moieties to make them more or less hydrophilic. If a nanocluster is
somewhat hydrophilic, e.g. a metal oxide like TiO2, and the column is also hydrophilic (e.g. also a metal oxide such as alumina), the cluster will interact or stick and
then release constantly as it moves down the column. How strongly the molecule
sticks to the column will inuence its elution time, with the most hydrophilic clusters
eluting at the longest time. This afnity chemistry can be used to separate and study
clusters based on cluster surface chemistry. Since surface chemistry is so vital to
photocatalytic activity and selectivity, this type of chemical afnity chromatography
provides a useful complement to SEC.
Various types of detectors may be used in-line with the separation column to
detect the elution time of chemicals. In Figure 3.7, the detector response has been
normalized by its peak, but the total area under the elution peak is proportional to
the amount of chemical and is a very useful piece of information. If the chemical or
cluster absorbs light, an absorbance spectrometer which can detect both the
complete spectrum or just a single wavelength can be used [89]. Since semiconducting clusters such as MoS2 absorb in the visible region, monitoring the
absorbance provides a signature for the elution of a cluster. Most organic pollutants
absorb light only in the UV or near-UV range, so monitoring the light absorbance in
this region allows the detection of these species. For non-absorbing chemicals, the
change in the refraction of light when a chemical is present in the mobile phase can
be used to detect the elution. Fluorescent molecules or nanoclusters (most
semiconductor nanoclusters used as photocatalysts are at least weakly uorescent
and emit in the visible region) can be detected with an in-line uorescence
spectrometer. The entire absorbance spectrum corresponding to an elution peak
can be used to distinguish different clusters or chemicals [90]. The width and
degree of shape homogeneity of the spectra of the elution peak can also be used to
gauge the size dispersion of the clusters.
Collection of an eluting peak allows further identication of the chemical or cluster
by other analytical methods such as gas chromatographymass spectrometry
(GCMS). Elemental composition can be quantied by using X-ray uorescence,
(XFS). The latter technique is non-destructive and can be used for either solutions or
solid lms. The collected solution can be dried out and the resulting powder
subjected to gas adsorption measurements to determine the available area per unit
mass. Comparisons of such data with the geometric area based on the particle size are
useful for inferences regarding geometry.
Dynamic light scattering (DLS) measures the diffusion rate of particles in dilute
solution, where by dilute we mean that the particles are separated from each other
by many particle diameters. This is the case for most photocatalysis studies. DLS is
very useful for monitoring particle photocatalyst size changes and aggregation
which might occur as a result of the photocatalytic reaction. If aggregation of the
colloids occurs, for example, less catalyst surface area will be available to substrates, lowering the photoactivity of the nanomaterial. It is important to establish
that the nanosized photocatalyst average size remains unchanged as a result of the
reaction.
Studies of TiO2 synthesis in wateralcohol mixtures sometimes use DLS to
measure the particle size in solution and to follow changes with time, since these
clusters are not agglomerated and are fully dispersed. This is rarely done for most
experiments using commercial gas-synthesized TiO2 nanocluster photocatalysts
since slurries or suspensions of these clusters in water are turbid. This means that
light is multiply scattered, precluding DLS measurements of cluster diffusion. For
example, the strongly turbid solutions formed by commercial powders of TiO2 such
as Degussa P25 are unfortunately unsuitable for DLS studies.
3.6.3
Detailed Examples of Nanocluster Synthesis and Photocatalysis
3.6.3.1 Semiconductor Nanoclusters
Photocatalysts based on semiconductor nanoclusters must have certain chemical,
electronic and optical properties to be useful in environmental remediation. Other
properties include low synthesis cost, photostability in water over a wide range of pH
values and a light absorption range that includes a signicant portion of the solar
spectrum. In addition, the oxidation potential of the valence band hole must be
sufciently positive to form free radicals from adsorbed organics and/or create
hydroxyl radicals from adsorbed water. Ideally, the reduction potential of the
conduction band electrons must be negative enough to transfer electrons to dissolved
oxygen or other electron acceptor molecules. No single material of a xed size can
satisfy all of these conditions, but as many reviews have noted, titania comes the
closest [13, 17, 34]. It is photostable under near-UV illumination conditions,
rendering it environmentally benign, and its redox potentials can drive photooxidation reactions for a variety of organic and biotoxins [13, 17]. It suffers primarily
from too wide a bandgap, which requires that UV light excitation be used, corresponding to only 25% of the solar spectrum.
Many complete reviews focusing only on titania have appeared addressing its
photophysical and photocatalytic properties [13, 14, 16, 17, 20, 22, 24, 26, 34, 40]. We
shall selectively and critically review some of its properties based on these studies. We
then discuss in more detail nanosized semiconductors which have received less
attention, such as MoS2 and WS2. In our discussions we will emphasize the
connection between synthesis, size, nanostructure and photocatalytic properties
and also the characterization methods best suited to study this relationship.
j77
78
3.6.3.2 TiO2
Synthesis, Nanostructure and Electronic Properties The growth of a solid, colloidal
material such as TiO2 or SnO2 from chemical precursors consisting of metal
alkoxides can be catalyzed by acid hydrolysis. The hydrolysis is a stepwise process
which initially produces TiO2 molecules during the induction period. Over time, an
increase in concentration of these molecules creates a condition of supersaturation,
at which point nucleation occurs to form small agglomerates (typical sizes are
12 nm). It is generally known that a relatively short nucleation period removes the
supersaturation condition and results in a narrow size distribution. Following the
nucleation events, growth continues until all the precursor is exhausted.
Over the years, protocols have been empirically developed to synthesize TiO2
colloids in the nanosize regime. Unfortunately, reviews and papers discussing the
use of nano-TiO2 as a photocatalyst usually give very little or no specic information
regarding the many details of the synthesis, particularly the important issue of colloid
size control. Research suggests that particle size is important for the photocatalytic
activity of TiO2 [58]. Hence it is worthwhile to give some synthesis examples from our
laboratory discussed in a previous paper [68].
Our method is based on slow injection of a solution of either titanium or tin
isopropoxide in 2-propanol into a rapidly stirred acidic solution (pH 1.5). The ratio
of 2-propanol to water was xed at 1:10. Under these conditions, we observed that
smaller sizes are favored by higher precursor concentrations and faster rates of
injection for colloid sizes between 10 and 100 nm. A plausible rationalization of this
observation is that higher titanium isopropoxide concentrations will produce larger
numbers of critically sized nuclei of TiO2, with less available precursor left to add to
these nuclei, so growth will both be faster and end more quickly.
A systematic increase in nal colloid size occurs upon increasing the pH of the
acidic solution. However, this size increase is accompanied by a loss of long-term
stability against aggregation. A critique of this explanation of the observed smaller
sizes at higher concentrations is that it ignores the possible sintering and/or
exchange of atoms between clusters which may occur after all the precursor is
exhausted. This can result in changes in the colloid size distribution and even the
average size. Our conclusions regarding size control differ from those of other
workers who synthesized larger (400500 nm) TiO2 colloids by a similar process and
found no systematic dependence of nal colloid size on initial precursor concentration [91]. However, these experiments did not use acid-catalyzed hydrolysis and so the
nucleation process was slower for an equivalent precursor concentration and resulted
in colloids 400500 nm in size. The small specic surface area of such colloids makes
them unsuitable for efcient photocatalysis.
Nanostructure, Surface Structure and Photocatalytic Activity Although the TiO2
anatase phase is favored thermodynamically, the solution growth process that we
described can produce a mixed anataserutile phase and also amorphous material.
Degussa P25 photocatalyst, for example, has a 70 : 30 anatase:rutile composition.
Both anatase and rutile phases have tetragonal lattices with a rock salt-like structure.
Figure 3.8 Crystal structure of (a) anatase, rutile (b) and (c)
brookite showing the connections between TiO62 octahedral
subunits which distinguish the crystal structure of TiO2
(Reprinted with permission from O. Carp, C. L. Huisman, A.
Reller, Progress in Solid State Chemistry 2004, 32, 33).
j79
80
active sites and thus reduce the photo-oxidation of organic chemicals. One might also
expect the physical size and adsorption mechanism to be strongly affected by the type
of organic, which explains several of the conicting reports concerning the efcacy of
Pt deposition [69, 98].
An additional example of the use of metal island deposits in commercial anatase
powders is the use of Ag/TiO2 powders prepared by the photo-reduction of Ag ions by
suspensions of TiO2 [56]. The deposition of the Ag on TiO2 was accompanied by
increased absorption of the Ag/TiO2 in the visible range (400600 nm) where Ag
clusters have signicant absorbance. However, the absorbance was much broader
than would be observed in isolated spherical clusters. This visible absorbance
increased as the fraction of the Ag deposited increased from 0.5 to 2.0 wt.% relative
to TiO2. The catalytic photo-oxidation of two dyes was studied using a batch slurry
reactor illuminated with UV light. Both dyes were photo-oxidized at substantially
faster rates compared with the naked TiO2 powder and, as in the Pt/TiO2 studies
described above, an optimal Ag loading of around 1.5% was found. The fast photooxidation of dyes by TiO2 prevents the use of the dyes themselves to enhance the
absorption of TiO2 into the visible region. However, as discussed below, this dye
sensitization has been proposed as a method of increasing the efciency of TiO2
photocatalysts.
The use of TiO2 with deposited noble metals can be studied with either batch
slurry reactors as described above and shown in Figure 3.9, in studies of TiO2/Pt and
TiO2/Ag or using more practical ow reactors whose walls are coated with the catalyst
material. A good example of the last reactor design was given by Orlov et al. [55]. In
this work, the photo-oxidation of two signicant types of environmental pollutants
was studied using both Degussa P25 TiO2 and the same material modied by
deposition of gold particles.
The method for deposition of the Au nanoparticles on the TiO2 was taken from that
described by Haruta [99]. This method produces hemispherical gold islands with a
large contact area along the perimeter between the TiO2 support and the Au. This
large contact area is considered critical in the activity for oxidation reactions since the
TiO2 will adsorb oxygen and water while the Au can serve as a reservoir of electrons.
The synthesis method described by Haruta can produce small Au islands on any form
j81
82
increased physical separation of electron and hole due to this transfer process can
improve the efciency of the photo-oxidation. Examples of dyes which are effective and
reasonably stable for this process are rutheniumbipyridine complexes [101, 102]. The
cost of such complexes due to the expensive Ru may make such modications
impractical for large-scale systems, however.
An interesting method of extending the absorbance onset via electron transfer
while avoiding the problems with degradation of the organic part of a dye is to deposit
transition metal salts such as Pt, Rh or Au chloride on the surface of TiO2. It has been
demonstrated that photo-oxidation of chlorinated compounds such as 4-chlorophenol using only visible light at 455 nm is then possible [65, 103105]. The metal
chloride complex absorbs visible light and undergoes MCl bond breakage, leaving
the metal in an oxidized state and the chlorine atom on the cluster surface. Then
electron transfer from this chlorine atom to a chlorine atom in the organic compound
also adsorbed on the TiO2 occurs and regenerates the metal, making the reaction
catalytic. There are less possibly adverse reaction pathways in this approach compared with the use of organic dyes as sensitizers.
Research Needs for Future Improvements The key to further improvements in TiO2based photocatalysts is to develop synthetic methods extending the light absorption into
the visible region so that UV lamps and their costs can be eliminated. These new
photocatalysts may have to be based on new nanosized materials as described in the next
section. Also, not enough is known concerning the true longevity of TiO2 when used
under real conditions of salts and other inorganic metal ions in the presence of organic
pollutants. Since any particulate photocatalyst must be immobilized to prevent contamination of the water and loss of the catalyst, high surface area ow reactors which
allow light to reach all the photocatalyst surface are critical. Methods of inexpensively
regenerating the active surfaces of such reactors will also have to be developed.
3.6.3.3 Alternative Photocatalytic Materials
Introduction Several alternatives to TiO2 as potential photocatalytic materials have
also been investigated in an attempt to access more of the visible portion of the solar
spectrum and potentially to enhance the photocatalytic efciency for different
reactions. Some of these alternative materials include ZnO, SnO2, CdS, MoS2, WS2
and, more recently, nitrides and oxynitrides such as TaN and TaON, respectively. Of
the oxide materials (other than TiO2), ZnO has been the most studied. Along with
ZnO, we also review the newer nitrides, specically TaN, as part of a small sampling
of the materials that have been investigated. MoS2 is outlined in detail in the latter
part of Section 3.6.
ZnO ZnO is another wide-bandgap material which has been explored as a photocatalyst by several groups. One concern with this material is its photostability for
certain reactions in aqueous solution [106]. For other reactions, especially photooxidation of dyes, ZnO may have a better efciency than TiO2 [107]. Since the cost and
absorbance characteristics of the two photocatalysts are similar, ZnO in nanosize
form may be a useful alternative photocatalyst.
j83
84
photocatalysis and have been shown to be fairly active for the photocatalytic
destruction of methylene blue (MB) [109]. Zhang and Gao synthesized nanosized
Ta3N5 through high-temperature (6001000 C for 58 h) nitridation of Ta2O5
nanoparticles. The Ta3N5 crystallite size, which was determined by XRD, appeared
to depend on the nitridation temperature, which was also critical in determining the
extent of nitride formation (i.e. complete nitridation only occurred at higher
temperature starting at 900 C). Thus, smaller sized crystallites (18 nm) nitrided
at 700 C often also contained the tantalum oxynitride phase. It therefore follows that
with the higher temperatures needed for complete nitridation, that larger crystallite
sizes on the order of 75 nm formed at 900 C. However, there seems to be some
variation in the resulting nitride formation based on temperature as a parameter
since, according to the authors, pure phases of Ta3N5 also occurred at 700 C but with
resulting crystallite sizes of 26 nm. Control over the synthesis parameters for this
materials system seems to be difcult. Thus, the formation of a pure nitride phase
also proved difcult.
In studying the photocatalytic degradation of MB, Zhang and Gao compared the
reactivity of Ta3N5 nanoparticles with standards such as Degussa P25 TiO2 and
TiO2 doped with N (TiO2xNx) for visible light reactions using batch slurry-type
reactors [109]. Compared with the TiO2xNx under UV/Vis illumination, the
authors found a faster rate of photodegradation of MB by the 18 nm Ta3N5TaON
phased nanoparticles and also the 26 nm nanoparticles consisting of pure Ta3N5.
The comparison with the TiO2xNx is perhaps slightly misleading since, as we
discussed above, this form of TiO2 has been shown to have decreased photocatalytic efciency compared with pure TiO2. The main role of the N doping is to
extend the absorption into the visible region. A fairer comparison under the UV/
Vis conditions would be with Degussa P25, which the authors did not show. The
authors performed the comparison with Degussa P25 only under visible light
conditions where it has minimal absorption and subsequently suppressed photoactivity. Accordingly, the Ta3N5 systems perform better than both the Degussa P25
and TiO2xNx under visible light illumination for the photo-oxidation of MB. The
j85
86
authors also claimed that a size dependence to this photocatalytic activity exists
where smaller Ta3N5 crystallites (18 nm) are more active than larger ones
(75 nm). It is important to distinguish, as the authors did, between crystallite
size and particle size. There may be some evidence of varying photoactivity as a
function of crystallite size, but this cannot be translated into particle size since, as
can be seen from their TEM images, there is a signicant particle size distribution
where large aggregates have formed. Also, as stated by the authors, the 18 nm
sample was not a pure phase of TaN, but consisted of a Ta3N5TaON mixture
whereas the 75 nm samples were pure Ta3N5. It is therefore difcult to conclude
that the photocatalytic activity of these materials is size dependent. It could very
well be a case similar to TiO2, where phase plays a very important role depending
on the specic reactions of interest. Regardless of size dependence, these systems
do show potential for visible light photocatalysis, but it is not clear that they are
better than TiO2, especially Degussa P25.
3.6.3.4 MoS2 and Other Metal Dichalcogenides
High surface area TiO2 powders in nanosized form such as Degussa P25 TiO2,
d 25 nm, as mentioned above, are the most studied photocatalysts. However, TiO2
has a signicant disadvantage due to its large bandgap of 3.2 eV. This means that
it can only be excited by UV illumination of 390 nm or shorter wavelength, thus
requiring the use of UV lamps, increasing the cost of detoxication. If sunlight is
used as the light source, TiO2 absorbs only 37% of the solar spectrum, as illustrated
in Figure 3.11 [52]. Also shown in Figure 3.11 is the absorption edge of bulk TiO2
compared with three sizes of MoS2: bulk, 8 nm and 4.5 nm. Research and analysis by
Tributsch [110] has revealed that in order for photocatalysts to be useful in solar fuel
production and chemical waste detoxication, the semiconductor material must
meet the following requirements: (1) have a bandgap matched to the solar spectrum,
(2) have valence and conduction band energy edges compatible with the desired redox
potentials, (3) be resistant to photochemical degradation and (4) have a short carrier
diffusion time leading to faster energy transfer for the surface compared with
electronhole recombination times [52]. All of these properties can be achieved
through tailoring of the optical and electronic properties of MoS2 nanoclusters based
on the size. Prior to discussing MoS2 nanoclusters, it is worthwhile reviewing the
properties and uses of bulk MoS2.
MoS2 Bulk Properties and Historical Background MoS2 and its structural isomorphs
such as WS2, MoS2 and WSe2 have excellent corrosion resistance, which has resulted
in a variety of high-temperature catalytic and lubrication applications. This resistance
to oxidation comes from the two-dimensional structure of these materials and the
resultant electronic properties shown in Figures 3.12 and 3.13. In Figure 3.12, it is
observed that the valence band is composed of Mo dz2 states and the conduction band
of Mo dxy and dx2 y2 states, so that excitation of an electron across the 1.75 eV gap
doesnt weaken any chemical bonds.
Human use of Mo from MoS2 has a long history. The mineral form of MoS2,
known as molybdenite, is a naturally occurring mineral and one of the most abundant
forms found in the Earths crust. This contrasts with most other metal sources such
as Fe, Al and Ti, which occur naturally as oxides [111]. Due to its long history and the
fact that molybdenite occurs naturally as a mineral in the Earths crust, it is thought to
have been used in very early times, such as in 14th century Japanese swords [111].
Initially, MoS2 was thought to be the same as other ores such as lead or graphite and
was named molybdos, meaning lead-like, by the ancient Greeks [111]. In 1778,
Karl Scheele, a Swedish chemist, demonstrated that molybdenite was actually a
sulde mineral [112, 113] and that it contained molybdenum metal [111]. Extraction
of the Mo was also later performed by Peter Jacob Hjelm in 1782 by reducing
molybdenite with carbon [111]. Throughout the 19th century, interest in Mo was
primarily non-commercial.
Commercial and technical applications of Mo were enabled by mining and
extraction improvements, allowing the extraction of molybdenite in commercial
quantities. The use of Mo in metal alloys steadily increased with time with the
increased demands beginning around World War I. Most of molybdenite is MoS2 and
the rest consists of silicates and other rare metals such as Re [113]. To extract a purer
form of MoS2 from molybdenite, the mineral is subjected to a series of crushing,
cleaning and purication steps [113]. The use of MoS2 as a hydrotreating catalyst [114]
for removal of heteroatoms from crude oil in fuel rening and as a high-temperature
j87
88
lubricant also increased rapidly during the 20th century. Research in Germany
around the time of World War II identied formulations of MoS2 on alumina
supports capable of removing heteroatoms such as S, N and O from crude oil
feedstocks, allowing subsequent rening operations using transition metal catalysts
to be conducted without poisoning of these materials by sulfur. MoS2 has been
studied extensively both in the puried form and in its natural state. It was found to be
easy to study using optical and electrochemical methods since it can be cleaved to very
thin layers, making it suitable for optical and electronic studies [115].
Roscoe Dickinson and Linus Pauling elucidated the anisotropic crystal structure
of MoS2 in 1923 [116]. In this work, they noted that by 1904 Hintze had discovered
that MoS2 occurred as hexagonal crystals with complete basal cleavage planes.
Dickinson and Pauling showed that Mo has six S atoms surrounding it in an
equidistant manner at the corners of a triangular prism. They determined this
through the use of Laue photographs and the theory of space groups [116].
Following this, they also showed that the S and Mo were stacked in SMoS
sandwiches along the c-axis and that there were two sandwiches per unit cell where
c 12.3 [116]. Later, Hultgren [117] and Pauling [118] showed that trigonal
bonding in MoS2 and WS2 was due to d4sp hybridization of atomic wavefunctions.
The bonding within the MoS2 sheet was determined to be covalent and the trilayer
sheets were held together by weak van der Waals forces. Figure 3.6 shows a
schematic of the MoS2 sandwich structure. The most stable form of MoS2 is the
2H-MoS2 structure. Two other metastable polytypes also exist: 1T-MoS2 and 3RMoS2[119]. The space group for the stable 2H-MoS2 is P63/mmc [120]. The quasi2D, graphite-like structure where the trilayers consisting of SMoS sandwiches
held together by weak van der Waals forces allows the facile shearing reported by
Dickinson and Pauling. MoS2 has many applications in lubrication, especially in
space applications and at high temperatures where its solid state form is an
advantage [121, 122]. Other industrial applications include catalytic hydrodesulfurization (HDS) [114, 123]. It has also been proposed as a possible solar photoelectrochemical electrode for hydrogen generation from water [124].
MoS2 is a semiconductor with an n-type conduction mechanism. Its optical
absorbance properties are due to both direct and indirect transitions, which have
been investigated by a variety of measurement techniques, including optical
absorption and transmission [115, 125127], reectivity measurements [128, 129],
electron energy loss measurements [130] and electron transport measurements [131]. For example, Goldberg et al. [127] measured the optical absorption
coefcient, emphasizing the features below the rst exciton [115] located at 680
nm. The lowest energy absorption at 1.24 eV (1000 nm) at room temperature was
forbidden (i.e. indirect) and thus weak. [127]. Using photocurrent measurements,
Kam and Parkinson obtained similar values of 1.23 eV for the indirect Eg of MoS2
and 1.691.74 eV for the direct gap [132]. Roxlo et al. used photothermal deection
and transmission spectroscopy [133] to obtain a highly precise value of the indirect
bandgap of 1.22 0.01 eV. Using the augmented spherical wave method, Coehoorn
and co-workers calculated detailed band structures for MoS2, MoSe2 and
WSe2 [120, 134].
j89
90
It is possible to determine the valence and conduction band levels of bulk n-type
MoS2 using cyclic voltammetry. For example, Schneemeyer and Wrighton studied
the oxidation reactions of biferrocene and N,N,N0 ,N0 -tetramethyl-p-phenylenediamine using an MoS2 electrode and found that the at-band valence band potential
was 1.9 V vs. SCE, demonstrating that the direct transition of around 1.7 eV
controls the oxidizing power of the holes [135]. They noted that this potential
should allow most of the oxidation reactions possible at TiO2 to take place for
MoS2 also.
To test this hypothesis without the competing reaction of water oxidation, they
studied the oxidation of Cl ion in an oxygen-free solvent, acetonitrile. They found
that a sustained evolution of chlorine gas was indeed observed upon visible
illumination of the MoS2 electrode with no loss of photocurrent for over 8 h of
continuous operation. During this experiment, many moles of electrons were
generated without any visible (i.e. microscopic) evidence for oxidation of the MoS2
electrode. Their results support the results predicted by Tributsch for the electrochemical photo-oxidation of water described in the next section [110].
Photocatalytic Properties of MoS2 A detailed and systematic search for photocatalysts
allowing the photo-oxidation of water using visible light was reported by Tributsch in
1977 [110, 124]. The principles he outlined for successful water photo-oxidation
should also apply to photo-oxidation of organic molecules. For example, a suitable
photocatalyst must be capable of absorbing visible light, have valence and conduction
band potentials appropriate for the substrate molecule to be oxidized, have an
efcient mechanism for electronhole separation, be chemically stable (i.e. exciton
creation must not weaken the chemical bonds in the structure) and be inexpensive
and/or easy to synthesize [110].
Based on the rst requirement, oxide compounds were determined to be unsuitable since they absorb mostly in the UV region (e.g. TiO2, ZnO, SnO2). Polar
semiconductors such as CdS were also ruled out since they are unstable as the
photogenerated electrons weaken the chemical bond. Transition metal dichalcogenides appeared to be the most suitable candidates [110]. Tributsch investigated a large
number (70) of metal dichalcogenides and concluded that the most favorable
materials were WS2, MoS2 and TcS2. Due to its lack of abundance and high cost, TcS2
was deemed unsuitable. WS2 was also ruled out since it was difcult to obtain as large
crystals at that time. MoS2 was readily available and easy to process, becoming the
obvious nal choice.
A detailed analysis of the MoS2 band structure was also reported (Figure 3.12) [110].
Optical transitions occur between the d states, as shown in Figure 3.12. Since the
formation of the exciton pair occurs via the metallic d states, light absorption has no
adverse affect on the MoS chemical bonds. It is this phenomenon that accounts for
the stability of MoS2 against photocorrosion and accounts for the widespread use of
MoS2 as a stable, high-temperature solid-state lubricant and its applications as an
electrode material with good corrosion resistance in water.
Tributsch demonstrated the oxidation of water with the formation of molecular
oxygen and hydrogen using MoS2 as an electrode [110]. However, the reaction needed
j91
92
pH 7 versus NHE [52]. It should be noted that the pH of a solution of a metal sulde
has a different chemical effect on the cluster surface compared with a metal oxide
such as TiO2, where low pH tends to make the surface positive and change its
substrate binding properties. For example, the formation of TiOH2 structures,
which is important at typical pH values of around 1 used in many photocatalytic
studies, is not possible for MoS2.
An advantage of MoS2 nanoclusters over bulk MoS2 is the increased surface to
volume ratio (S/V), allowing for more efcient surfaces traps per volume. In addition
to lowering the cost of the photocatalyst by reducing the amount of material required,
the high S/V ratio increases the likelihood of surface trapping of the photogenerated
carriers and so increases the lifetime of the charge carriers prior to recombination.
This increases the probability of successful hole transfer to an organic pollutant. Also,
since nanocluster solutions scatter a negligible amount of light, simplied analysis of
the photocatalytic behavior of the system is possible. As in the case of TiO2, the
method of nanocluster synthesis is likely to have a signicant effect on the cluster
photocatalytic properties.
MoS2 Nanocluster Synthesis Since photocatalytic applications of MoS2 often require
solution processing methods to form either lms or disperse solutions, liquid-phase
synthesis has some advantages compared with vacuum or gas-phase methods.
Monodispersed solutions of MoS2, MoSe2, WS2 and WSe2 nanoclusters were rst
synthesized by Wilcoxon and co-workers using an inverse micelle approach
[51, 52, 68, 141144]. In this method, surfactant molecules are dissolved in a suitable
non-polar organic such as toluene or octane. The surfactants are chemically bipolar
with a water-loving or hydrophilic head group joined to a hydrophobic or water-hating
tail group. Their bipolar nature causes the surfactants to associate and form dropletlike aggregates as shown schematically in Figure 3.14. The non-polar organic tail
groups prefer to form an interface with the non-polar solvent, which allows the
hydrophilic head groups to be shielded from the solvent and lowers the free energy of
the solution. These aggregates are called inverse micelles since the curvature of
the interface is the opposite to that of normal micelles that form in water with the
hydrophilic head groups interfacing with a continuous water phase. The reader may
be familiar with normal micelle aggregates as they are critical to allowing solubilization of hydrophobic entities such as oil and permitting detergent action, which gives
soaps the ability to remove oil from clothes and skin.
Inverse micelles have a water-like interior volume which can dissolve ionic metal
salts just as water does. However, the absence of water means that undesired
chemical reactions such as hydrolysis to form metal hydroxides cannot occur. The
dissolved metal ion pairs interact and are stabilized by the hydrophilic head groups
instead of water, and these head groups are not reactive chemically. When this metal
salt solution is then mixed with another inverse micelle solution containing an ionic
sulding agent, such as a metal sulde or H2S, a controlled aggregation to form a
nanocrystal of MoS2 ensues [52]. The cluster growth process is relatively slow
compared with the same reaction in a continuous, homogeneous medium since
it requires diffusion and collisions between micelles to allow exchange of the growing
nanocrystals. The kinetics of this process are controlled by the size of the inverse
micelle cage that is used to dissolve the Mo salt and also the initial salt concentration.
The slow nanocrystal growth allows good ordering of the atoms even at room
temperature and this is conrmed by HRTEM, showing atomic lattice planes and
facets on the nanocrystals and also selected area electron diffraction (SAD).
Figure 3.15 shows a SAD pattern of a 4.5 nm MoS2 cluster, revealing the same
hexagonal crystal structure as the bulk [51]. Figure 3.16 shows an HRTEM image of
a MoS2 cluster 3 nm in size. This image reveals that the cluster is highly crystalline
with no defects. The lack of defects in nanoclusters of this size is perhaps not
surprising since any defect that may form has a very short distance to diffuse out to
the nanocluster surface.
MoS2 clusters synthesized in inverse micelles in non-polar oils such as decane can
be extracted into immiscible polar organic solvents such as dry acetonitrile or
methanol. In this process most of the ionic by-products and free surfactant is
removed and the resulting solutions can be dissolved in water. Figure 3.17 shows
a photograph of these solutions with each solution having a distinct color and
absorbance onset determined by the average cluster size. Since acetonitrile can be
obtained in high purity with no absorbance above 200 nm, detailed complete
absorbance spectra on puried MoS2 clusters are readily collected.
Optical absorption spectra of nanosized MoS2 (4.5, 3 and 2.5 nm) compared with
the bulk are shown in Figure 3.18 [51]. Curves 3, 4 and 5 correspond to solutions of
4.5 nm, 3 nm and 2.5 nm MoS2, respectively (note the logarithmic scale of the
absorbance axis). The absorption features in the bulk spectra can be traced to the
j93
94
different transitions in the band structure as shown in Figure 3.19 [51]. There is a
weak absorption onset at 1040 nm (1.2 eV) (Figure 3.18), which is attributed to the
indirect bandgap occurring between the G point and the middle of the Brillouin zone
between G and K [51]. The direct bandgap occurring at the K point is represented by
the next absorption onset at 700 nm (1.8 eV). There are actually two peaks that
correspond to this direct transition, labeled A1 and B1. The energy separation of these
two peaks is most likely due to spinorbit splitting of the valence band at the K
point [51, 120, 126, 134]. There is another direct transition originating deep within the
valence band, corresponding to the absorption onset at 500 nm (2.5 eV). This
transition is labeled C and D.
As semiconductor clusters become molecular in size, the continuous bands due to
the translational symmetry develop molecule-like structure and discrete bands. The
valence band evolves into the highest occupied molecular orbital (HOMO) and the
conduction band evolves to become lowest unoccupied orbital (LUMO). These bands
are seen in the absorbance peaks of Figure 3.18. Photoexcitation leads to electron
j95
96
promotion from the HOMO to the LUMO. Some weak luminescence is observed
from these cluster solutions due to radiative recombination from the LUMO to the
HOMO. This luminescence can be used to monitor the transfer rate of the electrons
or holes to other adsorbed species and roughly estimate the size-dependent conduction band energy levels [84].
It is worth noting that other synthetic methods for the formation of MoS2
nanoclusters result in clusters with metallic properties. For example, ultrahigh
vacuum methods have been employed to deposit 2D Mo nanoclusters on gold
surfaces followed by suldization of Mo metal islands [123, 145147]. As a result of
this synthetic route, the formation of triangular clusters through analysis using
scanning tunneling microscopy (STM) [123, 145] was discovered (Figure 3.20). The
STM image in Figure 3.20 shows a single layer of MoS2. These triangular clusters
were shown to have metallic conducting properties for the atoms at the edges. Since
these clusters were grown on a support, it is not certain that free clusters would have
the same geometry and electronic properties.
More recently, Lauritsen et al. showed that excess sulfur can be present at the
edges depending on the cluster size [148]. They also showed that there is a
tendency for the preferential formation of particular sizes, suggesting the formation of magic sizes that are thermodynamically favored. This is an important
observation under these synthesis conditions since this phenomenon is observed
in many other materials systems and under various synthetic routes. It also leads
to better size control and, with this control, the authors were able to link atomicscale structural analysis to cluster size using STM. The subsequent implications
for enhanced catalytic activity, particularly for hydrodesulfurization (HDS), are
thus promising.
Accordingly, Bertram et al. investigated both MoS2 and WS2 clusters formed in the
gas phase and used mass and photoelectron spectroscopy to show that these clusters
have planar platelet structures [149]. The platelet structures were shown to stack just
as in the bulk to form larger clusters.
These MoS2 clusters from Bertram et al.s work were grown by evaporation of
bulk MoS2 using a pulsed electric arc. Both charged and neutral MnSm clusters grow
within an inert seeding gas of helium. To study the effect of extra sulfur on the
structure, they also vaporized Mo and W metal, allowing metal clusters to form of
various sizes which were then exposed to controlled amounts of H2S gas. The
smallest platelets formed had a metallic character and were chemically inert. They
began stacking just as in the bulk structure since multiples of fundamental platelet
masses were observed. Mass analysis was consistent with extra sulfur atoms at edge
sites. Simulations indicated these extra sulfur atoms stabilized the structures and a
magic sized cluster of WS2 is shown in Figure 3.21 taken from [149]. (The phrase
magic sized derives from the larger abundance of certain masses of clusters
observed due to extra stability for special numbers of metal atoms.) Clusters were
stable over a wide range of M:S ratios. This was explained by the hypothesis that
polysulde chains could grow near the edges of the cluster. Their observation of
S : Mo > 2 is consistent with our observations of excess sulfur in clusters grown in
inverse micelles, puried by chromatography and analyzed using X-ray uorescence (XRF) spectroscopy. The nature of the bonding at the catalytically active Mo
edge sites which are protected by sulfur atoms is likely critical to the photocatalytic
activity, as is known in the case of HDS catalysis, the major catalytic application of
MoS2 [148].
The small triangular cluster shown in Figure 3.21 has a nearly zero bandgap (i.e. is
metallic) and only the edge W atoms are metallic, whereas the states near the HOMO
(the Fermi level) are delocalized over the entire plane of W atoms. Furthermore, this
metallic character is due to the excess of sulfur at the edges and, as the clusters are
grown larger and approach the 2 : 1 S : W ratio of the bulk, a gap develops. The optical
properties such as absorbance and luminescence of our larger 2.5, 3 and 4.5 nm MoS2
and WS2 clusters grown in inverse micelles are consistent with an energy gap
j97
98
between HOMO and LUMO states, indicating that they are semiconducting.
Furthermore, the onset of the absorbance is consistent with an indirect optical
transition even for the smallest 2.0 nm clusters studied by our group [144]. Indications are that clusters grown in free space differ in structure and electronic properties
from those grown on substrates, possibly due to the signicant interaction energies
between the Au atoms in the substrate and the Mo atoms.
Quantum Size Effects in MoS2 The effect of decreased cluster size and higher energy
carrier connement in both the in-plane and out-of-plane, transverse or c-axis
direction of platelet stacking are important for the electronic properties of metal
dichalcogenides. Studies of size-related phenomena were reported by Consadori and
Frindt in 1970 through investigations of thin layers (13 , corresponding to a
thickness of one unit cell) of WSe2 [150]. They found that the optical absorption
onset depended on sample thickness and showed a very small shift to higher energies
with decreasing thickness. These effects were attributed to quantum size effects [150].
This was not the rst observation of the effect of carrier connement in small
semiconductors; however, it may have been the rst observation of quantum size
effects in layered dichalcogenides. These effects were actually due to two-dimensional (2D) connement in an innitely large sheet and the shift in the energy of the
A1 exciton E A1 was only 0.15 eV compared with innitely thick samples.
As described below, compared with the shifts observed for metal dichalcogenide clusters in solution due to in-plane carrier connement, the transverse out-ofplane connement for thin layers is very weak. Lateral connement of the carriers
appears to be necessary in order to observe strong size effects. This result is consistent
with the dd band optical transitions which dominate the photoexcitation behavior
of MoS2. The energy shifts reported on thin WSe2 reported by Consadori and
Frindt [150] are compared with the transverse connement of MoS2 and WSe2
clusters demonstrated by Wilcoxon et al. [51] in Figure 3.22. In this gure, rst
exciton energy E A1 is plotted as a function of 1/t2 for the 2D thickness study (right
axis) and as a function of 1/d2 (where d diameter) for the 3D cluster study (left axis).
This 1/t2 dependence of DE A1 is obeyed to t 40 but then deviates signicantly as
the thickness decreases. The shift in E A1 as sizes approach 40 in-plane carrier
connement is over an order of magnitude larger in the transverse direction. This
deviation is even stronger at smaller sizes, which demonstrates the large difference
between transverse and longitudinal connement.
The absorption spectra of Figure 3.18 have structured features such as minima and
maxima whose positions shift to the blue with decreasing cluster size. Figure 3.23
provides an examination of these peaks for MoS2 clusters in dilute acetonitrile
solution. The absorption edge shifts correspond to changes in the size-dependent
bandgap and the density of energy states as the clusters become molecular in size.
The effects of quantum connement depend on the cluster dimension relative to the
bulk excitonic Bohr radius (rB). This dimension denes a cross-over from a strong to a
j99
100
weak regime. In 1938, Mott proposed a model of the hydrogen-like exciton for the
n 1 Rydberg states of the A1 and B1 excitonic peaks [151]. He thus determined that
rB for MoS2 is 2 nm [51, 115, 151]. The 4.5 nm diameter MoS2 clusters (R 2.25 nm)
are thus slightly larger than the bulk exciton, whereas 3.0 nm (R 1.5 nm) and
2.5 nm (R 1.25 nm) clusters should exhibit strong carrier connement. Although
D 4.5 nm clusters are slightly larger than the rB for MoS2, they still exhibit
properties vastly different from the bulk, which can be veried by the large blue
shift in the absorption spectra shown in Figures 3.18 and 3.23.
The extent of carrier connement manifests itself through the effect of particle size
on spinorbit splitting of the absorbance peaks. Decreasing cluster size is accompanied by an increase in the spinorbit splitting of the A1 and B1 excitonic peaks.
Effective mass theory (EMM) denes quantum connement as the shift in the
absorption edge or bandgap, Eg(R), for a cluster with radius R, to be proportional to
1/2 mR2, where m is the reduced exciton mass [79]. When Eg(R) is plotted against 1/R2,
a straight-line plot should result with a slope proportional to 1/(2 m) [51]. However,
deviations from linearity for each excitonic peak occur as the cluster size decreases
(Figure 3.24). The increase in spinorbit splitting between the Wannier excitonic
peaks, A1 and B1, as a function of decreasing cluster size also occurs [51]. The
difference between peaks A1 and B1 is 0.67 eV for MoS2 clusters of 2.5 nm [51]
compared with 0.20 eV for the bulk [120, 126, 134].
When R/rB > 4, the quasi-particle nature of the excitons is predicted to be
preserved [81]. The quasi-particle characteristics are lost, however, when R/rB < 2
and the charge carriers are conned, leading to independent carrier behavior. The
Figure 3.24 E(R) vs. 1/R2 showing deviations from linearity for
each excitonic peak with decreasing MoS2 cluster size. Increase in
spinorbit splitting is also represented by the right axis. (Reprinted
with permission from J. P. Wilcoxon, P. P. Newcomer, G. A.
Samara, Journal of Applied Physics 1997, 81, 7934).
j101
102
Thurston and Wilcoxon published some of the rst reported photocatalysis using
nanosized MoS2 in the late 1990s [52]. They investigated the effect of MoS2 cluster
size on the efciency of phenol photo-oxidation. A xenon lamp with appropriate band
pass lters allowed comparisons with Degussa P25 TiO2 photocatalyst [52]. A shortpass lter cutting off IR radiation above 1000 nm to minimize heating of the solution
was used. HPLC using a reversed-phase octadecyl-terminated silica column and a
mixture of water and acetonitrile as the mobile phase was utilized to analyze the
phenol concentration as a function of illumination time. To study the MoS2 cluster
solutions, which were optically transparent, a long-pass lter limited the xenon arc
lamp output to wavelengths greater than 455 nm. Figure 3.25 shows an absorption
chromatogram for an experiment using 4.5 nm MoS2 obtained using 455 nm light
where the phenol elutes at 14.6 min. After 8 h, the phenol peak area had decreased
by 25%. This was accompanied by the appearance of two phenol photo-oxidation
products: catechol (13.4 min) and a possible isomer of catechol (12.2 min) [52]. Photooxidation using Degussa P25 TiO2 in slurry suspension at 365 nm radiation showed
similar behavior (Figure 3.26). As expected, no photo-oxidation of phenol using TiO2
slurries illuminated with visible light occurred (Figure 3.27). The active sites on the
MoS2 clusters are possibly the empty d-orbitals accessible at the Mo metal edge sites.
Phenol adsorption and hole transfer may occur at these locations and even the
presence of a stabilizing cationic surfactant apparently does not prevent access to
these sites. In fact, in later studies of pentachlorophenol oxidation using MoS2 and
TiO2, certain cationic surfactants were shown to increase the rate of photo-oxidation.
j103
104
in this case. Figure 3.29 shows a plot of the phenol concentration as a function of time
under visible illumination for 4.5 nm MoS2 clusters, 810 nm MoS2 clusters and
Degussa P25 TiO2. These data were obtained from the calculated area under the
phenol elution peaks of the chromatograms shown in Figures [2527]. The catalytic
nature of the 4.5 nm MoS2 clusters for the phenol oxidation was veried by adding
additional phenol after most of the phenol had been destroyed, repeating the reaction
and observing the same reaction kinetics within the error bars indicated in
Figure 3.29.
Using fully dispersed nanoclusters as photocatalysts is not practical in real
environmental remediation, since it would be very difcult to remove the clusters
from the puried water by ltration or other commonly used methods. Accordingly,
experiments were undertaken to deposit the nanoclusters onto a TiO2 powder which
could be used as a slurry or eventually coated onto a high surface area support as has
been done previously in ow reactors based upon TiO2 catalysts. The deposited MoS2
clusters then serve as sensitizers allowing visible light absorbance and transfer of the
electron from the more negative MoS2 conduction band to the TiO2. Thus, both light
absorbance and charge separation could be achieved. We deposited 810 nm MoS2
onto several support materials: Degussa P25 TiO2, SnO2, WO3 and ZnO [52]. P25
TiO2 was the only support material that showed enhanced performance during
j105
106
must be excited by UV light, which is only 3% of the solar spectrum, probably
requiring the use of lamps for a practical TiO2 photoreactor.
It was necessary to use bulk CdS powder as a reference photocatalyst since TiO2 is
not active under visible illumination [68]. CdS has a similar absorbance onset
(525 nm) to MoS2 nanoclusters [68]. The elution peak area from absorbance
chromatograms was used to determine the PCP concentration as a function of
illumination time. The decrease in PCP concentration as a function of time for 4.5
and 3 nm MoS2 clusters, bulk CdS powder, bulk MoS2 powder and bulk TiO2
illuminated with visible light is shown in Figure 3.32 [68]. As might be expected,
there was no change in PCP concentration using either bulk powders of MoS2 or
Degussa P25 TiO2 under visible light radiation. Some decrease in PCP concentration
was observed with bulk CdS powder. However, both sizes of MoS2 clusters were more
active than the CdS slurry powder.
The most interesting result of these studies (Figure 3.32) is the dramatic increase
in PCP destruction rate for the 3 nm MoS2 compared with the 4.5 nm MoS2. In fact by
120 min, complete photo-oxidation of PCP occurs and there is no detectable PCP [68]
(the sensitivity was 20 ppb). Although the 4.5 nm clusters absorb signicantly more of
the incident visible light, the most important size effect appears to be the more
energetic electrons and holes created in the smaller 3 nm MoS2 clusters. It is also
possible that some of the activity increase is due to more favorable surface chemistry
and binding properties of the 3 nm clusters compared with the 4.5 nm clusters. It is
j107
108
worth noting that this high rate of PCP oxidation also occurs in the presence of
stabilizing surfactants and that PCP photo-oxidation using only visible light excitation of 3 nm MoS2 clusters is higher than that observed for Degussa P25 TiO2 using
full lamp irradiation (300 nm < l < 700 nm) [68].
The stability of the 3 nm MoS2 clusters during photocatalysis reactions was also
tested by Wilcoxon et al. [152]. The photo-oxidation of an alkyl chloride using visible
light and 3 nm MoS2 is shown in Figure 3.33. In this gure, the HPLC absorption
chromatogram shows a peak at 4.1 min (3 nm MoS2), a peak at 4.65 min (free
surfactant) and a peak at 5.2 min (organic impurity) at four different stages of the
reaction: t 0, 15, 30 and 60 min [152]. There is no evidence of the organic impurity
left after 60 min. The surfactant that does not participate in cluster stabilization (i.e.
free surfactant) present in the solution also photo-oxidizes as a function of continued
irradiation and after 3 h is completely gone. However, the MoS2 peak remains
unchanged, demonstrating, again, the photocatalytic property of this material.
Electron transfer (ET) rates from MoS2 clusters to electron acceptor molecules
such as bipyridine (bpy) can be estimated by monitoring the change in the uorescence decay rates as a result of ET [84]. For example, time-resolved uorescence
measurements showed a dramatic decrease in ET rates to bpy as the size of the
clusters increased. The ET rates were fastest for 3 nm MoS2 clusters, which is
consistent with a more negative conduction band potential due to increased quantum
connement compared with 4.5 nm clusters. By using electron acceptor molecules
with known redox potentials, one can also estimate how much the conduction band
shifts with cluster size by such measurements. Similar experiments using hole
acceptor molecules allow one to estimate the shift in valence band position with
cluster size.
j109
110
are shown in Figure 3.36. Either WS2/TiO2 or MoS2/TiO2 seem equally effective for
these photo-oxidation reactions.
They concluded that the effectiveness of the coupled system was indeed due to
electron transfer from the MS2 to the TiO2 by measurements using electron spin
resonance (ESR), where they observed a signal corresponding to Ti(III) radical.
j111
112
3.7
Current and Future Technological Applications of Photocatalysts for Environmental
Remediation
The photocatalytic reactions discussed thus far occur in water, but the same
materials, especially TiO2, have proved useful for photo-oxidation of VOCs in the
air. In fact, this application represents the rst commercial use of photocatalysts,
which in Japan amounts to nearly US$300 million in revenue and about 2000
companies as of 2003 [50]. Air purication using TiO2 photocatalysts is very
dependent on reactor design, the most common congurations being annular plug
ow reactors [156] and honeycomb reactors [157]. The oxidation process functions
most efciently at low concentrations and relatively low air ow rates. High ow rates
may be limited by mass transport considerations. For ow rates of less than
20 000 ft3 min1 may suggest that photocatalysis is more cost-effective than carbon
adsorption or incineration of VOCs.
Because light penetrates long distances through air, the illumination of the
photocatalyst in gas reactors is simplied compared with aqueous phase reactors,
especially the most efcient opaque, slurry-type reactors. Also, higher reaction
temperatures and pressures are possible, unlike in water purication where the
boiling point of water is a limitation. In fact, a combination of air stripping of VOCs
combined with gas-phase photocatalytic destruction is likely a viable approach to
water purication.
An advantage of photocatalytic gas purication of VOCs is that relatively low levels
of light are required. For outdoor applications, for example, ambient conditions of
around 23 mW cm2 are available in the near-UV (UVA) region of the spectrum that
can be absorbed by TiO2. These levels are sufcient for many applications such as
self-cleaning tiles and windows [50]. For indoor applications, uorescent lighting of
around 1 mW cm2 in the UVA region is available in most ofces. The efciency of
the photocatalysis has actually been shown to be superior at these levels. For example,
the quantum efciency for photo-oxidation of 2-propanol by TiO2 was 28% and
for the common indoor pollutant acetaldehyde it was nearly 100%, due to an
autocatalytic process producing a free radical chain in the air [50]. Since gas-phase
diffusion of both reactants and products takes place continuously, gas-phase photocatalytic processes can be self-cleaning, preserving the integrity of the catalytic
surface. It also appears that free radical scavengers such as chlorine ions and electron
scavengers such as oxygen are not as signicant a problem in air as in water [158].
However, incomplete mineralization of some chemicals can lead to loss of efciency
with time due to build-up of intermediates at the catalyst surface. Water is not
available in most indoor applications to wash these away, but the best outdoor designs
take advantage of rain to preserve an active catalyst surface [50].
Gas-phase photoreactors may be incorporated into existing heating and air
conditioning systems, where they are often utilized in combination with more
traditional approaches such as HEPA lters and carbon adsorption. For example,
Ao and Lee recently described experiments in which a commercial air cleaner was
modied to incorporate TiO2 on activated carbon lter illuminated using a 6 W UV
3.7 Current and Future Technological Applications of Photocatalysts for Environmental Remediation
lamp (254 nm) to remove nitrous oxide and toluene at ppb levels [159]. The high
efciency of the combined activated carbonTiO2 photocatalyst was attributed to the
ability of the carbon to adsorb and concentrate the NO pollutant, which then diffuses
to the TiO2 where it is photo-oxidized.
A recent review of the gas-phase photocatalysis by Fujishima and Zhang summarizes some of the most interesting results on gas-phase photo-oxidation using
TiO2 photocatalysts [50]. We will now examine some results from that review for both
indoor and outdoor air cleaning.
3.7.1
Indoor Air Purification
The labor costs associated with keeping indoor surfaces free from bacteria and other
contamination can be signicant in the case of hospitals and nursing homes. With
this application in mind, Fujishimas group was able to demonstrate that low UV light
levels of around 1 mW cm2 could destroy E. coli cells placed on a TiO2-coated glass
plate in <1 h. For comparison, they found that only 50% of the cells were dead
following 4 h of illumination under the same conditions. This laboratory work was
tested in several operating rooms in the form of antibacterial tiles with a TiO2 coating.
Tests showed that the bacterial levels were negligible after 1 h of illumination under
the ambient uorescent illumination. The bacterial levels in the surrounding air also
decreased signicantly [50].
The latter result is consistent with observations by Choi that OH radicals generated
on the TiO2 diffuse a signicant distance through air and can oxidize soot many
microns away from the active surface [22]. The evidence for this is shown in
Figure 3.37, where soot was deposited on the surface of a glass substrate coated
with TiO2 photocatalyst. This SEM image shows that a gap develops at the interface
between the TiO2 and the soot. The width, d, of this gap was reported to increase
continuously with UV illumination time, indicating that the active oxidants formed at
the TiO2 surface desorb and migrate across the glass to reach the soot.
Fujishimas group has investigated self-cleaning surfaces as possible interior wall
materials for buildings and homes [160]. Build-up of indoor air pollution in modern,
highly insulated homes and ofces combined with out-gassing of chemicals such as
formaldehyde and urethane used in building materials make such self-cleaning
surfaces very attractive. They were able to demonstrate that many volatile organic
compounds could be completely mineralized to CO2 using weak (1 mW cm2) UV
illumination.
Indoor air cleaners based on HEPA and activated carbon adsorption are commonly
used to remove foul-smelling VOCs which exist in most indoor environments. They
are readily adapted to use photocatalytic oxidation to extend lter life by the selfcleaning property of TiO2. To maximize the surface area, a honeycomb-type lter
which minimizes back-pressure is coated with TiO2. A picture of such a lter taken
from Fujishima and Zhangs review [50] is shown in Figure 3.38. TiO2 nanoparticles
can be embedded in activated carbon as described by Ao and Lee [159] and used in
modied air conditioners or air cleaners. This approach permits the adsorbing
j113
114
3.7 Current and Future Technological Applications of Photocatalysts for Environmental Remediation
j115
116
shows alternating tiles with and without TiO2. Water from the roof runs over
both types of tiles but the photocatalytic surfaces are much cleaner, as shown in
this photograph.
Window glass (SiO2) can also be made self-cleaning by embedding nanoparticles of
TiO2 in an SiO2 matrix. An added benet of this composite material is that water
droplet formation does not occur on rainy days due to the superhydrophilic properties of TiO2. This property is also very useful for the windshields of cars and
motorcycles. With just a little rainfall a thin, uniform lm forms which rapidly
evaporates. Even in heavy rainfall droplet formation that causes light scattering is
avoided since a uniform lm of water forms on the glass. Since TiO2 has a much
higher refractive index than glass, with a continuous lm on glass (i.e. SiO2) the
composition of the TiO2/SiO2 lm can be carefully controlled to avoid excess
refraction of light and which results in visual clarity [50].
Plastic tents made of materials such as PVC are also widely used in outdoor
applications such as temporary buildings for exhibits or storage. Their exible surfaces
are difcult to clean, so Kangyo Co. in Japan coated PVC tents with TiO2, leading to
the formation of an inorganic/organic interfacial microstructure between the TiO2
layer and the PVC to prevent photo-oxidation of the PVC. This approach is fairly
effective, as shown in Figure 3.40 taken from Fujishima and Zhangs review [50],
where a tent made partly of ordinary PVC and TiO2-coated PVC was photographed
by Taiyl Kangyo Co. after 2 months of exposure to air contamination [50].
Common construction materials such as cement can also be modied by
the addition of TiO2 photocatalysts [161]. Provided that the addition of the
photocatalyst particles does not adversely affect the curing and nal strength of
the material, this modication is inexpensive and reduces maintenance costs
by its self-cleaning ability. The preservation of ancient Greek statues against
urban air pollutants is one novel example of the utility of these composite
materials [13, 162].
3.8 Conclusion
3.8
Conclusion
j117
118
References
into reactor design will be required before photocatalytic remediation can compete
with current methods for water treatment. High surface area materials such as
carbon aerogels are worthy of consideration as supports.
Even though remediation schemes using TiO2 to treat liquid-phase pollutants are
not currently economically viable, we presented recent technological applications of
TiO2 porous lms to the continuous treatment of indoor and outdoor air pollution. The
key observation for these gas-phase reactions was the high efciency for gas-phase
reactions at lowlight levels. This is fortunate since the lightuxavailable in the near-UV
region in natural sunlight can be used with TiO2 lms in a variety of outdoor
applications, including self-cleaning tiles and glass. In indoor applications, there is
sufcient near-UV light available due to uorescent lighting that air cleaning and
purication using TiO2 impregnated on carbon is a useful method of removing
noxious odors and killing bacteria. The combination of conventional carbon adsorption with the ability of TiO2 photocatalysts to oxidize the adsorbed chemicals extends
the life of the lters and invigorates and extends an older technology. Such applications
should dominate the short-term uses of photocatalysts in environmental remediation.
As a nal note, not only is environmental remediation a technical and scientic
problem, it is also a social problem. The general population also needs to evaluate
their habits and curb any contributions they may be making to the pollution problem.
Environmental pollution is very much related to many other issues such as habitat
destruction, overpopulation, lack of education, excessive consumerism, extreme
poverty and corruption within governments and corporations. None of these problems can be solved in a completely isolated manner. Scientic and technical
solutions exist in many of these areas and particularly, as outlined here in this
chapter, for environmental remediation through photocatalysis. The use of nanomaterials as photocatalysts is certainly promising. However, we must be careful that
we do not merely exchange one problem for another when using nanomaterials if
there is a possibility that they may also become future pollutants. Evaluating the
actual impact of new discoveries is the responsibility of all researchers if they wish to
utilize their discoveries in any applied systems.
Acknowledgment
This work was supported by the Division of Materials Sciences, Ofce of Basic Energy
Research, U.S. Department of Energy under contract DE-AC04-AL8500. The Center
for Individual Nanoparticle Functionality (CINF) is supported by the Danish National
Research Council.
References
1 S. A. Ostroumov, Biological Effects of
Surfactants, Taylor & Francis Group, Boca
Raton, FL, 2006.
2 R.P.Schwarzenbach,B.I.Escher,K.Fenner,
T. B. Hofstetter, C. A. Johnson, U. von
Gunten, B. Wehrli, Science 2006, 313, 1072.
j119
120
References
37 N. Serpone, Solar Energy Materials and
Solar Cells 1995, 38, 369.
38 T. E. Agustina, H. M. Ang, V. K. Vareek,
Journal of Photochemistry and Photobiology
C Photochemistry Reviews 2005, 6, 264.
39 Jefferson Laboratories, The 10 Most
Abundant Elements in the Earths Crust,
http://education.jlab.org/glossary/
abund_ele.html, 2007.
40 A. L. Linsebigler, G. Q. Lu, J. T. Yates,
Chemical Reviews 1995, 95, 735.
41 N. Serpone, E. Pelizzetti, in Photocatalysis
Fundamentals and Applications, Wiley,
New York, 1989, vii.
42 S. C. Markham, Journal of Chemical
Education 1955, 32, 540.
43 S. N. Frank, A. J. Bard, Journal of Physical
Chemistry 1977, 81, 1484.
44 S. N. Frank, A. J. Bard, Journal of the
American Chemical Society 1977, 99, 4667.
45 C. Y. Hsiao, C. L. Lee, D. F. Ollis, Journal of
Catalysis 1983, 82, 418.
46 A. L. Pruden, D. F. Ollis, Journal of
Catalysis 1983, 82, 404.
47 T. Matsunaga, R. Tomoda, T. Nakajima, H.
Wake, FEMS Microbiology Letters 1985,
29, 211.
48 A. Fujishima, R. X. Cai, J. Otsuki, K.
Hashimoto, K. Itoh, T. Yamashita, Y.
Kubota, Electrochimica Acta 1993, 38, 153.
49 R. Wang, K. Hashimoto, A. Fujishima, M.
Chikuni, E. Kojima, A. Kitamura, M.
Shimohigoshi, T. Watanabe, Nature 1997,
388, 431.
50 A. Fujishima, X. T. Zhang, Comptes
Rendus Chimie 2006, 9, 750.
51 J. P. Wilcoxon, P. P. Newcomer, G. A.
Samara, Journal of Applied Physics 1997,
81, 7934.
52 T. R. Thurston, J. P. Wilcoxon, Journal of
Physical Chemistry B 1999, 103, 11.
53 I. Chorkendorff, J. W. Niemantsverdriet,
Concepts of Modern Catalysis and Kinetics,
Wiley-VCH, Weinheim, 2003.
54 S. T. Martin, H. Herrmann, W. Y. Choi, M.
R. Hoffmann, Journal of the Chemical
Society: Faraday Transactions 1994, 90,
3315.
j121
122
References
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
commodity/molybdenum/470798.pdf,
1998.
E. Graber, A. Klingsborg, P. M. Siegal,
Title?, Wiley, New York, 1985,
774.
H. Topse, B. S. Clausen, F. E. Massoth,
Hydrotreating Catalysis Science and
Technology, Springer, Berlin, 1996.
R. F. Frindt, A. D. Yoffe, Proceedings of the
Royal Society of London Series A 1963, 273,
69.
R. G. Dickinson, L. Pauling, Journal of the
American Chemical Society 1923, 45, 1466.
R. Hultgren, Physical Review 1932, 40,
891.
L. Pauling, The Nature of the Chemical
Bond and the Structure of Molecules and
Crystals: an Introduction to Modern
Structural Chemistry, 3rd edn., Cornell
University Press, Ithaca, NY, 1960.
E. Benavente, M. A. Santa Ana, F.
Mendizabal, G. Gonzalez, Coordination
Chemistry Reviews 2002, 224, 87.
R. Coehoorn, C. Haas, J. Dijkstra, C. J. F.
Flipse, R. A. de Groot, A. Wold, Physical
Review B:Condensed Matter 1987, 35, 6195.
N. Ohmae, in Wear, 1st International
Workshop on Microtribology (IWM),
Proceedings of the 1st International Workshop
on Microtribology (IWM), October 1213
1992,Morioka,Japan,Vol.168,Switzerland,
1993, Morioka, Japan, 1993, 99.
L. Scandella, A. Schumacher, N. Kruse, R.
Prins, E. Meyer, R. Luthi, L. Howald, H. J.
Guntherodt, Thin Solid Films 1994, 240,
101.
M. V. Bollinger, K. W. Jacobsen, J. K.
Norskov, Physical Review B 2003, 67,
085410.
H. Tributsch, J. C. Bennett, Journal of
Electroanalytical Chemistry 1977, 81, 97.
B. L. Evans, P. A. Young, Proceedings of the
Physical Society of London 1967, 91, 475.
B. L. Evans, P. A. Young, Proceedings of the
Royal Society of London Series A 1965, 284,
402.
A. M. Goldberg, A. R. Beal, F. A. Levy, E. A.
Davis, Philosophical Magazine 1975, 32,
367.
j123
124
j125
4
Pollution Treatment, Remediation and Sensing
Abhilash Sugunan and Joydeep Dutta
4.1
Introduction
126
Most mature
Gas chromatography
X-ray uorescence spectrometry
Photoionization devices
Flame ionization devices
Catalytic surface oxidation
Mass spectrometry
Infrared spectroscopy
Wet chemistry methods
Kits based on immunoassays and chemical reactions
techniques have been identied. The time and expense involved in the detection of
environmental pollutants (i.e. sample acquisition, sample preparation and laboratory
analysis) have led to renewed interest in nding newer solutions to analyze contamination inordertoprevent,to seekremedialactionforortodestroythecontaminants priorto
pollution of the environment. Fast and cost-effective eld-analytical technologies that
can increase the number of analyses and drastically reduce the time required to perform
them will help in the prevention of environmental catastrophes. Increasing the amount
of analytical data tends to improve the accuracy of hazardous waste site characterization,
leadingtobettermanagementoftheproblemsandtheriskassessmentscanbeimproved
by efcient clean-up procedures [6]. The different analytical techniques for contamination monitoring and testing that are widely used today are listed in Table 4.1 [7].
Environmental monitoring is a complex process involving hundreds of different
substances that are deemed to pose a threat to the environment and can occur in the
gaseous, liquid or solid phases with concentrations varying from a few percentage levels
down to a few parts per trillion (ppt). Monitoring in both the external environment and at
the point of discharge and sometimes in real time is necessary to prevent pollution, nd
remedial effects or decide when to destroy environmentally dangerous substances.
There is a critical and growing need for more cost-effective and rapid techniques for the
identication and quantication of pollutants in complex environmental matrices and
for theconversionofcontaminantsinto benignforms ortheircompleteelimination, and
nanotechnology has the promise to ll this need. Nanotechnology is being applied to
bridge the need for accurate, inexpensive, sensitive and real-time, in situ analyses using
robust sensors based on advantages delivered by the new techniques which can be
remotely operated through satellite signals.
Although a number of chemical sensors are commercially available for eld
measurements of chemical species (e.g. portable gas chromatographs, surfacewave acoustic sensors, optical instruments), few are really suitable for continuous
environmental and pollution control applications (Table 4.1). Detection of low
concentrations for the monitoring of volatile organic compounds (VOCs) such as
aromatic hydrocarbons (e.g. benzene, toluene, xylenes), halogenated hydrocarbons
[e.g. trichloroethylene (TCE), carbon tetrachloride] and aliphatic hydrocarbons
(e.g. hexane, octane) in air, groundwater and other saturated environments is
Compound class
%b
26.5
17.5
10.5
8.5
8.5
7.5
6.0
6.0
4.0
3.0
2.0
Compiled and published by the Agency of Toxic Substance and Disease Registry (PHS, Annual
Report, 1990).
Contribution to the US hazardous waste problem from a human exposure perspective.
urgently needed for the proper monitoring of the environment and prevention of
further pollution. Volatile organic compounds from cigarette smoke, building
materials, furnishings, cleaning compounds, dry cleaning agents, paints, glues,
cosmetics, textiles and combustion sources are also a major source of indoor air
pollution [8]. Nanotechnology has already been applied to remove some of these
VOCs; it has already been reported that an ultraviolet (UV) illuminated titanium
dioxide (TiO2) catalytic surface can produce an overall reduction in air VOC levels [9].
Low-temperature activity of gold catalysts has been employed by Mintek in South
Africa to construct a prototype air purication unit which removes carbon monoxide
from the air at room temperature [10].
Over 700 chemical species have been identied at hazardous waste sites and the
still unidentied compounds may number in the thousands [6]. All of the 600
compounds regulated under the Toxics Release Inventory (TRI) of these chemical
species and numerous other agricultural and industrial compounds that are regulated under waste disposal and treatment regulations, however, pose similar risks to
human health and ecosystems. The Agency for Toxic Substances and Disease
Registry (ATSDR) in the USA has ranked 275 priority hazardous substances based
on the frequency of occurrence at sites present on the National Priorities List,
available toxicity data and the potential for direct or indirect human exposure [6]. The
different chemical classes of hazardous substances of concern to human health are
shown in Table 4.2
4.2
Treatment Technologies to Remove Environmental Pollutants
j127
128
Table 4.3 Chemical processes that are the largest users of heterogeneous catalysts at present.
Reactions
Catalysts
Pt, Pd on alumina
Rh on alumina, V oxide
Zeolites
CoMo, NiMo, WMo
Pt, PtRe and other bimetallics on alumina
Metals on zeolites or alumina
Sulfuric acid, solid acids
N on support, FeCr, CuO, ZnO, aluminate
Ni on support
Fe
Ag on support
Pt, Rh, Pd
V oxide
Bi, Mo oxides
Cu chloride
Ni
Cr, Cr oxide on silica
effective risk management strategies for the harmful effects of pollutants that are
highly toxic, persistent and difcult to treat. Several new methodologies have been
utilized to address new waste treatment approaches that are more effective in
reducing contaminant levels that are commercially viable compared with the
currently available techniques. Application of nanotechnology that results in improved waste treatment options might include removal of the nest contaminants
from water (<300 nm) and air (<50 nm) and smart materials or reactive surface
coatings with engineered specicity to a certain pollutant that destroy, transform or
immobilize toxic compounds. Nanomaterials have been attracting increasing interest in the area of environmental remediation, mainly due to their enhanced surface
and also other specic changes in their physical, chemical and biological properties
that develop due to size effects. The development of novel materials with increased
afnity, capacity and selectivity for heavy metals, which are a major pollutant source,
has been actively studied because conventional technologies are often inadequate to
reduce concentrations in wastewater to acceptable regulatory standards. Commercially available ion-exchange sorbents such as Duolite GT-73, Amberlite IRC-718,
Dowex SBR-1 and Amberlite IRA 900X are limited in their ability to remove heavy
metal contaminants and are often inadequate for most applications. Genetic and
protein engineering have emerged as the latest tools for the construction of nanoscale
materials that can be controlled precisely at the molecular level. With the advent of
recombinant DNA techniques, it is now possible to create articial protein
polymers with fundamentally new molecular organization capabilities that are
allowing targeted removal of toxic waste [11].
One of the major environmental pollution sources is automobile exhaust, consisting of harmful emission gases including NOx, carbon monoxide and unburned
hydrocarbons (HCs), causing smog and acid rain. Most biological reactions that build
the human body are catalytic, but application of catalysis in the manufacturing sector
in our industrialized world started in the early 1800s and began to be used extensively
(listed in Table 4.3) following the discovery of the platinum surface-catalyzed reaction
of H2 and O2 in 1835 [12].
Since 1975, automobile manufacturers have taken a variety of steps to reduce the
level of emission of these harmful emission gases which can be reduced by catalytic
reactions in the catalytic converter via the following chemical reactions:
CO O2 ! CO2
CO oxidation
HC O2 ! CO2 H2 O
HC oxidation
NOx HC ! N2 H2 O CO2
NOx CO ! N2 CO2 NOx
NOx reduction by HC
Reduction by CO
The harmful pollutants are converted into relatively benign molecules such as
CO2, N2 and H2O through reactions that occur inside the automobile catalytic
converters in the presence of catalysts, which consist of mixtures of platinum-group
metals such as rhodium (Rh), platinum (Pt) and palladium (Pd). The future targets
for the reductions of emission gases from automobile exhaust are very demanding
and the requirement on NOx has been proposed to be 0.05 g per mile, which is about
one-quarter of the values that can be achieved through current catalytic converter
technology. Transition metal carbides and oxycarbides are being considered as a
replacement for the expensive Pt-group metals (Ru, Rh, Ir, Pd and Pt), since recent
results that show strong similarities in the catalytic properties between transition
metal carbides and the less abundant Pt-group metals. In addition to offering a very
high surface-to-volume ratio, nanoparticles offer the exibility of tailoring the
structure and catalytic properties on the nanometer scale.
Nanocrystalline materials composed of crystallites in the 110 nm size range
possess very high surface-to-volume ratios because of the ne grain size. These
materials are characterized by a very high number of low coordination number
atoms at edge and corner sites, which can provide a large number of catalytically
active sites. For example, gold catalyst systems, consisting of gold nanoparticles on
oxide supports, can be used for a wide variety of reactions [13, 14] and many of these
have potential for applications in pollution control. Supported gold catalysts are
active for the oxidation of methane and propane and the removal of NOx has also
been demonstrated. In exploratory work, gold on a transition metal oxide catalyst
system has shown potential as a low-temperature three-way catalyst for automobile
emission control [15, 16] with the light-off temperatures lowered for both
hydrocarbons and carbon monoxide when fresh catalyst is used. A further automotive use for gold catalysts could be in the decomposition of ozone [17]. Consequently, the number of patents related to gold catalysis has shown an upward trend,
with close to one-third of such granted patents involving pollution control
(Table 4.4) for the period 19912001 [18].
j129
130
Table 4.4 Comparative (%) number of patents granted in the field of catalytic gold nanoparticlesa.
Subject area
% (19912001)
Chemical processing
Pollution control
Catalyst manufacture
Fuel cells
46
29
15
10
In the area of pollution control, some patents [19, 20] have been led claiming the
use of gold catalysts in automotive emissions. Some promise for applications in
motor vehicle emission devices most likely in the exhaust treatment of gasoline and
diesel cars running at lower temperature ranges and for low light off applications,
such as cold start conditions in gasoline engines, has been demonstrated [16]. The
use of gold on a clay mineral containing magnesium silicate hydrate has been
patented by Toyota for use with ozone to destroy odors [21].
Another promising technology, utilizing the enhanced surface properties of
inorganic nanoparticles, involves photocatalytic degradation of organic pollutants
in water. Figure 4.1 shows a schematic of this technique. It is based on irradiation of a
semiconductor surface with light having energy greater than the semiconductor
bandgap exciting electrons from their valence band to the conduction band. This
generates electronhole pairs on its surface. These electronhole pairs are consumed
by either recombination or surface trapping. However, the presence of organic
molecules on the surface of the semiconductor material results in a catalytic redox
reaction through interfacial charge transfer [22]. Semiconductor materials of choice
are IIVI materials such as TiO2 and ZnO. Noble metals such as gold and platinum
on IIVI semiconductor nanoparticles act as a sink for photogenerated charge
carriers and promote an interfacial charge-transfer process that leads to an increase
in photocatalytic efciency of metal oxide semiconductors. Under UV illumination,
electrons accumulate on the metal surface (making electronhole pair separation
possible, decreasing the recombination of surface charges) and the hole oxidizes
absorbed species [23]. This technique has been demonstrated in degrading
4-chlorophenol [24] and chloroform [25] as models of harmful organic chemicals.
4.3
Remediation Technologies to Clean Up Environmental Pollutants Effectively
j131
132
insecticides and wood preservatives and are ubiquitous in the environment of both
industrialized and agrarian nations. A subgroup of these chemicals, referred to as
chlorinated aromatics, includes chlorinated benzenes, polychlorinated biphenyls
(PCBs), pentachlorophenol (PCP) and insecticides such as DDT. Microbial degradation and naturally occurring hydrolysis of these compounds are very slow processes
(e.g. for 4-chlorophenol at 9 C the halflife is nearly 500 days). Some direct
photodegradation also occurs, although the limited optical absorbance of chlorinated
aromatics at wavelengths above 350 nm slows the process drastically. Sometimes this
direct photolysis can actually lead to more toxic products; e.g. direct photolysis of PCP
has been reported to lead to octachlorodibenzo-p-dioxin, an even more toxic species
than its precursor [29]. It is clear that more effective methods of treatment of these
chlorinated aromatics must be sought [30, 31]. To this end, a few groups have been
investigating the photocatalytic oxidation of these compounds to form harmless CO2
and HCl, a process referred to as total mineralization [32]. Photocatalysis is needed to
reduce toxic pollutants in the atmosphere and water [33], VOCs in the atmosphere
and also for reduction of NOx [34] molecules (largely from vehicle exhausts) into N2,
N2O, NO2 and O2 over semiconductor and zeolite catalysts at ambient temperature.
Photocatalytic reaction between ammonia and nitric oxide has been investigated on a
TiO2 wafer under near-UV illumination [35]. Using a novel, integrated, nanobiotechnological approach comprising catalytic dechlorination using FeS nanoparticles followed by microbial degradation, it has been reported that complete removal
of 5 mg L1 of lindane (g-hexachlorocyclohexane), which is an organochlorine
pesticide and a persistent organic pollutant (POP), occurs from an aqueous solution
in less than 10 h [36].
A variety of photocatalyst nanoparticles have been synthesized, such as oxides
(TiO2, ZnO, Fe2O3, WO3, SnO2, Ag2O, V2O5, SrTiO3), suldes (ZnS, CdS, MoS2,
4.4 Sensors
CuxS, Ag2S, PbS), selenides (CdSe, PbSe, HgSe), iodides (AgI) and modied systems
such as coupled semiconductor systems (CdS/TiO2, CdSe/TiO2, SnO2/TiO2, ZnO/
TiO2, ZnO/CdS). Among them, TiO2 nanoparticles and modied TiO2 nanoparticles
are the most extensively studied and considered to be the most efcient photocatalysts [37]. Other semiconductor nanoparticles generally have lower photocatalytic
activity than TiO2 and some have problems associated with stability, reactivity,
etc. [26]. Fe2O3 easily undergoes photocathodic corrosion [38] and its active form
a-Fe2O3 also has high selectivity for the reactant [39].
Many chlorinated aromatics and aliphatics are toxic, even at low concentrations,
and exert a cumulative, deleterious effect on river basins and other streams that
enter the environment from manufacturing operations and user applications.
Reductive dechlorination of organics by various bulk metals (particularly Fe) in
the aqueous phase has been well documented. Although nanoparticles possess
several advantages (e.g. high surface area and surface energy), sustainability
requires particle immobilization on a base membrane to avoid particle loss and
agglomeration. Nanostructured metals immobilized in membrane phase leads to
high reaction rates at room temperature, signicant reduction of metal usage,
minimizing the need for the recovery of non-chlorinated products (e.g. ethylene
from TCE), leading to a subsequent improvement in water quality. Chlorinated
organics and many pesticides and herbicides are toxic to aquatic life, even at low
concentrations, and exert a cumulative, effect on receiving streams. The use of nontoxic, polypeptide-based membrane assemblies to create nano-sized metal domains
has signicant environmental importance [40]. Nanoscale bimetallic (Fe/Pd, 99.9%
Fe) particles are considered as a new generation of remediation technology that
could provide cost-effective remedial solutions to some of the most difcult waste
dumping sites [41]. The complete reduction of aqueous perchlorate to chloride by
nanoscale iron particles over a wide concentration range (1200 mg L1) has been
observed. The reaction is temperature sensitive, as evidenced by progressively
increasing rate constant values of 0.013, 0.10 and 1.52 mg perchlorate per gram of
iron per hour at temperatures of 25, 40 and 75 C, respectively. The high activation
energy of 79.02 7.75 kJ mol1 partially explains the stability of perchlorate in
water. Iron nanoparticles may represent a feasible remediation alternative for
perchlorate-contaminated groundwaters.
4.4
Sensors
j133
134
during interaction with chemicals. Optical sensors detect changes in visible light or
other electromagnetic waves during interactions with chemicals. Within each of
these categories, some sensors may exhibit characteristics that overlap with those of
other categories. For example, some mass sensors may rely on electrical excitation or
optical settings. Nevertheless, these four broad categories of sensors are sufciently
distinct for the purposes of this chapter. The following sections provide a summary
and assessment of the sensors reviewed in each of the four categories.
4.4.1
Biosensors
Electrochemical sensors represent a key area where the use of nanotechnology (e.g.
nanopowders), innovative materials and nano- and micro-fabrication techniques can
give sensor products that offer signicant enhancements with respect to sensitivity,
selectivity, power consumption and reproducibility. The idea of using semiconductors as gas-sensitive devices dates back to the early 1950s when Brattain rst reported
4.4 Sensors
Other strategies for sensing include nanomechanical sensors [3]. Cantilever sensors
have also been used for detecting chemicals, such as volatile compounds [58], warfare
pathogens [59], explosives [60] and glucose [61] and ionic species, such as calcium
ions [62]. The key to using microcantilevers for the selective detection of molecules is
the ability to functionalize one surface of the silicon microcantilever in such a way
that a given target molecule will be preferentially bound to that surface upon its
exposure. The bending and the changes in resonant frequency can be monitored by
several techniques, with optical beam deection, piezoresistivity, piezoelectricity,
interferometry, capacitance and electron tunneling among the most important [63].
This strategy allows microcantilever sensors to measure extremely small changes due
to molecular adsorption and, for that reason, they are extremely sensitive biosensors;
with the cantilever technique, it is possible to detect surface stress as small as about
104 N m1. Such measurement is also quantitative, related to the concentration of
the analyte being detected [64]. Nonetheless, the factors and the phenomena
responsible for the surface stress response during molecular recognition remain
unclear. Electrostatic interaction between neighboring adsorbates, changes in surface hydrophobicity and conformational changes of the adsorbed molecules can all
induce stresses, which may compete with each other and mean that the change in
stress is not directly related to the receptorligand binding energy.
j135
136
Figure 4.3 Mass sensors using microcantilevers: working mechanism. From Ref. [3].
Silicon, silicon nitride and silicon oxide cantilevers are available commercially with
different shapes and sizes, analogous to AFM cantilevers, with typical lengths of
10500 mm and ultra-thin cantilevers up to 12 nm thick. However, for specic
applications (e.g. highly sensitive biosensors), cantilevers must be designed and
fabricated to satisfy their requirements. Cantilever sensitivity depends critically on
the spring constant: the lower the constant, the higher is the sensitivity for
measurements in liquids based on the static method. Figure 4.3 shows a schematic
of the working mechanism of a microcantilever sensor.
For accurate functioning of such sensors based on microcantilevers, the immobilization process should:
.
.
.
4.4 Sensors
schemes have shown that cantilevers were able to detect, with great accuracy and
selectivity, different ions, such as CrO42 [67], Ca2 [62] and Pb2 [68].
Biological applications of cantilever based mass sensors include the detection of
different pathogens, such as Salmonella enterica by Weeks et al. [59], Vaccinia virus by
Gunter et al. [69] and fungal spores from Aspergillus niger by Nugaeva et al. [70].
Monitoring concentrations of specic pesticides plays an essential role in the
environmental control eld. An example of the application of a cantilever-based
biosensor in this area was reported by Alvarez et al. [71] for the detection of the
organochlorine insecticide dichlorodiphenyltrichloroethane (DDT). A synthetic hapten of the pesticide, conjugated with bovine serum albumin (BSA), was covalently
immobilized on the gold-coated side of the cantilever; specic detection was then
achieved by exposing the cantilever to a solution containing the specic monoclonal
antibody to the DDThapten derivative. The specic binding of the antibodies on the
sensitized side of the cantilever was measured with nanomolar sensitivity. Finally, a
competitive assay was performed, with the cantilever exposed to a mixed solution of
the monoclonal antibody and DDT and direct detection was achieved. With this
detection strategy, DDT concentrations as low as 10 nM were detected, involving
deection signals in the 50 nm range. Many other applications have been described
for the detection of pesticides and avidin streptavidin [72].
4.4.4
Optical Sensors
For realizing sensitive chemical sensors and biosensors, optical methods employing
optical bers or integrated optics (IO) and, in the case of remote sensing, by
connecting ber pigtails, offer high sensitivity and fast responses. Contemporary
methods for optical sensing of chemical and biological species at present are based
mainly on interferometry, surface plasmon resonance (SPR) and luminescence. The
relatively recent technique of luminescence quenching [73] is a new alternative.
The photoluminescent (PL) properties of nanocrystalline (porous) silicon depend on
the chemical nature of its surface; for example, metal ions can quench PL from
porous Si [74], as can inorganic molecules [7577]. The nanoscale size permits in vivo
monitoring of processes within individual cells. Concentrations of toxic chemicals
within carcinoma cells [78] have already been achieved. PEBBLE (probe encapsulated
by biologically localized embedding) nanosensors have been prepared for the
analytes oxygen [79, 80], potassium [81], zinc [82] and magnesium [83]. The wide
variation of the optical properties of gold nanoparticles with particle size, interparticle distance and the dielectric properties of the surrounding media due to
SPR [8486] permits the construction of simple but sensitive colorimetric sensors for
various analytes. Highly sensitive colorimetric sensors for biomolecules [8789] and
metal ions [4, 5], amongst others, have been devised using SPR of gold nanoparticles.
In the example of sensing heavy metal ions, well-known metal ion chelators, such as
chitosan, can be coated on the nanoparticle surfaces, such that the ligand changes its
dielectric properties upon chelating the metal ions, resulting in an optical (colorimetric) signal (Figure 4.4).
j137
138
4.4.5
Gas Sensors
Gas sensors for detecting air pollutants must be able to operate stably under deleterious
conditions, including chemical and/or thermal attack. Therefore, solid-state gas
sensors appear to be the most appropriate in terms of their practical robustness. The
sensors used for detecting air pollutants are usually produced simply by coating a
sensing (metal oxide) layer on a substrate with two electrodes. Typical materials are tin
(IV) oxide (SnO2), zinc oxide (ZnO), titanium dioxide (TiO2) and tungsten oxide (WO3),
with typical operating temperatures of 200400 C [90]. When the active surface of a
metal oxide (e.g. zinc oxide) grain is exposed to the ambient oxygen, the oxygen atoms
are adsorbed on the surface as shown in Figure 4.5 and the adsorbed oxygen acts as an
4.4 Sensors
acceptor state. Being ionized, a depletion layer is formed on the surface of the grains
and also in the neck regions, which raises the height of the potential barrier [91]. In the
presence of a reducing gas, the adsorbed oxygen can easily react with the gas molecules
and exit from the lattice, thus reducing the concentration of acceptors, which in turn
lower the voltage barrier height. When the metal oxide sensor absorbs a reducing gas
(CO, H2), the depletion area at the surface will be decreased, leading to increased
conductivity. On the other hand, if a metal oxide sensor absorbs an oxidizing gas (NO2),
the depletion zone at the surface will be increased, meaning decreasing conductivity. A
change in conductivity/resistance is related to gas concentration. In the case of a ZnO
sensor, conductivity decreases, which means resistance increases when the sensor
absorbs NOx, depending on the NOx concentration [90].
A host of sensing materials are available to be used as the sensing layer for solidstate gas sensors so far reported, as shown in Table 4.5. Despite their simplicity and
Table 4.5 Materials for Solid-state sensors.
Sensing materials
Target gases
SnO2
SnO2
NO2, CO, NOx, CH4, O2, ethanol, phenylarsine, C6H6, diethyl ether, H2S, H2,
ammonia, iso-C4H10
SnO2
O2
SnO2
j139
140
low production cost, solid-state gas sensors (SGS) usually exhibit drifts and variations
in behavior. After the introduction of nanoparticles, sensitivity in gas sensors has
improved. The use of nanoscale materials exposes a higher surface area of the
sensing element to gas and hence the physicochemical reaction that proceeds at the
surface is increased [92].
4.4.6
Novel Sensing Technologies and Devices for Pollutant and Microbial Detection
4.4 Sensors
Table 4.6 Selected waterborne contaminants in developing countries [95].
Problem
Occurrence
Pathogens
Metals, e.g. arsenic
Pesticides
Algal toxins
Nitrates
Fluoride
Organic compounds
j141
142
References
4.5
Conclusions
Acknowledgments
We are grateful for support from the Swedish International Development Agency
(SIDA) and the National Nanotechnology Center (NSTDA) of the Thai Ministry of
Science and Technology (MOST).
References
1 US Environmental Protection Agency ,
Report EPA/600/R-92/219, EPA,
Washington, DC, 1992.
2 M.-I. Baraton, L. Merhari, J. Nanopart.
Res. 2004, 6, 107.
3 L. G. Carrascosa, M. Moreno, M. Alvarez,
L. M. Lechuga, Trends Anal. Chem. 2006,
25, 196.
4 S. O. Obare, R. E. Hollowell, C. J. Murphy,
Langmuir 2002, 18, 10407.
j143
144
References
47 A. Lindgren, L. Stoica, T. Ruzgas, A.
Ciucu, L. O. Gorton, Analyst 1999, 124,
527; K. R. Rogers, J. N. Lin, Biosens.
Bioelectron. 1992, 7, 317; C. Nistor, J.
Emneus, L. Gorton, A. Ciucu, Anal.
Chim. Acta 1999, 387, 309; K. Riedel,
in G. Ramsay (ed.), Commercial
Biosensors: Applications to Clinical,
Bioprocess and Environmental Samples
Wiley New York 1998; E. Dominguez, in
OSullivan, G. G. Guilbault, S. Alcock,
A. P. F. Turner (eds.), Biosensors for
Environmental Monitoring: Technology
Evaluation, University College Cork,
Cork, 1998.
48 J. Liu, Yi Lu, Adv. Mater. 2006, 18, 1667.
49 W. H. Brattain, J. Bardeen, Bell Syst. Tech.
J. 1953, 32 (1), 1.
50 S. Jonda, M. Fleischer, H. Meixner, Sens.
Actuators B 1996, 34, 396.
51 G. Eranna, B. C. Joshi, D. P. Runthala, R.
P. Gupta, Crit. Rev. Solid State Mater. Sci.
2004, 29, 111.
52 H. Ogawa, M. Nishikawa, A. Abe, J. Appl.
Phys. 1982, 53, 4448.
53 H. Nanto, T. Minami, S. Takata, J. Appl.
Phys. 1986, 60, 482.
54 G. Faglia, P. Nelli, G. Sberveglieri, Sens.
Actuators B 1994, 19, 497.
55 A. Heilig, N. Barsan, U. Weimar, M.S-.
Berberich, J. W. Gardner, W. Gopel, Sens.
Actuators B 1997, 43, 45.
56 Q. F. Pengfei, O. Vermesh, M. Grecu, et al.
Nano Lett. 2003, 3, 347.
57 O. Pummakarnchanaa, N. Tripathia, J.
Dutta, Sci. Technol. Adv. Mater. 2005, 6,
251.
58 M. K. Baller, H. P. Lang, J. Fritz, C. Gerber,
J. K. Gimzewski, U. Drechsler, H.
Rothuizen, M. Despont, P. Vettiger, F. M.
Battiston, J. P. Ramseyer, P. Fornaro, E.
Meyer, H. J. Guntherodt, Ultramicroscopy
2000, 82, 1.
59 B. L. Weeks, J. Camarero, A. Noy, A. E.
Miller, L. Stanker, J. J. De Yoreo, Scanning
2003, 25, 297.
60 L. A. Pinnaduwage, A. Gehl, D. L.
Hedden, G. Muralidharan, T. Thundat, R.
T. Lareau, T. Sulchek, L. Manning, B.
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
j145
146
96
97
98
99
100
101
102
103
104
105
106
107
j147
5
Benefits in Energy Budget
Ian Ivar Suni
5.1
Introduction
Nanomaterials in the 1100 nm size range have unusual potential for applications
within a wide variety of existing and emerging technologies. Nanomaterials have
several intriguing properties that may be exploited for technological applications. Due
to quantum connement effects, when their dimensions are comparable to the
electron mean free path or the optical wavelength, the electronic and optical properties
of nanomaterials become size dependent. This is of course the origin of the unique
properties of the widely popularized quantum dots, which exhibit quantum connement in all three dimensions. Another interesting property of nanomaterials, and
in particular nanoparticles, is their unusually high chemical reactivity. This has led to
the widespread use of metal and metal oxide nanoparticles as commercial catalysts in
the chemical and petrochemical industries. Metal nanoparticles are also currently
employed within catalytic converters in automobiles as three-way catalysts. Three-way
catalysts catalyze the following three reactions: oxidation of unburned hydrocarbons,
oxidation of CO, and reduction of nitrogen oxides.
Another interesting aspect of nanomaterials is their unusually high surface area per
unit mass. Many potential applications that exploit the high surface area of nanomaterials involve their use within compacted solids as what are termed nanostructured materials, which in many cases are composite materials. Nanostructured materials thus
have extremely high internal surface areas, although these may not be chemically accessible. Composite nanomaterials can be fabricated from nanowires or nanotubes of extremely high aspect ratio, allowing for low percolation thresholds. This means that high
aspect ratio nanomaterials can more easily form interacting networks within a composite material to form a conductive electrical pathway or to increase mechanical strength.
Although nanomaterials have several existing applications, their potential for the
development of new technologies is the main source of the excitement within academia,
government and industry. Among the most widely anticipated applications of
nanomaterials is the development of more environmentally friendly and more efcient
energy sources. Interest in sustainable energy is driven in part by long-term concerns
Nanotechnology. Volume 2: Environmental Aspects. Edited by Harald Krug
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31735-6
148
about the scarcity of hydrocarbon fuels, which are in increasingly great demand
with the rapid industrialization of China, Russia, Brazil and other emerging economies.
In addition, concerns about greenhouse gas emissions such as CO2 that arise from
combustion of hydrocarbons are generating interest in cleaner energy sources.
Applications of nanomaterials in the eld of energy include fuel cell catalysts, fuel
cell support materials, hydrogen storage, solar cells, lithium ion batteries and supercapacitors. The current discussion will focus on recent results, on clear demonstrations
of the utility of nanomaterials and on the scientic basis for these applications. In lowtemperature fuel cells, Pt and other noble metal nanoparticle catalysts have been widely
studied for their ability to catalyze efciently the electrochemical reduction of oxygen
and the electrochemical oxidation of both hydrogen and methanol. Because these
nanoparticle catalysts may be interspersed with less conductive materials, carbon
nanotubes have been widely investigated as catalyst support materials to improve
catalyst utilization in fuel cells. The development of fuel cells and other energy sources
powered by molecular hydrogen, with water as the only chemical product, is an
important goal for sustainable energy. One of the critical issues limiting hydrogen
energy is the need for an infrastructure and new technology for hydrogen storage and
distribution. Although early results have now been shown to be misleading, carbon
nanotubes have been widely investigated for their hydrogen storage properties.
Further applications for nanomaterials can be envisioned in the area of solar
energy cells. The classical example of nanotechnology is the variation in optical
absorption/emission of semiconductor nanostructures with dimension. These
size-dependent properties have been exploited to alter the wavelength of optical
absorption to match the terrestrial window. In addition, nanostructured TiO2 in dyesensitized solar cells (DSSCs) has been widely investigated due to its high internal
surface area, which increases the available dye for optical absorption and maximizes
internal reections within the DSSC.
Nanomaterials have also been employed within Li ion batteries, particularly as
materials for anode construction. Many materials have been demonstrated to have
higher Li capacities than the prototypical graphite anode material, but they have been
prone to mechanical failure due to repeated expansion (contraction) during Li
insertion (removal) as the battery is charged (discharged). Intensive research efforts
have been expended to use these alternative Li anode materials in the form of
nanoparticles, nanowires or nanotubes to minimize mechanical strain during Li
insertion and removal.
5.2
Nanomaterials in Fuel Cells
5.2.1
Low-Temperature Fuel Cell Technology
main product produced would be H2O, with effectively zero emissions. Even the
economical use of other hydrocarbon fuels beyond gasoline and natural gas may have
global benets, as this may lessen the demands for hydrocarbon fuels. Fuel cells
operate by converting chemical potential energy directly into a current or voltage by
coupling an electrochemical oxidation reaction with an electrochemical reduction
reaction. A wide variety of fuel cells have been investigated, including proton
exchange membrane, direct methanol, molten carbonate, solid oxide and phosphoric
acid fuel cells.
High-temperature fuel cells such as molten carbonate, solid oxide and phosphoric
acid fuel cells have recently been employed for several applications, particularly those
where waste heat can be employed to reach and maintain the operating temperature.
For example, waste heat is widely generated throughout industrial chemical plants,
sometimes making fuel cells an economical energy source. At the operating temperature of these fuel cells, the anode and cathode reactions are typically fairly facile,
making the use of electrocatalysts, which are often in the form of nanoparticles,
unnecessary. In addition, nanomaterials may be subject to grain growth, sintering,
dissolution and other unwanted chemical reactions at high temperature.
On the other hand, nanomaterials are much more compatible with lowtemperature fuel cells, which are needed for many transportation and consumer
applications where intermittent operation is typical and power requirements are
relatively modest. The most common low-temperature fuel cells are the polymer
electrolyte membrane fuel cell (PEMFC) and the direct methanol fuel cell (DMFC),
where the following reactions occur:
Anode PEMFC :
Anode DMFC :
H2 ! 2H 2e
CH3 OH H2 O ! CO2 6H 6e
O2 4H 4e ! 2H2 O
5:1
5:2
5:3
The structure of a typical proton exchange membrane fuel cell (PEMFC) is illustrated
in Figure 5.1. The cathode in a DMFC has a similar structure, whereas the anode
structure depends on whether the methanol feed stream is in liquid or vapor form. In
both electrodes of a PEMFC, the metal nanoparticle catalyst is dispersed atop larger
carbon particles that are combined into a porous structure that allows mass transport
of both reactants and products. Sandwiched between the two electrodes is a Naontype polymer material that serves as a proton conductor between the anode, where
protons are produced and the cathode, where protons are consumed. Naon, a
peruorosulfonated polymer, facilitates proton hopping along its sulfonate backbone through a series of electrostatic interactions.
Several technological challenges remain for commercialization of PEMFCs and
DMFCs. For economic reasons, the use of precious metal catalysts should be reduced
or eliminated. The cost of Naon polymer membranes may also need to be reduced
and their durability improved. Naon membranes are employed for proton transport
in both PEMFCs and DMFCs and are only conductive within a narrow temperature
j149
150
range, where the membrane is neither dried out nor ooded. In an H2-fueled
PEMFC, the need to maintain an appropriate humidication level throughout the
fuel cell creates a complex water management problem. The local humidication
depends on a complex balance between water production at the cathode, water
consumption at the anode, water diffusion from the cathode to the anode, and water
electro-osmosis from the anode to the cathode [1]. The durability of PEMFCs and
DMFCs must also be improved, since Naon degradation, catalyst agglomeration
and dissolution and carbon corrosion can all occur upon prolonged operation at high
current density.
5.2.2
Nanoparticle Catalysts in Low-Temperature Fuel Cells
When considering the use of nanomaterials in fuel cells, many observers would
rst consider the use of nanoparticle catalysts in both the anode and cathode.
However, nanoparticle fuel cell catalysts will be discussed only briey, since these
reactions have been widely studied and the interested reader can consult recent
reviews [29]. Hydrogen reduction by Reaction (5.1) at the anode of a PEMFC is the
most facile due to its simple reaction mechanism, and Pt nanoparticles are widely
used as electrocatalysts for this reaction. The main complication is that Pt catalysts
can be easily poisoned by trace CO in the H2 fuel, and so far the best performance has
been attained by PtRu bimetallic nanoparticle catalysts, preferably with a 1 : 1 ratio of
Pt : Ru, that facilitate CO desorption. Ternary and quaternary catalysts have also been
widely investigated.
The two cathode reactions above, methanol reduction by Reaction (5.2) and O2
reduction by Reaction (5.3), involve more complex mechanisms and multi-step
electron transfer, making electrocatalysis more difcult. O2 reduction is most facile
on Pt nanoparticle catalysts, and the use of Pt alloys with transition metals such as Co,
Cr, Ti and Zr has been thoroughly investigated. However, for catalysts tested to date,
the overpotential loss of 300400 mV for O2 reduction still accounts for about 80% of
the voltage loss in a typical PEMFC [10]. Similarly, methanol oxidation has been
widely studied on Pt nanoparticle catalysts alloyed with a wide variety of different
transition metals, including Ru, Os and Sn. Given that the expensive Pt catalyst
contributes signicantly to the overall fuel cell cost, non-Pt catalyst materials are also
under intensive investigation for both PEMFCs and DMFCs [11, 12]. However, Pt and
its alloys in nanoparticle form remain the best catalysts for reactions (5.1)(5.3) in
low-temperature fuel cells.
5.2.3
Fuel Cell Catalyst Support Materials
Both PEMFCs and DMFCs employ porous catalyst support structures, typically
some form of carbon, that perform multiple functions. The catalyst support
material must be porous enough to provide a pathway for inlet and outlet of
gaseous reactants and products, but it must also maintain electrical conductivity so
that the voltage (current) created across the fuel cell can be captured for use or
storage. The requirement to maintain electrical conductivity is complicated by the
presence of Naon polymer, which is typically far less conductive than the carbon
support material.
Carbon nanotubes and carbon nanobers have been widely investigated for
possible application into the catalyst supports shown in Figure 5.1 for the
PEMFC [10]. The main improvement that is envisioned is increased utilization for
the Pt catalyst supported on carbon nanotubes. The high nanotube aspect ratio
increases the likelihood that Pt catalyst will have direct electrical contact to the desired
electrode, without electrical blockage by intervening Naon particles. Another
potential advantage of carbon nanotubes as catalyst supports is their improved
resistance to oxidation. One of the primary barriers to commercialization of both
PEMFCs and DMFCs is their poor durability. During long-term usage, catalyst
agglomeration, catalyst dissolution and carbon corrosion all occur, resulting in a
gradual loss of performance.
Wang et al. recently reported that the corrosion current for carbon nanotube
catalyst support materials in a PEMFC cathode is 30% lower than that from Vulcan
XC-72 carbon catalyst support materials [13]. In addition, these authors noted that the
supported Pt catalyst better maintains its activity for the oxygen reduction reaction. Li
and Xing recently used cyclic voltammetry to compare corrosion currents for carbon
nanotube and Vulcan XC-72 carbon catalyst supports following prolonged oxidation [14]. They found that for the carbon nanotube-based support material, the
j151
152
Carbon nanotubes, which are allotropes of carbon from the fullerene structural
family, have been the most widely studied nanomaterial. They can be conceived as allsp2 carbons arranged in graphene sheets that have been rolled up into hollow tubes.
The nanotubes can be capped at the ends by a fullerene-type hemisphere and can
range in length from tens of nanometers to several micrometers. Carbon nanotubes
can be subdivided into two categories, single-wall carbon nanotubes (SWNTs) and
multi-wall carbon nanotubes (MWNTs). As the name suggests, SWNTs consist of a
single hollow tube with a diameter of 0.43 nm, whereas MWNTs are composed of
multiple concentric nanotubes spaced by 0.34 nm, with overall diameters of
2200 nm.
Research interest in carbon nanotubes arises from several of their extraordinary
properties. For example, their mechanical strength per unit weight is 100 times
greater than that of steel, their electrical conductivity is similar to that of Cu and their
thermal conductivity is comparable to that of diamond [15]. MWNTs have electrical
conductivities greater than those of metals, but depending on the tube diameter and
chirality, SWNTs can behave electronically as either metals or semiconductors. In
addition, the aspect ratios of both SWNTs and MWNTs can be as high as 103105,
allowing for low percolation thresholds when they are employed in composite
materials. Thus carbon nanotubes have been proposed for a diverse range of
applications, including nanoscale transistors, chemical sensors, high-strength composites, hydrogen storage, and fuel cell electrode supports.
Carbon nanotubes can be made by laser ablation, electric arc discharge and
chemical vapor deposition. The detailed synthesis conditions, such as temperature,
pressure, or the presence of an inert gas, strongly inuence the properties of the
resulting carbon nanotubes, as does the presence and type of metal catalyst employed. One of the primary difculties with these synthetic methods is that all create a
complex mixture of different carbon forms, including amorphous carbon, graphite
particles and carbon nanotubes. Thus synthesis must typically be followed by a
difcult separation process.
For applications as fuel cell catalyst support materials, one should also consider
the chemical reactivity of carbon nanotubes, since they must rst be functionalized
with metal nanoparticle catalysts and then formed into porous support materials.
Both of these processes involve solution-phase chemistry, preferably aqueous
chemistry. Dispersion of carbon nanotubes into aqueous solvents is difcult given
their hydrophobicity, so the use of organic solvents is often required. In addition,
carbon nanotubes are highly chemically inert, so chemical or electrochemical
methods must be employed to attach the catalyst, typically Pt or its alloys.
Among the methods that have been employed for catalyst deposition are electroless
deposition, otherwise known as chemical impregnation, electrodeposition, microwave
j153
154
that the catalyst activity on the standard support material may be compromised by
trace amounts of organosulfur compounds, common catalyst poisons [24].
Not surprisingly, several reports have also appeared of carbon nanotube supports
for Pt cathode catalysts in H2-fueled PEMFCs, where the oxygen reduction reaction
is the same [2629]. Shaijumon et al. recently reported interesting results for an
operating PEMFC with composite carbon catalyst supports containing a mixture of
Pt-decorated MWNTs and commercial Pt/C samples from E-Tek [29]. MWNTs were
fabricated by catalytic decomposition of acetylene in a CVD reactor and decorated
with Pt nanoparticles by impregnation with H2PtCl6 followed by reduction with
NaBH4. This yields Pt particles of size 58 nm. Composite cathodes were fabricated
by using 0, 25, 40, 50, 60, 75 and 100 wt.% Pt/MWNT, with the remainder of the
catalyst as commercial E-Tek 20 wt.% Pt/C, with a total Pt loading of 0.5 mg cm2.
The anode of the PEMFC was constructed from commercial E-Tek 20 wt.% Pt/C,
with a loading of 0.25 mg cm2. Surprisingly, an optimum performance was
observed with a composite catalyst support composed of a 50:50 wt.% mixture of
the two carbon forms. This yielded a current density of 535 mA cm2 at a voltage of
540 mV [29]. This is considerably higher than the corresponding current densities
for pure E-Tek and pure MWNT catalyst supports of 258 and 362 mA cm2,
respectively [29].
Waje et al. recently reported the PEMFC performance of CVD-grown carbon
nanotubes that are pretreated by electrochemical reduction in a diazonium
acetonitrile electrolyte and then decorated with Pt nanoparticles by standard impregnation and reduction methods [27]. This treatment yields uniform Pt particles
of about 22.5 nm diameter, with a mass loading of about 0.09 mg cm2 [27]. This
Pt/carbon nanotube catalyst layer was then employed as the cathode in a H2-fueled
PEMFC with a standard E-Tek/Vulcan XC-72 anode catalyst, and its performance was
compared with that of a reference E-Tek/Vulcan XC-72 cathode catalyst with a slightly
lower loading, about 0.075 mg cm2. The maximum power density obtained for the
Pt/nanotube cathode catalyst was about 290 mW cm2, whereas that obtained from
the reference cathode catalyst was about 160 mW cm2 [27]. The authors contend that
the superior performance at high current densities arises from the more open
structure of the Pt/nanotube cathode catalyst, which enhances mass transport of
reactants and products [30].
Several research groups have investigated the use of carbon nanotube catalyst
supports on the anode side of an H2-fueled PEMFC [3134]. Li et al. recently
described a ltration method for incorporating a Pt/carbon nanotube lm into a
PEMFC so that it is partially oriented [31]. These authors started with commercial
MWNTs, oxidized them in a nitric acidsulfuric acid mixture, and deposited Pt
nanoparticles by ethylene glycol reduction of H2PtCl6, producing Pt nanoparticles of
25 nm diameter. The Pt/MWNTsuspension was then ltered through a hydrophilic
nylon lter-paper, which apparently forces the hydrophobic MWNTs to stand up and
self assemble on the lter-paper [31]. These can then be pressed onto a Naon
membrane to create a partially oriented but somewhat loosely packed Pt/MWNT
lm. This Pt/MWNT lm was then used as the cathode catalyst layer in an operating
PEMFC and compared with two reference cathode catalysts, one made from a
j155
156
non-oriented Pt/MWNT lm and one made from E-Tek Pt/C. All cathodes had
approximately the same Pt loading, 0.200.25 mg cm2. The best performance was
observed for the cathode catalyst containing oriented Pt/MWNT, which the authors
argue arises partly from improved mass transport [31]. The results of Li et al. are
shown in Figure 5.3, which compares oriented carbon nanotube cathode catalyst
support materials with non-oriented carbon nanotube supports.
Carmo et al. also studied carbon nanotube catalyst supports for PEMFC anode
fabrication using nanotubes grown by chemical vapor deposition [32]. Both Pt and
PtRu nanoparticle catalysts were deposited by impregnation of H2PtCl6, with and
without RuCl3, followed by reduction at elevated temperature in an H2 atmosphere.
This produced an average Pt particle size of about 3.6 nm and an average PtRu particle
size of about 4.6 nm, whereas the corresponding particle sizes on Vulcan XC-72 carbon
were 6.8 and 6.4 nm, respectively. As is typically observed, the particle sizes measured
by different techniques varied somewhat. The metal loading in all cases was approximately 0.4 mg cm2. At the anode side, the Pt/carbon nanotube catalyst showed
signicantly better performance than the Pt/Vulcan XC-72 catalyst, suggesting that
carbon nanotubes may have some intrinsic role in suppressing CO poisoning [32].
However, the authors noted that this may be due to trace presence of the metal catalysts
used for carbon nanotube growth. The same group also investigated the same set of
catalysts and supports for the anode reaction in a direct methanol fuel cell, nding that
the best performance was obtained for a PtRu catalyst supported on MWNTs [32].
Liang et al. reported a study of carbon nanotube anode catalyst supports that
compared different techniques for PtRu nanoparticle formation [33]. They found
that reduction of mixtures of H2PtCl6 and RuCl3 in ethylene glycol yielded a much
higher nucleation density of nanoparticles than reduction in aqueous solutions of
formaldehyde. This was attributed to the lower polarity of ethylene glycol, which
therefore does not interfere with ion exchange reactions that deposit Pt and Ru [33].
The ethylene glycol reduction formed 28 nm diameter PtRu catalyst particles on
MWNTs that were purchased commercially and puried before use. Anode catalysts
with a loading of 0.220.32 mg cm2 were then compared with commercial catalysts
with a loading of 0.39 mg cm2 in an operating hydrogen fuel cell. On a per weight
basis, the MWNT-supported catalyst exhibited superior performance in the middle to
high current density regime [33].
Several groups have studied the use of carbon nanotubes as anode catalyst supports
in DMFCs [3539]. Understanding of these studies is complicated by the need to
separate effects associated specically with the catalyst support from those associated
with the exact catalyst composition.
5.3
Hydrogen Storage
The efforts to develop energy sources powered by molecular hydrogen, rather than by
hydrocarbons, are motivated primarily by the desire to minimize the production of
greenhouse gases. Carbon dioxide, which is one of the common greenhouse gases, is
the inevitable product of energy sources fueled by hydrocarbons. By comparison,
energy sources fueled by hydrogen would produce mainly water, dramatically
reducing greenhouse gas emissions. The development of hydrogen-powered energy
sources encompasses a number of technical challenges, including the development
of low-cost, efcient fuel cells powered by hydrogen, in addition to low-cost, efcient
methods for producing and storing hydrogen. One of the greatest challenges for
applications in transportation is the lack of an infrastructure to store and distribute
hydrogen. The infrastructure for storing and distributing hydrocarbons is well
developed and the development of an alternative infrastructure for hydrogen is a
daunting economic and technological obstacle. Hydrogen storage for vehicular
applications has challenging constraints of weight and space.
Proposed hydrogen storage methods include compression, liquefaction, hydride
formation and adsorption on carbon and other nanomaterials, although all currently
have signicant shortcomings [40]. Although hydrogen compression is the simplest
method for hydrogen storage, the energy density is not high enough for most
applications that are envisioned. In addition, this approach is thought to be more
expensive than hydrogen liquefaction. On the other hand, hydrogen liquefaction is
limited by the large energetic requirement for the liquefaction process and by the
continuous loss of hydrogen due to boiling.
For systems based on hydrogen adsorption, the simplest way to compare their
hydrogen storage capabilities is the weight percent of hydrogen that they are capable
of adsorbing. In addition to the storage capacity, another important issue for
hydrogen storage is reversibility. Practical hydrogen storage systems need to be
reversible at moderate temperatures and pressures. Hence the mechanism of
hydrogen storage is extremely important. Hydrogen chemisorption can likely only
j157
158
The primary motivation for the use of nanomaterials for hydrogen storage is their
extremely high surface area per unit weight or unit volume. Sound fundamental
reasons exist for investigating carbon nanomaterials relative to nanomaterials of other
compositions. Carbon is well known for its ability to adsorb gases, so carbon materials
are already widely employed as adsorbents. Carbon-based nanomaterials proposed for
hydrogen storage include carbon nanotubes and graphite nanobers. As discussed
above, carbon nanotubes can be prepared either as SWNTs or MWNTs. SWNTs, which
have the strong adsorption capability of carbon materials coupled with an enormous
surface area per unit weight, are therefore promising as hydrogen storage materials.
Despite the widespread use as carbon as an adsorbent, the precise mechanism by
which carbon nanomaterials adsorb hydrogen is not completely understood. For
gases above the critical temperature, the expected adsorption mechanism is monolayer adsorption. Simple calculations based on the known surface area of different
nanomaterials and the known dimensions of hydrogen molecules yield maximum
hydrogen storage capacities in the range 24 wt.% [41]. More complex calculations are
not substantially different. Such values are considerably less than the benchmark
values provided by the US Department of Energy (DOE), which projects requirements
of 4.5 wt.% by 2007, 6 wt.% by 2010 and 9 wt.% by 2015 [42]. However, the existence of
more complex hydrogen storage mechanisms, such as those involving defects, cannot
be discounted.
The highest hydrogen storage capacities reported to date for carbon nanotubes are
in the 510 wt.% range. However, the upper end of this range is now treated with
considerable skepticism, since other investigators have had difculty reproducing
these results [4345]. Instead, it has become generally accepted that room temperature hydrogen storage capacity is limited to less than 1 wt.%, although up to 6 wt.%
hydrogen storage capacity can be attained at cryogenic temperatures [46]. Since
hydrogen monolayers are generally bound by physisorption, hydrogen storage
capacity typically decreases dramatically as the temperature is raised.
It should be noted that measurements of hydrogen adsorption are complicated by
the very small extent of hydrogen adsorption relative to other gases, so experimental
measurements are sensitive to the detailed procedures of how they are performed [45, 47, 48]. Agreement among reports of hydrogen storage capacity from
different laboratories is often hampered by the lack of reliable methods for producing, purifying and quantifying carbon nanotubes. Indeed, if defects are critically
involved in hydrogen storage, slight differences in preparation techniques may even
result in intrinsic differences in hydrogen storage capacity [49].
One focus of current research efforts is on methods for improving the hydrogen
storage capacity of carbon nanomaterials by treatments to increase its surface
reactivity. Such treatments include reactive ball-milling [5052], oxidation [53], acid
treatment [47] and doping with transition metals such as Pd [5458]. These transition
metals serve a catalytic purpose of breaking the chemical bond in molecular hydrogen
so that it can be stored in greater quantities on a carbon surface.
Non-carbon-based nanomaterials, such as boron nitride nanotubes, have also been
studiedforhydrogenstorageapplications [5961].Boronnitridenanotubesarefullerene
materials with similar properties to carbon nanotubes. One advantage of boron nitride
nanotubes is their greater oxidation resistance with respect to carbon nanotubes.
5.4
Solar Cells
5.4.1
Solar Energy Basics, Including Quantum Confinement
Solar power is highly desirable as a sustainable energy source due to the expected
long lifespan of the Sun, on the order of 10 billion years. Solar energy has long
been considered an attractive alternative to hydrocarbon fossil fuels, but its
widespread adoption has been hindered by the coupled problems of low efciency
and high cost, in addition to the large area required for generation of signicant
power. Most commercial photovoltaic cells employ either crystalline, polycrystalline or amorphous Si [62]. Photovoltaic cells based on other materials, such as
CdTe, CuInSe2, CuInGaSe2 and TiO2, have comparable or higher efciencies,
j159
160
Table 5.1 Reported maximum efficiencies and other data for solar cell materials (From Ref. [62]).
Material
Efficiency (%)
Area (cm2)
Voc (V)
a-Si
CuInSe2
CuInGaSe2
CuInAlSe2
CdTe
12.7
15.4
19.2
16.9
16.5
1
0.408
0.470
1.032
0.887
0.515
0.689
0.621
0.845
19.4
41.2
35.7
36.0
25.9
but none is yet cost-effective in comparison with Si photovoltaic cells [62]. The
highest efciency reported for several different solar cell materials is given in
Table 5.1 [62].
Nanomaterials have been employed in several reported types of photovoltaic
cells, for a variety of different purposes. Probably the classic demonstration of
nanotechnology is the dependence of optical bandgap of a semiconductor nanostructure on its dimensions. When the size is comparable to the exciton Bohr radius,
quantum connement effects shift the bandgap of a semiconductor nanostructure to
higher energy. This has been widely popularized by the term quantum dot. The
detailed dependence of the optical bandgap on nanostructure dimensions can be
determined by solving the Schrodinger equation with the appropriate boundary
conditions.
Hence the most straightforward application of nanomaterials to photovoltaic
technology is to tune the optical bandgap by varying the size dimensions of a
nanostructure. When optical absorption occurs within a nanoparticle, nanowire,
nanotube or other nanostructure, the voltage created by exciton formation is captured
in an external electrical circuit. The main focus of these efforts has been the metal
chalcogenides, including CdS, ZnS, PbS and CdSe. CdTe is particularly appealing as
its bandgap (1.45 eV, 855 nm) is nearly ideal for solar terrestrial photoconversion.
Figure 5.4 illustrates the terrestrial solar irradiance determined by several different
measurements, using detectors both normal to the Earths surface and tilted [63]. The
report of a CdS/CdTe solar cell with 15.8% efciency in 1993 stimulated an enormous
literature on this type of photovoltaic cell [64].
The blue shift of the optical bandgap with decreasing nanostructure dimension has
motivated the use of nanomaterials in such solar cells. For example, losses due to
optical absorption in the CdS n-type window can be minimized through the use of
nanocrystalline CdS, where the optical absorption is blue shifted out of the terrestrial
window for light collection [65]. This allows for use of a window material that does not
absorb photons, allowing subsequent collection by the absorber material.
More generally, nanostructured absorber layers have several potential advantages
over bulk absorber layers. The nanostructure dimension can be adjusted to tune the
optical bandgap to match the terrestrial window [66]. The variation in optical properties of chalcogenide nanostructures has been extensively characterized [67, 68].
In addition, the presence of multiple interfaces within the absorber material may
increase the effective pathlength by internal light scattering, allowing the use of much
thinner absorber layers [69]. For example, the optical pathlength has been reported to
increase by a factor of ve in nanocrystalline TiO2 lms [70]. The competing effects of
maximizing the internal surface area, which favors nanoscale particles and maximizing internal reections, which favors microscale particles, make this a complex
optimization problem [71].
5.4.2
Nanocrystalline Dye-Sensitized Solar Cells
Although TiO2 is an effective photocatalyst for many applications, for solar cells it has
the substantial drawback of an optical bandgap (3.2 eV, 387 nm) in the ultraviolet
region of the spectrum, so pure TiO2 will not absorb throughout most of the
terrestrial solar irradiance window shown in Figure 5.4. The addition of a sensitizing
dye, typically a transition metal complex that absorbs visible radiation, provides the
necessary coverage throughout the terrestrial solar window.
The most widespread application of nanomaterials in solar cells is the use
nanocrystalline TiO2, and to a lesser extent nanocrystalline ZnO and SnO2, as
the absorber layer in DSSCs. First reported by ORegan and Gratzel in 1991 [72],
j161
162
5:4
TiO2 =S ! TiO2 =S ecb
5:5
TiO2 =S
3
1
I ! TiO2 =S I3
2
2
5:6
TiO2 =S ecb
! TiO2 =S
5:7
1
3
I ept ! I
2 3
2
5:8
1
3
I ecb
! I
2 3
2
5:9
Photon absorption by the dye monolayer (S) adsorbed at the TiO2 surface in
Reaction (5.4) creates a dye molecule in an electronically excited state (S ). Electron
transfer from this excited state to the conduction band (ecb) of TiO2 is shown in
Reaction (5.5). The desired oxidation of I is shown in Reaction (5.6), while
Reactions (5.7) and (5.8) represent two undesired reactions, back reaction of
conduction band electrons with the dye and the reduction of I3 by injected electrons.
An important criterion for high-efciency operation of the DSSC is that Reaction (5.6)
must occur much more quickly than Reactions (5.7) and (5.8). Reaction (5.9)
describes I3 reduction at the counter electrode to regenerate the electron mediator,
I. The rapidity of this reaction at the counter electrode, usually Pt, can often limit the
overall cell performance, as described further below.
The main drawback of the Gratzel-type DSSC has been the low electron diffusion
coefcient. Laser ash-induced transient photocurrent measurements and intensitymodulated photocurrent spectroscopy show that the electron diffusion coefcient in
nanocrystalline TiO2 lms is about two orders of magnitude less than in bulk anatase
TiO2 [7478]. More recent photocurrent transient measurements that systematically
varied the average TiO2 particle size suggest that this behavior arises from electron
traps located predominantly at the TiO2 nanoparticle surface, not within the bulk
particle or at interparticle grain boundaries [79]. Ultrafast measurements using THz
spectroscopy suggest that low electron mobilities also arise from local electric eld
effects that are coupled to the TiO2 morphology [80].
In analogy with the approaches described above for using carbon nanotubes as fuel
cell catalyst support materials, several investigators have tried using one-dimensional
TiO2 nanostructures, such as nanotubes and nanowires, in order to obtain simultaneously both high surface area and the connectivity needed for rapid electron
transfer [81]. TiO2 nanostructures can be fabricated by a variety of different methods,
including template synthesis within nanoporous membranes, hydrothermal methods and colloidal methods. Several authors have mixed TiO2 nanowires or nanotubes
with more standard TiO2 materials, demonstrating improved energy conversion
efciency.
Yoon et al. reported the template synthesis of Ti nanowires and nanotubes
from TiCl4 within nanoporous alumina membranes [82]. Template synthesis of
nanomaterials using commercially available nanoporous alumina and polycarbonate
membranes has been widely employed to fabricate metal, semiconductor, ceramic
and polymer nanowires and nanotubes [83]. Solar cells containing 10 wt.% of
180250 nm TiO2 nanowires and nanotubes with 90 wt.% Degussa P25 TiO2 exhibited
an energy conversion efciency reported to be 42% higher than that obtained for
Degussa P25 TiO2 alone [82].
Jiu et al. reported the growth of TiO2 nanowires of controlled length by a
hydrothermal process in a solution containing a cetyltrimethylammonium bromide
(CTAB) surfactant through addition of varying amounts of a triblock copolymer
containing poly(ethylene oxide), poly(propylene oxide) and poly(ethylene oxide) [84].
The use of multiple surface-active species in such wet synthesis methods has been
shown to allow for the anisotropic growth needed for nanowire formation.
Surprisingly, the TiO2 nanowire morphology survives intact following calcination
at 450 C and sintering at 550 C, but only in the presence of the triblock copolymer.
Nanowires of 2030 nm diameter and 100300 nm length have been produced by
this method. Earlier studies from the same research group used a slightly different
technique and produced TiO2 nanotubes with diameters of 510 nm and lengths of
30300 nm [85]. DSSCs with an absorber layer containing mixtures of these
nanotubes and Degussa P25 TiO2 showed superior efciency to those containing
only Degussa P25 TiO2 [86].
Jiu et al. also incorporated their TiO2 nanowires into a DSSC and compared their
performance with that of Degussa P25 TiO2 [84]. However, as these authors
acknowledge, this comparison between these two materials is difcult due to their
differing crystal structures, degree of crystallinity, packing orientation and porosity.
For example, Degussa P25 TiO2 contains about 80 wt.% anatase and 20 wt.% rutile
phases, whereas the nanowire TiO2 is purely in the anatase phase. Earlier studies
have shown that DSSCs constructed using nanocrystalline TiO2 in the anatase phase
are more efcient that those constructed using Degussa P25 TiO2 [87]. Despite these
difculties in comparison, the DSSC containing TiO2 nanowires exhibited superior
performance to those containing Degussa P25 TiO2 for thick (<10 mm) absorber
layers, where the advantages in terms of enhanced electron diffusion rates would be
most evident.
j163
164
This group has also grown TiO2 nanowires using an oriented attachment
method that ensures that the nanowires grow parallel to each other [88]. This process
involves the use of surfactant-aided anatase crystal growth process near room
temperature. This geometry is designed to maximize the rate of electron transfer
in a DSSC and yielded an efciency of 9.3%.
Several other groups have incorporated TiO2 nanowires and/or nanotubes into
DSSCs. The fabrication of TiO2 nanowires has been demonstrated using the
following solgel alkyl halide elimination reaction [89]:
TiCl4 TiOR4 ! 2TiO2 4RCl
5:10
The size and the crystal phase (anatase vs. rutile) of the nanowires can be controlled
to some extent by the injection rate of the titanium precursors. As the injection rate
increases, the nanowire diameter decreases and the proportion of anatase phase
increases. A DSSC fabricated from 81 wt.% anatase and 19 wt.% rutile TiO2 nanowires, which is similar to the mass fraction of these phases in Degussa P25 TiO2,
exhibited a higher efciency (3.83%) than DSSCs fabricated from purely anatase or
purely rutile nanowires.
Several other examples of TiO2 nanomaterials in DSSCs should also be noted. A
DSSC has also been constructed using an absorber that contains titanate (H2Ti3O7)
nanotubes fabricated by a hydrothermal process [90]. TiO2 nanowires have been
directionally grown by electrospinning and incorporated into a DSSC with a highly
viscous gel electrolyte, attaining an efciency of 6.2% [91]. Rather than creating
individual nanotubes, a Ti lm can be anodized to create a continuous, oriented and
highly ordered array of TiO2 nanotubes that has been demonstrated as part of a
DSSC [92].
Another intriguing example of nanomaterials in DSSCs is the fabrication of core
shell nanoparticles for use in the absorber layer. TiO2 nanoparticles have been coated
with insulating oxides or wide bandgap semiconductors. The coating is designed to
prevent interfacial recombination, but must be thin enough to allow electron tunneling
between the dye and TiO2. Although this has been studied mainly with Al2O3 coatings [93, 94] this effect can be seen using a wide variety of oxide coatings [95, 96]. This
idea has also been extended to oxide coating of ZnO nanowires [97].
5.4.3
Nanomaterials in Solar Cell Counter Electrodes
Nanomaterials are also important for other elements within a DSSC besides the
absorber material. Equation (5.6) occurs at the counter electrode and is important for
regenerating the electron acceptor, I. If the rate of I3 reduction is not sufciently
rapid, then this will limit the overall DSSC efciency. Pt is typically the electrocatalytic
cathode material of choice in a wide variety of electrochemical systems. However,
owing to its high cost, minimizing the amount of Pt used is highly desirable. The
rapidity of I3 reduction can be quantied by its charge transfer resistance (Rct),
which can be conveniently determined by electrochemical impedance spectroscopy
(EIS). EIS studies of the reaction rate at the Pt counter electrode indicate that a 23 nm
5.5
Lithium Ion Battery Anode Materials
5.5.1
Lithium Ion Batteries
Lithium ion batteries have now become ubiquitous due to their high voltage (3.6 V),
high energy density (125 W h kg1) and long life cycle (>1000 cycles) relative to other
battery types, such as NiCd, AgZn, Nihydride and lead acid batteries. Applications
are widespread in portable consumer electronics, including notebook computers,
cellular telephones, MP3 players and camcorders. Like other batteries, Li ion batteries
are composed of a cathode and an anode, separated by an electrolyte. This type of
battery has been called the swing battery because Li ions are exchanged alternately
between the cathode and anode during battery charging and discharging.
The cathode material, which provides a source of Li ions during battery charging, is
typically either layered LiMO2, where M can be Co, Ni, Al or Mn; Li manganese oxide
spinels (LiMn2O4); or other Li salts. Although LiCoO2 is widely used as cathode
material due to its long cycle life and reasonable energy density, the high cost and
high toxicity of Co have limited its use to relatively small batteries. Efforts are under
way to replace Co either partly or completely with Ni or Mn, which are less costly and
less toxic. The prototypical anode material is carbon/graphite, which accepts Li ions
during battery charging by intercalation between adjacent graphitic planes. Li metal
was used as the anode for early Li ion batteries, but this caused safety problems due to
the high activity of metallic Li. During battery discharge, the direction of Li ion
migration is reversed. The electrochemical reactions at the anode and cathode are
shown below for the prototypical cathode and anode materials during battery
charging:
LiCoO2 ! Li1 x CoO2 xLi xe
5:11
6C xLi xe ! Lix C6
5:12
The intervening electrolyte material is normally a polymer material, most commonly poly(ethylene oxide) (PEO). Signicant research efforts have been undertaken
to nd new materials for the cathode, anode and electrolyte to improve the life cycle
and increase the energy density of the Li ion battery. Many of these efforts have
involved the development of nanomaterials, in large part due to their higher internal
surface area [101].
Nanomaterials that have been proposed as an anode material in Li ion batteries are
mainly metal nanoparticles, carbon nanotubes and nanocomposites that combine
j165
166
Diameter (nm)
300
200
20
10
58
68
87
94
been constructed that are nanocomposites of active and inactive materials, with the
inactive material serving as a mechanical buffer to accommodate the volume
change [106108]. For example, composites between Sn2Fe (active) and SnFe3C
(inactive) prepared by high energy ball-milling have been reported to have reversible
Li capacities twice that of graphite materials [106, 107]. A serious disadvantage of this
technique is that the energy density per unit mass of the anode material is reduced,
sometimes dramatically, by the presence of the inactive component.
To circumvent this problem, several groups have instead created composites of Liactive nanoparticles (usually Sn) with a buffer material that is also active, usually
carbon in graphitic form. Since graphite is fairly soft, it is expected to form a
mechanical buffer during repeated Li insertion/removal [109111]. One complication of this approach is that Sn nanoparticles are relatively difcult to fabricate. As is
well known in colloid chemistry, formation of metal and metal oxide nanostructures
requires careful control of reaction conditions, typically requiring the use of surfactants, stabilizers and/or capping agents in order both to control nanostructure growth
and simultaneously to prevent aggregation [112].
Wang et al. have reported the fabrication of Sn nanoparticles of 25 and 713 nm
diameter by reduction of SnCl4 with NaBH4 [110]. The Sn nanoparticles were then
dispersed in graphite to form a composite Li anode containing 10.3 wt.% Sn. One of
the challenges of such techniques is obtaining a high fraction of Sn in the composite
anode. During both the preparation and dispersion steps, the Sn nanoparticles were
protected from agglomeration by the presence of 1,10-phenanthroline, which forms a
coordination compound with Sn [110]. This anode exhibited a 415 mA h g1 Li
storage capacity and was 91.3% reversible after 60 charge/discharge cycles.
Caballero et al. reported the formation of Sn nanoparticles by reduction of SnCl4 by
KBH4 in the presence of cellulose bers [111]. The Sn nanoparticles apparently
nucleate and grow only on the surface of the cellulose bers, preventing agglomeration. This celluloseSn nanocomposite exhibits a storage capacity of 600 mA h g1 as
an Li anode and was reported to exhibit reversible charge/discharge cycling, although
specic numbers for capacity retention were not provided [111].
A particularly ingenious method for forming a nanocomposite anode from an Li
active metal (usually Sn) and an active support (usually carbon) is to encapsulate the
metal nanoparticles within carbon spheres [113117]. This protects the metal
nanoparticles both from agglomeration and from mechanical degradation arising
j167
168
from volume expansion. So far, this technique has been demonstrated only for
particles larger than the nanoscale (1100 nm), but could in principle be extended to
smaller particles. Unless the carbon coating can be made graphitic, the Li storage
capacity will not be very high. In some cases, the addition of a conductive material
such as Cu silicide within carbon coating Si particles may improve electrical contact
during repeated lithium charge/discharge cycles [113].
Some of the studies that involve active metals within carbon shells provide some
interesting, direct comparisons between cycling stability of these materials, with and
without their carbon shells [114, 117]. Jung et al. created carbon encapsulation about
commercial Sn particles by hydrophobization in 1-octanethiol, dispersion in a
resorcinolformaldehyde microemulsion, polymerization and carbonization of the
coating by high-temperature annealing [114]. Without rst making the Sn particle
surface hydrophobic, the Sn particle surface will not be wetted during this synthesis.
The performance of this material (CSP) as an Li anode was then compared with two
control materials, Sn nanoparticles that are not encapsulated (SN) and a random
mixture of spherical carbon powder with Sn particles (MIX) with the same nominal
composition, 20 wt.% Sn. The results for 40 charge/discharge cycles are shown in
Figure 5.5 [114]. Clearly, the cyclability of the carbon-encapsulated Sn particles is far
greater than either control anode, demonstrating the stabilizing effect of the carbon
encapsulation.
Another study used a similar fabrication technique to encapsulate Sn2Sb particles
with carbon through polymerization of resorcinolformaldehyde microemulsion
followed by high-temperature annealing [117]. The Li anode performance of the
carbon microsphere (CM)-encapsulated Sn2Sb particles was compared with that
of Sn2Sb powder, as shown in Figure 5.6. While the initial Li storage capacity of
the Sn2Sb powder is 689 mA h g1, only 20.3% of this capacity is retained after 60
charge/discharge cycles [117]. On the other hand, the CM-encapsulated Sn2Sb
particles retained 87.7% of their original storage capacity of 649 mA h g1 after 60
charge/discharge cycles [117].
Figure 5.5 Li discharge capacity with cycle time for Sn particles in Li battery anode. (From Ref. [114]).
Figure 5.6 Li discharge capacity with cycle time for Sn2Sb powder,
with and without a carbon microsphere (CM) coating. (From
Ref. [117]).
5.5.3
Nanomaterials for Lithium Ion Storage: Si Nanocomposites
Most of the examples given above describe ingenious techniques to improve the cycle
stability of Sn-based lithium storage materials through the use of nanoparticles and
nanocomposites. After Sn, the most widely studied alternative lithium storage
material is Si. Si has an even higher theoretical lithium storage capacity (4200 mA
h g1) than does Sn (994 mA h g), although the low conductivity of Si requires the
use of some type of conductive ller. Not surprisingly, the most widely studied
composite Si anode materials are Sicarbon composites [118122]. One difculty
with SiC composite anodes is that high-temperature processing may form the
compound SiC, which is inactive for lithium storage [118].
Holzapfel et al. deposited nanocomposite Sigraphite lms by chemical vapor
deposition (CVD) of Si from SiH4 on a graphitized ne particle carbon lm, Timrex
KS6 [121]. This active material was then mixed with one of two different binders,
dissolved in a petroleum ether solution and dried under vacuum. The resulting
electrode lm contains about 7 wt.% Si in the form of Si nanoparticles ranging from
10 to 20 nm in diameter [121]. The Li storage capacity of this Sigraphite electrode
declined gradually from 2500 to 1900 mA h g1 during 100 charge/discharge cycles.
Wang et al. formed an SiC nanocomposite by high-energy ball-milling of
commercially available powders, 80 nm Si nanoparticles and 10 mm spherical mesocarbon microbeads (MCMB) [120]. MCMB represent an industry standard for
lithium storage and exhibit capacities ranging from 300 to 340 mA h g1 with
excellent stability during cycling. After 20 h of ball-milling, the spherical MCMB
lose their original structural integrity and the Si nanoparticles become dispersed
within the MCMB. These composite anodes were tested and compared with Li
anodes constructed from 80 nm Si powder, 10 mm MCMB and composite anodes
containing much larger Si particles (20 mm). The best performance was obtained for
j169
170
composite anodes constructed from 20 wt.% Si and 80 wt.% MCMB, which showed
good reversibility over 25 charge/discharge cycles with a capacity of 1066 mA h g1, as
illustrated in Figure 5.7 [120].
5.5.4
Nanomaterials for Lithium Ion Storage: Carbon Nanotubes and Carbon Nanotube-Based
Composites
ferrocene catalyst. Following removal of catalyst and amorphous carbon, the carbon
nanotube ends were oxidatively opened using concentrated HNO3. Aqueous SnCl2
was introduced into the carbon nanotubes by capillary action, then reduced by either
hydrothermal treatment or NaBH4. The Li insertion capacity of the Sn nanoparticlecarbon nanotube composite material is shown in Figure 5.8, which illustrates
that an approximately stable Li capacity on the order of 800 mA h g1 is attained after
10 cycles [130].
Nanocomposites fabricated by ball-milling of carbon nanotubes with Si have also
been prepared and tested for their Li insertion capacity [135, 136]. High and stable
reversible Li capacity has not yet been obtained by such techniques.
5.5.5
Lithium Ion Storage: Further Considerations
An often overlooked aspect of lithium ion battery anodes is the inherent difference in
electrocatalytic behavior between graphite and active metals [137]. Graphite surfaces
exhibit well-known differences between the electrochemistry of basal planes and
edge surfaces, whereas the surfaces of active metals appear to be more electrochemically homogeneous. The solid electrolyte interface (SEI) on basal planes of graphite is
known to be relatively thin and contain many organic decomposition products, while
the edge sites exhibit a thicker SEI that contains mainly inorganic species [137]. The
mechanism of SEI formation is also found to differ in ethylene carbonate and
propylene carbonate electrolytes, which are the two most common electrolytes in
lithium ion batteries.
The studies discussed here report many impressive results for lithium battery
anodes constructed from composite materials containing a variety of identiable
nanoparticle and nanotube ingredients. However, Dahns group suggested that an
j171
172
References
1 P. Costamagna, S. Srinivasan, J. Power
Sources 2001, 102, 25369.
2 T.R. Ralph, M.P. Hogarth, Platinum Met.
Rev. 2002, 46, 314.
3 T.R. Ralph, M.P. Hogarth, Platinum Met.
Rev. 2002, 46, 11735.
4 M.P. Hogarth, T.R. Ralph, Platinum Met.
Rev. 2002, 46, 14664.
5 H.A. Gasteiger, S.S. Kocha, B. Sompalli,
F.T. Wagner, Appl. Catal. B 2005, 56,
935.
6 E. Antolini, J.R.C. Salgado, E.R.
Gonzalez, J. Power Sources 2006, 160,
95768.
7 E. Antolini, J.R.C. Salgado, E.R.
Gonzalez, Appl. Catal. B 2006, 63, 13749.
8 H. Liu, C. Song, L. Zhang, J. Zhang,
H. Wang, D.P. Wilkinson, J. Power Sources
2006, 155, 95110.
References
18 D.J. Guo, H.L. Li, J. Electroanal. Chem.
2004, 573, 197202.
19 Z. He, J. Chen, D. Liu, H. Zhou,
Y. Kuang, Diamond Relat. Mater. 2004,
13, 176470.
20 C. Wang, M. Waje, X. Wang, J.M. Tang,
R.C. Haddon, Y. Yan, Nano Lett. 2004, 4,
34548.
21 P.J. Britto, K.S.V. Santhanam, A. Rubio,
J.A. Alonso, P.M. Ajayan, Adv. Mater.
1999, 11, 15457.
22 J.M. Nugent, K.S.V. Santhanam, A. Rubio,
P.M. Ajayan, Nano Lett. 2001, 1, 8791.
23 J.J. Gooding, Electrochim. Acta 2005, 50,
304960.
24 W. Li, C. Liang, W. Zhou, H. Han,
Z. Wei, G. Sun, Q. Xin, Carbon 2002, 40,
79194.
25 W. Li, C. Liang, W. Zhou, J. Qiu, Z. Zhou,
G. Sun, Q. Xin, J. Phys. Chem. B 2003, 107,
629299.
26 Z. Liu, L.M. Gan, L. Hong, W. Chen, J.Y.
Lee, J. Power Sources 2005, 139, 7378.
27 M.M. Waje, X. Wang, W. Li, Y. Yan,
Nanotechnology 2005, 16, S395400.
28 X. Li, M. Hsing, Electrochim. Acta 2006,
51, 525058.
29 M.M. Shaijumon, S. Ramaprabhu, N.
Rajalakshmi, Appl. Phys. Lett. 2006, 88,
253105.
30 X. Wang, M. Waje, Y. Yan, Electrochem.
Solid-State Lett. 2005, 8, A4244.
31 W. Li, X. Wang, Z. Chen, M. Waje, Y. Yan,
Langmuir 2005, 21, 938689.
32 M. Carmo, V.A. Paganin, J.M. Rosolen,
E.R. Gonzalez, J. Power Sources 2005, 142,
16976.
33 Y. Liang, H. Zhang, B. Yi, Z. Zhang,
Z. Tan, Carbon 2005, 43, 314452.
34 G. Grishkumar, M. Rettker, R. Underhile,
D. Binz, K. Vinodgopal, P. McGinn,
P. Kamat, Langmuir 2005, 21, 848794.
35 E.S. Steigerwalt, G.A. Deluga, C.M.
Lukehart, J. Phys. Chem. B 2002, 106,
76066.
36 G. Grishkumar, T.D. Hall, K. Vinodgopal,
P. Kamat, J. Phys. Chem. B 2006, 110,
10714.
j173
174
References
87 S. Kambe, K. Murakoshi, T. Kitamura,
Y. Wada, S. Yanagida, H. Kominami,
Y. Kera, Solar Energy Mater. Solar Cells
2000, 61, 42741.
88 M. Adachi, Y. Murata, J. Takao, J. Jiu,
M. Sakamoto, F. Wang, J. Am. Chem. Soc.
2004, 126, 1494349.
89 B. Koo, J. Park, Y. Kim, S.H. Choi, Y.E.
Sung, T. Hyeon, J. Phys. Chem. B 2006,
110, 2431823.
90 M. Wei, Y. Konishi, H. Zhou, H. Sugihara,
H. Arakawa, J. Electrochem. Soc. 2006, 153,
A123236.
91 M.Y. Song, Y.R. Ahn, S.M. Jo, D.Y. Kim,
J. Y. Ahn, Appl. Phys. Lett. 2005, 87,
113113.
92 G.K. Mor, K. Shankar, M. Paulose, O.K.
Varghese, C.A. Grimes, Nano Lett. 2006,
6, 21518.
93 A. Zaban, S.G. Chen, S. Chappel, B.A.
Gregg, Chem. Commun. 2000, 223132.
94 G.R.R.A. Kumara, K. Tennakone, V.P.S.
Perera, A. Konno, S. Kaneko, M. Okuya,
J. Phys. D 2001, 34, 868873.
95 A. Kay, M. Gratzel, Chem. Mater. 2002, 14,
293035.
96 D. Menzies, Q. Dai, Y.B. Cheng, G.P.
Simon, L. Spiccia, Mater. Lett. 2005, 59,
189396.
97 M. Law, L.E. Greene, A. Radenovic,
T. Kuykendall, J. Liphardt, P. Yang, J. Phys.
Chem. B 2006, 110, 226526323.
98 A. Hauch, A. Georg, Electrochim. Acta
2001, 46, 345766.
99 X. Fang, T. Ma, G. Guan, M. Akiyama,
T. Kida, E. Abe, J. Electroanal. Chem. 2004,
570, 25763.
100 S. Katusic, P. Albers, R. Kern, F.M. Petrat,
R. Sastrawan, S. Hore, A. Hinsch, A.
Gutsch, Solar Energy Mater. Solar. Cells
2006, 90, 198399.
101 H.K. Liu, G.X. Wang, Z. Guo, J. Wang,
K. Konstantinov, J. Nanosci. Nanotechnol.
2006, 6, 115.
102 J.O. Besenhard, J. Yang, M. Winter,
J. Power Sources 1997, 68, 8790.
103 Y. Idota, T. Kubota, A. Matsufuji,
Y. Maekawa, T. Miyasaka, Science 1997,
276, 139597.
j175
176
j177
6
An Industrial Ecology Perspective
Shannon M. Lloyd, Deanna N. Lekas, and Ketra A. Schmitt
6.1
Introduction
6.1.1
Industrial Ecology
Industrial ecology (IE) is a framework for analyzing the impacts and interactions
of industrial, social and ecological systems. White [1] dened IE as the study of
the ows of materials and energy in industrial and consumer activities, of the effects
of these ows on the environment and of the inuences of economic, political,
regulatory and social factors on the ow, use and transformation of resources.
IE as a eld is fairly young. The intellectual underpinnings of IE can be found in
systems analysis research conducted by Forrester [2], research on the ows of
materials in economies by Ayres and Kneese [3] and the use of systems analysis
to evaluate environmental degradation trends [4, 5]. Two major developments in 1989
are generally cited as founding IE as a discipline. First, Ayres [6] developed
the concept of industrial metabolism, which compares processes for converting
materials, energy and labor into nished products and waste to living organisms.
Second, Frosch and Gallopoulos [7] published Strategies for Manufacturing, in
which they developed the biological analogy for industrial systems [8].
The primary objectives of IE are to understand how industrial and economic
systems behave and interact with ecological and social systems, to transition from
open systems to closed-loop systems where waste from one industry can be used an
input for another industry and to develop industrial and regulatory strategies that
function effectively with natural systems allowing resources to be replenished and
avoiding damage to biological and natural systems. The eld of IE encompasses
several related areas of research, practice and tools. The following list was identied
by the International Society for Industrial Ecology [9].
178
.
.
.
.
.
.
.
.
.
The conceptual framework provided by IE and the tools listed above can be used
retrospectively to evaluate the impacts of current industrial processes, consumer
activities and government regulations. However, when retrospective studies identify
specic changes for reducing negative ecological impact, it is often difcult to make
changes to existing products or practices. For example, decisions made during
product development determine what a product will be made of, how it will be
produced, where it will be produced, how it will be used and how it will be disposed.
Consequently, most of the costs and material, energy and environmental loadings
that will be experienced during a products life cycle are likely to be committed during
product development [1012]. Actual costs and environmental impact are not
realized until later in the product life cycle. Changing a product to reduce its
environmental impact after the product has been developed can cost orders of
magnitude more than making the change during product development [11]. If
infrastructures have been built around a commercialized product, it may be difcult
to make any changes at all.
Rather than wait until a product is developed and commercialized, IE concepts
and tools can be applied prospectively to provide a forward-looking analysis that
allows the design and manufacture of products in a manner that prevents or reduces
negative environmental impacts and interactions in nature. For example, life cycle
engineering (LCE), design for the environment (Df E) and green design approaches
are used during product development to estimate the environmental impacts of
different product designs and support decision-making aimed at reducing the
environmental impact of products [1214].
6.1.2
Applying Industrial Ecology to Nanotechnology
6.2
Life Cycle Assessment
6.2.1
Background on Life Cycle Assessment
The life cycle stages associated with a product are shown in Figure 6.1. Materials,
energy and labor are required to extract, process and transport raw materials and
to manufacture, transport, use, dispose of, reuse and recycle products. In addition to
consuming resources, the transformation of materials and energy into products
results in environmental discharges and generates waste. Evaluating the total
j179
180
To date, LCA has been used only on a limited basis to assess the potential life cycle
impacts of nanotechnology products. Table 6.2 provides a summary of LCAs of
nanotechnology-based products conducted to date. Additional studies have been
identied by Lekas [24]. For the most part, these studies focused on the life cycle
implications associated with reducing the mass of materials incorporated into
products and use-phase energy consumption. Evaluation of the impacts associated
with processing nanomaterials, manufacturing and retiring nanotechnology products and releasing engineered nanoparticles into the environment are limited by
lack of data on producing nanomaterials, recovery of materials from nanotechnology
products, fate and transport of engineered nanoparticles and risks from ecological
and human exposure to the numerous types of engineered nanoparticles.
There arecurrently insufcient life cycleinventory data for nanoscale manufacturing
processes. LCA studiesofnanotechnology-basedproducts conductedtodatehaverelied
on surrogate data for these processes. However, recent work has focused on quantifying
j181
182
Potential
benets
Potential
risks
Implication
Environmental impact
All
All
High-precision
manufacturing
Reduced waste
Manufacturing
Materials processing,
manufacturing
Materials processing
All
All
Increase in materials
required and waste during
manufacturing
Materials processing,
manufacturing
Materials processing
Self-assembly reactions
using toxic substances
Materials processing
End-of-life
the inputs and outputs associated with producing nanomaterials. For example, Zhang
et al. [25] qualitatively examined the energy requirements for several important
nanoscale manufacturing technologies. Preliminary ndings indicated that bottomup manufacturing processes are more energy intensive than top-down processes. With
bottom-up approaches, nanoscale structures with fundamentally new molecular
organization are built by precisely locating individual atoms and molecules where
they are needed. With top-down approaches,nanoscalestructuresare madebyreducing
the dimensions of a larger structure using machining and etching techniques.
Khannaand Bakshi[26]conducted a cradle-to-gate LCA toevaluatetwo nanoparticles:
nanoclay synthesis from montmorillonite clay and carbon nanobers via catalytic
pyrolysis of hydrocarbons on a metallic catalyst. Rather than re-lying on surrogate life
cycle inventory data for nanoscale manufacturing technologies, the inputs and outputs
Styrene synthesis
Displays
Lighting
Conventional pulverization
technique
Not specied
Not specied
Automotive catalysts
Copying/printing
Electronics, computing
Medical applications
Fluharty [59]
Chromium-based coatings
Bulk titania
Solgel
Water, solvent and powder
varnish
Not specied
Not specied
Emulsion aggregation
Nanotube-based catalytic
converter
Nano-coatings
Nano-sized titania
Alumoxane nanoparticles
Solgel nano-varnish
Nanoclay-based
nanocomposite
Europa [55]
Beaver [56]
Nanotechnology
Conventional
Application
Reference
j183
184
6.3
Substance Flow Analysis
6.3.1
Background on Substance Flow Analysis
Faced with many unknowns about nanomaterials and their penetration into our
everyday lives, SFA provides an approach for investigating the quantity and location
of specic nanomaterials in the economy (e.g., the quantity of carbon nanotubes
that are produced and exist in Japan versus the USA). It is less useful to compare
quantities of nanomaterials in general because the properties, effectiveness and
hazard potential differ by nanomaterial.
Consideration of quantity may also be useful for characterizing the dematerialization or waste reduction that may result when nanomaterials replace conventional
or larger materials. For instance, a back-of-the-envelope MFA calculation on lead in
cathode-ray tube (CRT) monitors revealed that disposing of one CRT monitor sends
as much as 0.45 kg of lead to a landll; switching to nanotechnology-based substitutes
such as at panel displays with organic light-emitting diodes will reduce this type of
lead waste in the future [32, 33]. By nding out where nanomaterials are produced
and what they are used for, we can better understand where they will ultimately end
up and who will be exposed to nanomaterials during the production, distribution or
use of products containing these substances.
SFA can be an appropriate tool when the material of interest is linked to a particular
impact and thus warrants a more focused analysis on the stocks and ows and
concentrations in the environment [34]. Additionally, by identifying large accumulations of nanomaterials, SFA may also highlight unexpected impact areas.
6.3.3
Summary of Substance Flow Analysis Work Conducted to Date
Few SFAs have been conducted on nanomaterials to date. One of the authors
performed a preliminary SFA on carbon nanotubes [35]. Carbon nanotubes were
chosen because of the growing interest in, manufacture, and use of these materials
combined with increased concerns about their potential risks. For this analysis,
carbon nanotube production and use information was gathered from the literature
(both journals and news sources) and nanotube company Web sites. Information on
nanotube production, raw material inputs and nanotube destination was requested
from nanotube producers identied in a Small Times Magazine survey [36] and other
producers identied during initial research.
j185
186
Figure 6.3 Approximate substance flow diagram for carbon nanotubes (2004).
Since nanomaterials are relatively new and information on their production and use
is often proprietary, this SFA required estimation and assumptions to characterize
production, use and end-of-life ows of carbon nanotubes based on available information. Since the most comprehensive data were available on production, the initial
analysis focused on this life cycle stage. Information provided by one rm was used to
approximate the inputs and outputs from carbon nanotube growth and then as a basis
for characterizing carbon nanotubes at other life stages. The dissipation of carbon
nanotubes into various end uses was estimated using carbon nanotube patent lings
for eight broad application categories. Sales data or product-specic projections may
also provide ways to approximate nanomaterial penetration into end uses. End-of-life
outcomes for nanotubes was modeled using surrogate data from existing waste
management scenarios published by the US Environmental Protection Agency (EPA)
for US municipal solid waste. In reality, different products and applications will likely
result in different waste management outcomes. For instance, electronics containing
carbon nanotubes may be refurbished and reused, whereas carbon nanotubes in
synthesis and processing may be incinerated.
The ow diagram in Figure 6.3 presents a general overview of the SFA ndings
on carbon nanotubes. Estimates presented here are generalized based on limited
information, a small amount of nanotubes in commercial applications and a rapidly
changing market. However, the ndings improve our understanding of carbon
nanotube ows throughout the economy. SFA provides a snapshot of a substances
ow in the economy at a given time, providing a better understanding of where
these materials exist and are expected to go. Widespread projected application of
these nanomaterials in many everyday products (from vehicle composites to tennis
racquets and batteries) amidst uncertain risks makes the characterization of the ow
of nanotubes increasingly important.
6.4
Corporate Social Responsibility
6.4.1
Background on Corporate Social Responsibility
6.4.2
Corporate Social Responsibility Implications for Nanotechnology
Small Times Magazine estimated that over 3500 rms were involved in some aspect
related to micro- or nanotechnology [37]. This includes companies that research and
manufacture nanomaterials, nanotechnology-based products and equipment for
nano-production, trade magazines and rms that offer legal expertise, venture
capital investment and intellectual property strategizing on nanotechnology. Nanomaterials are used in numerous sectors, including energy, cosmetics, medicine and
electronics.
Nanomaterials offer signicant opportunities for decreased environmental impacts in terms of materials and energy consumption, and also the possibility of
targeted human health benets in the future. Along with the potential benets also
comes the potential for uncertain risks. Uncertainty in the risks of nanotechnology
stem from scientic uncertainty in several areas, such as transport and dispersion of
nanomaterials; ability of nanomaterials to enter biological systems; potential mode of
action and toxicological effects of nanomaterials; suitability of traditional worker
protection techniques in nano-environments; and the ability of the life cycle benets
of specic nanotechnology applications to outweigh the life cycle costs.
The level of uncertainty surrounding nanotechnology makes the work of protecting
the public, workers and the environment and communicating about potential risks
and benets challenging for CSR-committed corporations. How can a corporate entity
prevent or reduce exposures when the mode of exposure is not understood? How can a
corporation communicate a complex, uncertain message to the public? Given the
potential advantages and risks offered by nanomaterials, what is the appropriate path
forward? These challenges also represent a signicant opportunity for CSR-oriented
rms to collaborate with regulators, non-governmental organisations (NGOs), and
other rms to achieve common corporate social responsibility. By being proactive in
seeking partners, sharing information and conducting research, nanorms can hope
to reduce risk and obtain competitive advantage.
Increased public awareness of the potential risks of nanotechnology, the lack
of sound strategies for identifying and assessing these risks and increased production and use of nanomaterials prompted NGOs, activist groups and members of the
scientic community to call for more research investigating nanotechnologys
j187
188
risks [3841]. For example, a report prepared for Greenpeace called for in-depth
assessments of the environmental risks of near-term nanotechnology specically
products or processes that might result in the release of nanoparticles into the
environment [40]. The Canadian-based NGO ETC Group [38, 39] called for a global
moratorium on the commercial production of nanomaterials. In particular, ETC
Group [39] cautioned against scaling up of nanomaterial production without understanding the potential adverse side-effects from using them in many diverse
commercial applications. Similar campaigns launched by activist groups have
resulted in backlash against genetically modied (GM) crops. In the light of GM
setbacks and a responsibility to balance pursuit of science with sensible precaution,
editorials in Nature warned against leaving legitimate questions unanswered [41, 42].
Instead, they called for more research into nanotechnology risks accompanied by an
open public discussion.
Companies face potential nancial risks from the investment and insurance
communities who are paying greater attention to the environmental, health and
safety (EHS) discussions surrounding nanotechnology. Insurers are encouraging
more rigorous review on both the opportunities and hazards of nanotechnology. A
report by the Swiss Reinsurance Company asserts that the insurance industry should
analyze nanotechnology to identify potential risks [43]. MunichRe, GenRE and
Allianz have also published reports on the need for nanotechnology risk assessment [44]. Cientica urges companies working with nanomaterials to prepare for the
potential impact of future liabilities or changes in legislation [53].
Companies that take the lead in dealing with EHS management and public
perception concerns with nanotechnology (i.e. are rst movers in their industry
group) may attain competitive advantage. They will not only be ready for possible
regulation and may avoid retrotting operations later (in the event of future nanospecic regulations), but they may also appeal to consumers who value CSR efforts and
be in the position to inuence regulatory or reporting schemes. For example, the CEO
of NanoDynamics testied before Congress [45] and DuPont and Environmental
Defense worked together to develop a framework for the responsible development,
production, use and disposal of nanomaterials and products [30, 46]. By taking
measures to consider and evaluate environmental, health and safety risks that
nanomaterials may pose, corporations may avoid future risks, backlash and liability.
6.4.3
Summary of Work Conducted to Understand Nanofirm EHS Concerns and Actions
6.5 Conclusions
risks and what they could be doing to protect their workers, the environment and
consumers. A few examples include the following:
.
Lux Research [48] interviewed nanotechnology rms and found that companies
want more certainty in the type of regulations to expect. In order to handle real,
perceived and regulatory risks, Lux Research suggested that rms inventory
nanomaterials, map them to exposures throughout the life cycle, characterize the
risks with available knowledge and mitigate them with appropriate controls,
toxicity testing and product redesigns.
Lekas [49] surveyed nanotechnology startup rms in Connecticut and New York
and found that they have varied concerns and degrees of progress in addressing
EHS issues; they indicated that they need information and guidance and prefer
communication about nanotechnology risks through an electronic or online venue
from a government source.
Lindberg and Quinn [50] surveyed nanotechnology rms in the northeastern USA
and learned that small rms, in particular, need a roadmap from suppliers,
industry, and government bodies to manage risks.
A European Commission survey of 380 European nanotechnology startups indicated that they do not consider social acceptance and environmental and health
regulations as important barriers for the applications of nanomaterials [51].
6.5
Conclusions
j189
190
methods to reduce or mitigate potential risks. When applied prospectively, IE tools can
be used to provide a forward-looking analysis that permits the design and manufacture
of nanotechnology-based products in a manner that prevents or reduces their negative
impacts. The current level of uncertainty surrounding the potential impacts and risks
of nanotechnology present a signicant challenge in applying IE tools. As scientic
understanding of nanotechnologys environmental and health impacts evolves, the
efcacy of IE in mitigating potential harm and designing more effective products
using nanotechnology will continue to increase.
References
1 White, R. (1994) in The Greening of
Industrial Ecosystems. Preface, (eds B.
Allenby and D. Richards), The National
Academy of Engineering, National
Academy Press, Washington, DC, vvi.
2 Forrester, J.W. (1968) Principles of Systems,
Wright-Allen Press, Cambridge, MA.
3 Ayres, R.U. and Kneese, A.V. (1969)
Production, consumption and
externalities. American Economic Review,
59 (3), 282297.
4 Meadows, D., Randers, J. and Meadows, D.
(1972) Limits to Growth, Universe Books,
New York.
5 Garner, A., and Keoleian, G.A. (1995)
Industrial Ecology: an Introduction,
National Pollution Prevention Center for
Higher Education, University of Michigan,
Ann Arbor, MI.
6 Ayres, R. (1989) Industrial metabolism.
in Technology and Environment, (eds J.H.
Ausubel and Sladovich), National
Academy Press, Washington, DC, 2349.
7 Frosch, R.A. and Gallopoulos, N.E. (1989)
Strategies for manufacturing. Scientic
American, 189 (3), 144152.
8 Ehrenfeld, J. (2002) Industrial ecology:
coming of age. Environmental Science
and Technology, 36 (13), 280A285A.
9 ISIE (2006) A History of Industrial
Ecology. International Society for
Industrial Ecology. http://www.is4ie.org/
history.htm [Accessed 14 October 2006].
10 Ullmann, D.G. (1992) The Mechanical
Design Process, McGraw-Hill, New York.
References
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
j191
192
47
48
49
50
51
52
53
54
55
References
http://ec.europa.eu/research/industrial_
technologies/articles/article_346_en.html
[Accessed 9 October 2006].
56 Beaver, E. (2004) Implications of
Nanomaterials Manufacture and Use,
presentation given at US EPA Nanotechnology STAR Progress Review Workshop.
57 Steinfeldt, M., Petschow, U., Haum, R.
and von Gleich, A. (2004) Nanotechnology
and Sustainability. Institute for Ecological
Economy Research, Discussion Paper
IOEW 65/04, October.
58 Lloyd, S.M., Lave L.B. and Matthews, H.S.
(2005) Life cycle benets of using
nanotechnology to stabilize PGM
particles in automotive catalysts.
j193
j195
7
Composition, Transformation and Effects of Nanoparticles
in the Atmosphere
Ulrich Poschl
7.1
Introduction
The effects of airborne particles in the nanometer to micrometer size range (aerosols)
on the atmosphere, climate and public health are among the central topics in current
environmental research [1, 2]. The particles scatter and absorb solar and terrestrial
radiation, they are involved in the formation of clouds and precipitation as cloud
condensation and ice nuclei and they affect the abundance and distribution of
atmospheric trace gases by heterogeneous chemical reactions and other multiphase
processes [36]. Moreover, they play an important role in the spread of biological
organisms, reproductive materials and pathogens (pollen, bacteria, spores, viruses,
etc.) and they can cause or enhance respiratory, cardiovascular, infectious and allergic
diseases [3, 79].
An aerosol is generally dened as a suspension of liquid or solid particles in a gas,
with particle diameters in the range 10 910 4 m (lower limit, molecules and
molecular clusters; upper limit, rapid sedimentation) [6, 9]. The most evident
examples of aerosols in the atmosphere are clouds, which consist primarily of
condensed water with particle diameters on the order of 10 mm. In atmospheric
science, however, the term aerosol traditionally refers to suspended particles which
contain a large proportion of condensed matter other than water, whereas clouds are
considered as separate phenomena [10].
Atmospheric aerosol particles originate from a wide variety of natural and
anthropogenic sources. Primary particles are directly emitted as liquids or solids
from sources such as biomass burning, incomplete combustion of fossil fuels,
volcanic eruptions and wind-driven or trafc-related suspension of road, soil and
mineral dust, sea salt and biological materials (plant fragments, microorganisms,
pollen, etc.). Secondary particles, on the other hand, are formed by gas-to-particle
conversion in the atmosphere (new particle formation by nucleation and condensation of gaseous precursors). As illustrated in Figure 7.1, airborne particles undergo
various physical and chemical interactions and transformations (atmospheric aging),
196
Figure 7.1 Atmospheric cycling of airborne nano- and microparticles (aerosols) [1, 2].
7.1 Introduction
For different locations, times, meteorological conditions and particle size fractions,
however, the relative abundance of different chemical components can vary by an
order of magnitude or more [3, 6, 11, 15]. In atmospheric research, the term ne air
particulate matter is usually restricted to particles with aerodynamic diameters
1 mm or 2.5 mm (PM1 or PM2.5, respectively). In air pollution control, it sometimes
also includes larger particles up to 10 mm (PM10).
The total number concentration of aerosol particles in the atmosphere is usually
dominated by nanoparticles with diameters up to 100 nm, whereas the total mass
concentration is generally dominated by particles with diameters >100 nm (microparticles). Characteristic examples of particle number concentration, size distribution and chemical composition of ne particulate matter in urban and high alpine air
are illustrated in Figure 7.2. The displayed particle number size distributions (particle
number concentration per logarithmic decade of particle diameter, dN/d log dp,
plotted against particle diameter) have been observed in the city of Munich
[500 m above sea level (asl); 814 December 2002] and at the Schneefernerhaus
research station on Mount Zugspitze (2600 m asl; 6 November 2002) in Southern
Germany. They correspond to total particle number concentrations of about 102 cm3
in alpine air and 104 cm 3 in urban air and to particle mass concentrations of about 1
j197
198
7.2 Composition
7.2
Composition
j199
200
particles, which can be pictured as more or less disordered stacks of graphene layers
or large polycyclic aromatics [2729]. Depending on the applied optical or thermochemical methods (absorption wavelength, temperature gradient, etc.), however, BC
and EC measurements also include the carbon content of colored and refractory
organic compounds, which can lead to substantially different results and strongly
limits the comparability and suitability of BC, EC and OC data for the determination
of mass balances and physicochemical properties of air particulate matter.
Nevertheless, most information available on the abundance, properties and effects
of carbonaceous aerosol components so far is based on measurement data of TC, BC/
EC and OC [21, 25]. These data are now increasingly complemented by measurements of water-soluble organic carbon (WSOC), its macromolecular fraction
(MWSOC) and individual organic compounds as detailed below. Moreover, the
combination of thermochemical oxidation with 14 C isotope analysis (radiocarbon
determination in evolved CO2 by accelerator mass spectrometry) allows one to
distinguish fossil fuel combustion and other sources of carbonaceous aerosol
components. Recent results conrm that the EC is dominated by fossil fuel
combustion and indicate highly variable anthropogenic and biogenic sources and
proportions of OC [30].
Characteristic mass concentrations and concentration ratios of ne air particulate
matter (PM2.5) and carbonaceous fractions in urban, rural and alpine air in central
Europe are summarized in Table 7.1. The reported data were obtained on an altitude
transect through Southern Germany, from the city of Munich (500 m asl), via the
meteorological observatory Hohenpeissenberg (1000 m asl), to the environmental
research station Schneefernerhaus on Mount Zugspitze (2600 m asl) throughout the
period 20012003. The sampling locations and measurement procedures have been
described in detail elsewhere [31, 32] and the results are consistent with those of other
studies performed at comparable locations [11, 13, 15, 1821].
7.2 Composition
Table 7.1 Characteristic mass concentrations of fine particulate
matter (PM2.5) and proportions of total carbon (TC), elemental
carbon (EC), organic carbon (OC), water-soluble OC (WSOC) and
macromolecular WSOC (MWSOC, molecular mass >5 kDa) mass
in urban, rural and high alpine air in central Europe (rounded
arithmetic mean values standard deviation of about 30 filter
samples collected at each location over the period 20012003).
Urban (Munich)
Rural (Hohenpeissenberg)
Alpine (Zugspitze)
20 10
40 20
50 20
40 20
20 10
30 10
10 5
30 10
30 10
70 10
40 20
50 20
42
20 10
30 10
70 10
60 20
40 20
On average, the total PM2.5 mass concentration decreases by about a factor of two
from urban to rural and from rural to alpine air, while the TC mass fraction decreases
from 40 to 20%. The EC/TC ratios in PM2.5 are as high as 50% in the urban air
samples taken close to a major trafc junction and on the order of 30% in rural and
high alpine air, demonstrating the strong impact of diesel soot and other fossil fuel
combustion or biomass burning emissions on the atmospheric aerosol burden and
composition. The water-soluble fraction of organic carbon (WSOC in OC), on the
other hand, exhibits a pronounced increase from urban (20%) to rural (40%) and
high alpine (60%) samples of air particulate matter. This observation can be
attributed to different aerosol sources (e.g. water-insoluble combustion particle
components versus water-soluble biogenic and secondary organic particle components; see below) but also to chemical aging and oxidative transformation of organic
aerosol components, which generally increases the number of functional groups and
thus the water solubility of organic molecules.
Black or elemental carbon accounts for most of the light absorption by atmospheric
aerosols and is therefore of crucial importance for the direct radiative effect of
aerosols on climate [3335]. Despite a long tradition of soot and aerosol research,
however, there is still no universally accepted and applied operational denition of BC
and EC. Several studies have compared the different optical and thermal methods
applied by atmospheric research groups to measure BC and EC. Depending on
techniques and measurement locations, fair agreement has been found in some
cases, but mostly the results deviated considerably (up to 100% and more) [3638].
Optical methods for the detection of BC are usually non-destructive and allow
(near) real-time operation, but on the other hand they are particularly prone to
misinterpretation. They generally rely on the assumptions that BC is the dominant
absorber and has a uniform mass-specic absorption coefcient or cross-section.
While these assumptions may be justied under certain conditions, they are highly
questionable in the context of detailed chemical characterization of aerosol particles
(How black is black carbon?) [26]. In addition to different types of graphite-like
j201
202
material, there are at least two classes of organic compounds which can contribute to
the absorption of visible light by air particulate matter (light-absorbing, yellow or
brown carbon): [21] polycyclic aromatics and humic-like substances. Therefore,
optically determined BC values have to be considered as mass equivalent values but
not as absolute mass or concentration values. Moreover, most conventional optical
methods such as the aethalometer and integrating-sphere- and integrating-plate
techniques are based on the measurement of light extinction rather than absorption.
As a consequence, these methods require aerosol composition-dependent calibrations or additional sample work-up processes to compensate or minimize the
inuence of scattering aerosol components such as inorganic salts and acids on
the measurement signal [37, 39, 40]. Alternatively, photoacoustic spectroscopy allows
direct measurements of light absorption by airborne aerosol particles and in recent
years several photoacoustic spectrometers have been developed and applied for the
measurement of aerosol absorption coefcients and BC equivalent concentrations [4143].
Among the few methods available for the characterization of the molecular and
crystalline structures of BC and EC (graphite-like carbon proportion and degree of
order) are high-resolution electron microscopy, X-ray diffraction and Raman spectroscopy [28, 29, 44]. These measurement techniques have revealed dependences of
the microstructure and spectroscopic properties of ame soot, diesel soot and related
carbonaceous materials on the processes and conditions of particle formation and
aging. So far, however, these methods have been too labor intensive for routine
investigations of atmospheric aerosol samples and their applicability to quantitative
analyses remains to be proven [28, 29]. Nevertheless, recently developed measurement systems promise to allow the quantication of graphite-like carbon and soot in
aerosol lter samples by Raman spectroscopy [45, 46].
7.2.2
Primary and Secondary Organic Components
The total mass of organic air particulate matter (OPM), i.e. the sum of organic
aerosol (OA) components, is usually estimated by multiplication of OC by a factor of
about 1.52, depending on the assumed average molecular composition and
accounting for the contribution of elements other than carbon contained in organic
substances (H, O, N, S, etc.) [21, 47]. The only way, however, to determine accurately
the overall mass, molecular composition, physicochemical properties and potential
toxicity of OPM is the identication and quantication of all relevant chemical
components. Also, trace substances can be hazardous to human health and potential
interferences of refractive and colored organics in the determination of BC or EC can
be assessed only to the extent to which the actual chemical composition of OPM is
known [26, 48].
Depending on their origin, OA components can be classied as primary or
secondary. Primary organic aerosol (POA) components are directly emitted in the
condensed phase (liquid or solid particles) or as semi-volatile vapors, which are
condensable under atmospheric conditions. The main sources of POA particles and
7.2 Composition
components are natural and anthropogenic biomass burning (wildres, slashing and
burning, domestic heating); fossil fuel combustion (domestic, industrial, trafc
related); and wind-driven or trafc-related suspension of soil and road dust, biological
materials (plant and animal debris, microorganisms, pollen, spores, etc.), sea spray
and spray from other surface waters with dissolved organic compounds.
Secondary organic aerosol (SOA) components are formed by chemical reaction
and gas-to-particle conversion of volatile organic compounds (VOCs) in the atmosphere, which may proceed via different pathways:
1. new particle formation: formation of semi-volatile organic compounds (SVOCs)
by gas-phase reactions and participation of the SVOCs in the nucleation and
growth of new aerosol particles;
2. gasparticle partitioning: formation of SVOCs by gas-phase reactions and uptake
(adsorption or absorption) by pre-existing aerosol or cloud particles;
3. heterogeneous or multiphase reactions: formation of low-volatility or non-volatile
organic compounds (LVOCs, NVOCs) by chemical reaction of VOCs or SVOCs at
the surface or in the bulk of aerosol or cloud particles.
The formation of new aerosol particles from the gas phase generally proceeds via the
nucleation of nanometer-sized molecular clusters and subsequent growth by condensation of condensable vapor molecules. Experimental evidence from eld measurements and model simulations suggest that new particle formation in the atmosphere is
most likely dominated by ternary nucleation of H2SO4-H2ONH3 and subsequent
condensation of SVOCs [4951]. Laboratory experiments and quantum chemical
calculations indicate, however, that SVOCs might also play a role in the nucleation
process (H2SO4SVOC complex formation) [52]. The actual importance of different
mechanisms of particle nucleation and growth in the atmosphere has not yet been fully
unraveled and quantied. In any case, however, the formation of new particles exhibits
a strong and non-linear dependence on atmospheric composition and meteorological
conditions, may be inuenced by ions and electric charge effects and competes with
gasparticle partitioning and heterogeneous or multiphase reactions [53]. Among the
principal parameters governing secondary particle formation are temperature, relative
humidity and the concentrations of organic and inorganic nucleating and condensing
vapors, which depend on atmospheric transport in addition to local sources and sinks
such as photochemistry and pre-existing aerosol or cloud particles [23, 25, 49, 50]. The
rate and equilibrium of SVOC uptake by aerosol particles depend on the SVOC
accommodation coefcients and on the particle surface area, bulk volume and
chemical composition (kinetics and thermodynamics of gasparticle partitioning) [54].
Most earlier studies of SOA formation had focused on pathways (1) and (2). Several
recent studies indicate, however, that heterogeneous and multiphase reactions may
also play an important role and substantially contribute to the overall atmospheric
burden of OPM [21, 5558]. The term heterogeneous reaction generally refers to
reactions of gases at the particle surface, whereas the term multiphase reaction refers
to reactions in the particle bulk involving species from the gas phase. A variety of
different reversible and irreversible mechanisms of acid-catalyzed condensation and
j203
204
Substance classes
Proportion
Sources
Aliphatic hydrocarbons
Aliphatic alcohols and carbonyls
Levoglucosan
Fatty acids and other alkanoic acids
Aliphatic dicarboxylic acids
Aromatic (poly-)carboxylic acids
Multifunctional aliphatics and aromatics
(OH, CO, COOH)
Polycyclic aromatic hydrocarbons
(PAHs)
Nitro- and oxy-PAHs
102
102
101
101
101
101
101
103
101
102
101
103
7.3 Transformation
Several studies have shown that macromolecules such as cellulose and proteins
(molecular mass 1 kDa) and other compounds with relatively high molecular
mass (100 Da) such as humic-like substances (HULIS) account for large proportions of OPM and WSOC [21, 32, 6872]. Obviously, biopolymers and humic
substances are emitted as POA components (soil and road dust, sea-spray, biological
particles), which may be modied by chemical aging and transformation in the
atmosphere (e.g. formation of HULIS by oxidative degradation of biopolymers).
On the other hand, organic compounds with high molecular mass can also
originate from SOAs formation by heterogeneous and multiphase reactions at the
surface and in the bulk of atmospheric particles as outlined above (SOA oligomers/
polymers).
For the identication and quantication of individual organic compounds, lter
and impactor samples are usually extracted with appropriate solvents and the extracts
are analyzed by advanced instrumental or bioanalytical methods of separation and
detection: gas and liquid chromatography; capillary electrophoresis; absorption,
uorescence and mass spectrometry; immunosorbent, enzyme and dye assays;
etc. [26, 32, 62, 63, 66, 7376]. Alternatively, deposited or suspended particles can
be partially or fully vaporized by thermal or laser desorption and directly introduced
into a gas chromatograph or spectrometer [3, 7780].
In recent studies, nuclear magnetic resonance [18], Fourier transform infrared
spectroscopy [47], scanning transmission X-ray microscopy [81] and aerosol mass
spectrometry [82] have been applied to the efcient characterization and quantication of functional groups in OPM (alkyl, carbonyl, carboxyl and hydroxy groups;
carbon double bonds and aromatic rings). These methods give valuable insight into
the overall chemical composition, oxidation state and reactivity of OPM, but they
provide only limited information about the actual identity of the individual compounds that present in the complex mixture. The molecular mass and structure of
organic compounds, however, are crucial parameters for their physicochemical and
biological properties and thus for their climate and health effects (volatility, solubility,
hygroscopicity, CCN and IN activity, bioavailability, toxicity, allergenicity; Sections 7.3
and 7.4).
7.3
Transformation
Chemical reactions proceed at the surface and in the bulk of solid and liquid aerosol
particles and can inuence atmospheric gas-phase chemistry and properties of atmospheric particles and their effects on climate and human health[3, 6, 54, 8392].
For example, aerosol chemistry leads to the formation of reactive halogen species,
changes of reactive nitrogen and depletion of ozone especially in the stratosphere,
upper troposphere and marine boundary layer [93103] On the other hand, chemical
aging of aerosol particles generally changes their composition, decreases their
reactivity, increases their hygroscopicity and cloud condensation activity and can
change their optical properties [21, 26, 89, 104109].
j205
206
7.3 Transformation
hydrates, soot or mineral dust; fresh or aged surfaces; low or high reactant concentration levels, transient or (quasi-)steady-state conditions; limited selection of chemical species and reactions (see Ref. [54] and references therein). The different and
sometimes inconsistent rate equations, parameters and terminologies make it
difcult to compare, extrapolate and integrate the results of different studies over
the wide range of reaction conditions relevant for the atmosphere, laboratory
experiments, technical processes and emission control.
A comprehensive kinetic model framework for aerosol and cloud surface chemistry and gasparticle interactions has recently been proposed by P
oschl, Rudich and
Ammann, abbreviated to PRA [54]. It allows to describe mass transport and chemical
reactions at the gasparticle interface and to link surface processes with gas-phase
and particle bulk processes in aerosol and cloud systems with unlimited numbers of
chemical components and physicochemical processes. The key elements and essential aspects of the PRA framework are as follows:
1. a simple and descriptive double-layer surface model (sorption layer and quasistatic layer);
2. straightforward and additive ux-based mass balance and rate equations;
3. clear separation of mass transport and chemical reactions;
4. well-dened rate parameters (uptake and accommodation coefcients, reaction
and transport rate coefcients);
5. clear distinction between different elementary and multistep transport processes
(surface and bulk accommodation, etc.);
6. clear distinction between different elementary and multistep heterogeneous and
multiphase reactions (LangmuirHinshelwood and EleyRideal mechanisms,
etc.);
7. mechanistic description of complex concentration and time dependences;
8. exible inclusion or omission of chemical species and physicochemical
processes;
9. exible convolution or deconvolution of species and processes;
10. full compatibility with traditional resistor model formulations.
Figure 7.6 illustrates the PRA model compartments and elementary processes at
the gasparticle interface. The individual steps of mass transport are indicated by
bold arrows besides the model compartments: gas-phase diffusion; reversible
adsorption; mass transfer between sorption layer, quasi-static surface layer and
near-surface particle bulk; diffusion in the particle bulk. The slim arrows inside the
model compartments represent different types of chemical reactions: gas-phase
reactions; gassurface reactions; surface layer reactions; surfacebulk reactions;
particlebulk reactions [54]. Exemplary practical applications and model calculations
demonstrating the relevance of these aspects have been presented in a companion
paper [114].
The PRA framework is meant to serve as a common basis for experimental and
theoretical studies investigating and describing atmospheric aerosol and cloud
surface chemistry and gasparticle interactions. In particular, it will support the
following research activities: planning and design of laboratory experiments for the
j207
208
Organic aerosol components and the surface layers of BC or EC can react with
atmospheric photo-oxidants (OH, O3, NO3, NO2, etc.), acids (HNO3, H2SO4, etc.),
water and UV radiation. The chemical aging of OA components basically follows the
generic reaction pathways outlined in Figure 7.7 and it tends to increase the oxidation
state and water solubility of OC. In analogy with atmospheric gas-phase photochemistry of VOCs (methane, isoprene, terpenes, etc.) [59, 116, 117], oxidation, nitration,
hydrolysis and photolysis transform hydrocarbons and derivatives with one or few
functional groups into multifunctional hydrocarbon derivatives. The cleavage of
organic molecules and release of SVOCs, VOCs, CO or CO2 can also lead to
volatilization of OPM. On the other hand, oxidative modication and degradation
of biopolymers may convert these into HULIS (analogy with the formation of humic
substances in soil, surface water and groundwater processes). Moreover, condensation reactions and radical-initiated oligo- or polymerization can decrease the volatility
of OA components and promote the formation of SOAs particulate matter (SOA
oligomers or HULIS, respectively; Table 7.2; Section 7.2.2).
7.3 Transformation
The actual reaction mechanisms and kinetics, however, have been elucidated and
fully characterized only for a small number of model reaction systems and components. So far, most progress has been made in the kinetic investigation and modeling
of chemical reactions in cloud droplets [120, 121]. For the reasons outlined above,
very few reliable and widely applicable kinetic parameters are available for organic
reactions at the surface and in the bulk of liquid and solid aerosol particles [21, 54, 89, 122124].
Several studies have shown that surface reactions of organic molecules and black
or elemental carbon with gaseous photo-oxidants such as ozone and nitrogen dioxide
tend to exhibit non-linear concentration dependences and competitive co-adsorption
of different gas-phase components, which can be described by LangmuirHinshelwood reaction mechanisms and rate equations [54, 84, 89, 114, 125].
An example of such reactions is the degradation of benzo[a]pyrene (BaP) on soot
by ozone. BaP is a polycyclic aromatic hydrocarbon (PAH) and prominent air
pollutant with the chemical formula C20H12, consisting of ve six-membered
aromatic rings. It is one of the most hazardous carcinogens and mutagens among
the 16 priority polycyclic aromatic hydrocarbon pollutants dened by the US
Environmental Protection Agency (EPA). The main source of BaP in the atmosphere
is combustion aerosols and it resides to a large extent on the surface of soot
particles [48, 90, 110].
Figure 7.8 shows pseudo-rst-order rate coefcients for the degradation of BaP on
soot by ozone at gas-phase mole fractions or volume mixing ratios (VMR) up to 1 ppm
under dry conditions and in the presence of water vapor [relative humidity (RH) 25%,
296 K, 1 atm]. These and complementary results of aerosol ow tube experiments and
model calculations indicate reversible and competitive adsorption of O3 and H2O,
followed by a slower, rate-limiting surface reaction between adsorbed O3 and BaP on
the soot surface. The kinetic parameters determined from the displayed non-linear
least-squares ts (maximum pseudo-rst-order rate coefcients and effective Langmuir adsorption equilibrium constants) allow the prediction of the half-life (50%
decay time) of BaP on the surface of soot particles in the atmosphere. At typical
ambient ozone VMR of 30 ppb it would be only 5 min under dry conditions and
15 min at 25% RH.
j209
210
Figure 7.9 illustrates the recovery ratio (RR) of BaP from ne air particulate matter
(PM2.5) collected with a regular lter sampling system from urban air at ambient
ozone VMRs up to 80 ppb (Munich, 20012002). The plotted RRs refer to lter
samples collected in parallel with a system that removes ozone and other photooxidants from the sample air ow by means of an activated carbon diffusion
denuder [110]. Thus deviations from unity represent the fraction of BaP degraded
by reaction with ozone and other photo-oxidants from the sampled air during the
7.3 Transformation
sampling process, i.e. the BaP loss by lter reaction sampling artifacts. The BaP
recovery ratio is nearly identical with the recovery ratio to the sum of all particlebound ve- and six-ring US EPA priority PAH pollutants, PAH(5,6), and exhibits a
negative linear correlation with ambient ozone. It decreases from unity at low ozone
to 0.5 at 80 ppb O3, which is a characteristic concentration level for polluted urban
air in summer. Similar correlations have been observed in experiments performed at
different locations and with different lter sampling and denuder systems [110].
With regard to chemical kinetics, the linear correlation between PAH recovery
ratio and O3 VMR can be attributed to the near-linear dependence of the PAH
degradation rate coefcient on O3 at low VRMs (VMR inverse of effective
adsorption equilibrium constant; Figure 7.8) [54, 84, 90, 114]. Moreover, it indicates
efcient protection and shielding of the PAH on deposited particles from further
decay by coverage with subsequently collected particulate matter (build-up of lter
cake) on time scales similar to the half-life of PAH at the surface. Otherwise, the
PAH recovery should be even lower and the ozone concentration dependence should
be less pronounced.
In any case, the sampling artifacts observed by Schauer and co-workers and
illustrated in Figure 7.9 imply that the real concentrations of particle-bound PAHs in
urban air are up to 100% higher than the measurement values obtained with simple
lter sampling systems (without activated carbon diffusion denuder or equivalent
equipment) as applied for most atmospheric research and air pollution monitoring
purposes [48, 90, 110]. Clearly, other OA components with similar or higher reactivity
towards atmospheric oxidants (e.g. alkenes) are also prone to similar or even stronger
sampling artifacts, which have to be avoided or at least minimized and quantied for
accurate and reliable determination of atmospheric aerosol composition and properties. These and other potential sampling and analytical artifacts caused by reactive
transformation of ne air particulate matter have to be taken into account not only in
atmospheric and climate research activities, but also in air pollution control. In
particular, the control and enforcement of emission limits and ambient threshold
level values for OA components which pose a threat to human health (Section 7.4.2)
require the development, careful characterization and validation and correct application of robust analytical techniques and procedures [48].
As far as the atmospheric aerosol cycling and feedback loops are concerned
(Figures 7.1 and 7.3), chemical aging and oxidative degradation of organics present
on the surface and in the bulk generally make aerosol particles more hydrophilic or
hygroscopic and enhance their ability to act as a CCN. Besides their contribution to
the water-soluble fraction of particulate matter, partially oxidized organics can act as
surfactants and inuence the hygroscopic growth, CCN and IN activation of aerosol
particles (Section 7.3.2).
The chemical reactivity of carbonaceous aerosol components also plays an important role in technical applications for the control of combustion aerosol emissions.
For example, the lowering of emission limits for soot and related diesel exhaust
particulate matter (DPM) necessitates the development and implementation of
efcient exhaust aftertreatment technologies such as diesel particulate lters or
particle traps with open deposition structures. These systems generally require
j211
212
regeneration by oxidation and gasication of the soot deposits in the lter or catalyst
structures. Usually the regeneration is based on discontinuous oxidation by O2 at
high temperatures (>500 C) or continuous oxidation by NO2 at moderate exhaust
temperatures (200500 C) [29, 44, 111113, 125]. Efcient optimization of the
design and operating conditions of such exhaust aftertreatment systems requires
comprehensive kinetic characterization and mechanistic understanding of the
involved chemical reactions and transport processes. Recent investigations have
shown that the oxidation and gasication of diesel soot by NO2 at elevated concentration and temperature levels (up to 800 ppm NO2 and 500 C) follows a similar
LangmuirHinshelwood reaction mechanism as the oxidation of BaP on soot by O3 at
ambient concentration and temperature level (up to 1 ppm O3 and 30 C) [29, 90, 125].
7.3.2
Restructuring, Phase Transitions, Hygroscopic Growth and CCN/IN Activation of
Aerosol Particles upon Interaction with Water Vapor
Water vapor molecules interacting with aerosol particles can be adsorbed on the
particles surface or absorbed into the particles bulk. For particles consisting of
water-soluble material, the uptake of water vapor can lead to aqueous solution droplet
formation and substantial increase of the particle diameter (hygroscopic growth)
even at low relative humidities (RH <100%; atmospheric gas-phase water partial
pressure < equilibrium vapor pressure of pure liquid water) [10].
At water vapor supersaturation (RH >100%), aerosol particles can act as nuclei for
the formation of liquid cloud droplets [cloud condensation nuclei (CCN)]. For the
formation of water droplets from a homogeneous gas phase devoid of aerosol
particles supersaturations up to several hundred percent would be required (thermodynamic barrier for the homogenous nucleation of a new phase). In the atmosphere, however, water vapor supersaturations with respect to liquid water generally
remain below 10% and mostly even below 1%, because aerosol particles induce
heterogeneous nucleation, condensation and cloud formation. At low temperatures
and high altitudes, clouds consist of mixtures of liquid water droplets and ice crystals
or entirely of ice crystals. The formation of ice crystals is also induced by pre-existing
aerosol particles, so-called ice nuclei (IN), as detailed below. Ice nucleation in clouds
usually requires temperatures well below 0 C, which can lead to high water vapor
supersaturations with respect to ice [10, 126131].
The minimum supersaturation at which aerosol particles can be effectively
activated as CCN or IN is called critical supersaturation. It is determined by the
physical structure and chemical composition of the particles and it generally
decreases with increasing particle size. For insoluble CCN the critical supersaturation depends on the wettability of the surface (contact angle of liquid water) and for
partially or fully soluble CCN it depends on the mass fraction, hygroscopicity and
surfactant activity of the water-soluble particulate matter [10, 22, 24, 132, 133].
The nucleation of ice crystals on atmospheric aerosol particles can proceed via
different pathways or modes. In the deposition mode, water vapor is adsorbed and
immediately converted into ice on the surface of the IN (deposition or sorption
7.3 Transformation
nuclei). In the condensation freezing mode, the aerosol particles act rst as CCN and
induce the formation of supercooled aqueous droplets which freeze later on
(condensation freezing nuclei). In the immersion mode the IN are incorporated
into pre-existing aqueous droplets and induce ice formation upon cooling (immersion nuclei). In the contact mode, freezing of a supercooled droplet is initiated upon
contact with the surface of the IN (contact nuclei). Obviously, the IN activity of aerosol
particles depends primarily on their surface composition and structure, but condensation and immersion freezing can also be governed by water-soluble bulk material [10, 130, 134140].
Most water-soluble aerosol components are hygroscopic and absorb water to form
aqueous solutions at RH <100%. The phase transition of dry particle material into a
saturated aqueous solution is called deliquescence and occurs upon exceeding a
substance-specic RH threshold value [deliquescence relative humidity (DRH)]. The
reverse transition and its RH threshold value are called eforescence and eforescence relative humidity (ERH), respectively. The hygroscopic growth and CCN
activation of aqueous solution droplets can be described by the so-called
Kohler theory, which combines Raoults law or alternative formulations for the
activity of water in aqueous solutions and the Kelvin equation for the dependence of
vapor pressure on the curvature and surface tension of a liquid droplet
[10, 22, 24, 132, 141146].
Figure 7.10a shows a typical example of the hygroscopic growth of water-soluble
inorganic salts contained in air particulate matter: the hygroscopic growth curve
(humidogram) of pure NaCl aerosol particles with dry particle diameters of 100 nm
measured in a hygroscopicity tandem differential mobility analyzer (H-TDMA)
experiment at relative humidities up to 95%. Upon hydration (increase in RH), the
crystalline NaCl particles undergo a deliquescence transition at DRH 75%. The
water uptake and dependence of the aqueous solution droplet diameter on RH agree
very well with K
ohler theory calculations, which are based on a semi-empirical ion
interaction parameterization of water activity and account for the effects of particle
shape transformation (cubic crystals and spherical droplets; mobility and mass
equivalent diameters) [141]. The hysteresis branch measured upon dehydration
(decrease in RH) is due to the existence of solution droplets in a metastable state of
NaCl supersaturation (ERH < RH < DRH). The eforescence transition, i.e. the
formation of a salt crystals and evaporation of the liquid water, occurs at ERH 40%.
Figure 7.10b displays the hygroscopic growth curve of aerosol particles composed
of pure bovine serum albumin (BSA) as a model for globular proteins and similar
organic macromolecules. The hygroscopic growth is much less pronounced than for
inorganic salts but still signicant, with deliquescence and eforescence transitions
at DRH ERH 40% (conversion of dry protein particles into saturated aqueous
solution or gel-like droplets, v.v.) and no signicant deviations between hydration and
dehydration (no hysteresis effect). The dependence of the deliquesced particle
diameter on RH is in good agreement with Kohler theory calculations based on a
simple osmotic pressure parameterization of water activity, which has been derived
under the assumption that the dissolved protein macromolecules behave like inert
solid spheres [141].
j213
214
7.3 Transformation
Figure 7.10c shows the hygroscopic growth curve of internally mixed NaClBSA
particles (mass ratio 1:1) with dry particle diameter of 100 nm. The mixed
aerosol particles have been generated in full analogy with the pure NaCl and pure
BSA particles (nebulization of an aqueous solution). Upon hydration, however, the
particles exhibit a signicant decrease in the measured (mobility equivalent) diameter as the relative humidity approaches the deliquescence threshold (DRH 75%).
The observed minimum diameter is 10% smaller than the initial diameter,
indicating high initial porosity of the particles (envelope void fraction 30%) and
strong restructuring upon humidication. Upon dehydration, the eforescence
threshold is lower than for pure NaCl (ERH 37% vs. 40%), indicating that the
protein macromolecules inhibit the formation of salt crystals and enhance the
stability of supersaturated salt solution droplets. The particle diameters observed
after eforescence essentially equal the minimum diameter observed prior to
deliquescence. The hygroscopic growth of the deliquesced particles (aqueous solution droplets) is in fair agreement with Kohler theory calculations based on the
observed minimum diameter rather than the initial diameter and on the assumption
of simple solute additivity (linear combination of NaCl ion interaction and BSA
osmotic pressure parameterizations of water activity) [141]. These and complementary measurement and modeling results can be explained by the formation of porous
agglomerates due to ionprotein interactions and electric charge effects on the one
hand and by compaction of the agglomerate structure due to capillary condensation
and surface tension effects on the other.
Depending on their origin and conditioning, aerosol particles containing inorganic salts and organic (macro-) molecules can have complex and highly porous
microstructures, which are inuenced by electric charge effects and interaction with
water vapor. Proteins and other surfactants tend to be enriched at the particle surface
and form an envelope which can inhibit the access of water vapor to the particle core
and lead to kinetic limitations of hygroscopic growth, phase transitions, CCN and IN
activation. Formation and effects of organic surfactant lms on sea salt particles have
been discussed by ODowd et al. [147]. These and other effects of (non-linear)
interactions between organic and inorganic aerosol components have to be further
elucidated and considered for consistent analysis of measurement data from
laboratory experiments and eld measurements and reliable modeling of atmospheric aerosol processes (Figures 7.1 and 7.3).
Structural rearrangements, hygroscopic growth, phase transitions and CCN/IN
activation of aerosol particles interacting with water vapor are not only important for
the formation and properties of clouds and precipitation (number density and size of
cloud droplets and ice particles; temporal and spatial distribution and intensity of
precipitation). They also inuence the chemical reactivity and aging of atmospheric
particles (accessibility of particle components to reactive trace gases and radiation),
their optical properties (absorption and scattering cross-sections) and their health
effects upon inhalation into the human respiratory tract (deposition efciency
and bioavailability). Therefore, the water interactions of particles with complex
chemical composition are widely and intensely studied in current aerosol, atmospheric and climate research. So far, however, their mechanistic and quantitative
j215
216
7.4
Climate and Health Effects
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
are the specic surface, transition metals and organic compounds [7, 165167]. Some
of the possible mechanisms by which air particulate matter and other pollutants may
affect human health are summarized in Table 7.3.
Ultrane particles (dp < 100 nm) are suspected to be particularly hazardous to
human health, because they are sufciently small to penetrate the membranes of the
respiratory tract and enter the blood circulation or be transported along olfactory
nerves into the brain [168170]. Neither for ultrane nor for larger aerosol particles,
however, is it clear which physical and chemical properties actually determine their
adverse health effects (particle size, structure, number and mass concentration,
solubility, chemical composition and individual components, etc.).
Particularly little is known about the relations between allergic diseases and air
quality. Nevertheless, trafc-related air pollution with high concentration levels of
ne air particulate matter, nitrogen oxides and ozone is one of the prime suspects,
besides unnatural nutrition and exaggerated hygiene, which may be responsible for
the strong increase in allergies in industrialized countries over recent decades [7, 171
173]. The most prominent group of airborne allergens are protein molecules, which
account for up to 5% of urban air particulate matter. They are not only contained in
coarse biological particles such as pollen grains (diameter >10 mm) but also in the ne
fraction of air particulate matter, which can be explained by ne fragments of pollen,
microorganisms or plant debris and by mixing of proteins dissolved in rain water
with ne soil and road dust particles [69, 72, 174].
A molecular rationale for the promotion of allergies by trafc-related air pollution
has been proposed by Franze and co-workers [72, 73], who found that proteins
including birch pollen allergen Bet v 1 are efciently nitrated by polluted urban air.
The nitration reaction converts the natural aromatic amino acid tyrosine into
nitrotyrosine and proceeds particularly fast at elevated NO2 and O3 concentrations
(photo-smog or summer smog conditions), most likely involving nitrate radicals
(NO3) as reactive intermediates. From biomedical and immunological research, it is
j217
218
known that protein nitration occurs upon inammation of biological tissue, where it
may serve to mark foreign proteins and guide the immune system. Moreover,
conjugates of proteins and peptides with nitroaromatic compounds were found to
evade immune tolerance and boost immune responses and post-translational
modications generally appear to enhance the allergenicity of proteins [72]. Thus
the inhalation of aerosols containing nitrated proteins or nitrating reagents is likely to
trigger immune reactions, promote the genesis of allergies and enhance the intensity
of allergic diseases and airway inammations. This hypothesis is supported by rst
results of ongoing biochemical experiments with nitrated proteins [175].
By means of newly developed enzyme immunoassays, nitrated proteins have been
detected in urban road and window dust and ne air particulate matter, exhibiting
degrees of nitration up to 0.1%. Upon exposure of birch pollen extract to heavily
polluted air at a major urban trafc junction and to synthetic gas mixtures containing
NO2 and O3 at concentration levels characteristic of intense summer smog, the
degrees of nitration rose by up to 20%. The experimental results indicate that Bet v 1 is
more easily nitrated than other proteins, which might be an explanation for why it is a
particularly strong allergen [72]. If the ongoing biochemical experiments and further
studies conrm that protein nitration by nitrogen oxides and ozone is indeed an
important link between air pollution, airway inammations and allergies, the spread
and enhancement of these diseases could be encountered by the improvement of air
quality and reduction of emission limits for nitrogen oxides and other trafc-related
air pollutants. Moreover, it might be possible to develop pharmaceuticals against the
adverse health effects of nitrated proteins.
Efcient control of air quality and related health effects requires a comprehensive
understanding of the identity, sources, atmospheric interactions and sinks of
hazardous pollutants. Without this understanding, the introduction of new laws,
regulations and technical devices for environmental protection runs the risk of being
ineffective or even of doing more harm than good through unwanted side-effects.
For example, epidemiological evidence for adverse health effects of ne and
ultrane particles has led to a lowering of present and future emission limits for
soot and related DP [29, 113, 170, 176178]. For compliance with these emission
limits, different particle lter or trapping and exhaust aftertreatment technologies
have been developed and are currently being introduced in diesel vehicles. Depending on the design of the particle lter or trap and catalytic converter systems, their
operation can lead to substantial excess NO2 emissions [29, 125]. If, however, elevated
NO2 concentrations and the nitration of proteins indeed promote allergies, such
systems could reduce respiratory and cardiovascular diseases related to soot particles
but at the same time enhance allergic diseases. Moreover, elevated NO2 concentrations and incomplete oxidation of soot in exhaust lter systems could also increase
the emissions of volatile or semi-volatile hazardous aerosol components such as
nitrated PAH derivatives [31, 48]. Hence effective mitigation of the adverse health
effects of diesel engine exhaust may require the introduction of advanced catalytic
converter systems which minimize the emissions of both particulate and gaseous
pollutants (soot, PAHs and PAH derivatives, nitrogen oxides, etc.) rather than simple
particle lters.
7.5
Summary and Outlook
j219
220
With regard to atmospheric aerosol effects on human health, not only the
quantitative but also the qualitative and conceptual understanding are very limited.
Epidemiological and toxicological studies indicate strong adverse health effects of
ne and ultrane aerosol particles (nanoparticles) in addition to gaseous air pollutants, but the causative relations and mechanisms are hardly known [7, 164]. Their
elucidation, however, is required for the development of efcient strategies of air
quality control and medical treatment of related diseases, which permit the minimization of adverse aerosol health effects at minimum social and economic cost.
Particularly little is known about the relations between allergic diseases and air
pollution and the interactions between natural aeroallergens and trafc-related
pollutants. Several studies have shown synergistic and adjuvant effects of diesel
particulate matter, O3, NO2 and allergenic pollen proteins, but the specic chemical
reactions and molecular processes responsible for these effects have not yet been
unambiguously identied. Recent investigations indicate that the nitration of
allergenic proteins by polluted air may play an important role. Nitrated proteins
are known to stimulate immune responses and they could promote the genesis of
allergies, enhance allergic reactions and inuence inammatory processes, which is
conrmed by the results of ongoing biochemical investigations [72, 175].
For efcient elucidation and abatement of adverse aerosol health effects, the
knowledge of atmospheric and biomedical aerosol research should be integrated to
formulate plausible hypotheses that specify potentially hazardous chemical substances and reactions on a molecular level. These hypotheses have to be tested in
appropriate biochemical and medical studies, to identify the most relevant species
and mechanisms of interaction and to establish the corresponding doseresponse
relationships. Ultimately, the identication and characterization of hazardous aerosol components and their sources and sinks (emission, transformation, deposition)
should allow the optimization of air pollution control and the medical treatment of
aerosol effects on human health.
asl
BaP
BC
BSA
CCN
dp
dN/d log dp
DPM
DRH
EC
ELPI
EPA
ERH
References
HC
H-TDMA
HULIS
IN
k1
LVOC
MWSOC
NVOC
OA
OC
OPM
PAH
PAH(5,6)
PM
PM2.5 (1 or 10)
POA
PRA
RH
RR
SMPS
SOA
SVOC
TC
UV
VMR
VOC
WSOC
hydrocarbon
hygroscopicity tandem differential mobility analyzer
humic-like substances
ice nucleus
(pseudo-) rst-order rate coefcient
low-volatility organic compound
macromolecular water-soluble organic carbon
non-volatile organic compound
organic aerosol
organic carbon
organic particulate matter
polycyclic aromatic hydrocarbon
polycyclic aromatic hydrocarbons consisting of ve or six aromatic
rings
particulate matter
particulate matter of particles with aerodynamic diameters
2.5 mm (1 or 10 mm)
primary organic aerosol
Poschl, Rudich and Ammann
relative humidity
recovery ratio
scanning mobility particle sizer
secondary organic aerosol
semi-volatile organic compound
total carbon
ultraviolet
volume mixing ratio
volatile organic compound
water-soluble organic carbon
References
1 U. P
oschl, Angewandte Chemie
International Edition 2005, 44,
7520.
2 S. Fuzzi, M. O. Andreae, B. J. Huebert, M.
Kulmala, T. C. Bond, M. Boy, S. J. Doherty,
A. Guenther, M. Kanakidou, K.
Kawamura, V.-M. Kerminen, U.
Lohmann, L. M. Russell, U. Poschl,
Atmospheric Chemistry and Physics
2006, 6, 2017.
3 B. J. Finlayson-Pitts, J. N. Pitts, Chemistry
of the Upper and Lower Atmosphere,
Academic Press, San Diego, CA, 2000.
j221
222
References
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
j223
224
58
59
60
61
62
63
64
65
66
67
68
69
70
71
References
87 A. R. Ravishankara, Science 1997, 276,
1058.
88 J. P. Reid, R. M. Sayer, Chemical Society
Reviews 2003, 32, 70.
89 Y. Rudich, Chemical Reviews 2003, 103,
5097.
90 U. P
oschl, T. Letzel, C. Schauer, R.
Niessner, Journal of Physical Chemistry A
2001, 105, 4029.
91 R. Sander, Surveys in Geophysics 1999,
20, 1.
92 S. P. Sander, B. J. Finlayson-Pitts, R. R.
Friedl,D.M.Golden,R. E. Huie, C. E. Kolb,
M. J. Kurylo, M. J. Molina, G. K. Moortgat,
V. L. Orkin, A. R. Ravishankara, Chemical
Kinetics and Photochemical Data for Use in
Atmospheric Studies, Evaluation Number
14. JPL Publication 02-25, Jet Propulsion
Laboratory, Pasadena, CA, 2002.
93 A. E. Waibel, T. Peter, K. S. Carslaw, H.
Oelhaf, G. Wetzel, P. J. Crutzen, U.
P
oschl, A. Tsias, E. Reimer, H. Fischer,
Science 1999, 283, 2064.
94 D. J. Stewart, P. T. Grifths, R. A. Cox,
Atmospheric Chemistry and Physics
2004, 4, 1381.
95 J. Austin, D. Shindell, S. R. Beagley, C.
Bruhl, M. Dameris, E. Manzini, T.
Nagashima, P. Newman, S. Pawson, G.
Pitari, E. Rozanov, C. Schnadt, T. G.
Shepherd, Atmospheric Chemistry and
Physics 2003, 3, 1.
96 S. K. Meilinger, B. Karcher, T. Peter,
Atmospheric Chemistry and Physics
2002, 2, 307.
97 S. K. Meilinger, B. Karcher, T. Peter,
Atmospheric Chemistry and Physics
2005, 5, 533.
98 M. O. Andreae, P. J. Crutzen, Science
1997, 276, 1052.
99 R. Sander, W. C. Keene, A. A. P. Pszenny,
R. Arimoto, G. P. Ayers, E. Baboukas, J.
M. Cainey, P. J. Crutzen, R. A. Duce, G.
Honninger, B. J. Huebert, W. Maenhaut,
N. Mihalopoulos, V. C. Turekian, R. Van
Dingenen, Atmospheric Chemistry and
Physics 2003, 3, 1301.
100 A. A. P. Pszenny, J. Moldanov, W. C.
Keene, R. Sander, J. R. Maben,
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
j225
226
130
131
132
133
134
135
136
137
138
139
140
141
142
References
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
j227
228
j229
8
Measurement and Detection of Nanoparticles Within
the Environment
Thomas A.J. Kuhlbusch, Heinz Fissan, and Christof Asbach
8.1
Introduction
Nanotechnology opens up opportunities for new and improved products and needs
tailored production tools. Nanoparticles are one of the most important building
blocks, for example to develop new or improved materials, allow for higher catalytic
efciencies or reduce energy and material consumption.
Nanoparticles (NPs) are dened in this chapter as intentionally produced particles
for use in products either as single particles or as agglomerates with diameters below
100 nm. Another term often used is ultrane particles (UFPs, Figure 8.1). Ultrane
particles in this chapter denote particles in the environment which at least partially
consist of unintentionally produced and/or naturally formed particles. Nanoparticles
are normally solid particles whereas naturally and unintentionally manmade particles may be of solid or liquid nature. Even though particles in the sub-100-nm range
can be differentiated into nanoparticles and UFPs, the measurement and detection
techniques are fundamentally the same.
Figure 8.2 shows the relative scale of natural and manmade materials. This gure
shows that nanoparticles are about one hundredth the size of a blood cell. The small
size of nanoparticles is actually one of the important characteristics which give the
free, unbound particles a generally high mobility. Particles of sizes below 100 nm are
within the size of the pores of cells and hence may penetrate into cells, the blood or
even end organs such as the brain [1]. Whether this may happen or is of relevance to
health is still under debate and not a topic of this chapter.
Nanoparticles are mainly dened by their particle diameter being <100 nm. Other
specic properties related to nanoparticles and possible concentration metrics are
listed in Table 8.1.
The overview of possible particle properties of interest in view of nanoparticle
effects on humans, animals and plants in Table 8.1 is not exclusive and only indicates
the complexity with which nanoparticles may interact with the environment. The list
of concentration metrics (Table 8.1) again is not exclusive but indicates that it is
230
possible to use different metrics for the quantitative investigation of exposure, dose
and effects. More detailed information on how the above properties are related to
particle-induced negative health effects and the detection of particles in different
tissues is given elsewhere [24].
The specic properties listed in Table 8.1 may also be used for the measurement
and detection of nanoparticles in the environment.
Another issue illustrated in Figure 8.2 also is that natural and manmade nanosized
particles co-exist. The properties listed in Table 8.1 may be used to determine,
measure and quantify nanoparticles, but the methods generally do not differentiate
manmade nanoparticle and other nanosized particles. One of the major tasks of
measuring and detecting nanoparticles in the environment is the differentiation
between UFPs and nanoparticles. A simple differentiation may even not be enough
since a nanoparticle being attached to a larger particle will lose some of its properties,
especially its size and hence its mobility. Therefore, generally three kinds of particles
below 100 nm have to be differentiated.
.
.
.
.
Particle property
Concentration metric
Shape/morphology
Chemical composition
Hygroscopicity
Solubility
Charge, mobility
Particle reactivity (radical formation)
Mass concentration
Surface area concentration
Number concentration
Size distribution <100 nm
8.1 Introduction
j231
232
specic nanoparticle properties but may recover some of their properties when
released from the agglomerate after uptake in plants, animals or humans. This issue
being specically related to toxicology will not be further discussed here but leads to
the discussion of the measurement techniques that may also include particle sizes up
to about 400 nm. Still, it should be noted that to our knowledge no release of
nanoparticles from agglomerates has yet been demonstrated under conditions after,
for example, uptake of agglomerates in lungs.
A further point of discussion related to the detection and measurement of
nanoparticles in the environment is shown in Figure 8.3. This gure shows the
lifecycle of nanoparticles from production, handling and processing to the stage
where they are used in products, for example for consumers, and nally to recycling,
deposition in landlls or other processes. The life cycle is shown to indicate possible
release of nanoparticles into the environment. They may be released during
the production, handling (e.g. packing), handling during nanoparticle processing
(e.g. use of carbon black nanoparticles during tire production) or use of products
(e.g. paint containing nanoparticles sprayed onto surfaces).
Emissions from products containing bound nanoparticles in a xed matrix such as
carbon black in plastics of computers or in tires can generally be imagined but are not
likely and cannot yet be detected. Hence the release of nanoparticles from products
containing nanoparticles in a xed matrix can be neglected, as was announced, for
example, in Californian Proposition 65 [5] for carbon black. Detection of nanoparticles in most of these xed matrices in the environment is nearly impossible.
The environment and environmental matrix containing nanoparticles are also
of crucial importance for the detection and measurement of nanoparticles.
Generally three matrices can be differentiated in the environment: soils, water and
air. These three matrices differ not only in their physical state (solid, liquid and
gas) but also in their mobility, increasing from soils to air. This mobility and the
8.2
Occurrence of Nanoparticles in Environmental Media
Mainly two different environments can be differentiated when discussing nanoparticles: the ambient environment and plants or workplaces in nanoparticle production,
handling and processing. The environmental media in which nanoparticles may
occur are the same, but the media soil and water will only be discussed for the
ambient, public environment since the information given there is also valid for the
work environment.
8.2.1
Ambient Environment
8.2.1.1 Water and Soils
Nanoparticles may reach waters, either by intentional use for remediation [6] or after
unintentional release during/after production and subsequent wash-off. nanoparticles may also be released to water when they are produced via the liquid phase and
leaks occur or the efuent is not sufciently cleaned up. Once suspended in the liquid
phase, nanoparticles will likely attach to other particles or to surfaces such as those of
sand grains. No information on transport distances and the spatial distribution of
nanoparticles in waters is currently available, to our knowledge.
Soils may become contaminated by nanoparticles by airborne deposition, deposition from the water phase or by direct deposition of powders and uids, either
intentionally or unintentionally. No systematic differences are made between soils
and sediments in this chapter, since no information on differences in the behavior
and the determination of nanoparticles is currently available.
Once nanoparticles are attached to soil surfaces, three ways of possible
(re)mobilization of nanoparticles can be differentiated: (a) transport by water through
the soils to the groundwater, (b) uptake by plants via roots and (c) wind erosion of soil
particles containing nanoparticles.
No studies investigating the uptake of nanoparticles by plants are currently
available. Lecoanet and Wiesner [7] stated that nanosized particles do not move far
under environmental conditions. It was shown that the mobility of nanoparticles
j233
234
correlates with size, with smaller nanosized particles being easily adsorbed on
surfaces of sand grains and therefore immobilized. Biological transport may still
occur from ingested sediments, but the physical movement of nanosized materials is
restricted by their small size and propensity to adsorb on surfaces.
Nevertheless, taking the example of fullerenes, it could be demonstrated that the
mobility of nanoparticles in water and soils is also strongly dependent on their
physicalchemical properties. The mobility and behavior of three different fullerene
solutions and four different oxidized materials was shown to be highly variable when
changing ionic strength and pH values [7, 8]. While common models of particle
transport through porous media described the behavior of mineral nanoparticles
fairly well, the behavior of the fullerenes could not be modeled. Especially the latter
component was found in brains of articially exposed sh, where it induced oxidative
stress [9]. Fullerenes, on the other hand, showed the lowest mobility of the substances
tested, reducing the likelihood of exposure. This example demonstrates the need for
knowledge of the mobility of nanoparticles.
So far, no studies have been conducted, to our knowledge, to determine nanoparticles in environmental media outside plants or experimental sites. This lack of
current knowledge may be due to the low concentrations of nanoparticles in waters
and soils, which make detection and quantication nearly impossible. Still, with
increasing use of mobile nanoparticles, it may become important and should not be
neglected in the future.
8.2.1.2 Air
The main matrix studied for the transport of nanoparticles so far is the air. This focus
on air has several reasons, all related to the implications and use of particles in
general. Some of the most important uses and effects of particles include the
industrial production of particles (carbon black, titanium dioxide, etc.), horizontal
dispersion in the atmosphere [10], climatic implications [11], cloud nucleation [12, 13],
transport of nutrients [14] and health effects [15].
Another important reason is the high mobility of nanoparticles and UFPs in air,
leading to a wide dispersion of these particles in our atmosphere. UFP number
concentrations in ambient air may vary from a few hundred to over 105 particles per
cubic centimeter, depending on the distance to sources such as incomplete combustion [16], but are determined anywhere in the Earths atmosphere.
These UFPs are not due to nanoparticles but are mainly due to unintentionally
manmade emissions in addition to natural emission. The occurrence of UFPs long
before the intentional production of nanoparticles, their high mobility and ubiquitous presence and their implications for the environment and health led to the
development of various measurement devices and to innumerable studies.
8.2.2
Workplace Environment
8.3
Nanoparticle Detection and Measurement Techniques
8.3.1
Soil
Basically no specic detection techniques for nanoparticles in soils are known. The
only applicable method is the preparation of soil samples for analysis by transmission
electron microscopy (TEM) coupled with, for example, energy-dispersive X-ray (EDX)
analysis for the detection of nanoparticles of known chemical composition.
8.3.2
Water and Liquids
Nanoparticles can be produced in water but the detection, identication and size
measurements of nanoparticles in water are still not as advanced as for airborne
particles. One generally feasible way of determining nanoparticles is the ltration of
waters and uids with subsequent analysis of the deposited particles by microscopy
[TEM, scanning electron microscopy (SEM), atomic force microscopy (AFM)]. This
off-line technique permits the determination of nanoparticles down to diameters of a
few nanometers.
No on-line technique for measurements down to a few nanometers is currently
available.
8.3.2.1 Coulter Counter
The Coulter principle (electrical sensing zone method) is a widely used method for
particle size analysis in liquids and is the recommended limit test for particulate
matter in large-volume solutions. It sizes and counts particles based on measurable
changes in electrical resistance produced by nonconductive particles suspended in an
electrolyte. Suspended particles pass through the sensing zone consisting of a small
j235
236
Figure 8.4 Principle setup of a Coulter Counter (Adapted from Beckman Coulter [52]).
opening (aperture) between the electrodes (Figure 8.4). Particles displace their
own volume of electrolyte in the sensing zone, which is then measured as a
voltage pulse. The height of each pulse is proportional to the particle volume and
the number of pulses per unit time is proportional to the concentration during
constant-volume ow. Several thousand particles per second can individually be
counted and sized and this information is converted to particle size distributions.
This method is according to its principle independent of particle shape, color and
density. The detectable particle size range of the Coulter Counter extends from about
300 nm to 1200 mm, and hence not applicable to nanoparticles but to their larger
agglomerates.
8.3.2.2 Light Scattering
Another principle in use for the detection of particles in liquids is based on light
scattering. Gas molecules and atmospheric particles are smaller than the wavelengths of visible light. When light hits a gas molecule, the molecule absorbs and
scatters the light in different directions. This is why a beam of a torch can be seen, for
example, at night even from outside the lights path. The different colors of light are
scattered differently after collision. The scattering is called Rayleigh scattering.
One of the scattering measurement techniques used for liquids is dynamic light
scattering (DLS), which permits measurements of particle sizes from 2 nm to 6 mm.
DLS measures the intensity uctuations of scattered light by Brownian motion. The
Brownian motion of the particles causes a Doppler shift in the incident light
frequency. The magnitude of the frequency shift is related to the frequency of the
Brownian motion, which on the other hand is directly related to the size of the
particles.
The backscattered light (angle about 160180 of incident light) is most commonly
used for the detection and measurement of nanoparticles and their size distribution,
since multiple scattering effects can be reduced and solutions with higher concentrations measured because the light does not have to go through the whole sample
(Figure 8.5).
j237
238
8.3.3
Air
8.3.3.1 Basics
This section explains some of the basic particle and aerosol properties which are
necessary to understand the measurement techniques discussed thereafter.
Mechanical Mobility The terminal settling velocity of a particle can be viewed as
the velocity of a particle being released in still air undergoing gravitational settling
(e.g. [20]). The terminal velocity can be calculated by a force balance, where the sum of
drag force FD and gravitational force FG is zero:
F D F G mg
8:1
3
3phVd rp rg pd g
CC
6
8:2
where h is the viscosity of air, V the particle velocity, d the particle diameter, rp the
particle density, rg the density of the gas and g the acceleration relative to the gas due
to gravity. CC is the dimensionless Cunningham slip correction factor, which takes
into account that the motion of submicrometer particles is affected by interaction
with single molecules [21, 22]. CC increases with decreasing particle diameter and
approaches 1 for particle diameters above 1 mm:
CC 1
gdp
2l
a b exp
dp
2l
8:3
where l is the mean free path of the gas molecules, a 1.165, b 0.483 and
g 0.997 are empirical constants [23].
Generally rg is much smaller than rp and hence can be neglected. Equation (8.2)
can now be solved for the terminal settling velocity VTS as used, for example, for the
denition of equivalent diameters.
V TS
rp d2 gCC
18h
8:4
Stokes law, which describes the total resisting force on a spherical particle due to its
velocity V relative to the uid, can also be transformed to
FD
3phVd
V
CC
YB
CC
F D 3phd
8:5
where B denotes the mechanical mobility of a particle. The equation now expresses
the particle mobility or velocity per unit force B.
The terminal settling velocity VTS [Equation (8.4)] can now be rewritten as
V TS F G B
8:6
Equivalent Diameter The morphology of airborne particles can vary over a wide
range, from spherical to needle-like in shape, from single particles to agglomerates.
The latter denote small particles attached to each other by strong or weak bonds.
Agglomerates usually exhibit irregular shapes with any fractal dimensions. This
variance in morphology prevents easy comparisons and necessitates the use of
particle models. The most commonly used models in aerosol measurement are based
on equivalent spheres. One concept is the assumption of equal settling velocity of
particles. In this model, each irregularly shaped particle is assigned an equivalent
diameter of a sphere with the same terminal settling velocity as given in Equations (8.4) and (8.6). If the equivalent sphere is assumed to have the same density rp
as the irregularly shaped particle, it is referred to as the Stokes diameter dst, whereas if
unit density (1 g cm3) is assumed as density for the sphere, the diameter is referred to
as the aerodynamic diameter dae.
This concept of equivalent diameters now allows the comparison of particles of
different shape and density (Figure 8.7). Stokes and aerodynamic particle diameter
can be exchanged using the following equation:
s
rp
8:7
dae dst
r0
Inertia (Impactor, Cyclone) One common principle used in aerosol measurements is
the fractionation of particles according to their size by using the differences of the
particle inertia. For the sake of simplicity, only impaction is discussed in more detail.
Impactors are used to remove particles of a given size from an aerosol; for example,
impaction as a separation technique is used in environmental standards as a
separator to remove particles larger than 10 mm dae or 2.5 mm dae (EN12341, EN14907)
from the aerosol. The impaction principle is based on the acceleration of the aerosol
in a nozzle and the direction of the outow out of the nozzle onto a at plate, called
impaction plate. The ow is deected by 90 by the impaction plate and particles
above a certain size cannot follow the gas ow due to inertia and are deposited on the
impaction plate. This is also exemplarily shown in Figure 8.8. It can be noted that the
Figure 8.7 An irregular particle and its equivalent spheres (Adapted from [20]).
j239
240
dP;50
s
9pStk50 hNd3
4rP C C V
8:8
Ratio of nozzleimpaction plate distance to nozzle diameter. The ratio of the distance
from the nozzle to the impaction plate (s) to the nozzle diameter (dn) should be in
the range 0.5 s/dn 5.0.
Ratio nozzle length to nozzle diameter. The ratio of nozzle length (l) to nozzle
diameter (dD) should be in the range 0.25 l/dD 2.0. These limits ensure a
homogeneous ow pattern at the exit of the nozzle, meaning the same velocity
anywhere at the nozzle exit. The ow pattern will not have been well developed if
the ratio is too small (l/dD < 0.25) whereas distinct velocity proles have developed
with lower velocities at the edge compared with the middle of the nozzle if ratio is to
large (l/dD > 2.0).
Reynolds number. The Reynolds number should be in the range 403000 to ensure
laminar ow conditions.
Zp
ve
E
8:9
The electrical mobility thus expresses the ability of a charged particle to move within an
electric eld. This ability is dependent upon the particle charge, expressed as multiples
n of the elementary charge e (1.6 10 19 A s) and the mechanical mobility B of the
particle which represents the general ability of the particle to move in a gas:
Zp neB
8:10
The mechanical mobility B is a function of particle and gas properties and is explained
in (8.5).
Particle Size Distributions (Mass, Surface, Number) One basic piece of information
on airborne particles is their size distribution. Three modes can generally be
differentiated: the nucleation mode (diameter <30 nm), accumulation mode (diameter 200700 nm) and the coarse mode (diameter >1000 nm). The nucleation mode is
j241
242
Figure 8.10 Number, surface and mass size distributions (Adapted from [56]).
air due to the low surface to volume ratio but becomes important when assessing
indoor air and possibly workplaces (especially for nanoparticles and UFPs).
A standard way to quantify the longevity of a substance in the atmosphere is its
lifetime the time that it takes for an initial amount to be cut by about two-thirds.
Particle lifetimes in the atmosphere are strongly size dependent.
Figure 8.12 gives an example of particle number size distribution. Figure 8.12a
shows the variance between different site types, from rural background to urban
trafc sites. A clear increase in number concentrations and a shift towards UFP
particles from the rural background to the urban trafc site can be seen. Figure 8.12b
shows the same average urban particle number size distribution as in (a), with the
only difference being that the y-axis is linear in scale. This example demonstrates the
inuence of changing scales on the perception of particle size distributions.
8.3.3.2 Online Physical Characterization
Condensation Particle Counter Condensation particle counters (CPC)s, sometimes
also referred to as condensation nucleus counters (CNC)s or Aitken nucleus counters
j243
244
Figure 8.12 (a) Particle number size distributions for various site
types in a loglog plot. (b) Average urban in (a) with linear
concentration axis [57].
(ANC)s, are used to measure the total number concentration of gas-borne particles
larger than some minimum detectable size. No upper size limit is given for CPCs
other than the collection efciencies of the inlet system. CPCs are often used as either
stand-alone instruments to measure the total concentration, for example in room or
ambient air, or as detectors with other instruments, such as electrostatic classiers
(DMAs) to detect the number concentration of size-selected particles. Condensation
particle counters are available as either hand-held or stationary instruments. The
latter usually offer a larger liquid reservoir and can thus be used over longer periods,
for example in conjunction with a scanning mobility particle sizer (SMPS). Handheld CPCs are more mobile and are used, for instance, for the mapping of total
particle concentrations in, for example, workplace environments.
In CPCs, particles are enlarged by vapor condensation and then are detected
optically. Without additional preconditioning, the lowest detectable particle size
would be limited to the range of the wavelength of light. Such particle counters would
therefore not be suitable to detect nanoparticles. In order to grow small particles to
optically detectable sizes, they are exposed to supersaturated vapor that condenses on
the particle surfaces. Commonly used vapors are n-butanol [26] and water [27].
Diameter growth factors of 1001000 are common [28], resulting in lower detection
limits of 3 nm for n-butanol-based and 2.5 nm for water-based CPCs.
The fact that particles in air act as condensation nuclei was rst described by
Coullier in 1875 [53], who found that if air expands adiabatically the condensation
effect is stronger in unltered compared with ltered air.
A rst version of a condensation particle counter was developed by Aitken in the
late 1880s [29, 30]. In his dust counter as illustrated in Figure 8.13, he rst ushed a
test receiver (A) with particle free air, before introducing a known amount (1 cm3) of
aerosol into the receiver. An air pump (B) was used to produce a known expansion.
The particles in the test receiver grew due to condensation, resulting in increased
settling of the particles. Aitken used a magnifying glass (S) to count manually the
particles settled on a deposition stage (O). Based on the assumption that the stage
contained a representative sample of the particles in the receiver ask, he concluded
that the total number of particles could be determined by means of the ratio of the
volume above the stage and the total volume of the ask. Aitken used his dust counter
and an improved portable dust counter [31] for intensive studies on atmospheric
j245
246
particles and found that the particle concentrations were signicantly higher when the
wind was blowing from industrial sources and that the concentrations were affected
by sunlight-driven photochemical reactions [32]. Based on these studies, he concluded [34]: Though this investigation clearly shows that the sun produces certain kinds of
fogs, yet it is by no means here contended that it is to be censured for their appearance.
It would rather appear that it is doing its best to show us the state of pollution into
which our modern civilization has brought our atmosphere, as it only inicts these
fogs on the areas upon which man has thrown the waste products of his industries and
converted the atmosphere into a vast sewer, as a penalty for something wrong in his
methods. Aitkens dust counter was thus an early instrument to help understand air
pollution. Even now, well over a century later, the principle of enlarging particles by
condensation remains an important tool for the determination of particle numbers.
The main difference is that today particles are not grown in order to increase their
settling, but to change their optical properties. CPCs can generally be distinguished
into direct and indirect detection instruments [32]. While direct instruments determine the total number by counting individual droplets formed by condensation,
indirect instruments measure the attenuation through or the light scattered by a
population of droplets. Direct instruments usually have a lower upper concentration
limit, whereas indirect instruments generally require relatively high concentrations.
Modern instruments include both direct and indirect counting, depending on the
particle concentration. When a certain concentration limit is exceeded, the instrument
automatically switches from direct to indirect mode.
The method of producing supersaturation has changed since the Aitkens time.
Whereas early instruments still used the discontinuous expansion method, modern
CPCs use steady ow, forced-convection heat transfer that allows particle counting in
real time. In these instruments, the saturated aerosol at 3540 C enters a laminar
ow condenser. The condenser walls are typically maintained at 10 C, causing a
forced heat transfer from the warm aerosol to the cool walls [33]. Figure 8.14 shows a
schematic of a modern, butanol-based ultrane particle counter (UCPC, TSI Model
3025A) that can detect particles down to 3 nm.
Electrometer Electrometers are used to measure a current, induced by particle-borne
charges. The results are integral values of the total electrical particle charge as current.
Figure 8.15 shows a schematic diagram of an electrometer. The instrument consists of
an absolute lter inside a grounded metal housing. The lter is connected to an
electrometer and all charges are removed from the particles and led to the electrometer. The housing creates a Faraday cup that shields the electrometer input from stray
electric elds. The total current measured by the electrometer can be expressed as
Ie N n p eQ e
8:11
where N is the total number of particles deposited on the lter, n p is the average number
ofelementarychargesona particle,eisthe elementarycharge(1.602 1019 A s)andQe
is the ow rate through the electrometer.
Aged particles in ambient air are usually nearly neutralized, that is, the sum of all
particle charges is close to zero, because approximately the same amount of particles
Figure 8.14 Schematic of a modern ultrafine condensation particle counter (TSI, Model 3025A).
j247
248
is positively and negatively charged. In this case the electrometer current would be
close to zero. If, however, the particles are freshly generated, they usually bear mainly
unipolar charges with the polarity dependent on the generation process. The polarity
of the measured current could therefore yield insight into the origin of the deposited
particles. If the particles become intentionally charged prior to deposition in an
electrometer and the relationship between particle size and charge is known, the
integral current allows (limited) interpretation with respect to particle size and
concentration. If the particles are monodisperse or mono-mobile, as for example
delivered by an electrostatic classier, the current can be directly related to the
number concentration of particles in the aerosol. This fact is used in several
applications, where condensation particle counters cannot be used, for example
due to pressure or time resolution restrictions. Electrometers can detect arbitrarily
small particles; however, they require a certain minimum current, caused by the
deposited particles. Therefore, they require higher minimum particle concentrations
than CPCs (Figure 8.15).
Overview of Aerosol Particle Sizing Instrumentation Various techniques for the
determination and characterization of airborne particle size distributions (e.g.
number, surface or size) exist. Three physically different basic principles for particle
size determination can be differentiated: the electrical mobility, mechanical mobility
and optical scattering. Figure 8.13 gives an overview of particle size ranges covered by
the different techniques (abbreviations are explained in Table 8.2).
The optical scattering of light is dependent on the particle size in addition to the
refractive index and hence is used directly for particle size classication and counting.
The electrical and mechanical mobility are used for the fractionation of particle sizes
with subsequent measurement of the number concentrations per size class by a
separate particle counting device such as CPC or electrometer. Particle number size
distributions are hence calculated based on the known investigated volume and the
Particle sizer
Abbreviation
Sampling interval
Earliest References
Nano-scanning mobility
particle sizer
Scanning mobility particle sizer
Differential mobility
spectrometer
Fast mobility particle
sizer
Electrical diffusion
battery
Nano-Moudi
Moudi
Electrical low-pressure
impactor
Optical particle counter
(sizer)
Aerodynamic particle
sizer
Nano-SMPS
Continuous
SMPS
Continuous
DMS
Continuous
FMPS
Continuous
EDB
Continuous
Nano-Moudi
Moudi
ELPI
Discontinuous
Discontinuous
(Dis)Continuous
OPC
Continuous
APS
Continuous
counted particles per size class. Table 8.2 presents an overview of the available
techniques and corresponding earliest references.
A very good overview of particle sizers based on optical and time-of-ight
techniques can be found in a dissertation by Cole [34]. Further good overviews were
given by Klaus and Baron [35], Chow [36] and McMurry [28].
The principles of the rst two techniques (APS and OPS) will be briey described
in this section and the other techniques are presented in separate sections hereafter.
Particles are accelerated in an APS by passing the aerosol through a nozzle.
Particles experience acceleration in that nozzle in accordance with their aerodynamic
diameter and their speed after the nozzle is directly proportional to it. This speed is
determined by a split laser beam after the nozzle, which simultaneously counts the
particles. The main drawback of the APS is that it can only detect particles larger than
about 500 nm aerodynamic diameter. Hence it can only be applied for the detection of
large nanoparticle agglomerates and not for nanoparticles.
The OPC has a similar drawback since it normally is limited to particle sizes in the
size range of the wavelength used. The detectable lower particle diameter is normally
around 200 nm (Figure 8.16).
Differential and Scanning Mobility Particle Sizer To determine the number size
distribution of submicron airborne particles, differential mobility particle sizers
(DMPSs) and SMPSs are very commonly used. Depending on their conguration,
they can cover a size range between 3 nm and 1 mm. The hardware of both instruments is substantially the same. The aerosol rst passes through a size-selective inlet
(impactor), before being neutralized in a neutralizer. The neutralized aerosol is then
j249
250
fractionated in a DMA, before the classied particles are counted in particle counting
equipment (usually a CPC, sometimes an electrometer). Computer software is used
to control the voltages applied to the DMA and to read the counts from the CPC. A
schematic of a DMPS/SMPS is shown in Figure 8.17.
The general principle of a DMPS/SMPS is that a range of voltages is applied to the
DMA. Each voltage corresponds to a certain electrical mobility. The concentration of
mono-mobile particles is measured with the CPC and along with the known charge
distribution evaluated to obtain the concentration of particles with this mobility
diameter. In a DMPS, the voltages are applied sequentially and the number of
particles counted after the concentrations have stabilized, before the next voltage is
applied. In the more recently developed SMPS [37], the voltage is continuously
ramped and an algorithm used to relate the measured concentrations to the applied
voltages, that is to the according electrical mobility. While a DMPS still needed for
1520 min for a full scan, an SMPS can accomplish full scans within 2 min.
The inversion of the measured mobility data into size distributions was described
by Hoppel [39]. Since the particles were neutralized prior to classication in the DMA,
the counted numbers not only contain singly but also larger, multiply charged
particles, as sketched in Figure 8.18a. In order to determine the number size
distribution from the mobility distribution, the concentration of multiply charged
particles has to be subtracted for each channel (Figure 8.18b) and the resulting
concentration of singly charged particles divided by the probability of singly charged
particles for that particular size (Figure 8.18c), as for example given by Wiedensohler [40] (see Table 8.1). To understand the inversion technique, it is easiest to start with
the channel of lowest electrical mobility, that is, with the largest particles. An
appropriate size-selective inlet for a DMPS/SMPS is designed such that its cut-off
diameter is below the diameter of doubly charged particles of the lowest detectable
electrical mobility. Therefore, all particles in the lowest mobility channel are singly
charged, because all particles bearing higher charge levels are captured in the
pre-selector. The measured number concentration in the lowest mobility channel can
thus without further correction be directly divided by the probability of singly charged
particles in order to obtain the airborne number concentration. The same is true for
all lower mobility channels, where the size of doubly charged particles is large enough
to be captured in the pre-selector. In higher mobility channels, doubly (or higher)
charged particles are present. Their size can be calculated by means of Equation 8.10
in the section on electrical mobility. Their concentration within the total measured
concentration in the channel can be determined by multiplying the airborne
concentration of particles of this (larger) size with the probability of doubly charged
particles. The concentration of doubly charged particles is then subtracted from the
measured concentration in order to obtain the concentration of only singly charged
particles, which then needs to be divided by the probability of singly charged particles
of that size in order to obtain the airborne concentration. Once the channels also
contain triply or higher charged particles, they need to be subtracted from the
measured data as described for the doubly charged particles and the result divided by
the probability of singly charged particles.
j251
252
After the correction described above (Figure 8.18ac), the result is the number
concentration for each channel where the channels now represent the average diameter
of the singly charged particles. The magnitudes of the concentrations, however, might
not necessarily represent the actual size distribution, as the width of the channels can
vary with respect to the particle diameter. The number concentration for each channel
(Figure 8.18c) therefore needs to be weighted with the channel width (Figure 8.18d). In
SMPSs, the channel widths are based on either common or natural logarithms.
Therefore, the number concentrations dN per channel are weighted with either the
common logarithm [dN/d log(dp)] or the natural logarithm [dN/d ln(dp)].
All of the above-mentioned data inversion was based on the assumption that no
particle losses occur in a DMPS/SMPS system. This is obviously not true. The data
therefore need to be post-processed to be corrected for particle losses. Since the
DMPS/SMPS systems are designed for submicron particles, losses can mainly be
attributed to diffusion, whereas other loss mechanisms can be neglected. Corrections
for diffusion losses are not generally covered in DMPS/SMPS evaluation software
packages and therefore in some cases need to be done manually. If the size
distributions are not corrected, they tend to under-predict signicantly the concentrations of particularly small particles with sizes mainly below 50 nm. In a DMPS/
SMPS, diffusion losses occur in
.
.
.
.
.
the
the
the
the
the
size-selective inlet
neutralizer and internal plumbing
tubing to the DMA and CPC,
DMA
CPC.
If the DMPS/SMPS does not sample directly into the size-selective inlet, but with
tubing connected to the inlet, losses inside this upstream tubing also need to be
considered. Furthermore, the CPC counting efciency needs to be taken into
consideration. While the losses in the DMA require special treatment as described
Reineking and Porstendorfer [67], the losses inside all tubing can be quantied as
described by Gormley and Kennedy [40].
Based on the assumption that all sampled particles are spherical, the measured
number size distribution can be converted into a surface or volume size distribution
as illustrated in Figure 8.19. To compute the surface distribution, the number
concentrations for each channel i need to be multiplied by the surface area of the
particles in that channel:
dSi dN i pd 2p;i
8:12
Figure 8.19 Number, surface and volume size distributions for spherical particles.
j253
254
p
dV i dN i d 3p
6
8:13
If the particles all have the same density, the volume distribution can also be
transferred into a mass size distribution. If the particles are not spherical, the
surface and volume distribution in Figure 8.19 can be understood as the surface and
volume of electrically equivalent spheres.
Under certain assumptions, the surface area and volume distributions of chainlike ultrane aggregates, such as soot particles, can be estimated, based on electrical
mobility measurements as described by Lall and co-workers ([41, 42]). Their theory,
however, is based on several assumptions:
.
.
.
All aggregates are composed of primary particles, all of which have the same known
diameter.
The primary particles are much smaller than the mean free path of the surrounding
gas molecules.
The connections between the primary particles do not exhibit necks, that is, the
surface area can be obtained by summing over the surface areas of the single
primary particles.
Fractal dimensions of the aggregates are smaller than two.
They take into account a different charging efciency for aggregates compared
with spheres, resulting in a shifted size distribution, and calculate the surface area of
the particles based on the number and size of primary particles of which the
agglomerates are composed.
Fast Mobility Particle Sizer SMPSs offer a time resolution of 2 min. If the size
distributions are quickly changing, such as due to the inuence of a car driving by or
accidental release of nanoparticles in workplaces, this time resolution may be too
short. Tammet et al. [55], 1998 of Tartu University in Estonia developed an electrical
aerosol spectrometer that measures number size distributions based on electrical
mobility measurements. However, the different mobility channels are not measured
sequentially as in an SMPS, but simultaneously, resulting in signicantly higher time
resolution. The instrument has been commercialized by TSI in two versions, the fast
mobility particle sizer (FMPS) with a time resolution of 1 s and the engine exhaust
particle sizer (EEPS) with a time resolution of 0.1 s. The two instruments are
fundamentally the same. Whereas the FMPS is designed for ambient or workplace
measurements, the EEPS is tailored for measuring engine exhausts and additionally
includes means for the recording of engine data. It is only the higher particle
concentration in engine exhaust that allows the EEPS to offer higher time resolution
than the FMPS. Both instruments cover the same size range from 5.6 to 560 nm. The
principle of the two instruments is shown in Figure 8.20.
The aerosol rst passes a two-stage diffusion charger. The rst stage puts a negative
net charge on the particles in order to remove any potential high positive charge levels
on the particles. The second stage puts a predictable net positive charge on the
particles. The main instrument functions very similarly to a DMA, just that the
aerosol is introduced near the inner rod, with the sheath air surrounding the aerosol
ow. The center rod is divided into three sections with different xed voltages
(increasing from top to bottom) applied to them. Since the voltages are xed, the
trajectories of the particles depend only on their electrical mobility. The outer cylinder
contains a series of 22 electrodes, each separately connected to an electrometer. The
current induced by deposited particles is continuously measured and used to
determine the particle size distribution. The data inversion takes into account a
number of parameters that affect the electrometer reading and the time resolution.
The electrometer current can be affected by image charges that are induced if charged
particles ow past a detection stage without being deposited. Additionally, there are
time delays between the detection of small particles in an upper stage and larger
particles in a lower stage.
A complex inversion algorithm is used to deconvolute the measured data and take
into account image charges and time delays.
j255
256
Figure 8.21 Schematic diagram of the original real-time singleparticle mass spectrometer. The diagram shows the direct sample
inlet directing an air flow to the filament for surface desorption/
ionization of aerosol particles. Ions are mass analyzed in a
magnetic sector mass analyzer (Adapted from [58] in [46].
Copyright 1977 American Chemical Society).
chemical ionization mass spectrometer. The aerosol charger generates singly charged
aerosol from sampled ambient particles at ow rates of up to 6.6 (slpm). An optional
differential mobility analyzer placed downstream of the charger can be used to achieve
size selectivity of the aerosol. Charged particles are then introduced into the electrostatic precipitator, a cylindrical chamber that contains a collection wire mounted on a
ceramic rod that is biased to a high voltage (usually 4200 V) and located on the center
line of the chamber (see Figure 8.22). Prior to collection, the wire is cleaned by applying
a current pulse to it, resistively heating it to 500 C. After allowing the wire to cool to
ambient temperature, a high voltage is applied so that these charged particles pass
through a clean buffer gas that isolates the collection wire from contamination from
the ambient gas and deposit on the tip of the wire. Once a sufcient number of particles
have been collected, usually 110 pg over a period of 510 min, the wire is inserted into
the ion source region of the chemical ionization mass spectrometer. A current is
applied to the wire to thermally desorb the aerosol at a temperature of 300 C. The third
part of the instrument is the chemical ionization mass spectrometer and consists of the
ion source region, a declustering collision cell and a mass spectrometer. The ion source
consists of a 241Am a-source that ionizes the reactant gas mixture at atmospheric
pressure to form H3O , O2 and CO3 and their higher clusters as the primary stable
ions. The reactant gas is cryogenic nitrogen that has passed through a liquid nitrogen
trap at slightly elevated pressure to remove some impurities, but which nonetheless is
able to generate these ions in abundance. These regent ions react with the compounds
evaporated from the aerosol to ionize them according to typical chemical ionization
j257
258
mass spectrometry (CIMS) procedures [49] and electrostatic lenses direct these ions
into the collision cell, where the ions are declustered from neutral compounds that may
be present in the gas, most commonly water. The ions are detected using selected ion
monitoring with a triple quadrupole mass spectrometer (paragraph adapted
from [48]).
8.3.3.4 Offline Physical Characterization
Not all particle characteristics can be determined online, since either no online
methods exist, as e.g. for particle morphology, or detection limit constrains. Therefore
particles can be sampled on suitable substrates to overcome these limitations.
Electrostatic Precipitations and Impaction Electrostatic precipitators (ESPs) provide a
simple mean for the collection of samples for ofine analysis of, for example,
chemical composition or morphology of airborne particles. Common applications of
ESPs include samples for electron microscopy (SEM/TEM), for example coupled
with EDX analysis, samples for atomic force microscopy (AFM), samples for total
reection X-ray uorescence (TXRF) analysis, nanomaterial evaluation and atmospheric particle sampling.
Inside an ESP, the particles are exposed to an electric eld, which directs
particles of one polarity towards a sample electrode, whereas those particles of
other polarity are deposited on the ESP wall. Uncharged particles are not affected by
the electric eld and therefore follow the gas streamlines. For effective particle
sampling, unipolarly charged particles should be introduced into an ESP. Unipolar
charging of particles can for example be achieved by a corona charger upstream of
the ESP.
If connected directly to an outlet of a DMA, an ESP can be used to sample size
selected particles without additional charging.
Figure 8.23 shows an ESP as developed by Dixkens and Fissan [50] which has been
commercialized (TSI Model 3089). In this ESP, particles are homogenously distributed within a deposition spot. The spot size can be varied by controlling the ow
aerosol ow rate and voltage applied to the electrode system.
Nanoparticles may also be sampled by using the impaction process. Commercially
available cascade impactors separating and sampling particles even down to about
20 nm are the Nano-Moudi, the Berner low-pressure impactor and the electrical lowpressure impactor (ELPI) (Figures 8.24 and 8.25, see also [51]).
These cascade impactors are all based on the same principle, the inertia of
particles. This principle is explained in Section 8.3.3.1. The main extension to the
principle explained therein is the low-pressure conditions. Low pressure is necessary
actually to use the principle of inertia to differentiate particles in the nanometer size
range since gasparticle interactions otherwise interfere with the process and make a
particle size separation for those size classes impossible.
Once the particles have been deposited on the substrates, either by ESP or by
impaction, particles may be chemically analyzed in the bulk or as single particles. ESP
is the better option for the latter chemical analysis.
8.4
Nanoparticle Detection and Measurement Strategies
j259
260
to be traceable to the product or are of such high toxicity that even single particles can
cause health effects (e.g. asbestos).
Themeasurementtechniquestobeemployedmaybedifferentifonlylargerquantities
of nanoparticles have to bedetected. Even here two different levels may bedifferentiated,
exemplarilydiscussedherefornumberconcentrationsofparticlesofdiameter<100 nm:
nanoparticle contributions to airborne particles from 1000100 000 cm3 and contribuohlmann [18]are applicable
tions >100 000 cm3. Measurement techniquesasused byM
to identify source contributions >100 000 cm3. In his study, a single SMPS measuring
particle number size distributions from 10 to 700 nm was used. This measurement
allows for the calculation of the particle number concentration of particles of diameter
<100 nm and for the determination of the mode (particle size with the highest number
concentration). Still, since particle number concentrations of up to 100 000 cm3 may
occur naturally or by contributions from outside, this kind of measurement strategy is
only applicable for higher number concentrations.
A different approach must be pursued if particle contributions by processes,
leaks and work activities at levels down to about a few thousand particles per cubic
meter are to be determined. Kuhlbusch et al. [17] and Kuhlbusch and Fissan [19]
showed that particle contributions from sources outside the plant can be signicant
and have to be taken into account. By choosing a setup of instruments with simultaneous measurements inside the working area and at a comparison site in the direct
vicinity of the work area but outside, they demonstrated the inuence of outside
j261
262
Number concentration
range (cm3)
>100 000
1000100 000
<1000
Strategy
Determination of particle number concentrations and/or number
size distributions only at the location of interest
Determination of particle number concentrations and/or number
size distributions simultaneously at the location of interest and a
corresponding comparison location
Determination of particle number concentrations and/or number
size distributions, along with particle samplers for single particle
analysis for nanoparticle identication
contributions to indoor measurements. By choosing this kind of setup they were able
to differentiate (a) sources from outside of the work area, (b) continuous source
contributions and (c) discontinuous source contributions. This possibility of differentiation of continuous and discontinuous source activities can be important when the
continuous sources are active during the assessment of a discontinuous activity. The
particles from the continuous source would have been attributed to the discontinuous
source activity if no differentiation were to have been made.
Still, even when determining source activities by the methods describe above,
single particle analysis may still be necessary to avoid data misinterpretation, such as
attributing welding particles to nanoparticle bagging activity (Table 8.3).
References
1 Oberdorster, G., Sharp, Z., Atudorei, V.,
Elder, A., Gelein, R., Kreyling, W. and Cox,
C. 2004 Translocation of inhaled ultrane
particles to the brain. Inhalation Toxicology,
16 (67), 437445.
2 Donaldson, K., Li, X.Y. and MacNee, W.
(1998) Ultrane (nanometer) particle
mediated lung injury. Journal of Aerosol
Science, 29 (56), 553560.
3 Oberdorster, G., Oberdorster, E. and
Oberdorster, J. (2005) Nanotoxicology: an
emerging discipline evolving from studies
of ultrane particles. Environmental Health
Perspectives, 113 (7), 823839.
4 Borm, P.J.A., Robbins, D., Haubold, S.,
Kuhlbusch, T., Fissan, H., Donaldson, K.,
Schins, R., Stone, V., Kreyling, W.,
Lademann, J., Krutmann, J., Warheit, D.,
and Oberdorster, E., (2006) The potential
References
10
11
12
13
14
15
16
j263
264
References
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
j265
266
j267
9
Epidemiological Studies on Particulate Air Pollution
Irene Br
uske-Hohlfeld and Annette Peters
9.1
Introduction
This chapter presents an overview of the main results stemming from epidemiological research on the health effects of exposure to particulate air pollution in the
environment and at the workplace. Over the past two decades, evidence has
accumulated that airborne particles are correlated with the incidence of respiratory
and cardiovascular disease. Although remarkably consistent between numerous
epidemiological studies in different geographic areas, these ndings were at rst
received with some skepticism, as there appeared to be no plausible biological
mechanism to explain the observed association between respiratory and
cardiovascular mortality and the level of airborne particles. As epidemiology is an
observational rather than an experimental science, it cannot establish causality on its
own and is a rather blunt tool for elucidating biological mechanisms. However, in the
meantime complementary information from controlled in vivo and in vitro
experimental studies has supplied supporting evidence, which will be presented,
for example, in Chapter 10.
9.1.1
Outline of the Chapter
268
heart, the lungs and the central nervous system. Finally, we will summarize
epidemiological studies that have looked into dusty workplace environments and
have investigated the impact on health of exposed workers.
9.1.2
A Short Definition of Particle Sizes
9.1 Introduction
mass concentration is relatively low. With regard to ultrane particles, the number
concentration (n cm3) or surface area concentration (mm2 m3) or particle length
concentration (mm cm3) is more relevant than particle mass. Such data on exposure
are not routinely available by monitoring stations, but have to be collected independently. It has been proposed that the adverse health effect of airborne particles was
mainly associated with the number concentrations of ultrane particles [13] rather
than the mass concentrations PM2.5 or PM10.
9.1.3
A Brief Comment on Epidemiological Study Design
j269
270
between two groups. The relative risk expresses the ratio of the probability of
the event occurring in the exposed group versus the control (non-exposed) group.
A 95% condence interval (CI) is dened as the interval between two numbers with
an associated probability p which is generated from a random sample of an
underlying population, such that if the sampling was repeated numerous times and
the CI recalculated from each sample according to the same method, a proportion p of
the CIs would contain the population parameter in question. It must be noted that
this is not equivalent to a (Bayesian) credible interval. We will cite OR and RR
estimates from epidemiological studies along with their CIs to give the reader a
perception of the magnitude of the measured association and the precision of this
estimate mirrored by the width of the CI.
9.2
Potential Entry Routes for Nanoparticles into the Human Body
In principle, there are three main contact sites of the human organism with
the environment: skin, lungs and intestinal tract. The skin provides a relatively
thick (10 mm) rst barrier against hazardous compounds that is difcult to pass, as
opposed to the lungs, where in the gas exchange region the barrier between the
alveolar wall and the capillaries is very thin. The air in the lumen of the alveoli is on
average only 0.5 mm away from the blood. Epidemiological studies with their main
focus on environmental air pollution can only contribute scientic ndings to the
effect of inhaling particles. The dermal or oral uptake of particles, although probably
important in the context of manufactured nanomaterial, has so far not been the
objective of epidemiology.
9.2.1
Inhalation and Metabolism of Airborne Particles
Particles can be inhaled when their aerodynamic diameter is less than 10 mm; larger
ones will be trapped in the nose. In general, as particle size decreases, the access to the
lower respiratory tract and the alveolar region increases [4]. This rule does not apply,
however, for particles smaller than 100 nm, as the deposition of nanosized particles
becomes governed by diffusional processes rather than gravity. For example, 20 nm
UFP are predicted to be deposited in the alveolar region up to 50% and only about
10% each in the nasopharyngeal and tracheobronchial regions; in contrast, about
90% of inhaled UFP around 1 nm in size deposit in the nasopharyngeal region [5],
whereas only about 10% of this size deposit in the tracheobronchial and essentially
none in the alveolar region.
Inhaled particles will be cleared by various human defense mechanisms.
The mucociliary escalator dominates clearance from the upper airways, where
particles in the size range 2.510 mm deposit. The mucociliary escalator is an efcient
transport system pushing the mucus, which covers the airways, together with trapped
solid materials towards the mouth. Smaller particles (PM2.5 and particles <100 nm)
reach the alveoli of the lung and can only be cleared by alveolar macrophages.
The uptake of particles and bers in the alveoli, not only results in activation of
macrophages, but also stimulates the release of chemokines and pro-inammatory
cytokines into the circulation and the production of reactive oxygen species. Although
the inammatory response is a key component of host defense, it can also contribute
to persistent inammation and the pathogenesis of disease [6]. Independently of
particle size, there are specic particle characteristics of manufactured nanoparticles
such as shape (bers versus crystals), surface (coated versus uncoated) and surface
charges (hydrophilic versus hydrophobic properties), which affect deposition and
clearance. Even physiological features of the host organism, such as blood circulation
during strenuous physical activity [7, 8] or changed air ow due to pre-existing lung
diseases [9], determine the extent and site of deposition of particles.
The impact of inhaled particles on extra pulmonary organs has only recently been
recognized. Nemmar et al. found in ve healthy volunteers that inhaled ultrane
99m
Tccarbon particles passed rapidly into the systemic blood circulation [10].
The literature on the translocation of very small particles from the lungs into the
blood circulation is limited and still conicting. In experimental animal
studies, several authors have reported extra pulmonary translocation of ultrane
particles [1113] after intratracheal installation or inhalation. However, the amount of
ultrane particles that translocate into blood and extra pulmonary organs differed
among these studies.
The difference in deposition characteristics is very important to understand why
nanoparticles can probably gain access into the human central nervous system (CNS)
directly from deposits on the nasal mucosa via the olfactory epithelium and the
olfactory nerves. This pathway has been well demonstrated for inhaled or nasally
instilled compounds in animal experiments [14] and if it also exists in humans
would be very important, as it circumvents the tight bloodbrain barrier. Although
the olfactory system of rodents requires 50% of the nasal mucosa as compared with
only 5% in humans, Elder et al. suggest that the direct access of nanoparticles to the
brain via the olfactory epithelium and the olfactory nerves is also relevant in
humans [15].
9.3
Studies of Environmental Air Pollution in the USA and Europe
9.3.1
PM10 and PM2.5
Based on health statistics and smog episodes in the past, a temporal correlation
between high levels of air pollution and acute increases in morbidity and mortality
was observed already in the 1950s (e.g. [16]). Since then, numerous epidemiological
studies have shown that the association of daily deaths with daily air pollution is
not conned to smog episodes, but exists at levels commonly observed in cities and
rural areas.
j271
272
obstructive pulmonary disease [25], congestive heart disease [26, 27], myocardial
infarction [28, 29] or diabetes [29, 30]. All of these studies showed an increased risk of
experiencing acute exacerbation of their disease on days with a high concentration of
air pollution or shortly afterwards. However, physiological responses with
potentially negative effects such as an increase in plasma viscosity [31], in
brinogen [32] and in C-reactive protein [4] were not restricted to frail populations,
but were also observed in samples of randomly collected healthy subjects. Small
increases in blood pressure may occur in association with elevated concentrations of
ambient particles [33, 34].
9.3.2
Ultrafine Particles (UFP)
j273
274
9.4
Cardiovascular Disease
Repeated exposures to elevated ambient air pollution concentrations might not only
transiently deteriorate risk factor proles. Several mechanisms have been hypothesized to contribute to deaths from cardiovascular diseases [49], as shown in Figure 9.2.
The inhalation of particles provokes oxidative stress and triggers alveolar and
systemic inammation [2], the linchpin of further patho-physiological mechanisms
leading to (1) altered blood rheology favoring coagulation [31, 50], (2) vascular
dysfunction [43, 51] and (3) enhanced atherosclerosis, all increasing the risk of a
subsequent myocardial infarction; and (4) the alteration of the autonomic nervous
control of the heart increases the likelihood of ischemic events and cardiac arrhythmias [52]. Patients with implanted cardioverter debrillators were more likely to
receive interventions with high ambient air pollution 2 days before [53].
An association was found between exposure to trafc and the onset of a
myocardial infarction within 1 h afterwards (OR 2.92; 95% CI: 2.22 to 3.83,
p < 0.001). The time the subjects spent in cars, on public transportation or on
motorcycles or bicycles was consistently linked with an increase in the risk of
j275
276
myocardial infarction. Adjusting for the level of exercise on a bicycle or for getting
up in the morning changed the estimated effect of exposure to trafc only slightly
(OR for myocardial infarction, 2.73; 95% CI: 2.06 to 3.61, p < 0.001). The subjects
use of a car was the most common source of exposure to trafc; nevertheless, there
was also an association between time spent on public transportation and the onset
of a myocardial infarction 1 h later [54].
Time-series studies have reported signicant reductions in heart rate variability in
association with higher ambient air pollution levels in elderly subjects [55, 56] and
with occupational exposure concentrations in healthy young men [57]. Decreased
heart rate variability reects a disturbance of cardiac autonomic function and predicts
an increased risk for sudden death. Peters [52] reviewed the association between
particulate matter and heart disease and concluded that epidemiological studies have
demonstrated coherent associations between daily changes in concentrations of
ambient particles and cardiovascular disease mortality, hospital admission, disease
exacerbation in patients with cardiovascular disease and early physiological responses
in healthy individuals consistent with a risk factor prole deterioration.
9.5
Respiratory Disease
There are few sources of widespread urban air pollution that rival diesel exhaust.
The combustion of diesel fuel leads to an emission aerosol that is nanoparticle in
primary particle size, but rapidly forms aggregates of 80 nm nanoparticle
(accumulation) mode with a solid carbon core. In Europe, exhaust from motor vehicle
trafc is considered to contribute to more than 50% of ambient particulate matter
(PM10) [59]. For ultrane particles, the contribution of automobile trafc is even higher.
Trafc-related air pollution increases the risk of non-allergic respiratory symptoms and
disease. This has been observed in so many epidemiological studies that only one
review [60] and one study from The Netherlands are cited here as examples.
Diesel typically is emitted at ground level and ambient diesel levels are highest near
highways and busy roads. Brunekreef et al. [61] studied children in six areas located
near major motorways in The Netherlands and showed that lung function was
associated with truck trafc density. The association was stronger in children living
closest (<300 m) to the motorways. The results indicated that exposure to
Peterson and Saxon [64] reviewed the prevalence of allergic rhinitis and asthma and
found an increase in frequency over the past two centuries. They suggested that
certain pollutants such as those produced from the burning of fossil fuels, which have
been shown to enhance in vitro and in vivo IgE production, may be partly responsible
for the increased prevalence of allergic respiratory disease.
Laboratory studies in humans and animals have shown that particulate toxic
pollutants, particularly diesel exhaust particulates, can enhance allergic inammation and can induce allergic immune responses. Although road trafc pollution from
automobile exhausts may be a risk factor for atopic sensitization, the evidence in
support of this view remains contradictory [65, 66]. Some investigators have reported
a clear association between the prevalence of allergy and road trafc-related air
pollution, whereas such a difference was not observed in other studies.
Asthma is characterized by airway obstruction, with air trapping and increases in
lung residual volume. Increases in alveolar volume would be expected to enhance
diffusional deposition, the primary mechanism of deposition for UFPs, although
j277
278
impaired alveolar ventilation would counter this increase. A panel study of subjects
with asthma [36] found that peak ow varied more closely with the 5-day mean of UFP
number than with ne particle mass concentration, suggesting that the UFP
component of ne particle pollution contributes to airway effects in asthmatics.
Penttinen et al. [40] noted that UFP number concentrations tended to be inversely but
not signicantly associated with measures of lung function. However, some epidemiological studies have not found associations between UFP exposure and health
effects [41]. Inhaled UFPs have a high predicted deposition efciency in the
pulmonary region [67]. Thus, the expected number of particles retained in the lung
with each breath is greater for UFPs than for larger particles. A study on 16 subjects
with mild to moderate asthma demonstrated an efcient respiratory deposition of
ultrane particles especially in subjects with asthma [68]. Deposition was measured
during spontaneous breathing at rest and exercise. The deposition fraction increased
during exercise by particle number and mass concentration and reached a maximum
for the smallest particles. When both the increased deposition fraction and minute
ventilation were considered, the total number of particles retained in the lung was
74% greater in subjects with asthma than in healthy subjects. Thus, people with
asthma have a higher total respiratory dose of UFPs for a given exposure, which may
contribute to their increased susceptibility to the health effects of air pollution.
The association between particulate air pollution and asthma medication use and
symptoms was assessed in a panel study of 53 adult asthmatics in Erfurt, Germany, in
winter 199697. The results suggest that reported asthma medication use and
symptoms increase in association with particulate air pollution (0.010.1 mm in
diameter) and gaseous pollutants such as nitrogen dioxide [38].
9.5.3
Lung Cancer
Large cohort studies in the USA and Europe suggest that air pollution may increase
lung cancer risk. For example, the Adventist Health Study found an increased risk of
newly diagnosed lung cancers in a cohort study of 6338 non-smoking, white
Californian adults, followed from 1977 to 1992, associated with elevated long-term
ambient concentrations of PM10 [69]. In the Harvard Six Cities Study air pollution was
positively associated with death from lung cancer [21]. Also, in the Cancer Prevention
II study of the American Cancer Society it was quantitatively evaluated that for each
10 mg m3 elevation in ne particulate air pollution there is an increase of 8% in lung
cancer mortality [24].
In Europe, the association between incidence of lung cancer and long-term
air pollution exposure was investigated in a cohort of Oslo men followed from
197273 to 1998. During the follow-up period, 418 men developed lung cancer. For a
10 mg m3 increase in average home address exposure to nitrogen oxides NOx a
trafc-related gas of urban air pollution between 1974 and 1978, the risk of
developing lung cancer increased by 8%, controlling for age, smoking habits and
length of education [70]. To estimate the relationship between air pollution and lung
cancer, a nested casecontrol study was set up within EPIC (European Prospective
9.6
Diseases of the Central Nervous System
Transitional metals such as copper, manganese and iron have been associated with
pathological lesions of the brain characteristic of a variety of neurodegenerative
diseases such as Parkinsons disease, Alzheimers disease and amyotrophic lateral
sclerosis [85]. Metals are essential in the synthesis of DNA and RNA and are also
cofactors of numerous enzymes, particularly those involved in respiration.
In addition, several modications indicative of oxidative stress have been described
in association with neurons, neurobrillary tangles and senile plaques in
Alzheimers disease.
These ndings became even more important after inhalation experiments with
rats by Oberdorster [14] suggested that 13 C-labeled nanoparticles with a size about
35 nm may migrate along the olfactory nerve into the olfactory bulb of the brain after
j279
280
deposition on the olfactory mucosa in the nasal region. If this observation proves to be
a route of entry of nanoparticles into the brain, it would circumvent the tight
bloodbrain barrier and might play a role in neurodegenerative disease.
In Mexico, neuropathological ndings for 32 dogs from Southwest Metropolitan
Mexico City, a highly polluted urban region, were compared with those for eight dogs
from Tlaxcala, a less polluted, control city [86]. The report describes early and
progressive alterations in the nasal respiratory and olfactory mucosa. Early changes
included expression of nuclear neuronal NF-kappaB and iNOS in cortical endothelial
cells occurring at ages 2 and 4 weeks; subsequent damage included alterations of the
bloodbrain barrier (BBB), degenerating cortical neurons, apoptotic glial white
matter cells, deposition of apolipoprotein E (apoE)-positive lipid droplets in smooth
muscle cells and pericytes, non-neuritic plaques and neurobrillary tangles.
The authors concluded that persistent pulmonary inammation and deteriorating
olfactory and respiratory barriers may play a role in the degenerative neuropathology
observed in the brains of highly exposed dogs.
The greatest exposure to metals is likely to occur in occupational settings such as
mining, alloy production and welding. Welding and laser operations are well
known for their potential to produce large numbers of nanosized particles [87]
(see Table 9.1), for example manual metal arc welding with covered electrodes
releases particles in the size ranges 20400 and 1020 nm for gas-shielded metal arc
welding.
A review by Tjalve and Henriksson [88] deals with the mechanism of uptake and
transport of metals in the olfactory system. Metals discussed are mainly manganese,
cadmium, nickel and mercury. Manganese was found to have a unique capacity to be
taken up via the olfactory pathways and be passed transneuronally to other parts of the
brain. It is considered that the occupational neurotoxicity of inhaled manganese may
be related to an uptake of the metal into the brain via the olfactory pathways. Airborne
manganese levels during welding practice were measured in a study on 97 welders
engaged in electric arc welding in a vehicle manufacturer. Ambient manganese levels
in welders breathing zone were the highest inside the vehicle and the lowest in the
center of the workshop. Serum levels of manganese in welders were about three-fold
(p < 0.01) higher than those of controls [89].
The neurotoxicity of manganese has been known since the nineteenth century.
In 1837, Couper described manganism characterized by extrapyramidal motor
system dysfunction and in particular, Parkinsons disease and dystonia. Manganese
is rapidly cleared from the blood by the liver, but elimination from the central nervous
system takes a very long time. The neurological signs of manganism have received
close attention because they resemble several clinical disorders collectively described
as extrapyramidal motor system dysfunction and, in particular, Parkinsons disease
and dystonia. Semchuk et al. [90], in a population-based casecontrol study in Calgary,
Alberta, reported no signicant increase in risk of Parkinsons disease associated
with a history of rural exposure to manganese. In contrast, Gorell et al. [91],
in a population-based casecontrol study at Henry Ford Health System (HFHS),
Detroit, MI, found a signicant association of Parkinsons disease with manganese
with more than 20 years of occupational contact (OR 10.6, 95% CI: 1.06 to 105.83),
although only three cases and one control subject had such a lengthy exposure to
manganese. The small number reporting such an exposure requires that the
association be interpreted with caution.
Racette et al. performed a casecontrol study [92] that compared the clinical
features of 15 career welders with two control groups with idiopathic Parkinsons
disease. One control group was ascertained sequentially to compare the frequency of
clinical features and the second control group was sex- and age-matched to compare
the frequency of motor uctuations. Welders were exposed to a mean of 47 144
welding hours. Welders had a younger age at onset (46 years) of Parkinsons disease
compared with sequentially ascertained controls (63 years; p < 0.0001). There was no
difference in frequency of tremor, bradykinesia, rigidity, asymmetric onset, postural
instability, family history, clinical depression, dementia or drug-induced psychosis
between the welders and the two control groups. Parkinsonism in welders was
distinguished clinically only by age at onset, suggesting that welding may be a risk
factor for Parkinsons disease.
9.7
Particulate Air Pollution at the Workplace
The inhalation of dust at work has historically always been and still remains one of the
most important causes of ill-health related to work. Dust is responsible for serious
and disabling diseases such as pneumoconiosis, interstitial lung disease and brosis,
lung cancer and asthma. Research related to dust exposure at the workplace has
historically always focused on the effects on the lung and only recently probably
triggered by environmental epidemiological studies has started to look for
implications regarding the cardiovascular system. For example, in a retrospective
cohort study, an increased risk of mortality due to ischemic heart disease (OR 1.32,
95% CI: 1.13 to 1.55) was observed among heavy equipment operators [93] exposed to
diesel motor emissions. The paragraphs below summarize the effects of particle
inhalation on the lungs, as the impact of dust on the lungs has attracted most
attention over the last 50 years in occupational medicine.
Pneumoconiosis, one of the civilizations oldest known occupational respiratory
diseases, is caused by the inhalation of dust and is characterized by a reactive
reparative process that leads to the formation of nodular brotic changes in the lungs.
Gradually, the alveoli of the lungs become replaced by brotic tissue, causing an
irreversible loss of the tissues ability to transfer oxygen into the bloodstream.
The brogenic potential of inorganic dusts varies considerably, with silica and
asbestos having greater brogenic potential than coal dusts, iron and man-made
mineral bers. Silicosis, a condition of brosis of the lungs marked by shortness of
breath and resulting from prolonged inhalation of crystalline silica dust, is also
associated with lung cancer. The hypothesis that diffuse brotic disorders of the
lung are associated with increased lung cancer risk stems from early observations
at autopsy that lung cancer was often associated with brosis of the lung.
This nding could be substantiated in a meta-analysis of lung cancer and silicosis [94].
j281
282
The pooled RR estimate for the 23 studies that could be combined was 2.2, with a 95%
CI of 2.1 to 2.4. The authors considered the association between silicosis and lung
cancer as causal, either due to silicosis itself or due to a direct effect of the underlying
exposure to silica. The IARC concluded that there is sufcient evidence for carcinogenicity of crystalline silica in humans [95].
In 2006, the IARC in Lyon, France, reassessed the carcinogenicity of carbon black
and titanium dioxide and the results will be published as volume 93 of the IARC
Monographs. Both substances are produced in the particulate form. Exposure to
carbon black occurs mainly with aggregates with particle size 50600 nm.
The primary particles of titanium dioxide are typically 200300 nm in diameter, but
larger aggregates and agglomerates are readily formed. Ultrane grades of titanium
dioxide (1050 nm) are used in sunscreens and plastics to block ultraviolet light.
For carbon black and titanium dioxide, the Monograph Working Group of the IARC
concluded that existing epidemiological studies provided inadequate evidence of
carcinogenicity, but overall taking also into account the sufcient evidence
of carcinogenicity from toxicological experiments in laboratory animals carbon
black and titanium dioxide were classied as possibly carcinogenic to human beings
(Group 2B).
Asbestos is the name given to a group of minerals that occur naturally as bundles of
bers which can be separated into thin threads. These bers are not affected by heat
or chemicals and do not conduct electricity. For these reasons, asbestos has been
widely used in many industries. When asbestos bers are set free and inhaled,
exposed individuals are at risk of developing an asbestos-related disease such as
asbestosis, lung cancer, mesothelioma of the pleura or peritoneum and other cancers,
such as those of the larynx and oropharynx [96]. Asbestos remains the primary
occupational carcinogenic substance affecting workers all over the world. Outside the
workplace, asbestos is second only to tobacco as an environmental source of cancer.
Carbon nanotubes have attracted a great deal of attention due to their potential
technological applications, but also with their shape and physical appearance
resembling those of asbestos bers have aroused considerable concern. Even
though carbon nanotubes consist only of carbon, it does not seem adequate to classify
them (and also fullerenes in general) as graphite. Varying physical shapes might well
be associated with entirely different properties. Toxicity studies on nanomaterial will
be extremely complex, as 10, 20 m, 50 and 500 nm titanium dioxide crystals, for
example, will all be different. Despite the prominence of carbon nanotubes in
nanotechnology, exploration of their interactions with biological materials remains
sparse. In part, this reects the challenge of observing nanotubes in biological
environments. Single-walled carbon nanotubes in tissues evade detection by elemental analysis, as they contain only carbon, and often also by electron microscopy.
One successful method described recently is near-infrared uorescence
microscopy [97].
Lam et al. [98] and Warheit et al. [99] have published the results of studies of the
toxicity of single-walled carbon nanotubes in mice and rats, respectively. Using an
intratracheal route of administration, they compared different means of nanotube
production with effects of carbon black and quartz particles. In Lam et al.s study,
the nanotubes were found to produce dose-dependent lung lesions. The effects of
carbon black were distinctively different. The study by Warheit et al. was more
comprehensive. It showed multifocal pulmonary granuloma but without evidence of
ongoing pulmonary inammation or cellular proliferation. These effects were
different from those of quartz, carbon black and graphite. The conclusion from
these two studies was that carbon nanotubes have different toxicological properties
from other forms of carbon [100].
Another example of how the physical shape of nanoparticles will have an impact on
cellular function was provided by Zhao et al. [101]. C60 fullerenes were found to bind
to double-stranded DNA, either at the hydrophobic ends or at the minor groove of the
nucleotide. They also bound to single-stranded DNA, deforming the nucleotides
signicantly. When the DNA molecule was damaged (specically, a gap was created
by removing a piece of the nucleotide from one helix), fullerenes could stably occupy
the damaged site. The authors speculated that this strong association may negatively
impact the self-repairing process of the double stranded DNA.
There have been two reports [102, 103] describing brotic lung disease that
developed after exposure to indium tin oxide (ITO). ITO is a sintered alloy containing
a large proportion of indium oxide and a small proportion of tin oxide and is used in
the making of thin-lm transistor liquid crystal displays (LCDs) for television
screens, portable computer screens, cellphone displays and video monitors. One
patient was engaged in wet surface grinding of ITO targets for 3 years and the other
was exposed for 4 years to ITO as an aerosol while making transparent conductive
lms. Both patients came from the same factory and developed pulmonary brosis.
One died of bilateral pneumothorax. The autopsy demonstrated interstitial pneumonia with numerous ne particles scattered throughout the lungs. Intrapulmonary
deposition of indium and tin was shown by X-ray energy spectrometry in the ne
particles. The level of serum indium was extremely high. According to Chonan and
Taguchi [104], among 115 workers from the same metal plant, 14 revealed interstitial
brosis on chest CT.
In experimental or occupational settings, exposure to airborne particles, bers
and fumes have long been recognized as causing brotic lung disease, with
idiopathic pulmonary brosis (IPF) being the most distinct entity. IPF is a
progressive and devastating lung disorder with a median survival of 24 years
after diagnosis [105], yet the course of individual patients is highly variable. In
various populations, the prevalence estimates for IPF have ranged from 6 to 32
per 100 000 persons. Approximately two-thirds do not have a known cause
(idiopathic), whereas one-third result from known causes such as sarcoidosis,
connective tissue disease, complication of certain drug exposures or radiation and
occupational exposures.
In a meta-analysis of six casecontrol studies conducted in three countries, several
exposures were signicantly associated with IPF, including ever smoking (OR 1.58,
95% CI: 1.27 to 1.97), agriculture/farming (OR 1.65, 95% CI: 1.20 to 2.26), livestock
(OR 2.17, 95% CI: 1.28 to 3.68), wood dust (OR 1.94, 95% CI: 1.34 to 2.81), metal
dust (OR 2.44, 95% CI: 1.74 to 3.40) and stone/sand (OR 1.97, 95% CI: 1.09 to
3.55) [106]. Although multiple exogenous agents can initiate an inammatory
j283
284
alveolitis and result in interstitial lung disease, it is likely that the underlying
pathogenetic mechanisms that mediate the development and progression of pulmonary brosis are similar. The natural history and the pathogenic mechanisms remain
unknown; the long-prevailing hypothesis sustains the idea that chronic inammation plays an essential role. According to this hypothesis, the alveolar epithelial
alterations are caused by an unresolved inammatory process. More recently,
however, research emphasis changed from a focus on inammation to alveolar
epithelial injury, brogenesis in broblastic foci [107].
New cases of occupational asthma in France are collected by a national
surveillance program, based on voluntary reporting, named Observatoire National
des Asthmes Professionnels (ONAP) [108]. The most frequently incriminated
agents were our (20.3%), isocyanates (14.1%), latex (7.2%), aldehyde (5.9%),
persulfate salts (5.8%) and wood dust (3.7%). The highest risks of occupational
asthma were found in bakers and pastry makers, car painters, hairdressers and
wood workers. Another voluntary surveillance scheme, SHIELD, for occupational
asthma is located in the West Midlands, a highly industrialized region of the
UK [109]. Spray painters represented the occupation at the highest risk of
developing occupational asthma, followed by electroplaters, rubber and plastic
workers, bakery workers and molders. Although the percentage of reported cases
was low among healthcare workers, there was an increasing trend. Isocyanates
still remained the most common causative agents, with 190 (17.3%) out of the
total 1097 cases reported to the surveillance scheme in 7 years. There was a
decrease in the reported cases due to colophony (from 9.5 to 4.6%) and our and
wheat (from 8.9 to 4.9%). There was an increase of reported cases due to latex
(from 0.4 to 4.9%) and glutaraldehyde (from 1.3 to 5.6%).
The best evidence to support the hypothesis that it is the ultrane fraction of PM10
that is responsible for the adverse health effects comes from toxicology [110].
Ultrane particles have extra toxicity and inammogenicity compared with ne,
respirable particles of the same material when delivered at the same mass dose.
This has been shown for a range of different materials of generally low toxicity, such
as carbon black and titanium dioxide. Ultrane particles cause inammation in the
lungs even when composed of relatively low toxicity materials. The mechanism of the
induction of inammation appears to be via oxidative stress and Ca2 and signaling
perturbations [111]. Particularly the large surface area of ultrane particles provides
a unique interface for catalytic reaction of surface-located agents with biological
targets such as proteins and cells [112]. In vivo experiments showed that within hours
after the respiratory system is exposed to nanoparticles, they may appear in many
compartments of the body, including the liver, heart and nervous system. Inhalation
experiments with rats resulted in ultrane titanium dioxide particles being found on
the luminal side of airways and alveoli, in all major lung tissue compartments and
cells and within capillaries. Particles within cells were not membrane bound and
hence had direct access to intracellular proteins, organelles and DNA, which may
greatly enhance their toxic potential [113].
Current aerosol standards at the workplace are expressed in terms of mass
concentration of particulate matter conforming to a particle size fraction.
Instruments able to measure particles below 100 nm were rst introduced for
environmental studies and are not in use for operational supervision due to a
lack of regulations. This is surprising, as nanoparticles had been around the
workplace for a long time before nanotechnology appeared on the scene. Aerosols
in workplace environments may be derived from mechanical processes (e.g. the
breaking or fracture of solid or liquid material) and may come from a variety of
sources such as mining, chemical manufacture, textiles and agriculture. The size
range can be anything from micrometer and submicrometer particles down to
100 nm and below. In a workplace study [114], nanoparticles occurring in
different work processes were measured. Typical examples include welding
fumes, metal fumes, soldering fumes, plasma cutting fumes, plasma spraying
emissions, polymer fumes, vulcanization fumes, powder coating emissions, oil
mists, aircraft engine emissions, bakery oven emissions, meat smokery fumes
and particulate diesel motor emissions. The particles were for the most part the
products of condensation in thermal and chemical reactions, the primary
particles created having a size of only a few nanometers. The most frequently
occurring particle size was between 160 and 300 nm. The total concentration of
all particles in the measurement range 14673 nm was between 500 000 and
2 500 000 particles per cm3. A comparison of the occurrence of nanoparticles in
different workplace atmospheres is given in Table 9.1 [87].
Most plasma and laser deposition and aerosol processes are performed in
evacuated or at least closed reaction chambers. Therefore, exposure to nanoparticles
is more likely to happen after the manufacturing process itself, except in those cases
of failures during the processing. In processes involving high pressure (e.g. supercritical uid techniques) or with high-energy mechanical forces, particle release
could occur in the case of failure of sealing of the reactor or the mills. Furthermore,
many particles, including metallic particles, are highly pyrophoric and there is a
considerable risk of dust explosions [116].
Workplace
Total concentration
in measurement range
14673 nm (particles cm3)
Maximum of number
concentration
(nm range)
Outdoor, ofce
Silicon melt
Metal grinding
Soldering
Plasma cutting
Bakery
Airport eld
Hard soldering
Welding
Up to 10 000
100 000
Up to 130 000
Up to 400 000
Up to 500 000
Up to 640 000
Up to 700 000
54 0003 5000 000
100 00040 000 000
280520
17170
3664
120180
32109
<45
33126
40600
j285
286
References
1 G. Oberdorster, J. Ferin, B. E. Lehnert,
Environ. Health Perspect. 1994, 102, Suppl
5, 173179.
2 A. Seaton, W. MacNee, K. Donaldson,
D. Godden, Lancet 1995, 345, 176178.
3 H. E. Wichmann, C. Spix, T. Tuch, G.
Woelke, A. Peters, J. Heinrich, W. G.
Kreyling, J. Heyder, Health Effects Institute
Research Report 2000, Health Effects
Institute, Boston, 2001.
4 International Commisson on
Radiological Protection (ICRP) , Human
respiratory tract model for radiological
protection. ICRP Publication 66. Ann.
ICRP 1994, 24, 13.
5 D. L. Swift, N. Montassier, P. K. Hopke,
K. Karpen-Hayes, Y. S. Cheng, Y. F. Su,
J. C. Strong, J. Aerosol Sci. 1992, 23,
6572.
6 K. E. Driscoll, J. M. Carter, D. G.
Hassenbein, B. Howard, Environ
Health Perspect. 1997, 105, Suppl 5,
11591164.
7 P. A. Jaques and C. S. Kim, Inhal. Toxicol.
2000, 12, 715731.
8 C. C. Daigle, D. C. Chalupa, F. R. Gibb, P.
E. Morrow, G. Oberdorster, M. J. Utell, M.
W. Frampton, Inhal. Toxicol. 2003, 15,
539552.
9 J. S. Brown, K. L. Zeman, W. D. Bennett,
Am. J. Respir. Crit Care Med. 2002, 166,
12401247.
10 A. Nemmar, P. H. Hoet, B.
Vanquickenborne, D. Dinsdale, M.
Thomeer, M. F. Hoylaerts, H. Vanbilloen,
L. Mortelmans, B. Nemery, Circulation
2002, 105, 411414.
11 W. G. Kreyling, M. Semmler, F. Erbe,
P. Mayer, S. Takenaka, H. Schulz, G.
Oberdorster, A. Ziesenis, J. Toxicol.
Environ Health A 2002, 65, 15131530.
12 G. Oberdorster, Z. Sharp, V. Atudorei, A.
Elder, R. Gelein, A. Lunts, W. Kreyling, C.
Cox, J. Toxicol. Environ. Health A 2002, 65,
15311543.
13 S. Takenaka, E. Karg, C. Roth, H. Schulz,
A. Ziesenis, U. Heinzmann, P. Schramel,
14
15
16
17
18
19
20
21
22
23
References
24 C. A. Pope, R. T. Burnett, M. J. Thun, E. E.
Calle, D. Krewski, K. Ito, G. D.
Thurston, J. Am. Med. Assoc. 2002, 287,
11321141.
25 J. Sunyer, J. Schwartz, A. Tobias, D.
Macfarlane, J. Garcia, J. M. Anto, Am. J.
Epidemiol. 2000, 151, 5056.
26 M. S. Goldberg, R. T. Burnett, J. C.
BailarIII, R. Tamblyn, P. Ernst, K. Flegel,
J. Brook, Y. Bonvalot, R. Singh, M. F.
Valois, R. Vincent, Environ. Health
Perspect. 2001, 109, Suppl 4, 487494.
27 H. J. Kwon, S. H. Cho, F. Nyberg,
G. Pershagen, Epidemiology 2001, 12,
413419.
28 S. von Klot, G. Wolke, T. Tuch, J. Heinrich,
D. W. Dockery, J. Schwartz, W. G.
Kreyling, H.-E. Wichmann, A. Peters, Eur.
Respir. J. 2002, 20, 691720.
29 T. F. Bateson and J. Schwartz,
Epidemiology 2004, 15, 143149.
30 C. A. Pope, R. T. Burnett, M. J. Thun, E. E.
Calle, D. Krewski, K. Ito, G. D. Thurston,
J. Am. Med. Assoc. 2002, 287, 11321141.
31 A. Peters, A. Doring, H. E. Wichmann,
W. Koenig, Lancet 1997, 349, 15821587.
32 J. Pekkanen, E. J. Brunner, H. R.
Anderson, P. Tiittanen, R. W. Atkinson,
Occup. Environ. Med. 2000, 57, 818822.
33 A. Ibald-Mulli, J. Stieber, H. E.
Wichmann, W. Koenig, A. Peters, Am.
J. Public Health 2001, 91, 571577.
34 A. Ibald-Mulli, K. L. Timonen, A. Peters,
J. Heinrich, G. Wolke, T. Lanki,
G. Buzorius, W. G. Kreyling, J. de
Hartog, G. Hoek, H. M. Ten Brink,
J. Pekkanen, Environ. Health Perspect.
2004, 112, 369377.
35 H. E. Wichmann, C. Spix, T. Tuch,
G. Wolke, A. Peters, J. Heinrich, W. G.
Kreyling, J. Heyder, Health Effects Institute
Research Report, Health Effects Institute,
Boston, 2000, 586
36 A. Peters, H. E. Wichmann, T. Tuch,
J. Heinrich, J. Heyder, Am. J. Respir. Crit.
Care Med. 1997, 155, 13761383.
37 J. Pekkanen, K. L. Timonen,
J. Ruuskanen, A. Reponen, A. Mirme,
Environ. Res. 1997, 74, 2433.
j287
288
51
52
53
54
55
56
57
58
59
60
61
62
References
74 U. Heinrich, H. Muhle, S. Takenaka,
H. Ernst, R. Fuhst, U. Mohr, F. Pott, W.
Stober, J. Appl. Toxicol. 1986, 6, 383395.
75 J. L. Mauderly, Environ. Health Perspect.
1994, 102, Suppl 4, 165171.
76 R. B. Hayes, T. Thomas, D. T. Silverman,
P. Vineis, W. J. Blot, T. J. Mason, L. W.
Pickle, P. Correa, E. T. Fontham, J. B.
Schoenberg, Am. J. Ind. Med. 1989, 16,
685695.
77 P. Gustavsson, N. Plato, E. B. Lidstrom,
C. Hogstedt, Scand. J. Work Environ.
Health 1990, 16, 348354.
78 K. Steenland, D. Silverman, D. Zaebst,
Am. J. Ind. Med. 1992, 21, 887890.
79 A. Emmelin, L. Nystrom, S. Wall,
Epidemiology 1993, 4, 237244.
80 I. Br
uske-Hohlfeld, M. Mohner,
W. Ahrens, H. Pohlabeln, J. Heinrich,
M. Kreuzer, K. H. Jockel, H. E. Wichmann,
Am. J. Ind. Med 1999, 36, 405414.
81 I. Br
uske-Hohlfeld, Environ. Health
Perspect. 1999, 107, Suppl 2,253258.
82 R. Bhatia, P. Lopipero, A. H. Smith,
Epidemiology 1998, 9, 8491.
83 International Agency for Research on
Cancer, IARC Monographs on the
Evaluation of Carcinogenic Risks to
Humans, Vol. 46, IARC, Lyon, 1989.
84 P. Vineis, K. Husgafvel-Pursiainen,
Carcinogenesis 2005, 26, 18461855.
85 A. Campbell, M. A. Smith, L. M. Sayre, S.
C. Bondy, G. Perry, Brain Res. Bull. 2001,
55, 125132.
86 L. Calderon-Garciduenas, B. Azzarelli, H.
Acuna, R. Garcia, T. M. Gambling, N.
Osnaya, S. Monroy, M. R. DELTizapantzi,
J. L. Carson, A. Villarreal-Calderon, B.
Rewcastle, Toxicol. Pathol. 2002, 30,
373389.
87 C. M
ohlmann, Gefahrstoffe-Reinhaltung
Luft 2005, 65, 469471.
88 H. Tjalve and J. Henriksson,
Neurotoxicology 1999, 20, 181195.
89 L. Lu, L. L. Zhang, G. J. Li, W. Guo,
W. Liang, W. Zheng, Neurotoxicology 2005,
26, 257265.
90 K. M. Semchuk, E. J. Love, R. G. Lee, Can.
J. Neurol. Sci. 1991, 18, 279286.
j289
290
j291
10
Impact of Nanotechnological Developments on the Environment
Harald F. Krug and Petra Klug
10.1
Problem
Since Feynmans legendary statement, Theres plenty of room at the bottom [1],
natural scientists in physics, chemistry, electronics and other elds have been
occupied with newly combining the smallest units of matter. Using the resources
of modern analysis, especially atomic force microscopy, not only can the properties of
matter and atoms be examined, but even single atoms can be manipulated. Aside
from the known results, this also gave rise to speculation that manipulation of matter
at the single atom level was interpreted in such a way that the possibility exists of
generating engines and machines that are able to replicate themselves and therefore
possibly get out of control. Without wanting to evoke once again the discussion that
has been going on for a long time between Eric Drexler, representative of the hazard
hypothesism and Richard Smalley, representative of the safety hypothesis [2], the
reactions to this nevertheless show that here (i) sensible communication is necessary
in order to point out the real hazards and (ii) accompanying safety-relevant research is
needed in order to identify the real hazards and to face them. What we have to count
on in all probability in the near future is increased production of nanomaterials and
nanoparticles and associated with that its possible release into the air, water and soil.
Bayer, as an example, has produced carbon nanotubes for over 2 years and the
production capacity will increase over the next 5 years from
.
2005
3 tons
2006
2007
2009
2012/13
30 tons
60 tons
200 tons
3000 tons.
to
.
.
.
Within this book, several examples have been shown where nanomaterials can be
used within the environment. The opportunities for applications are nearly endless;
Nanotechnology. Volume 2: Environmental Aspects. Edited by Harald Krug
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31735-6
292
nevertheless, along with their use, numerous threats may arise when they are released
into the environment and become distributed everywhere. Nanoparticles and nanomaterials are used in various applications from which they reach the water and the soil.
Titanium dioxide and zinc oxide from sunscreens and surface coatings from textiles,
glass or other surfaces may be washed off and contaminate natural water [3]. We are
responsible for these products, their use and their disposal, hence we must be careful in
distributing all these new materials before we know their exact behavior and fate within
the environment. It is obvious that nanomaterials will reach the environment and
exposure is therefore probable. If there is a biological effect within the organisms that
are exposed to these products, then we can postulate a possible risk that has to be
addressed:
risk f exposure; hazard
10.2
Risk Management
Therststepinriskmanagement istheidenticationofpotentialrisksandtheircause. A
reasonableriskidenticationmustincludeallareasof atechnology,boththeinternaland
the external factors (Figure 10.1). In order to achieve this, intensive research is necessary
concerning both health-relevant and environment-relevant questions [46]:
.
.
.
.
A very important aspect in the judgment of possible risks from nanotechnological products lies in the differentiation of free and xed nanoparticles, because
there is obviously a large difference in their mobility. Furthermore, one must
differentiate between particles and materials that are being manufactured as
technical products and those that are created accidentally in technical processes
and are released into the environment (e.g. diesel exhaust, y ash, catalytic dust,
candle soot). Humans are and have been exposed to such ultrane particles
(UFPs), which mainly result from combustion processes, since the beginning of
their biological development. Whereas in former times forest res, volcanoes and
sand- and other storms occurred, since the industrial revolution and since the
increase in motor trafc, a dramatic rise in UFPs in the air has taken place over
the last century.
With the presumed and very fast development in the area of nanotechnological
applications, it must now be taken into account that a further source of such minute
particles will emerge that will reach humans via the environment: either from the
environment to the people or vice versa.
The exposure that goes along with that to humans via the respiratory system,
nutrition and skin and also the direct injection of nanoparticles in the medical sector
could lead to adverse effects [710]. With the knowledge that newly synthesized
nanomaterials possess completely new properties in view of chemical, physical and
electronic applications, entirely new effects on living organisms can be postulated.
For these reasons, the behavior of nanoparticles in the environment and in living
organisms cannot simply be extrapolated; a signicant prediction of the toxicity of
nanoparticles on the basis of the knowledge concerning conventional materials
cannot be made. The situation is, in addition to the above-mentioned new effects, not
assessable, also for the following reasons:
.
.
Hence information about the safety and the possible hazards from nanomaterials
is urgently needed. Toxicologists can, by all means, benet from the previously
performed studies on the effects of ultrane particles on the environment, since this
is where a multitude of ndings already exist. Since the Middle Ages and earlier there
have been well-documented cases on workplace-related exposure with effects on
health. Especially workers in mines are subject to exposure to inhaled dust of any size
for long periods of their lives, which can lead to pneumoconiosis and brosis of the
lung. It has been shown that especially the fractions of ultrane particles in the air
lead to the greatest effects on health [1117].
Present scientic knowledge about substances and devices produced using
nanotechnologies precludes going further than identifying hazards the rst step
in risk assessment and providing some elements of hazard characterization the
j293
294
10.3
Sources of Nanoparticles: New Products
The results of these and many other studies could lead us to expect a cautionary and
understandably biased view towards new sources of nanoparticles. Various products
have already been on the market for a long time (Table 10.1).
Some of these products will sooner or later lead to a rise in the amount of particles
in the environment, even if not all released particles will be in the ultrane range
under 100 nm. The use of nanotechnological products in all areas of life raises some
important questions:
.
.
.
Type
Products (examples)
Metal oxides
Silicon dioxide (SiO2)
Titanium dioxide (TiO2)
Aluminum oxide (Al2O3)
Iron oxide (Fe3O4, Fe2O3)
Zirconium oxide (ZrO2)
Zinc oxide (ZnO)
Carbon modications
Carbon black
Fullerenes
Buckminsterfullerenes (C60)
Carbon nanotubes
Single-wall carbon nanotubes
Multi-wall carbon nanotubes
Carbon nanowires
Various conformations
data for assessing an exposure and the hazards are still lacking and that the gaps in
knowledge must denitely be closed. For this, a new research eld has been
established: nanotoxicology [18]. This specializes in analyzing the biological safety
of technical nanostructures and nanomaterials. The same rules apply for this part of
toxicological research as for the hazard assessment of other chemical impacts on the
environment (Figure 10.3).
Within the assessment of relevant environmental problems of xenobiotics
or chemicals, the predicted environmental concentration (PEC) and the predicted
no-effect concentration (PNEC) are of great importance. If the PEC value is low
and the PNEC value is high (PEC/PNEC < 1), a low risk can be calculated and no
measures for reduction are necessary, whereas in the opposite case, a high PEC
value and a low PNEC value (PEC/PNEC > 1), further measures are indicated. On
the one hand this can mean that a limited use will be the result, but it can be
corrected by extended test procedures if it results in the relation between the two
values being <1 again.
j295
296
10.4
Production and Use of Nanomaterials
As mentioned above, the production and use of nanomaterials will increase dramatically and all areas of daily life will be affected. Exposure to nanomaterials will become
more probable, whereas an important differentiation will be whether these materials
are coated or uncoated. Coatings change the properties of the material, which has a
direct impact on the particle and its behavior in biological systems. Also of importance in view of possible far-reaching effects is the stability and along with that the
survival time of the particles in the environment. Both the environmental-relevant
and the health-relevant effects are largely dependent on whether the nanoparticles are
persistent or not. As far as stable particles are concerned, basically the same is true as
for long-lived chemicals: they could, if absorbed by living organisms, accumulate,
concentrate in certain target organs and eventually develop a critical effect if the
dosage reached is sufciently high. Therefore, precautionary safety measures must
be taken in good time, in order to recognize those stable materials with critical
consequences and to prevent their release. It has been demanded by several
researchers that the following points must be strictly observed:
.
.
.
.
.
.
.
These issues can be transferred in this or a similar fashion to the conditions in the
environment; for example, nanoparticles in personal care products reach wastewater
and therefore also the environment, in small but continuous quantities, through
body cleansing [3].
10.5
Workplace and the Environment: Effects and Aspects of Nanomaterials
.
.
Syntheses or fabrication processes can take place at room temperature and under
normal pressure in order to save energy.
The use of non-toxic catalysts leads to the minimal formation of pollutants and
reduces material usage and emission.
Water-based reactions can help save on solvents and reduce contaminants and an
adapted just-in-time production can reduce ecological environmental pollution
and over-production.
Nanoscale information technology will improve the tracking of products and
product routes in order to control recycling, further use and end-of-life disposal
in an environmentally safe way.
Nanoscale iron could be applied very efciently for groundwater treatment along
with further possibilities for improvement by using additional metals, such as
palladium.
Various nanomaterials can be used as semiconductor lms for producing sensors
or photocatalysts and these sensors could detect organic pollutants and degrade
them by photocatalytic reactions.
Single-molecule detection could help recognize pollutants early so that precautionary measures are possible.
Nano-building blocks could analyze chemicals by specic reactions; biogenous
toxins could be detected and food could be monitored more efciently.
Already lead-off studies are being published where carbon nanotubes are used as
sensors for gases [19], metal oxides show a sensitive reaction to moisture or
hydrogen [20], silver polymer nanoparticles are used for the detection of aromatic
hydrocarbons [21] and nanoparticles are used as sorbents of environmental
contaminants [22].
j297
298
10.6
Distribution of Nanoparticles in Ambient Air
Depending on their production and use, nanoparticles can reach the water or air,
from where they will eventually reach the groundwater or the soil. Furthermore, their
use in disposable articles necessitates increased caution where recycling and waste
management are concerned. Since nanoparticles in the air act more like gas
molecules and hardly sediment at all, they could cover great distances. Currently,
the long-term effects cannot be calculated by any means based on the poor data
available. However, the air is one of the best-examined environmental compartments
and its pollution has been recorded for many decades. The organisms that are
exposed in ambient air are being examined intensively and many studies are
examining the adverse effects of particular mass in the air (see Chapter 9 for detailed
information). Nevertheless, the unreasonably hazardous effects especially of ultrane particles in the air have been discovered during the past 10 years [16, 17, 2527].
It could be assumed that the number of ultrane particles in the air will increase in
the coming years, rst at the workplace and then in the environment. Especially due
to the surface treatment of engineered nanoparticles, release of nanoscale particles
could occur, while oxidic metal particles and carbon nanomaterials normally start to
aggregate soon after their synthesis and form agglomerates, which frequently
sediment faster than the primary particles. Immediately after their formation, the
primary particles act more like a gas or a vapor phase where diffusion processes
prevail (for details, see Chapter 7). The primary particles show high diffusion
coefcients and blend in well in aerosol systems. This is an important aspect in
the control of such minute particles in the air. In closed synthesis systems that show a
leak, it is easier for nanoparticles to escape unnoticed, due to their much higher
Half-life at a concentration of
Particle diameter (nm)
1
2
5
10
20
1 g m3
1 mg m3
1 mg m3
1 ng m3
2.20 ms
12.00 ms
0.12 ms
0.70 ms
3.80 ms
2.20 ms
12.00 ms
0.12 s
0.70 s
3.80 s
2.20 s
12.00 s
2.00 min
11.67 min
63.34 min
36.67 min
3.34 h
33.34 h
8.10 d
43.98 d
mobility, than it is for their larger counterparts. Due to the faster distribution, the
measurable concentration directly at the leakage is rather low; in return, a larger
number of individuals/workers could be exposed since the particles are distributed
more extensively over a larger area. For this reason, gas detection systems must be
used and not particle detectors, ultra-sensitive systems in any case. On the other
hand, the higher speed of the smaller particles also leads to a larger number of
collisions, which support the tendency for aggregation and agglomeration and
facilitate particle growth. This growth process is directly dependent on the number
of particles in the given volume and their mobility. Although this process runs very
fast (Table 10.2), it must be considered that a newly formed nanoparticle resulting
from these collisions, which is obviously larger than the primary particles, will
nevertheless have at least one of its dimensions smaller than 100 nm.
One difculty is, and also will be in future studies, that airborne particles do not
display a uniform population, but rather result from very different sources. Hence,
possibly emitted engineered particles can mix already at the workplace with, for
example, diesel exhaust fumes during packing and loading of nanomaterials [28], so
that assessment and distinct source denition are often difcult. Nevertheless,
there are indications that during the processing of nanomaterials, the same ones
are released into the air [29]. However, these are so far the only current studies that
have dealt with the release of produced nanomaterials. Two further reports deal with
the general aspects of the distribution of ultrane particles in the atmosphere and
at the workplace, but largely particles that are created unintentionally and not
those produced technically [30, 31].
10.7
Distribution of Nanoparticles in Water
In all probability, and viewed over a longer period of time, distribution and exposure
to the environment and to humans in the water and the soil will take place. If the
growth in production in fact moves as fast as is predicted, increased concentrations of
nanomaterials in the groundwater or in the soil could be an essential exposure route
j299
300
that should be taken into consideration when assessing the risk for the environment [32]. Such products are already in use as titanium dioxide nanoparticles in
sunscreens and paints, as carbon nanotubes in composite materials (car tires) or as
aluminum particles in shampoos [3].
As mentioned above already, different metallic and polymeric nanoparticles can be
used for groundwater remediation (for details, see Chapter 4) and are therefore a
possible source of pollution [3335].
In this context, two issues are of particular importance: on the one hand, the direct
effects of nanomaterials in the environment must be examined, and on the other, the
routes of the nanomaterials into the environment (and through it, e.g. through
the porous soil matrix) must be claried. Some studies have shown a large difference
between the mobility of nanoparticles in porous media, depending on whether they
are oxidic particles or carbon particles such as fullerenes [34, 36]. During their
transport in the water or through the soil matrix, even harmless particles could react
with other chemicals or adsorb them and therefore contribute to a possible hazard
themselves [3739]. Many substances have the capability of sticking to the surface of
nanoparticles [21, 40]; these adsorbed chemicals could then cause a biological effect,
the nanoparticles itself especially increase the uptake by organisms or directly lead to
adverse effects (Figure 10.4). It is conceivable that the particles themselves or the
bound pollutants after they have been taken up by living organisms cause, for
instance, lysosomal damage, thereby increasing autophagy and in effect cell damage,
as suggested by Moore [41].
Using the fullerenes as an example, it could be shown that the availability and the
mobility of nanomaterials in the water and in the soil depend very strongly on the
physico-chemical properties of the surface of these particles. At different ionic
strengths and pH values, three fullerene preparations and four different oxidic
materials behaved very differently with regard to their transport properties in the
aquifer [34, 36]. Whereas the transport of mineral nanoparticles could be described by
established models for the transport of particles in porous media, the same did not
hold true for the fullerenes since they behaved in a totally unexpected way and showed
unusual properties. Especially that particular form of fullerenes that has recently
caused concern in biological tests [10] is the least mobile one in soil and water tests, so
Figure 10.4 Distribution and possible reactions of nanoparticles in the aquatic system.
that we can rather expect a reduced risk here. This example is intended to demonstrate that our knowledge about the behavior of nanomaterials in the environment is
still totally insufcient and that the resulting debate about possible risks has not yet
been professionally established. Hence a decision should be made for every single
material, based on the knowledge about the availability and transport in water and/or
soil (Figure 10.5).
The above-mentioned work by Oberdorster [10] showed, for the rst time, a direct
effect of fullerenes on sh. After exposure to 0.5 ppm of water-soluble buckyballs
(C60) for 48 h, the sh showed signicantly increased levels of lipid peroxidation in
the brain. Furthermore, the gene expression was checked, for example in the liver,
where it was striking that there were genes that were turned on and others that were
turned off. Also, there was evidence of a systemic effect of the fullerenes, despite
the fact that the author herself judged the concentration to be very high and allocated
the fullerenes only moderate toxicity. Unanswered are the questions of how much of
the material was really taken up by the sh and whether this material can really
accumulate in the food chain or in specic organs, such as the brain. Such an
accumulation would then suggest a hazard for other organisms. Therefore, it is
necessary to increase the efforts in order to acquire more information about the
bioaccumulation of nanomaterials, since many of them are not likely to be biodegradable and could therefore, under certain conditions, behave as persistent
pollutants. Some follow-up studies by different authors resulted in controversial
data as the high capacity for toxic effects of fullerenes could not documented by the
same group when fullerenes were resuspended directly in water [42]. A solution of
fullerenes in tetrahydrofuran (THF) has been shown to be highly toxic ([10]; [43]) and
we could demonstrate that this is dependent on the peroxides formed in THF
spontaneously. Hence it is not always the nanomaterials that are the toxic component
band the solvents or contaminants often have a more dramatic effect on living
organisms, as has been published elsewhere [44, 45].
j301
302
10.8
Conclusions
On the basis of the displayed level of knowledge so far, ve fundamental considerations can be taken into account about the eco-toxicological risk management of
nanomaterials:
1. Research on the toxicology of nanomaterials has so far been more of a description
of the symptoms, hence it is essential to nd out more about the biological
mechanisms on the cellular and molecular level.
2. The further development of models and model systems is necessary in order to
understand cellular and physiological processes better and to be able to include the
communication between the cells within the investigations.
3. It should be possible to establish a relationship between molecular, cellular and
pathophysiological end-points with ecological consequences.
4. A more precise and preventive assessment of hazardous effects and risks of new
developments in the sector of nanotechnology should be possible by strong
improvement of the available data.
5. Of essential relevance is an increasing knowledge of the life-cycle of nanotechnological products, from production, through their use and until their disposal.
This includes possible exposure scenarios in addition to the systemic effects in
living organisms, to close knowledge gaps, where screening studies may be
preferred instead of detailed analysis.
Only on the basis of improved knowledge about the potential dangers in the entire
life cycle of the products will a risk assessment be possible and the corresponding
measures can then be implemented in order to reduce a possible hazard.
This was also made clear at a workshop of the National Science Foundation (NSF)
and the Environmental Protection Agency (EPA) in the USA, whose results were
summarized by Dreher [46] as follows:
.
.
.
.
.
Hence the question remains unanswered of whether all nanomaterials are also
simultaneously nanonoxes. The entire subject matter of environment and health is
displayed very clearly in a recent review that deals with the possible routes of
nanoparticles in the environment and the exposure routes by organisms [18]. At this
point it is important to state that the currently available data are not sufcient for a
realistic assessment of exposure, hazards and the associated risks. Moreover, nobody
takes into account that there is a natural background exposure with several particle types
References
Table 10.3 Most frequent elements in the Earths crust
(italics: typically as oxides comparable to engineered
nanoparticles).
Element
Concentration (%)
Oxygen
Silicon
Aluminum
Iron
Calcium
Potassium
Sodium
Magnesium
Titanium
Hydrogen
Carbon
Others
47
28
8
4.5
3.5
2.5
2.5
2
0.5
0.2
0.2
<1
References
1 Feynman, R.P. (1960) Theres plenty of
room at the bottom. Engineering and
Science, 23, 2236.
2 Baum, R. (2003) Nanotechnology Drexler
and Smalley make the case for and against
j303
304
10
11
12
13
14
15
16
17
18
19
20
21
References
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
j305
306
40
41
42
43
44
45
46
47
48
j1
I
Basic Principles and Theory
j3
1
Phase-Coherent Transport
Thomas Schapers
1.1
Introduction
j 1 Phase-Coherent Transport
1.2
Characteristic Length Scales
The elastic mean free path le is a measure of the distance between subsequent elastic
scattering events. Such events occur due to the fact that the conductor is not ideal but
rather contains irregularities in the lattice, such as impurities or dislocations. The
scattering can be considered as elastic, which means that the electron energy is
conserved. A typical example is the scattering of an electron at a charged impurity.
If we assume a stationary scattering center, then effectively no energy is transferred
during the scattering event, whereas the direction of the electron momentum can
change greatly.
In order to determine the elastic mean free path le within the Drude model, one
must rst calculate the average time between elastic scattering events, te. Its value
can be extracted from the electron mobility me, given by
me
ete
m*
1:1
The quantities m and e are the effective electron mass and the elementary charge,
respectively. The electron mobility is a measure of the increase of the drift velocity
vdrift in a conductor with increasing electric eld E: vdrift meE. In practice, the
electron mobility is determined from the electron concentration ne and the Drude
conductivity s0 by
me
s0
ene
1:2
Experimentally, the electron concentration ne is obtained from Hall measurements, while the conductivity s0 is deduced from resistance measurements at zero
magnetic eld.
Effectively, only electrons at the Fermi energy EF contribute to the electron
transport. Therefore, the elastic mean free path le is given by the length an electron
with the Fermi velocity vF propagates until it is elastically scattered after the elastic
scattering time te:
le te vF
1:3
In addition to the elastic scattering discussed above, electron scattering can also be
connected to an energy transfer. A typical example is the effect of lattice vibrations on
electron transport. An electron moving within a crystal will be scattered by these
lattice vibrations and either lose or gain energy, depending on whether it excites the
lattice vibrations or is excited by them. As an energy transfer occurs, these scattering
processes are considered to be inelastic. Similar to the previous discussion, one can
dene an inelastic scattering length lin as a measure for the length between inelastic
scattering events. Besides electronphonon scattering, electronelectron scattering
is another possible process, where a considerable amount of energy can be exchanged
between both scattering partners [3].
1.2.3
Phase-Coherence Length
j5
j 1 Phase-Coherent Transport
Diffusive
Classical
Quantum
kF, le L, lu < le
kF, le L, lu > le
Ballistic
Classical
Quantum
kF L < le, lu
kF L < le, lu
until the phase is broken, implying that the characteristic phase-breaking time tj is
larger than the elastic scattering time te. Owing to the diffusive motion during the
time tj, the phase-coherence length lj must be expressed by
p
lj Dtj
1:4
Here, D is the diffusion constant dened as
1
D v2F te
d
1:5
with d the dimensionality of the system. Typical values for the phase-coherence
length of an AlGaAs/GaAs 2-D electron gas below 1 K are of the order of several
micrometers [5].
1.2.4
Transport Regimes
By comparing le and lj with the dimension L of the sample and the Fermi
wavelength lF, different transport regimes can be classied, and these are summarized in Table 1.1. For the case where the elastic mean free path le is smaller than
the dimensions of the sample, many elastic scattering events occur while the
electrons propagate through the structure. The carriers are traveling randomly
(diffusive) through the crystal, as illustrated in Figure 1.1a. If the phase-coherence
length lj is shorter than the elastic mean free path le, the transport is considered
as classical. In contrast, if lj > le, then quantum effects owing to the wave nature
of the electrons can be expected. This diffusive regime is thus called the quantum
regime. As illustrated in Figure 1.1b, in the case that le is larger than the dimensions
of the sample, the electrons can transverse the system without any scattering;
this regime is called ballistic. Depending on the magnitude of the Fermi wavelength
lF in comparison to the dimension of the sample, the transport can either be
regarded as classical ballistic or quantum ballistic. In the following section, ballistic
transport will rst be discussed, and later the transport phenomena in the diffusive
regime.
1.3
Ballistic Transport
In this section transport in the ballistic transport regime will be discussed; that is,
where the elastic mean free path exceeds the dimensions of the sample. First, the
LandauerB
uttiker formalism is explained, where the resistance of a sample is
described in terms of transmission and reection probabilities, which is a very
convenient scheme to analyze the transport in the ballistic regime. Subsequently, the
quantized conductance of a split-gate point contact will be discussed, making use of
the LandauerB
uttiker formalism.
1.3.1
LandauerB
uttiker Formalism
U kl
I mn
1:6
is dened by the voltage measured between contacts k and l and the current owing
between contacts n and m.
In order to keep things simple, the discussion is restricted to a conductor
connected via ideal one-dimensional (1-D) ballistic leads to four corresponding
reservoirs. The geometry of the sample is depicted in Figure 1.2. The ballistic wires
should consist of only a single 1-D. The reservoirs with the corresponding chemical
potentials mi (i 1, . . . , 4) serve as source and drain for carriers owing in and out of
j7
j 1 Phase-Coherent Transport
the conductor. At zero temperature, the i-th reservoir can supply electrons to the
conductor up to a maximum energy of mi. Each carrier from the lead, which reaches
the reservoir is absorbed by the reservoir, irrespective of the phase and energy of the
carriers. As discussed above, inelastic scattering is forbidden within the leads, so that
electrons once injected into the conductor maintain their energy until they reach one
of the reservoirs.
As an example, we will study the current contributions in the 1-D lead 1, which
results in the net current I1. The current injected from reservoir 1 is given by:
m1
Iinj e
D1D EvEdE
1:7
0
where v(E) is the velocity of the electrons. As the wire is 1-D, the density of states of a
1-D system must be inserted, which is given by
D1D E
2
hvE
1:8
So far, only the states propagating from reservoir 1 are considered, and the density
of states used here is half of the commonly known value because there is only one
direction of propagation [1]. It can be seen directly that the product of the 1-D density
of states D1D(E) and the velocity v(E) is constant, and therefore the current leaving
reservoir 1 has the following simple form:
Iinj
2e
m
h 1
1:9
Part of the current supplied by reservoir 1 will be reected back into the conductor.
If Rii is dened as the reection probability for a reection of carriers from lead i back
into lead i, then the current reected into lead 1 can be written as
IR
2e
Rii mi
h
1:10
In addition, electrons are transmitted from the other three leads into lead 1. By
dening the transmission probability from lead j into lead i (i j) as Tij, we arrive at
the following expression for the current transmitted into lead 1:
IT
4
2e X
T 1j mj
h j2
1:11
By summing all of these contributions it can be seen that the net current owing in
lead 1 is nally given by:
I 1 Iinj I R IT
"
#
4
X
2e
1 R11 m1
T 1j mj ;
h
j2
or, more generally, the current in lead i is given by
2
3
X
2e 4
1 Rii mi
T ij mj 5
Ii
h
ji
1:12
1:13
By using Equation 1.13, the above-dened resistance Rmn;kl can be determined for
given reection and transmission probabilities of the sample. According to the initial
denition of Rmn;kl , as given in Equation 1.6, the net current Imn ows between
contacts n and m. The leads k and l do not carry a net current in case of an ideal voltage
measurement. The voltage drop Ukl is given by the difference of the electrochemical
potentials divided by e: (mk ml)/e. In the following section, Equation 1.13 will serve
as a basis to describe the transport properties of a split-gate quantum point contact.
1.3.2
Split-Gate Point Contact
In split-gate quantum point contacts the transport is limited to only one dimension.
This is obtained by rst restricting the propagation of the electrons to a plane. In these
so-called two-dimensional electron gases (2DEGs), the carriers are conned at an
interface of two different semiconductor layers. A typical example of a 2DEG realized
in an AlGaAs/GaAs layer system is depicted in Figure 1.3. Here, the carriers are
located at the AlGaAs/GaAs interface and, owing to the conduction band offset
between AlGaAs and GaAs, a triangular quantum well is formed at the interface. The
electrons in the quantum well are supplied by an n-type d-doped (very thin) layer. In
order to prevent ionized impurity scattering, the electrons in the quantum well are
separated from the d-doped layer by an undoped AlGaAs spacer layer. Using this
scheme, very large electron mobilities and thus very long elastic mean free paths of
the order of several micrometers can be achieved.
A further restriction of the electron propagation to only one dimension can be
realized by using split-gate point contacts [10, 11]. As illustrated in Figure 1.4, two
opposite gate ngers are separated by a distance of a few hundreds of nanometers.
Split-gate electrodes are usually prepared by using electron beam lithography. Since
j9
j 1 Phase-Coherent Transport
10
the Fermi wavelength lF of a 2-D electron gas is typically a few tenths of a nanometer,
the separation of the split-gates is comparable with lF. The length of the channel
formed by the gate electrodes is usually smaller than 1 mm, and thus smaller than the
elastic mean free path le. According to the classication introduced in Section 1.2, the
transport can be considered as ballistic.
By applying a sufciently large negative voltage to the gate ngers, the underlying
2-D electron gas is depleted underneath the gate ngers (see Figure 1.4a). Only a
small opening between the gate ngers remains for the electrons to propagate from
one side to the opposite side; however, by varying the gate voltage it is possible to
control the effective width of the opening. An increase of the negative bias voltage
enlarges the depletion area and thus reduces the opening width. At sufciently large
negative bias voltages the opening can even be closed completely (pinch-off).
Owing to the depletion area underneath the split-gate electrodes, it can be assumed
that the electrons in the 2DEG are conned in a potential well along the y-axis, while
the free propagation takes place along the x-axis. If the potential prole in the plane of
the 2DEG induced by the split-gate electrodes is expressed by V(x, y), the Hamilton
2m* qx 2 qy2
1:14
In order to determine the precise shape of the potential V(x, y) as a function of the
gate voltage, elaborated self-consistent simulations are required [12]. However, for
most applications it is sufcient to assume an approximated potential prole. For low
gate voltages an appropriate approximation is a rectangular potential prole, while for
higher negative gate voltages the potential well can be approximated by a parabolic
potential. As an example, we will consider here the latter potential shape. Due to the
short length of the channel formed by the split-gates, the 2-D potential prole will be
saddle-shaped. However, if the potential shaped along the constriction is smooth
(adiabatic limit), it is sufcient to consider only the narrowest point of the channel,
which can be expressed by
1
Vy m* w20 y2 V 0
2
1:15
2 k2x
h
;
2m*
n 1; 2; 3; . . . ;
1:16
with
E 0n n 1=2hw0 ;
1:17
the energy eigenvalues of the harmonic oscillator. By changing the gate voltage at the
split-gate electrodes, the effective width of the opening can be adjusted. In the
parabolic approximation o0 is increased if a more negative gate voltage is applied, and
this leads to an increased separation of the energy eigenvalues. As a consequence,
lesser levels are occupied up to the Fermi energy (see Figure 1.5a and b).
j11
j 1 Phase-Coherent Transport
12
1:18
h=2eI 1 R22 m2 T 21 m1
1:19
At zero magnetic eld (B 0), the transport is time-inversion invariant so that the
following relationships hold:
T 12 T 21 T 1 R11 1 R22
1:20
Thus, nally we arrive at the expression for the conductance of the constriction:
G
I
Ie
2e2
U m1 m 2
h
1:21
2e2
:
h
1:22
N
X
T ij;mn
1:23
mn
where Tij,mn denotes the transmission probability from the n-th subband of lead j into
the m-th subband of lead i. If ideal transmission and no intersubband scattering is
assumed, then the total transmission probability of a 1-D channel with N subbands is
given by T N. Thus, each subband contributes with 2e2/h to the conductance so that
the total conductance of a constriction with N subbands occupied is given by
G
2e2
N
h
1:24
Figure 1.6 Resistance and conductance of an AlGaAs/GaAs splitgate point contact as a function of the gate voltage. The
conductance is plotted in units of 2e2/h.
1.4
Weak Localization
Interference effects of electron waves due to phase coherent transport can be seen even
in large samples, where the phase coherence length is much smaller than the
dimensions of the sample. This effect, called weak localization, results in an increased
resistance compared to the classically expected value [15, 16]. Weak localization is
observed if the temperature is sufciently low so that the phase coherence length lj is
j13
j 1 Phase-Coherent Transport
14
larger than the elastic scattering length le. As we will see below, the effect of weak
localization depends strongly on the dimensionality of the system. The lower the
dimension of the system is, the stronger the effect of weak localization is, that is in
quasi one-dimensional wire structures weak localization is most pronounced. In order
to illustrate the general mechanisms leading to weak localization, we will rst
introduce a simple model. Later on more quantitative expressions for the conductivity
corrections will be given.
1.4.1
Basic Principles
1:25
Here, jj is the phase shift that the electron acquires on its way from A to Q while
propagating along path j. Often, there are many possible paths for an electron to
propagate between A and Q. For example, for free electron propagation the phase
accumulation along the path j can be calculated from the action Sj by
jj
Sj
h
1:26
dtLr; r_ ; t
Sj
tA
1:27
with
Lr; r_ ; t
m 2
r_
2
1:28
the Lagrangian function of a free propagating electron. Here, tA is the time when the
electron starts at A, and tQ the time when it arrives at Q. The quantities r and r_ are the
position and velocity of the particle, respectively. However, the electron acquires not
only a phase shift during free propagation but also well-dened phase shifts by the
elastic scattering events, so that the total phase accumulated along the path is the sum
of both contributions. The total amplitude for the propagation from A to Q is given by
the sum of the amplitudes Cj of all undistinguished paths. Finally, the total probability
PAQ for an electron to be transported from A to Q is determined by the square of the
total amplitude
2
X
ijj
cj e
1:29
PAQ
j
In systems with a large number of possible paths, the phases jj are usually randomly
distributed, and therefore the wave nature should have no effect on the electron transport due to averaging. Nevertheless, the fact that an increase of the resistance is
observed, compared to the classical transport, is a result of closed loops (see Figure 1.7a,
trajectory 3a). Along these loops, an electron can propagate in two opposite orientations
with the corresponding complex amplitudes C1,2 c1,2 exp(ij1,2). The current contribution of the current returning to the starting point of the loop (O) is given by
POO jC1 C2 j2 jC1 j2 jC1 j2 2ReC*1 C2
1:30
1:31
In the following section, it is briey sketched how a value for the correction of the
conductance due to weak localization can be obtained quantitatively [18]. For the weak
localization effect we are interested only in those processes where the electrons
return to their starting points. The discussion will rst be restricted to a 2-D system,
for example a 2-D electron gas in an AlGaAs/GaAs heterostructure. A larger number
j15
j 1 Phase-Coherent Transport
16
of scattering centers increase the probability for backscattering of the electrons. The
larger the number of scattering centers is, the smaller is the diffusion constant; as a
consequence one obtains for the return probability due to diffusive motion: 1/(4pDt).
For the total return probability, it must be ensured that the phase of the electrons is
preserved up to time tj, which provides a pre-factor exp(t/tj). Furthermore, it is
required that the electron is at least once elastically scattered; thus, a pre-factor
[1 exp(t/te)] must be included. In total, the correction to the conductance can be
expressed as [19, 20]:
Ds2D
2h
1
1 e t=te e t=tj
* s0 dt
m
4pDt
0
0
1
t
e2
j
2 ln@1 A
2p h
te
1:32
e
2pF
C1 ! C 1 exp i
Adl C 1 exp i
h
F0
1:35
Here, F BS is the magnetic ux penetrating the enclosed area S of the loop, with
F0 h/e the magnetic ux quantum. For the propagation in the opposite orientation
one obtains
2pF
C2 ! C 2 exp i
1:36
F0
The phase difference accumulated between both time-reversed paths is therefore
Dj 4p
F
F0
1:37
F0. Thus, lm is dened by h=eB. As outlined above, for a ux F0 the phase difference
between time-reversed paths is already signicant. The characteristic magnetic
p
relaxation time tB related to lm can be estimated from the relationship lm DtB ,
in analogy to Equation 1.4 dening lj. The expression that quantitatively describes the
increase of the conductivity with increasing magnetic eld is given by [21, 22]:
tj
e2
1
tB
1
tB
Ds2D B Ds2D 0 2 Y
Y
ln
2 2tj
2 2te
2p h
te
1:38
where C(x) is the digamma function. The exact expression for tB, which must be
inserted into Equation 1.38, is given by tB l2m =2D. At zero magnetic elds the
relevant maximum size of the loops at which the phase coherence is broken is given by
l2j . In a nite magnetic eld, weak localization is suppressed if a noticeable phase shift
between time-reversed loops is accumulated. This is the case for loops with the area of
about l2m . By comparing both relationships, it is clear that the magnetic eld has a
signicant effect on the conductance for l2j l2m. This relationship denes a critical
j17
j 1 Phase-Coherent Transport
18
Figure 1.8 Comparison of the weak localization effect in a twodimensional (upper graph) and a one-dimensional electron gas
(lower graph) in AlGaAs/GaAs. For the one-dimensional
structures a much higher magnetic field is required to suppress
the weak localization effect. (Reprinted with permission from [23].
Copyright (1987) by the American Physical Society.)
h
2el2j
1:39
1:40
Ds1D B
tj tB
tj te tB
ph W
Figure 1.9 (a) Typical closed trajectory in a dirty metal onedimensional conductor (le W lf). (b) Typical closed
trajectory in a narrow one-dimensional structure with W le.
Here, diffusive boundary scattering results in loops which selfinteract. The net flux is cancelled in this configuration.
with magnetic relaxation time in this case given by tB 3l4m =WD. It should be noted
that, at zero magnetic elds, Equation 1.32 is recovered. Furthermore, a closer
inspection of Bc reveals, that if the width of the wire is reduced, the critical eld is
increased, ensuring that the weak localization effect is preserved up to much higher
magnetic elds compared to the 2-D case. This is conrmed by the measurements
shown in Figure 1.8, where the magnetoresistance peak is wider in the 1-D case. In
wire structures based on high-mobility, 2-D electron gases, the elastic mean free path
le may be larger than the width of the wire: W le. In this ballistic regime, the
electrons propagate without any scattering between the wire boundaries. As illustrated in Figure 1.9b, owing to diffusive boundary scattering the typical closed loops
will self-interact. As both parts of the loop area are traversed in opposite orientation,
the net ux is basically cancelled [20]. Clearly, the ux cancellation results in a further
increase of the critical eld.
1.5
Spin-Effects: Weak Antilocalization
So far, the effect of spin on the electron interference has been neglected, and this
approach is valid as long as the spin orientation is conserved. However, in many
materials the spin changes its orientation while the electron propagates along the
closed loops, resulting in the weak localization effect.
It can be assumed that |si is the initial spin state, this generally being a
superposition of the spin up |"i and spin down |#i states. In principle, there are
two possibilities of how the spin orientation can be changed:
.
The ElliotYafet mechanism. Here, the potential prole of the scattering centers
can lead to spin-orbit coupling; this results in a spin rotation, while the electron is
scattered at the impurities (see Figure 1.10a).
The so-called DyakonovPerel mechanism, where the spin precesses while the
electron propagates between the scattering centers (see Figure 1.10b). The origin of
the spin precession may either be a lack of inversion symmetry (i.e., in zinc blende
j19
j 1 Phase-Coherent Transport
20
1:41
where U is the corresponding rotation matrix. For propagation along the loop in a
backwards directions (b), the nal spin state is given by
jsb i U1 jsi
1:42
Here, use is made of the fact that the rotation matrix of the counter-clockwise
propagation is simply the inverse of U. For interference between the clockwise and
counter-clockwise electron waves, not only the spatial component is relevant but also
the interference of the spin component:
hsb jsf i hU1 sjUsi
hsjU Ujsi
1:43
hsjU jsi:
2
The nal expression was obtained by making use of the fact that U is a unitary
matrix: U1 U , with U the adjoint (complex conjugated and transposed) matrix
of U. Weak localization and thus constructive interference is recovered if the spin
orientation is conserved in the case that U is the unit matrix 1.
However, if the spin is rotated during electron propagation along a loop, in general
no constructive interference can be expected. Moreover, for each loop a different
interference will be expected. Interestingly, averaging over all possible trajectories
even leads to a reversal of the weak localization effect such that, instead of an increase
in the resistance, a decrease occurs [22, 27, 28]. As the sign of the quantum
mechanical correction to the conductivity is reversed, this effect is referred to as
weak antilocalization.
The weak antilocalization measurements of a set of InGaAs/InP wires are shown
in Figure 1.11. In contrast to the weak localization effect, an enhanced conductivity is
found at B 0. However, if a magnetic eld is applied then the weak antilocalization
effect is gradually suppressed. Notably, important parameters characterizing the spin
scattering and spin precession can be extracted from weak antilocalization measurements. In fact, detailed information of the Rashba and Dresselhaus contributions in a
particular material can be obtained by tting the experimental curves to the
appropriate theoretical model. It should be noted that both contributions are
important for the spin eld effect transistor, as introduced in Chapter 3 of this
volume and Chapter 5 of volume 4 of this series (Bandyopadhyay, S., Monolithic and
Hybrid Spintronics. In: Schmid, G. (ed), Nanotechnology, Vol 4, Chapter 5).
1.6
AltshulerAronovSpivak Oscillations
The fact that, in a metallic conductor, closed loops are responsible for the reduction of
the resistance if a magnetic eld is applied raises the question: Is it possible to observe
resistance oscillations due to interference if the shape of the closed loops are restricted
by a xed, well-dened geometry? In the following sections, it will be shown that
indeed these oscillations the so-called AltshulerAronovSpivak oscillations can
be observed in ring-shaped conductors, if the rings are penetrated by a magnetic ux.
A series of interconnected ring-shaped conductors is shown schematically in
j21
j 1 Phase-Coherent Transport
22
Figure 1.12, where the enclosed magnetic ux F is the same for each ring. Thus, the
phase shift Dj between time-reversed paths, as given by Equation 1.37, is approximately the same in all rings. By using Equation 1.30, the total amplitude in a loop is
given by [30, 31]:
0
1
0
12
F
F
P C 1 exp@i2p A C 2 exp@ i2p A
F0
F0
1:44
2jC1 j2 1 cos4pF=F0 :
From the equation given above it can be concluded that the resistance in this type of
structure should oscillate with a period of F0/2.
The rst demonstration of this type of weak localization resistance oscillations
was provided by Sharvin and Sharvin [31], who evaporated a thin Mg lm onto the
surface of a quartz lament. The magnetic eld was applied in axial orientation with
respect to the lament while the current was owing through the Mg lm along the
lament. A comparison with the cross-section of the lament conrmed, that the
resistance oscillations indeed had a period of F0/2.
Beside cylindrical samples, AltshulerAronovSpivak oscillations can also be
observed in planar quantum wire networks, similar to the structure shown in
Figure 1.12 [33]. The closed trajectories are realized by squares connected to a chain
or to a mesh, as shown in Figure 1.13 (inset).
The relative resistance difference DR/R0 as a function of a perpendicular magnetic
eld is shown in Figure 1.13. Pronounced oscillations are found in the chain as well
as in the mesh structure. For smaller square elements, the oscillation period is larger
as a larger magnetic eld is required to generate a magnetic ux of F0/2. In order to
observe AltshulerAronovSpivak oscillations, the phase-coherence length must be
larger than the circumference of the squares. For the chain structure, the total
resistance is given by adding the contribution of each single square. Depending on
the type of material, each ring produces either a maximum or minimum at B 0,
depending on the absence or presence of spin scattering. This ensures that, after
summation of the contribution of each element of the chain, the Altshuler
AronovSpivak oscillations are not averaged out.
1.7
The AharonovBohm Effect
j23
j 1 Phase-Coherent Transport
24
The phase difference Dj of two electron waves propagating along the upper and
the lower branches of the ring (paths 1 and 2 in Figure 1.14a) and interfering at the
end point Q of the ring is given by
e
e
Adl
Adl
Dj c1 c2
h
h
path1
path2
1:45
e
Dc
Adl:
h
Here, w1 and w2 are the phases that the electron waves acquire during their
propagation along path 1 and path 2 at zero magnetic eld in the interior of the ring.
In contrast to weak localization, the paths are different and therefore not timereversed. Since the impurity congurations on both branches usually differ, the
accumulated phases are different in both branches.
We will now return to the AharonovBohm effect itself. By making use of
rot A B, Equation 1.45 results in
e
Bdf
Dj Dc
h
1:46
F
Dc 2p :
F0
The surface integral over B corresponds to the magnetic ux F penetrating
the ring. As illustrated in Figure 1.14a, the area penetrated by the magnetic eld
does not need to be as large as the opening of the ring. As can be inferred from
Equation 1.46, a phase shift of 2p is acquired if the magnetic ux is changed by a
magnetic ux quantum F0. Thus, the period is twice as large as the period of the
AltshulerAronovSpivak oscillations discussed above.
For the rst experiments demonstrating the AharonovBohm effect, a set-up was
used where the electrons were not exposed to a magnetic eld [36, 37]. These
experiments were performed with an electron beam in a vacuum and a shielded
magnet coil. In solid-state the AharonovBohm effect was rst demonstrated in Au
rings, with the diameter of the ring structure being less than 1 mm and a wire width of a
few tens of nanometers [38]. In metallic ring structures, the magnetic eld cannot
usually be prevented from penetrating the wire itself. Nevertheless, the vector potential
A is still responsible for the effect on the electron interference pattern. A typical ring
structure dened in an AlGaAs/GaAs semiconductor heterostructures is shown in
Figure 1.14b [35]. One important difference between the AharonovBohm experiments on nanoscaled rings and experiments using electron beams in vacuum, is that
in the former case the electrons are usually scattered many times within the conductor
before reaching the opposite side of ring. Thus, the elastic mean free path is most often
smaller than the ring size. In addition, in metallic or semiconducting ring structures
the phase-coherence length is in the order of the ring diameter at low temperatures,
and consequently many electrons lose their phase memory while propagating through
the ring. This is the reason why the oscillation amplitude is considerably smaller than
the total resistance of the structure. As can be seen in Figure 1.15, pronounced
AharonovBohm oscillations were observed in ring structures based on 2-D electron
gases in an In0.77Ga0.23As/InP heterostructure [39]. A comparison of the enclosed area
of the ring conrmed that the oscillation period corresponded to a magnetic ux
quantum F0. Owing to the low effective electron mass and to the high mobility in these
heterostructures, the phase-coherence length can exceed 1 mm at temperatures below
1 K, and consequently large oscillation amplitudes are achieved. Previously, resistance
modulations of up to 12% have been observed in this type of structure.
j25
j 1 Phase-Coherent Transport
26
and gate voltage. The symmetric pattern is due to the fact that, although four
terminals are used during the measurement, the measurement is effectively a
two-terminal measurement. For such a measurement it can be shown in general
that the resistance is symmetric under reversal of the magnetic eld: R(B)
R(B) [9]. The reason for the two-terminal nature of the measurement is the coupling
of the two contacts on each side, although these are not independent, as would be
required for a pure four-terminal measurement.
1.8
Universal Conductance Fluctuations
The reason, why each sample shows a different conductance uctuation pattern
becomes clear when it is realized that very few scattering centers (i.e., impurities) are
present in very small structures. In fact, if the sample possesses very few impurities
then it is the impurity conguration that governs the transport properties. Moreover,
if only a nite number of scattering centers is present, an ensemble average cannot be
applied for the theoretical description, as this does not take into consideration the
particular spatial distribution of the scattering centers. Of course uctuations may be
observed not only in ring structures but also in a single quantum wire, as they
originate from the spatial conguration of the scattering centers in the wire. The
conductance uctuations of a single Au wire are shown in Figure 1.17.
The uctuations shown in Figure 1.17 cover an interval of approximately e2/h, the
uctuation amplitude of which, as proven by a detailed theoretical analysis, is
universal [47, 48]. For a qualitative explanation of the physical origin of the conductance uctuations, the reader is referred to the above-mentioned AharonovBohm
effect. As illustrated in Figure 1.18a, the electron is able to propagate along a certain
number of paths in order to cross the wire.
j27
j 1 Phase-Coherent Transport
28
The total transmission probability results from the squared amplitude of all
possible trajectories. Among these trajectories, a limited number of paths may be
found which meet again after a certain distance. If a magnetic eld is applied, the
paths become penetrated by a magnetic ux F; subsequently, if the magnetic eld is
varied, superpositioning of the electron waves of two paths (which cross twice) leads
to a variation in the transmission probability due to the AharonovBohm effect.
Then, in contrast to a well-dened ring structure, the encircled areas differ among the
various locations of the wire, such that a different AharonovBohm period is
developed for each area. Superposition of the different quasi AharonovBohm rings
then produces an irregular conductance pattern [43]. It is important that only a
limited number of trajectories exists, so that an effective averaging out of the
1:48
j29
j 1 Phase-Coherent Transport
30
var
Rmn
h
m;n
0 12
e2
@ A N 2 varRmn ;
h
1:49
According to Equation 1.49, the variance of Rmn must rst be calculated in order to
obtain the variance of G. The variance of Rmn is given by
varRmn hR2mn i hRmn i2
1:51
The last term is the square of the average reection probability hRmni, which can be
expressed by
X
hRmn i
hCi C*j i
1:52
ij
1:53
1:54
For the rst term in Equation 1.51, the following can be written:
X
hR2mn i
hCi Cj C *k C *l i
ijkl
X
fhjC i j2 ihjCj j2 idik d jl hjC i j2 ihjCj j2 idil d jk g
ijkl
1:55
X
hjC i j2 ihjCj j2 i 2hRmn i2 :
2
ij
1
N
1:56
1:57
for the variance of the conductance in the diffusive limit. As can be seen here, the
conductance uctuations are of the order of e2/h. The universal magnitude of the
conductance uctuations is found for example in the measurement shown in
Figure 1.17.
1.8.3
Fluctuations in Long Wires
Until now, it has been assumed that the wire length is smaller than the phasecoherence length, but the question must also be addressed as to what happens if the
length L of the wire exceeds the phase-coherence length lj. In this situation, the wire
may be cut into N L/lj phase-coherent pieces connected in series (see Figure 1.19).
Each of these segments produces resistance uctuations, dR0, so that the total
resistance uctuations are given by
p
1:58
dR N dR0
j31
j 1 Phase-Coherent Transport
32
In semiconductor structures the electron concentration and thus the Fermi energy
can be controlled by means of a gate electrode. An impression of how the Fermi
energy affects the conductance uctuations can be gained by comparing the change
of EF with the characteristic correlation energy ETh (Thouless energy).
First, an insight must be obtained into the nature of the correlation energy ETh. For
the sake of simplicity, an ideal system can be assumed where any scattering is
neglected. Along the length L, the phase develops as
j kL
1:61
1:62
dE
Dj
1
Dk hvF
hDj
dk
L
tTh
1:63
Here, tTh is the time that the wave requires to propagate along length L. The Thouless
energy, ETh, is dened as the energy difference where both states are uncorrelated, at
which point if the phase difference Dj is equal to 1, the following is obtained:
E Th
vF
h
h
tTh
L
1:64
Thus, the Thouless energy is connected to the time tL L/vF that an electron wave
requires to cover the dimension L of the system. Up to now, only a ballistic system has
been considered, but in the diffusive regime the corresponding characteristic time is
given by tTh L2/D. Only if the Fermi energy is changed by a value comparable to ETh
is the next energy level reached and the conductance changed to a value uncorrelated
to the conductance value of the previous energy value. The correlation energy for
small quantum wires has a relative large value, which would not be expected if a large
number of uncorrelated trajectories were to be averaged.
If the temperature is increased, then the Fermi distribution becomes smeared out.
However, while the temperature remains sufciently low that the width of the Fermi
distribution is smaller than ETh, the maximum uctuation amplitude is maintained.
When the smearing of the Fermi distribution exceeds ETh, a number of approximately
these
N (kBT)/ETh segments contribute. As p
p are uncorrelated, the
N segments
uctuation amplitude decreases with 1= N , thus 1= T . This behavior was conrmed experimentally, as shown in Figure 1.20 [46]. At low temperature the
uctuation amplitude is found initially to be constant, but if theptemperature
is
1.9
Concluding Remarks
j33
j 1 Phase-Coherent Transport
34
coherence in nanostructures has attracted much attention in connection with solidstate quantum computation, where the maintenance of phase coherence is a critical
issue. Furthermore, spin-related phenomena are a subject of current interest, as
phase-coherent spin manipulation is regarded as an interesting option for future
electronic devices.
Clearly, this chapter can provide only a brief overview of the most important
phenomena connected with phase-coherent transport in nanostructures. However,
for further information, the reader is referred to various textbooks [1, 51, 52] and
reviews [2, 18, 46].
References
1 Datta, S. (1995) Electron Transport in
Mesoscopic Systems, Cambridge University
Press, Cambridge.
2 Lin, J.J. and Bird, J.P. (2002) Journal of
Physics: Condensed Matter, 14, R501.
3 Giuliani, G.F. and Quinn, J.J. (1982) Physical
Review B, 26, 4421.
4 Altshuler, B.L., Aronov, A.G. and,
Khmelnitsky, D.E. (1981) Solid State
Communications, 39, 619.
5 Choi, K.K., Tsui, D.C. and Alavi, K. (1987)
Physical Review B, 36, 7751.
6 Landauer, R. (1957) IBM Journal of
Research and Development, 21, 223.
7 Landauer, R. (1987) Zeitschrift fr Physik B,
68, 217.
8 B
uttiker, M., Imry, Y., Landauer, R. and
Pinhas, S. (1985) Physical Review B, 31,
6207.
9 B
uttiker, M. (1986) Physical Review Letters,
57, 1764.
10 van Wees, B.J., van Houten, H., Beenakker,
C.W.J., Willamson, J.G., Kouwenhoven,
L.P., van der Marel, D. and Foxon, C.T.
(1988) Physical Review Letters, 60, 848.
11 Wharam, D.A., Thornton, T.J., Newbury,
R., Pepper, M., Ahmed, H., Frost, J.E.F.,
Hasko, D.G., Peacock, D.C., Ritchie, D.A.
and Jones, G.A.C. (1988) Journal of Physics
C: Solid State Physics, 21, L209.
12 Laux, S.E., Frank, D.J. and Stern, F. (1988)
Surface Science, 196, 101.
13 Szafer, A. and Stone, A.D. (1989) Physical
Review Letters, 62, 300.
References
25
26
27
28
29
30
31
32
33
34
35
36
37
j35
j37
2
Charge Transport and Single-Electron Effects
in Nanoscale Systems
Joseph M. Thijssen and Herre S.J. van der Zant
2.1
Introduction: Three-Terminal Devices and Quantization
In electronics, charges are manipulated by sending them through devices which have
a few terminals: a source which injects the charge; and a drain which removes the
charge from the device. Occasionally, a third terminal, called a gate, is present, and
this is used to manipulate the charge ow through the device. The gate does neither
inject charge into, nor removes it from the device. Three-terminal devices are
standard elements of electronic circuits, where they act as switches or as amplifying
elements. Semiconductor-based three-terminal switches are responsible for the
tremendous increase in computer speed achieved over the past few decades.
Feynman, in his famous lecture [1], pointed out that the possible scale reduction
from the standards of that period was still enormous, and he also suggested that
quantum mechanical behavior may result in a different way of operation of the
devices, which may open new horizons for applications. Indeed, as we now know, two
aspects become important when the size of the device is reduced. The rst aspect is
indeed the quantum mechanical behavior, and the second is the quantization of the
charges owing into and out of the devices. It is interesting to analyze how the energy
scales at which the two effects become noticeable, depend on the device size.
The charge quantization is subtle in view of quantum mechanics: in principle, the
charge carried by an electron is distributed in space. In quantum mechanics, a single
charge may be distributed according to |c(r)|2, where c(r) is the quantum mechanical
wave function, and this leaves open the possibility of having a fractional charge inside
the device. Therefore, the discrete nature of charge does not seem to play a role in the
charge transport. However, if the device were to be uncoupled from its surroundings,
we would only nd integer charges residing on it. This puzzle is solved by realizing
that the expectation value of the electrostatic energy, which must be included into the
Hamiltonian governing the electron behavior, is dominated by the charge distribution which occurs most of the time. It can be shown that the charge within a device
that is weakly coupled to its surroundings, is always very close to an integer. Therefore,
38
in order to observe Coulomb effects resulting from the discreteness of the electron
charge, it is necessary to consider devices that are weakly coupled to the
surroundings.
For the charge quantization, the energy scale associated with the discreteness of
the electron charge is given by
EC
e2
;
2C
where C is the capacitance of the device. This is the energy needed to add a unit charge
to the device it is called the charging energy. Taking as an estimate the capacitance
of a sphere with radius R, we have
EC
e2
1
EH ;
8pe0 R 2R
2:1
where, in the rightmost expression, R is given in atomic units (Bohr radii), as is the
energy (EH is the atomic unit of energy it is called the Hartree and it is given by
27.212 eV). In Section 2.4, we shall present a more detailed analysis for the case where
the device is (weakly) coupled to a source, drain and gate.
The energy scale for quantum effects is given by the distance between the energy
levels of an isolated device. As a rough estimate, we consider the particle in the (cubic)
box problem with energy levels separated by a level splitting D given by
D const
2
h
1
const 2 E H ;
2
mL
L
2:2
where m is the electron mass and L is the box size (which must be given in atomic
units in the rightmost expression). The rst multiplicative constant is of order 1; it
depends on the geometry and on the details of the potential.
In the case of carbon nanotubes, the device is much smaller in the lateral direction
than along the tube axis. In such cases it is useful to distinguish between the two
sizes. The lateral size leads to a large energy splitting and the longitudinal splitting
may become vanishingly small. For a metallic nanotube, the level spacing associated
with the tube length L is
D
vF
h
;
2L
When studying transport through a small island, weakly coupled to a source and a
drain, information can be obtained about the quantum level splitting D and the charging
energy EC if the energy of the particles owing through the device can be controlled with
precision high enough to resolve these energy splittings. Paulis principle states that
electrons can only ow from an occupied state in the source to an empty state in the
drain. The separation between empty and occupied states in the leads is only sharp
enough when the temperature is sufciently low. It can be seen that a low operation
temperature is essential for observing the quantum and charge quantization effects.
The energy scale associated with the temperature is given by kBT, so we must have
kB T D; E C :
Note that for molecular devices, with their relatively large values of D and EC,
quantum and charge quantization effects should still be observable at room temperature. In a typical metallic island, D kBT, and the Coulomb blockade dominates the
level separation. In this case we speak of a classical dot (see also Chapter 21 of this
volume).
In the present chapter we explain the different aspects of charge transport, with
emphasis on those devices in which the level spacing and the charging energy plays
an essential role in the transport properties. This is the case in quantum dots and in
many molecular devices.
Table 2.1 Typical charging energies and level spacings for various three-terminal devices.
EC
D
a
Ga As quantum dot
Carbon nanotubea
Molecular transistor
0.22 meV
0.020.2 meV
3 meV
3 meV
>0.1 eV
>0.1 eV
j39
40
2.2
Description of Transport
Although it often cannot be used in the transport itself, the single particle picture is
still suitable for the bulk-like systems to which the device is coupled, and for the
narrow leads which may be present between the island and the bulk reservoirs. These
elements are described in Chapter 1, and their properties will be recalled only briey
here, with emphasis on the issues needed in the context of the present chapter.
2.2.1.1 The Reservoirs
The reservoirs are bulk-like regions where the electrons are in equilibrium. These
regions are maintained at a specied temperature, and the number of electrons is
variable as they are connected to the voltage source and the leads to the device (see
below). The electrons in these reservoirs are therefore distributed according to Fermi
functions with a given temperature T and a chemical potential m:
f FD E
1
:
expE m=kB T 1
This function falls off from 1 at low energy to 0 at high energy. For (m E0) kBT,
where E0 is the ground state energy, this reduces to a sharp step down from 1 to 0 at
E m, and m can in that case be identied with the Fermi energy (the highest occupied
single-particle energy level).
In order to have a current running through the device and the leads, the source and
drain reservoirs are connected to a voltage source. A bias voltage causes the two leads
to have different chemical potentials.
2.2.1.2 The Leads
Sometimes it is useful to consider the leads as a separate part of the system, in
particular for convenience of the theoretical analysis. The leads are channels, which
may be considered to be homogeneous. They form the connection between the
reservoirs and the island (see below). They are quite narrow and relatively long.
Electrons in the leads can still be described by single-particle orbitals. If the leads have
a discrete or continuous translational symmetry, the states inside them are Bloch
waves. By separation of variables, we can write the states as
eikz z uT x; y
with energy
E ET
2 k2z
h
:
2m
It is seen that the states can be written as a transverse state uT(x, y) which
contributes an amount ET to the total energy, times a plane wave along z. The
quantum numbers of the transverse wave function uT(x, y) are used to identify a
channel.
In this chapter, usually no distinction is made between reservoirs and leads:
rather, they are both simply described as baths in equilibrium with a particular
temperature and chemical potential (which may be different for the source and drain
lead). However, for a theoretical description of transport, it is often convenient to
study the scattering of the incoming states into outgoing states in that case, the
simple and well-dened states of the leads facilitate the description.
2.2.1.3 The Island
This is the part of the system which is small in all directions (although in a
nanotube, the transverse dimensions are much smaller than the longitudinal);
hence, this is the part where the Coulomb interaction plays an important role. To
understand the device, it is useful to take as a reference the isolated island. In that
case we have a set of quantum states with discrete energies (levels). The density of
states of the device consists of a series of delta-functions corresponding to the
bound state energies.
j41
42
Now imagine there is a knob by which we can tune the coupling to the leads. This
is given in terms of the rate G=h at which electrons cross the tunnel barriers
separating the island from the leads. The transport through the barriers is a
tunneling process which is fast and, in most cases, it can be considered as elastic:
the energy is conserved in the tunneling process. Generally speaking, when the
island is coupled to the leads (or directly to the reservoirs), the level broadens as a
result of the continuous density of states in the leads (or reservoirs), and it may shift
due to charge transfer from the leads to the island. Two limits can be considered.
For weak coupling, G EC, D, the density of states should be close to that of the
isolated device: it consists of a series of peaks, the width of which is proportional to
G. Sometimes, we wish to distinguish between the coupling to the source and drain
lead, and use GS and GD, respectively. For strong coupling, that is, G EC, D, the
density of states is strongly inuenced by that of the leads, and the structure of the
spectrum of the island device is much more difcult to recognize in the density of
states of the coupled island.
If we keep the number of electrons within the island xed, we still have the freedom
of distributing the electrons over the energy spectrum. The only constraint is the fact
that not more than one electron can occupy a quantum state as a consequence of
Paulis principle. The change in total energy of the device is then mainly determined
by the level splitting which is characterized by the energy scale, D. If we wish to add or
remove an electron to or from the device, we must pay or we gain in addition a
charging energy respectively.
It should be noted that, in principle, G may depend on the particular charge state on
the island. This is expected to be the case in molecules: the charge distribution usually
strongly differs for the different orbitals and this will certainly inuence the degree in
which that orbital couples to the lead states.
At this stage, an important point should be emphasized. From statistical mechanics, it is known that a particle current is driven by a chemical potential difference.
Therefore, the chemical potential of the island is the relevant quantity driving the
current to and from the leads. However, in an independent particle picture, a single
particle energy is identical to the chemical potential (which is dened as the
difference in total energy between a system with N 1 and N particles). Therefore,
if we speak of a single-particle energy of the island, this should often be read as
chemical potential.
2.2.2
Transport
For an extensive discussion of the issues discussed in this paragraph, the reader is
referred to the monograph by Datta [2].
As seen above, in the device we can often distinguish discrete states as (Lorentzian)
peaks with nite width in the density of states. A convenient representation of
transport is then given in Figure 2.3, which shows that the effect of the gate is to shift
the levels of the device up and down, while leaving the chemical potentials mS and mD
of the leads unchanged (for small devices, the gate eld is inhomogeneous due to the
effect of the leads; moreover, the electrostatic potential in the surface region of the
leads will be slightly affected by the gate voltage).
The transport through the device can take place in many different ways. A few
classications will now be provided which may help in understanding the transport
characteristics of a particular transport process.
2.2.2.1 Coherent-Incoherent Transport
First, the transport may be either coherent or incoherent, a notion which pertains to an
independent particle description of the electrons where the electrons occupy oneparticle orbitals. In the case of coherent transport, the phase of the orbitals evolves
deterministically. In the case of incoherent processes, the phase changes in an
unpredictable manner due to interactions which are not contained in the indepen-
j43
44
Molecules can often be viewed as chains of weakly coupled sites. If the Fermi
energy of the source lead (i.e., the injection energy) is at some distance below the onsite energies of the molecule, the dominant transport mechanism is through higher
order processes, which in electron transfer theory are known as superexchange
processes. This term also includes hole transport through levels below the Fermi
energy of the leads.
2.2.2.5 Direct Tunneling
It should be noted that if the device is very small (e.g., a molecule), there is a
possibility of having direct tunneling from the source to the drain, in which the
resonant states of the device are not used for the transport.
2.3
Resonant Transport
We start this section by studying resonant transport qualitatively [2]. Suppose we have
one or more sharp resonant levels which can be used in the transport process from
source to drain. We neglect inelastic processes inside the device during tunneling
from the leads to the device, or vice versa. In order to send an electron into the device
at the resonant energy, we need occupied states in the source lead. This means that
the density of states in that lead must be non-zero for the resonant energy (otherwise
there is no lead state at that energy), and that the FermiDirac distribution must allow
for that energy level to be occupied. Furthermore, for the electron to end up in the
drain, the states in the drain at the resonant energy should be empty according to
Paulis principle. We conclude that for the transport to be possible, the resonance
should be inside the bias-window. This window is dened as the range of energies
between the Fermi energies of the source and the drain.
The process is depicted in Figure 2.3 (top), and from this picture we can infer the
behavior of the current as a function of the bias voltage. It can be seen that no current
is possible (left panel) for a small bias voltage as a result of a nite difference in energy
DE between the energy of the resonant state on the island and the nearest of the two
chemical potentials of the leads. The current sets off as soon as the bias window
encloses the resonance energy (right panel). Any further increase of the bias voltage
does not change the current, until another resonance is included. The mechanism
described here gives rise to currentvoltage characteristics shown in Figure 2.4.
At this point, two remarks are in order. First, the image sketched here supposes
weak coupling and a low temperature. Increasing the temperature blurs the sharp
edge in the spectrum between the occupied and empty states, and this will cause the
sharp steps seen in the I/V curve to become rounded. Second, the differential
conductance, dI/dV as a function of the bias voltage V shows a peak at the positions
where the current steps up.
In the previous section it was noted that the coupling G GS GD between leads
and device can be given in terms of the rate at which electrons hop from the lead onto
the device. From this, an heuristic argument leads via the timeenergy uncertainty
j45
46
relation to the conclusion that G gives us the extent to which an energy level1)E0 on the
island is broadened. Simple models for leads and device yield a Lorentzian density of
states on the device [2]:
DE
1
G
:
2p E E 0 2 G=22
e
GS GD
DE
IE
f E mS f FD E mD dE:
2:4
h
GS GD FD
It must be remembered that the bias voltage (the potential difference between
source and drain) is related to the chemical potentials mD and mS as
eV mS mD ;
where e > 0 is unit charge. A positive bias voltage drives the electrons from right to
left, such that the current is then from left to right; this is dened as the positive
direction of the current.
If the density of states has a single sharp peak, then current is only possible when
this peak lies inside the bias window. Indeed, replacing D(E) by a delta-function
centered at E0 directly gives
I
e GS GD
f FD E 0 mS f FD E 0 mD :
h GS GD
At low temperature, the factor in square brackets is 1 when E0 lies inside the bias
window and 0 otherwise. It can be seen that the maximum value of the current is
found as
1) Note that the energy should be identied with the
chemical potential of the island; see the comment
in the previous section.
jI max j
e GS GD
:
GS GD
h
2:5
For low temperature, the Fermi functions in Equation 2.4 become sharp steps, and
the integral of the Lorentzian can be carried out analytically, yielding
e GS GD
m E0
m E0
arctan 2 S
arctan 2 D
:
2:6
I
ph GS GD
G
G
Equation 2.4 is valid in the limit where we can describe the transport in terms of
the independent particle model. It has the form of the Landauer formula:
e
I
TEf FD E mD f FD E mS dE;
h
which is discussed extensively in Chapter 1 of this volume. In that chapter it is shown
that the transmission per channel (which corresponds to the eigenvalues of the
matrix T(E)) has a maximum value of 1, so that the current assumes for low
temperatures a maximum value of
Imax
e2
nV;
h
2:7
where n is the number of channels inside the bias window. It should be noted that this
maximum occurs only for reectionless contacts, for which a wave incident from the
leads onto the device, is completely transmitted. This usually occurs when the device
and the leads are made from the same material. The strong-coupling result in
Equation 2.7 has been given in order to emphasize that the two Equations 2.5 and 2.7
hold in quite opposite regimes.
Often, in experiments the differential conductance dI/dV is measured, and this can
be calculated from the expression in Equation 2.4:
dI
e2 GS GD
dV
GS GD
h
2:8
0
heV 1 hf 0FD E m
1 heVg;
E m
dEDEfh f FD
0
denotes the rst derivative of the FermiDirac distribution with respect to
where f FD
mS mD =2. The parameter h species how the bias voltage is
its argument, and m
distributed over the source and drain contact; for h 1/2, this distribution is
symmetric. For T 0, the FermiDirac distribution function reduces to a step
function, and its derivative is then a delta-function. For low bias (V 0), the integral
picks up a contribution from both delta functions occurring in the integral in
Equation 2.8. The result is
dI
e2 GS GD
4
Dm;
dV
h GS GD
where the energy E in Equation 2.8 is taken at the Fermi energy of either the source
or the drain. As the maximum value of D(E) is given as
DEmax
2
1
;
p GS GD
j47
48
cosh
:
2:9
dV 4kB TGS GD
2kB T
This line shape (see Figure 2.5) is characterized by a maximum value e2GSGD/4kBT
(GS GD), attained when the gate voltage reaches the resonance V0 E0/e. The fullwidth half maximum (FWHM) of this peak is 3.525kBT/ea. The parameter a is the
gate coupling parameter: the potential on the island varies linearly with the gate
voltage, DVI aDVG. These features are often used as a signature for true quantum
resonant behavior as opposed to classical dots, where the small value of D renders the
spectrum of levels accessible to an electron continuous. For a classical dot, the peak
height is independent of temperature and the FWHM is predicted to increase by a
factor 1.25 [3,4]. It should be noted that, in a quantum dot, G sets a lower bound for the
temperature dependence of the peak shape: for G > kBT the peak height and shape
are independent of temperature (not visible in Figure 2.5 due to the small value for
G chosen there).
Interestingly, the nite width of the density of states, which is given by GS GD,
can in principle be measured experimentally from the resonance line widths at low
temperature. It should be noted that the expressions for the current and differential
conductance depend only on the combinations GS GD and GSGD/(GS GD). If both
are extracted from experimental data, the values of GS and GD can be determined
(although the symmetry between exchange of source and drain prevents us from
identifying which value belongs to the source).
2.4
Constant Interaction Model
In Section 2.1.1 it was seen that in the weak-coupling regime, energy levels can be
discrete for two reasons: quantum connement (the fact that the state must t into a
small island), and charge quantization effects. The scale for the second type of splitting
is the charging or Coulomb energy, EC. It is important to realize that this energy will
only be noticeable when the coupling to the leads is small in comparison with EC; this
situation is referred to as the Coulomb blockade regime. In this situation, a clear
distinction should be made a between one or two electrons occupying a level, as their
Coulomb interaction contributes signicantly to the total energy. The transport process
may be analyzed using the so-called constant interaction model [3], which is based on the
set-up shown in Figure 2.6. Elementary electrostatics provides the following relation
between the different potentials and the charge Q on the island:
CV I C S V S CD V D CG V G Q;
where C CS CD CG. Note that this equation can be written in the form:
V I V ext
Q
;
C
with
V ext CS V S C D V D CG V G =C:
It is seen that the potential on the dot is determined by the charge residing on it and
by the induced potential Vext of the source, drain and gate.
We take as a reference conguration the one for which all voltages and the charge
are zero. For total energies, U rather than E is used in order to avoid confusion with
odinger
the single-particle energies En resulting from solving the single-particle Schr
equation. The electrostatic energy UES(N) with respect to this reference conguration
after changing the source, drain and gate potentials and putting N electrons (of
charge e) on the island is then identied as the work needed to place this extra
charge on the island, and the energy cost involved in changing the external potential
when a charge Q is present:
j49
50
Ne;V
ext
U ES N
V I dQ QdV ext
Q0;V ext 0
Ne2
NeV ext :
2C
The integral is over a path in Q, Vext space; it is independent of the path that is, of
how the charge and external potential are changed in time.
The result for the total energy, including the quantum energy due to the orbital
energies is
N
X
Ne2
En:
NeV ext
2C
n1
The energy levels En correspond to states which can be occupied by the electrons in
the device, provided that their total number does not change, as changing this
number would change the Coulomb energy, which is accounted for by the rst term.
This expression for the total energy is essentially the constant interaction model.
From non-equilibrium thermodynamics, it is known that a current is driven by a
chemical potential difference; hence, we should compare the chemical potential on
the device,
UN
mN UN UN 1 N 1=2
e2
eV ext E N ;
C
2:10
with that of the source and drain in order to see whether a current is owing through
the device. From the denition of Vext we see that the effective change in the chemical
potential due to a change of the gate voltage (while keeping source and drain voltage
constant), carries a factor CG/C; this is precisely the gate coupling, which is called the
a factor (this was referred to at the end of Section 2.3).
It is important to be aware of the conditions for which the constant interaction
model provides a reliable description of the device. The rst condition is weak
coupling to the leads; the second condition is that the size of the device should be
sufciently large to make a description with single values for the capacitances
possible. Finally, the single-particle levels En must be independent of the charge
N. The constant-interaction model works well for weakly coupled quantum dots, for
which it is very often used. However, for molecular devices the presence of a source
and drain both of which are large chunks of conducting material separated by very
narrow gaps, reduces the gate eld to be barely noticeable close to the leads and far
from the gate. This inhomogeneity of the gate eld may lead to a dependence of the
gate capacitance CG with N due to the difference in structure of subsequent molecular
orbitals, and the chemical potential on the molecule will vary non-linearly with the
gate potential.
As will be seen below, the distance between the different chemical potential levels
can be inferred from three-terminal measurements of the (differential) conductance.
From Equation 2.10, this distance is given by
mN 1 mN
e2
EN 1 EN :
C
It should be noted that the difference in energy levels occurring in this expression
(EN1 EN) is nothing but the splitting D mentioned at the very start of this chapter.
However, for typical metallic and semiconductor quantum dots, this splitting is
usually signicantly smaller than the charging energy, so that this quantity determines the distance between the energy levels:
mN 1 mN
e2
:
C
Note that this addition energy is twice the energy of a charge on the dot (as the
addition energy is the second derivative of the energy with respect to the charge).
Now, the current can be studied as a function of bias and gate voltage. In
Section 2.2.2 it was seen that, in the weak coupling regime and at low temperature,
the current is suppressed when all chemical potential levels lie outside of the bias
window. As the location of these levels can be tuned by using the gate voltage, it is
interesting to study the current and differential conductance of the device as a
function of the bias and of the gate voltage.
We can calculate the line in the V, VG plane which separates a region of suppressed
current from a region with nite current; this line is determined by the condition that
the chemical potential of the source (or drain) is aligned with that of a level on the
island. Again, it is assumed that the drain is grounded (as in Figure 2.2). From the
expression in Equation 2.10 for the chemical potential, and using the denition for
Vext, we nd the following condition for the chemical potential to be aligned to the
source (keeping the dots charge constant):
V bV G V C ;
where b CG/(CG CD) and V C N 1=2 CeG CCG EeN ; that is, the voltage corresponding to the chemical potential on the dot in the absence of an external potential.
If the chemical potential is aligned with the drain, we have
V gV C V G
with g CG/CS. The expressions given here are specic for a grounded drain
electrode. It is easily veried that, irrespective of the grounding, it holds that
C
1 1 1
:
CG a b g
Each resonance generates two straight lines separating regions of suppressed
current from those with nite current. For a sequence of resonances, the arrangement shown in Figure 2.7a is obtained. The diamond-shaped regions are traditionally
called Coulomb diamonds, as they were very often studied in the context of metallic
dots, where the chemical potential difference of the levels is mainly made up of the
Coulomb energy. The name is also used in molecular transport, although this is
strictly speaking not justied there as D may be of the same order as the Coulomb
interaction.
From the Coulomb diamond picture we can infer the values of some important
quantities. First, we consider two successive states on the molecule with
chemical potentials Dm(N) and Dm(N+1). If we suppose that both states have the
j51
52
Figure 2.7 Linear transport. (a) Twodimensional plot of the current as a function of
bias and gate voltage (stability diagram). For a
small bias, current flows only in the three points
corresponding to the situation shown in
Figure 2.3 (top right). These points are known as
the degeneracy points. Red positive
same gate-coupling parameter a, it can be seen that the upper and lower vertices
of the diamond are both at a distance
DV
jmN mN 1j E add
e
e
from the zero-bias line. This difference in chemical potentials is the electron
addition energy, Eadd. If the addition energy is dominated by the charging energy,
then the total capacitance can be determined. Combining this with the slopes of the
sides of the diamond, which provide the relative values of CG, CS and CD, all of these
capacitances we can be determined explicitly.
One interesting consequence of the previous analysis is that, if the capacitances do
not depend on the particular state being examined, then the height of successive
Coulomb diamonds is constant. If, in addition to the Coulomb energy, a level splitting
is present, this homogeneity will be destroyed, as can be seen in Figure 2.7b, which
shows the diamonds for a carbon nanotube (CNT) [5]. The alternation of a large
diamond with three smaller ones can be nicely explained with a model Hamiltonian
[6]. In the case of transport through molecules there is no obvious underlying
structure in the diamonds.
The electron addition energy is sometimes connected to the so-called HOMO
LUMO gap. [These acronyms represent the Highest Occupied (Lowest Unoccupied)
Molecular Orbital, and denote orbitals within an independent particle scheme.] If the
Coulomb interaction is signicant, the HOMOLUMO gap can be related to the
excitation energy for an optical absorption process in which an electron is promoted
from the ground state to the rst excited state, without leaving the system. In that case,
the change in Coulomb energy is modest, and the energy difference is mostly made up
of the quantum splitting D. It should be noted, however, that the HOMO and LUMO are
usually calculated using a computational scheme whereby the orbitals are calculated for
the ground-state conguration that is, without explicitly taking into account the fact
that all orbitals change when, for example, an electron is excited to a higher level.
The addition energies are partly determined by quantum connement effects and
partly by Coulomb effects. A difculty here is that these energies will be different for a
molecular junction, in which a molecule is either physisorbed or chemisorbed to
conducting leads, than for a molecule in the gas phase. There are several effects
responsible for this difference. First, if there is a chemical bond present, then the
electronic orbitals extend over a larger space, which reduces the connement
splitting. Second, a chemical bond may cause a charge transfer from lead to molecule,
which in turn causes the potential on the molecule to change. Third, the charge
distribution on the molecule will polarize the surface charge on the leads, which can
be represented as an image charge. Such charges have the effect of reducing the
Coulomb part of the addition energy. In experiments with molecular junctions, much
smaller addition energies are often observed than in gas-phase molecules. At the time
of writing, there is no quantitative understanding of the addition energy in molecular
three-terminal junctions, although the effects mentioned here are commonly held
responsible for the observed gaps.
2.5
Charge Transport Measurements as a Spectroscopic Tool
A stability diagram can be used not only for nding addition energies, but also to
form a spectroscopic tool for revealing subtle excitations that arise on top of the
ground state congurations of an island with a particular number of electrons on it.
These excitations appear as lines running parallel to the Coulomb diamond edges. An
example taken from Ref. [7] is shown in Figure 2.8a, where the white arrows indicate
the excitation lines. At such a line, a new state (electronic or vibrational) enters the
bias window, thus creating an additional transport channel. The result is a stepwise increase in the current, and a corresponding peak in the differential conductance.
The energy of an excitation can be determined by reading off the bias voltage of the
intersection point between the excitation line and the Coulomb diamond edge
through the same argument used for nding addition energies. The excitations
correspond to the charge state of the Coulomb diamond at which they ultimately
end (see Figure 2.8c). The width of the lines in the dI/dV plot (or, equivalently, the
voltage range over which the stepwise increase in current occurs) is determined by the
larger of the energies kBT and G. In practice, this means that sharp lines and thus
accurate information on spectroscopic features are obtained at low temperatures and
j53
54
for weak coupling to the leads. However, it should be noted that the current is
proportional to G [Equations 2.4 and 2.5], so that G should not be too small; in fact, a Gvalue in the order of 0.11 meV seems typical in experiments that allow for
spectroscopy.
An important experimental issue here is that, for a particular charge state, the lines
are often visible on only one side of the Coulomb diamond (see Figure 2.8a, lower
right panel). This is due to an asymmetry in the coupling that is, for GD GS (or
GS GD). The situation at the two main diamond edges is illustrated in Figure 2.9.
A thick and a thin barrier between the island and the source/drain represent these
anti-symmetric couplings. It is clear that if the chemical potential in the lead
connected through the thin barrier is the higher one, then the island will have one
of its transport channels lled. The limiting step for transport is the thick barrier, and
only the occupied orbital will contribute to the current. When an extra transport
orbital becomes available, this will have only a minor effect on the total current, but if
the chemical potential of the lead beyond the thick barrier is high, then the transport
levels on the island will all be empty. The lead electrons which must tunnel through
the thick barrier have as many possible channels at their disposal as there are possible
empty states: the more orbitals, the more channels there are, and therefore a stepwise
increase occurs each time a new excitation becomes available.
2.5.1
Electronic Excitations
In order to study how detailed information on the electronic structure of the island
can be obtained from conduction measurements, we consider a system consisting of
levels that are separated in energy by the Di (see Figure 2.10). It should be noted that
this level splitting does not include a charging energy: the levels can be occupied in
charge-neutral excitations. For one extra electron on the island, N 1, the ground
j55
56
state is the one in which it occupies the lowest level. As discussed above, as soon as
this level is inside the bias window, the current begins to ow, thereby dening the
edges of the Coulomb diamonds. When the bias increases further, transport through
the excited level becomes possible. This leads to a step-wise increase of the current as
there are now two states available for resonant transport, and this increases the
probability for electrons to pass through the island. It should be noted that both levels
cannot be occupied at the same time, as this requires a charging energy in addition to
the level splitting. The resulting peak in the dI/dV forms a line (red) inside the
conducting region (blue), ending up at the N 1 diamond (white), as shown in
Figure 2.8c. (Eex D1 in this case). A second excitation is found at D1 D2; subseP
quent excitations intersect the diamond edge at bias voltages i Di , but they are only
P
visible if i Di < e2 =C:
Now, we consider the case where two electrons are added to the neutral island
(N 2). When two electrons occupy the lower orbital, the Fermi principle requires
their spin to be opposite. The rst excited state is the one in which one of the electrons
is transferred to the higher orbital, which costs an energy of D1. A ferromagnetic
exchange coupling favors a triplet state with a parallel alignment. If we take only
exchange interactions between different orbitals into account, this results in an
energy gain of J with respect to the situation with opposite spins. Thus, the rst
excitation is expected to be at D1 J, and the second one (corresponding to opposite
spins) at D1. The energy difference between the two excitations in Figure 2.8c
provides a direct measure of J. In some systems, J may be negative (antiferromagnetic
case) and the antiparallel conguration has a lower energy.
The simple analysis presented here captures some of the basic features of fewelectron semiconducting quantum dots [8] in which the charge states to which the
levels belong, can be identied. The complete electron spectrum has also been
determined in metallic CNT quantum dots [5,9]. Although, for a nanotube, many
densely spaced excitations occur, level spectroscopy is possible as the regularly spaced
levels are well separated from each other with EC D. Careful inspection of the
excitation and addition spectra of CNTs shows that the exchange coupling J is
ferromagnetic and that it is small, of the order of a few meV, or less. Further
identication of the states can be performed in a magnetic eld, using the Zeeman
effect as a diagnostic tool. Douplet states are expected to split into two levels, and
triplet states into three.
One nal remark concerns the N 0 diamond. In systems such as semiconducting quantum dots, where there is a gap separating the ground state from the rst
excited state, D1 may be of the order of hundreds of meV, and in that case no electronic
excitations are expected to end up in this diamond.
2.5.2
Including Vibrational States
where Ria is the Cartesian coordinate a x, y, z of nucleus i; l labels the normal mode;
l
X i;a is a xed vector which determines the amplitudes of the oscillation for the degree
of freedom labeled by i,a. The vibrations are described by a harmonic oscillator, which
has a spectrum with energy levels separated by an amount
hwl :
E l
hwl v 1=2;
v
v 0; 1; 2; . . . :
For molecular systems, the normal modes are often referred to as vibrons (in analogy
with phonons in a periodic solid). These modes couple with the electrons as the
electrons feel a change in the electrostatic potential when the nuclei move in a normal
mode. The coupling is determined by the electronvibron coupling constant, called l.
The presence of vibrational excitations can be detected in transport measurements.
However, it should be noted that for this to happen, the vibrational modes must be
excited, which can occur for two reasons: (i) the thermal uctuations excite these
modes; or (ii) they can be excited through the electronvibron coupling.
In order to study the effect of electronvibron coupling on transport, for simplicity
the discussion is restricted to a single vibrational mode and a single electronic level.
The nuclear part of the Hamiltonian is
H
P2
1
Mw2 X 2
2M 2
where P, X and M represent the momentum, position and mass of the oscillator.
j57
58
It twins out [11] that the electron-vibron coupling has the form:
He v lhw^
nX =u0 ;
^ is the number operator, which takes on the values 0 or 1 depending
where n
pon
h=2Mw is
whether there is an electron in the orbital under consideration; u0
the zero-point uctuation associated with the ground state of the harmonic
oscillator. The electronvibron coupling l is given as (j is the electronic orbital):
l
1
hw
qHel
h 1
j :
p j
qX
2w M
When the charge in the state j increases from 0 to 1, the equilibrium position
of the harmonic oscillator (i.e., the minimum of the potential energy) is shifted
over a distance 2lu0 along X, and it is also shifted down in energy (see
Figure 2.11a). Fermis golden rule states that the transition rate for going from
the neutral island in the conformational ground state to a charged island in some
excited vibrational state is proportional to the square of the overlap between the
initial and nal states. Hence, this rate is proportional to the overlap of the ground
state of the harmonic oscillator corresponding to the higher parabola and the
excited state of the oscillator corresponding to the shifted parabola (to be multiplied by the coupling between lead and island). This overlap is known as the
FrankCondon factor. It is clear that for large displacements, this overlap may be
larger for passing to a vibrationally excited state than for passing to the vibrational
ground state of the shifted oscillator. The FranckCondon factors may be calculated analytically (see for example Ref. [10]).
The sequential tunneling regime, which corresponds to weak coupling, can be
described in terms of a rate equation: the master equation. This describes the time
evolution of the probability densities for the possible states on the molecular island.
The master equation can be used for any sequential tunneling process and is
particularly convenient when vibrational excitations play a role. The details of
formulating and solving master equations are beyond the scope of this chapter, but
the interested reader is referred to Refs. [11,12] for further details.
Figure 2.11b and c were prepared using such a master equation analysis. For
sufcient electronphonon coupling, steps appear in the currentvoltage characteristics (Figure 2.11b), which for GD GS leads to the lozenge pattern in a
stability diagram, as illustrated in Figure 2.11c. It should be noted that, if the
vibrational modes are excited, they may in turn lose their energy through
coupling to the leads or other parts of the device. This can be represented by
an effective damping term for the nuclear degrees of freedom. For actual
molecules, solving the master equations by using FrankCondon factors obtained
from quantum chemical calculations may be used to compare theory with
experiment. This is especially useful because the observed vibrational frequencies
can be used as a ngerprint of the molecule under study [7,1315] (see also
Figure 2.8a).
2.6
Second-Order Processes
j59
60
2.6.1
The Kondo Effect in a Quantum Dot with an Unpaired Electron
The Kondo effect has long been known to cause a resistance increase at low
temperatures in metals with magnetic impurities [16]. However, in recent years
Kondo physics has also been observed in semiconducting [17], nanotube [18] and
single-molecule quantum dots [19]. It arises when a localized unpaired spin
interacts by antiferromagnetic exchange with the spin of the surrounding electrons in the leads (see Figure 2.12a). The Heisenberg uncertainty principle allows
the electron to tunnel out for only a short time of about
h=DE, where DE is the
energy of the electron relative to the Fermi energy and is taken as positive. During
this time, another electron from the Fermi level at the opposite lead can tunnel
onto the dot, thus conserving the total energy of the system (elastic co-tunneling).
The exchange interaction causes the majority spin in the leads to be opposite to the
original spin of the dot. Therefore, the new electron entering from these leads is
more likely to have the opposite spin. This higher-order process gives rise to a socalled Kondo resonance centered around the Fermi level. The width of this
resonance is proportional to the characteristic energy scale for Kondo physics,
TK. For DE G, TK is given by:
kB T K
p
GU
pDEDE U
exp
:
2
GU
2:11
Typical values for TK are 1 K for semiconducting quantum dots, 10 K for CNTs, and
50 K for molecular junctions. This increase of TK with decreasing dot size can be
understood from the prefactor, which contains the charging energy (U e2/C).
In contrast to bulk systems, the Kondo effect in quantum dots leads to an increase
of the conductance, as exchange makes it easier for the spin states belonging to the
two electrodes to mix with the state (of opposite spin) on the dot, thereby facilitating
transport through the dot. The conductance increase occurs only for small bias
voltages, and the characteristic feature is a peak in the trace of the differential
conductance versus bias voltage (see Figure 2.10b, red lines). The peak occurs at
zero bias inside the diamond corresponding to an odd number of electrons. (For
zero spin, no Kondo is expected; for S 1 a Kondo resonance may be possible, but
the Kondo temperature is expected to be much smaller.) The full width at half
maximum (FWHM) of this peak is proportional to TK: FWHM 2kBTK/e. Equation 2.11 indicates that TK is gate-dependent because DE can be tuned by the gate
voltage. Consequently, the width of the resonance is the smallest in the middle of
the Coulomb blockade valley and increases towards the degeneracy point on either
side.
Another characteristic feature of the Kondo resonance is the logarithmic decrease
in peak height with temperature. In experiments, this logarithmic dependence of
the conductance maximum is often used for diagnostic means, and in the middle of
the Coulomb blockade valley it is given by:
GT
GC
1 2
1=S
1T=T K 2 S
2:12
where S 0.22 for spin-1/2 impurities and GC 2e2/h for symmetric barriers. For
asymmetric barriers, GC is lower than the conductance quantum. Equation 2.12
shows that for low temperatures, the maximum conductance of the Kondo peak
saturates at GC while at the Kondo temperature it reaches a value of GC/2.
2.6.2
Inelastic Co-Tunneling
The inelastic co-tunneling mechanism becomes active above a certain bias voltage,
which is independent of the gate voltage. At this point, the current increases
stepwise because an additional transport channel opens up. In the stability
diagram, this results in a horizontal line inside the Coulomb-blockaded regime.
This conductance feature appears symmetrically around zero at a source-drain bias
of D/e for an excited level that lies at an energy D above the ground state. Cotunneling spectroscopy therefore offers a sensitive measure of excited-state energies, which may be either electronic or vibrational. Often, in combination with
Kondo peaks, inelastic co-tunneling lines are commonly observed in semiconducting, nanotube and molecular quantum dots. An example of inelastic co-tunnel lines
(dashed horizontal lines) for a metallic nanotube quantum dot is shown in
Figure 2.13a.
j61
62
Acknowledgments
The authors thank Menno Poot for his critical reading of the manuscript.
References
References
1 Feynman, R.P. (1961) There is plenty of
room at the bottom, in Miniaturization,
(ed. H.D. Hilbert), Reinhold, New York,
pp. 282286.
2 Datta, S. (1995) Electronic transport in
mesoscopic systems, Cambridge University
Press, Cambridge.
3 Beenakker, C.W.J. (1991) Theory of
Coulomb-blockade oscillations in the
conductance of a quantum dot. Physical
Review B-Condensed Matter, 44,
16461656.
4 Foxman, E.B. et al. (1994) Crossover
from single-level to multilevel transport
in articial atoms. Physical Review
B-Condensed Matter, 50, 1419314199.
5 Sapmaz, S., Jarillo-Herrero, P., Kong, J.,
Dekker, C., Kouwenhoven, L.P. and van
der Zant, H.S.J. (2005) Electronic
excitation spectrum of metallic nanotubes.
Physical Review B-Condensed Matter, 71,
153402.
6 Oreg, Y., Byczuk, K. and Halperin, B.I.
(2000) Spin congurations of a carbon
nanotube in a nonuniform external
potential. Physical Review Letters, 85,
365368.
7 Park, H., Park, J., Lim, A.K.L., Anderson,
E.H., Alivisatos, A.P. and McEuen, P.L.
(2000) Nanomechanical oscillations in a
single-C60 transistor. Nature, 407, 5760.
8 Kouwenhoven, L.P., Austing, D.G. and
Tarucha, S. (2001) Few-electron quantum
dots. Reports on Progress in Physics, 64,
701736.
9 Liang, W., Bockrath, M. and Park, H.
(2002) Shell lling and exchange coupling
in metallic single-walled carbon
nanotubes. Physical Review Letters, 88,
126801.
10 Flensberg, K. and Braig, S. (2003)
Incoherent dynamics of vibrating singlemolecule transistors. Physical Review
B-Condensed Matter, 67, 245415.
11 Mitra, A., Aleiner, I. and Millis, A.J. (2004)
Phonon effects in molecular transistors:
12
13
14
15
16
17
18
19
j63
64
j65
3
Spin InjectionExtraction Processes in Metallic
and Semiconductor Heterostructures
Alexander M. Bratkovsky
3.1
Introduction
n" n#
;
n" n#
3:1
66
J" J#
;
J" J#
3:2
where "(#) refers to the electron spin projection on a quantization axis. In case of
!
ferromagnetic materials, the axis is antiparallel to the magnetization moment M .
Generally, P 6 G, but in some cases they can be close. Since in the ferromagnets the
spin density is constant, a reasonable assumption can be made that the current is
carried independently by two electron uids with opposite spins (Motts two-uid
model [20]). Then, in the FM bulk the injection efciency parameter is
G GF
s" s#
;
s
3:3
3.2
Main Spintronic Effects and Devices
3.2.1
TMR
j67
68
Indeed, denoting D g" and d g# as the partial densities of states (DOS), we can
write down the following golden rule expression for parallel and antiparallel moments on the electrodes:
GP / D2 d2 ;
GAP / 2Dd;
3:4
and arrive at the expression for TMR rst derived by Jullieres [12]
TMR
GP GAP D d2
2P 2
;
GAP
2Dd
1 P2
3:5
where we have introduced a polarization P, which fairly approximates the polarization introduced in Equation 3.1, at least for narrow interval of energies:
P
D d g" g#
:
D d g" g#
3:6
Below, we shall see that the polarization entering expression for a particular
process, depends on particular physics and also on the nature of the electronic states
involved. It should be noted that, for instance, the DOS entering the above expression
for TMR, is not the total DOS but rather the one for states that contribute to tunneling
current. Thus, Equation 3.6 may lead one to believe that the tunnel polarization in
elemental Ni should be negative, as there is a sharp peak in the minority carrier
density of states at the Fermi level. The data, however, suggest unambiguously that
the tunnel polarization in Ni is positive [14], P > 0. This nds a simple explanation in
a model by M.B. Stearns, who highlighted the presence in elemental 3d metals parts
of Fermi surface with almost 100% d-character and a small effective mass close to one
of a free electron [21]. A detailed discussion of TMR effects is provided below.
3.2.2
GMR
There are important differences between TMR and GMR processes. Indeed, GMR is
most reminiscent of TMR for current-perpendicular-to-planes (CPP) geometry in
FM-N-FM-. . . stacks, where N stands for normal metal spacer (Figure 3.2a). In the
CPP geometry, the spins cross the nanometer-thin normal spacer layer (N) without
spin ip, similarly to tunneling through the oxide barrier, but the elastic mean free
path is comparable or smaller than the N thickness, so that a drift-diffusive electrons
transport takes place in metallic GMR stacks. In commonly used current-in-plane
geometry (CIP), the electrons bounce between different ferromagnetic layers,
(Figure 3.2b) effectively showing the same motif in transport across the layers as
in the CPP geometry. Comparing with TMR, the latter (and spin injection efciency)
depends on the difference between the densities of states gs, spin s "(#) at the
Fermi level, while GMR depends on relative conductivity
ss e2 hg s u2s ts iF ;
where the angular brackets indicate an average over the Fermi surface that involves
the Fermi velocity vs and the momentum relaxation time ts One can still use the
Motts two independent spin uid picture, but as one is dealing with metallic
heterostructure, the continuity (or Boltzmann) equations must be solved for a
periodic FM-N-FM-. . . stack to nd the ramp of an electrochemical potential that
denes the total current. Neglecting any slight imbalance of electrochemical potentials for two spins in the N regions (spin accumulation), one may construct an
equivalent circuit model for CPP stack in the spirit of Motts model, and thus
qualitatively explain the GMR. The parallel spin layer resistances would be R"(#)
s"(#)LF for the FM layers, and r sN 1 LN in the normal N regions with thicknesses
LF(LN), respectively. For conductances in two congurations of the moments, we then
obtain (Figure 3.2c):
GP
1
1
1
;
RP 2R" r
2R# r
GAP
3:7
1
2
;
RAP R" R# r
3:8
GP GAP
R# R" 2
GAP
2R" r2R# r
3:9
j69
70
G2F
2
1 r=R
G2F
G2F
;
1 G2F
3:10
where R R" R#. The latter is very similar to the expression TMR [Equation 3.5],
but there is an absence of a factor two in the numerator. It can be seen that the
polarization entering GMR is different from that entering TMR. Even in more
common CIP geometry, the electrons certainly do scatter across the interfaces; hence,
this equation can also be used for semi-quantitative estimates of the GMR effect in
CIP geometry (Figure 3.2c). Obviously, the effective circuit model [Equations 3.7
and 3.8] remains exactly the same because it simply reects the two-uid approximation for contributions of both spins. However, all effective resistances depend in
rather nontrivial manner on the geometry (CPP or CIP) and electronic structure of
the metals involved that is particularly complicated in magnetic transition metals.
In terms of applications, the TMR effect is used in non-volatile magnetic random
access memory (MRAM) devices and as a eld sensor, while GMR is widely used in
magnetic read heads as eld sensors. The MRAM devices are in a tight competition
with semiconductor memory cards (FLASH), since MRAM is technologically more
involved (and hence more expensive) than standard silicon technology. It is problematic to use those effects in building three-terminal devices with gain that would
show any advantage over standard CMOS transistors.
3.2.3
(Pseudo)Spin-Torque Domain Wall Switching in Nanomagnets
Magnetic memory based on TMR is non-volatile, may be rather fast (1 ns), and can
be scaled down considerably toward paramagnetic limits observed in nanomagnets.
Switching, however, requires the MTJ to be placed at a crosspoint of the bit and word
wires carrying current that produces a sufcient magnetic eld for switching the
domain orientation in free (unpinned) MTJ ferromagnetic electrodes. The undesirable side effect is a crosstalk between cells, a rather complex layout, and a power
budget. Alternatively, one may take a GMR multilayer in a nanopillar form with
antiparallel orientation of magnetic moments and run a current through it (this
would obviously correspond to a CPP geometry) (Figure 3.3). In this case, there will
be a spin accumulation in the drain layer; that is, the accumulation of minority spins in
the drain electrode for antiparallel conguration of moments on electrodes (for the
electrodes made out the same material. This formally means a transfer of spin
(angular) moment across the space layer. Injection of angular momentum means that
there is a change in the spin momentum on the drain electrode, the spin projection
on the quantization axis then evolves with time simply because of an inux of
electrons with a different projection of polarization on the (drain electrode) quantization axis, dP Rz =dt 6 0, P Rz n n , where marks along (against) the
quantization axis on the right electrode with n() the densities of majority (minority)
electrons. One may call this a (pseudo) torque effectively acting on a moment in the
drain electrode, although this is obviously not a good term. This simple effect was
predicted in Refs. [22, 23], and observed experimentally in nanopillars multilayer
stacks by Tsoi et al. [24], and also in other studies.
The spin accumulation is proportional to density J of the spin-polarized current
through the drain magnet or magnetic particle, that carries ( h
J=2q) spin moment
with it per second. The change in the longitudinal component of spin polarization
P Rz in the right electrode next to the interface is then simply
dP Rz
hJ J
hVG G
2q
2q
qVT T
4
V
h
G " G # G " G # ;
4q
3:11
in a linear regime, where G q2 =hT are the conductances for spin-up and -down
channels, expressed through the transmission probabilities T and partial spin
conductances G. There is an angle y between the quantization axes on the source
! !
and drain electrode, cosq PL PR =P L P R . The partial spin conductances G are in
turn given by the standard Landauer expression through the transmission coefP
cients tss0 as T s s0 ";# jtss0 j2 . The above expression can be expressed through
the partial conductances for arbitrary conguration of spins on the electrodes
G " q2 =hjt " j2 , G " q2 =hjt " j2 ; . . ., where the orientation along (against)
the spin on the right electrode, P R , is marked by the subscript ( ). Assuming that
there is no spin-ip in the oxide (non-magnetic metal) spacer, we can express the
transmission amplitudes through those calculated for parallel/antiparallel conguration of spins on the electrodes with the use of a standard rule for spin wave function
in a rotated frame.
Finally, the rate of change in polarization on the drain electrode due to inux of
polarized current is simply
j71
72
dP Rz
hV
q
q
G"" G## cos2 G"# G#" sin2 ;
4q
2
2
dt
3:12
and the term on the right-hand side should be added to the driving force in the
LandauLifshitz (LL) equation on the right-hand side. This expression should be
better suited for MTJs, as in metallic spin valves one must consider spin accumulation in a metallic spacer. Note that Slonczewski obtains dP Rz =dt / sinq for MTJs [25],
which may be inaccurate. Indeed, consider the antiparallel conguration (q p) of
unlike electrodes. Then, dP Rz =dt / G"# G#" 6 0, since for the unlike electrodes
G"# 6 G#" , and there obviously will be a change in the spin density in the right
electrode because the inux of spins into majority states would not be equal to the
inux into minority states. To handle the resulting spin dynamics properly, one needs
to write down the continuity equation for the spin, similar to Equation 3.23 below,
with Equation 3.12 as the boundary condition at the interface.
Time-resolved measurements of current-induced reversal of a free magnetic layer
in permalloy/Cu/permalloy elliptical nanopillars at temperatures from 4.2 to 160 K
can be found in Ref. [26]. There is considerable device-to-device variation in the ST
attributed to presence of an antiferromagnetic oxide layer around the perimeter of the
Permalloy free layer (and some ambiguity in an expression used for the torque itself).
Obviously, controlling this layer would be very important for the viability of the whole
approach for memory applications, and so on. There are reports about the activation
character of switching that may be related to pinning of the domain walls at the side
walls of the pillar. The injected DC polarized current may also induce a magnetic
vortex oscillation, when vortex may be formed in a magnetic island in, for example, a
pillar-like spin valve. These induced oscillations have recently been found [27]. It is
worth noting that the agreement between theory and experiment may be fortuitous:
thus, in permalloy nanowires the speed of domain wall has substantially exceeded the
rate of spin angular momentum transfer rate [28].
3.3
Spin-Orbital Coupling and Electron Interference Semiconductor Devices
3:14
The magnitude of the coupling constant a depends on the conning potential, and
this can in principle be modied by gating. It also denes the spin-precession wave
h2. Such a term, HR, is also present in cubic systems with strain [35].
vector ka am=
The VaskoRashba Hamiltonian for heavy holes is cubic in k, and, generally, very
small.
Electric elds due to impurities (and external eld) lead to extrinsic contributions
of the spin-orbit coupling in the standard form
!
Hext l k rU s ;
3:15
where U is the potential due to impurities and an externally applied eld, with the
coupling constant l derived from 8 8 Kane Hamiltonian in third-order perturbation
theory [33, 34]
"
#
P2 1
1
;
l
3 E 2g
E g D2
3:16
where P is the matrix element of momentum found from hSjpx jX i hSjpy jYi
hSjpz jZi iPm0 =h. This is the same analytical form as the vacuum spin-orbit
coupling but, for D > 0 the coupling has the opposite sign. Numerically, l 5.3 A2
for GaAs and 120 A2 for InAs, that is, spin-orbit coupling in n-GaAs is by six orders of
magnitude larger than in vacuum [31]. This helps to generate the relatively large
extrinsic spin currents observed in the spin-Hall effect (see below). In 2-D,
!
Hext lk rUz sz .
j73
74
Now, we can analyze the behavior of polarized electrons injected into the
2DED channel. A free-electron VR Hamiltonian HVR has two eigenstates with the
h2 . Datta
momenta k(E) for opposite spins for each energy E, with k k 2ma=
and Das [3] have noted that the conductivity of the device depends on the
h2 between electron carriers after
phase difference Dq k k L 2maL=
crossing a ballistic channel of a length L and oscillates with a period dened by
the interference condition (kk)L 2pn, with n as integer. An equivalent description of the same phenomenon is the precession of an electron spin in an effective
!
!
magnetic eld Bso 2a=gmB k ^z, with mB the Bohr magneton, g the gyromagnetic ratio. The device is supposed to be controlled by the gate voltage Vg that
modulates the SO coupling constant a, a a(V)g. This pioneering report generated
much attention, yet to date the device appears not to have been demonstrated. Using
typical parameters from Refs. [36, 37], ha 1 10 11 eV m and m 0.1 m0 for the
effective carrier mass, current modulation would be observable in channels with
relatively large length L 0 1 mm. Given the above, the observation of the effect would
need: (i) efcient spin injection into channel from the FM source, which is tricky and
requires a modied Schottky barrier (see below); (ii) the splitting should well exceed
the bulk inversion asymmetry effect; (iii) the inhomogeneous broadening of a due to
impurities, that mask the VaskoRashba splitting, should be small; and (iv) one
should be able to gate control a. All these represent great challenges for building a
room-temperature interference device, where one needs to use narrow-gap semiconductors and structures with ballistic channels. The device is not efcient is a
diffusive regime. The gate control of a has been demonstrated (see Refs. [36, 38] and
references therein).
3.3.1
Spin-Hall Effect (SHE) and Magnetoresistance due to Edge Spin Accumulation
Recently, there has been a resurgence of interest in the spin-Hall effect (SHE),
which is another general consequence of the spin-orbital interaction, predicted
by Dyakonov and Perel in 1971 [40]. These authors found that because of the
spin-orbital interaction the electric and spin currents are intertwined: an electrical
current produces a transverse spin current, and vice versa. In the case when
impurity scattering dominates, which is quite often, the transverse is caused by
the Mott skew scattering of spin-polarized carriers due to the SO interaction, [see
Eq. (3.13)]. Since the current drags along the polarization of the carriers, the spin
accumulation at the edges occurs, a so-called spin-Hall effect (SHE). In ferromagnets,
the appearing Hall current is termed anomalous, and is always accompanied by
the SHE. Importantly, the edge spin accumulation results in a slight decrease in
sample resistance. External magnetic eld would destroy the spin accumulation
(Hanle effect) and lead to a positive magnetoresistance, recently identied by
Dyakonov [41].
We present here a simple phenomenological description of spin-Hall effects
(direct and inverse) and Dyakonovs magnetoresistance [42]. To this end, we intro!
!
!
duce the electron charge ux f related to the current density as J q f , where q is
the elementary charge. For parts not related to the SO interaction, we have the usual
drift-diffusion expression:
!0
mn E Drn;
3:17
where m and D are the usual electron mobility and diffusion coefcient, connected by
!
the Einstein relation, E the electric eld, and n is the electron density. The spin
polarization ux tij is a tensor characterizing the ow of the jth component of the
polarization density P j nj" nj# in the direction i (spin density is sj r 12 P j ). It is
non-zero even in the absence of spinorbit interaction, simply because the spins are
0
carried by electron ux, and we mark the corresponding quantity tij . Then, we have
0
tij mE i P j Dqi P j ;
3:18
where qi q=qx i , and one can add other sources of current, such as temperature
gradient, in Equations 3.17 and 3.18. Spinorbit interaction couples the charge and
spin currents. For a material with an inversion symmetry, we have [42]:
0
3:19
3:20
f i f i geijk tjk ;
0
J =q mn E Dr n b E P d r P ;
tij mE i P j Dqi P j eijk bnE k dqk n;
3:21
3:22
where the parameters b gm, d gD, satisfy the same Einstein relation, as do m and
D. The spin polarization vector evolves with time in accordance with the continuity
equation [41, 42]:
!
qt P j qi tij j O P jj P j =ts 0;
3:23
where the vector O gmB H=h is the spin precession frequency in the applied
! !
!
magnetic eld H, and ts the spin relaxation time. The term b E P describes the
anomalous Hall effect, where the spin polarization plays the role of the magnetic eld.
We ignore the action of magnetic eld on the particle dynamics, which is justied if
oct
1, where oc is the cyclotron frequency and t is the momentum relaxation time.
Since normally ts t, it is possible to have both Ots 1 and oct
1 in a certain
range of magnetic elds. It is also assumed that the equilibrium spin polarization in
the applied magnetic eld is negligible. The uxes [Equations 3.21 and 3.22] need to
be modied for an inhomogeneous magnetic eld by adding a counter-term
j75
76
proportional to qBj/qxi, which takes care of the force acting on the electron with a
!
given spin in an inhomogeneous magnetic eld Hr.
Equations 3.213.23 derived in Ref. [41] fully describe all physical consequences of
! !
spincharge current coupling. For instance, the term d r P describes an electrical
current induced by an inhomogeneous spin density (so-called Inverse spin-Hall
Effect) found experimentally for the rst time by Bakun et al. [42] under the
conditions of optical spin orientation. The term bnEijkEk (and its diffusive counterpart
dEijkqn/qxk) in Equation 3.22, describes what is now called the spin-Hall effect: an
electrical current induces a transverse spin current, resulting in spin accumulation
near the sample boundaries [41]. Recently, a spin-Hall effect was detected optically in
thin lms of n-doped GaAs and In GaAs [44] (with bulk electrons) and 2-D holes [45].
All of these phenomena are closely related and have their common origin in the
coupling between spin and charge currents given by Equations 3.21 and 3.22. Any
mechanism that produces the anomalous Hall effect will also lead to the spin-Hall
effect, and vice versa. Remarkably, there is a single dimensionless parameter, g, that
governs the resulting physics.
It was found recently by Dyakonov that the spin-Hall effect is accompanied by a
positive magnetoresistance due to spin accumulation near the sample boundaries [41].
p
The spin accumulation occurs over the spin diffusion length Ls Dts . Therefore, it
depends on the ratio Ls to the sample width L and vanishes in wide samples with
Ls =L
1. In a stripe sample with the width L (in xy plane), the z-component of P
! !
varies across the stripe (y-axis), r P 6 0, this creates a positive correction to a
current compared to a hypothetical case of an absent spinorbit coupling, and the
sample resistivity goes down. By applying the magnetic eld in xy plane, one may
destroy the spin polarization (the Hanle effect) and observe the positive (Dyakonovs)
magnetoresistance in magnetic elds at the scale Ots 1.
The data for 3-D [44] and 2-D [46] GaAs suggest the estimate g 102, for platinum
at room temperature [48] one nds g 3.7 103, so in these cases a magnetoresistance due to spin accumulation is on the order of 104 and 105, respectively. It
should be possible to nd this MR due to its characteristic dependence on the eld
and the width of the sample, when it becomes comparable to the spin diffusion
length. Because of the high sensitivity of electrical measurements, magnetoresistance might provide a useful tool for studying the spincharge interplay in semiconductors and metals.
3.3.2
Interacting Spin Logic Circuits
There are various suggestions of more exoteric devices based on, for example, arrays
of spin-polarized quantum dots with exchange coupled single spins with typical
exchange coupling energy d 1 meV [47], or magnetic quantum cellular automata [48]. It is assumed that one can apply a local eld or short magnetic p-pulse to ip
the input spin that would result in nearest-neighbor spins to ip in accordance with
the new ground state (of the antiferromagnetically coupled circuit of quantum dots).
The idea is that those spin arrays (no quantum coherence is required) can be used to
perform classical logic on bits represented by spins pointing along or against the
quantizations axis, |zi ! 1, |zi ! 0. However, there are problems with using
those schemes. Indeed, the standard Zeeman splitting for electron in the eld of 1 T
is only 0.5 K in vacuo, so that one needs the eld of at least 150 T to ip the spin (or
use materials with unusually large gyromagnetic factors), or one can apply 1 T
transversal B-eld for some 30 ps to do the same. The practicalities of building such
a control system at a nanoscale is a major challenge, and would require a steep
power budget. The other challenge is that instead of nearest-neighbor spins falling
into a new shallow ground state with the directions of all other spins xed, the initial
ip would trigger spin wave(s) in the circuit, thus destroying the initial set-up.
Indeed, the spin wave spectrum in large coupled arrays of N spins is almost gapless,
with the excited state just d/N above the ground state (see e.g., Ref. [49]).
Additionally, the spins are subject to a uctuating external (effective) magnetic
eld that tends to excite the spin waves and destroy the direction of the spins along
set quantization axis z. For the same reason, keeping the spins in a coherent
superposition state is unlikely, so quantum computing with coupled spins is even
less practical [50].
It is clear from the above discussion, however, that it is unlikely that the DattaDas
or any other interference devices can offer any advantages over standard MOSFETs,
especially as they do not have any gain, should operate in a ballistic regime (i.e., at low
temperatures in clean systems), and require new fabrication technology.
3.4
Tunnel Magnetoresistance
Here, we describe some important aspects of TMR on the basis of simple microscopic
model for elastic, impurity-assisted, surface state-assisted and inelastic contributions.
Most of these results are generic, and some will be useful later to analyze roomtemperature spin injection into semiconductor through a modied Schottky barrier. A
model for spin tunneling has been formulated by Julliere [12], and further developed in
Refs. [17, 18, 21]. It is expected to work rather well for iron-, cobalt-, and nickel-based
metals, according to theoretical analysis [21] and experiments [15]. However, it
disregards important points such as an impurity scattering and a reduced effective
mass of carriers inside the barrier. Both issues have important implications for
magnetoresistance and will be considered here, along with proposed novel halfmetallic systems which should, in principle, show the ultimate performance.
Enhanced performance is also found in MTJ with MgO epitaxial oxide barrier, which
may be a combination of band-structure and surface effects [16, 51]. In particular,
Zhang and Butler [51] predicted a very large TMR in Fe/MgO/Fe, bcc Co/MgO/Co,
and FeCo/MgO/FeCo tunnel junctions, having to do with peculiar band matching for
majority spin states in a metal with that in MgO tunnel barrier.
We shall describe electrons in ferromagnet-insulating barrier-ferromagnet
(FMI! !
2
1
2
^ y Ey
=2m
FM) systems by the Schrodinger equation
[17]
h
i r U i 2 Dxc s
!
with U(r) the potential (barrier) energy, Dxc r the exchange splitting of, for example,
j77
78
2
!
d kk
q
FM1
dEf E F FM2
T E; k k ;
3:24
J s0
s0 f E F s0
2 s
h
2p
FM12
where f(x) is the FermiDirac distribution function with local Fermi level F s0
for
P
ferromagnetic electrode FM1(2), T s s0 Tss0 the transmission probability from
majority (minority) spin subband in FM1 s " or # into majority (minority) spin
subband in FM2, s0 " (#). It has a particularly simple form for a square barrier and
collinear [parallel (P) or antiparallel (AP)] moments on electrodes:
T ss0
16m1 m3 m22 k1 k3 k 2
e 2kw ;
m22 k21 m21 k 2 m22 k23 m23 k 2
3:25
where k is the attenuation constant for the wavefunction in the barrier k1 k1s ; k2
ik, k3 k3s0 are the momenta normal to the barrier for the corresponding spin
subbands, w is the barrier width, and we have used a limit of Tat kw 1 [18]. With the
use of Equations 3.17 and 3.18, and accounting for the misalignment of magnetic
moments in ferromagnetic terminals (given by the mutual angle y), we obtain
following expression for the junction conductance per unit area, assuming m1 m3,
G G0 1 P1 P2 cosq;
P12
3:26
3:27
assisted by polarized surface states may lead to much larger TMR, this may relevant to
observed large values of TMR [19].
The most striking feature of Equation 3.26 is that MR tends to innity for vanishing
k#; that is when the electrodes are made of a 100% spin-polarized material (P P 0 1)
because of a gap in the density of states (DOS) for minority carriers up to their
conduction band minimum ECB#. Then GAP vanishes together with the transmission
probability Equation 3.25, as there is a zero DOS at E m for both spin directions. Such
a half-metallic behavior is rare, but some materials possess this amazing property,
most interestingly the oxides CrO2 and Fe3O4 (e.g., see recent discussion in Ref. [2]).
These oxides are very interesting for future applications in combination with matching
materials, as will be seen below.
Remarkably, for |V | < Vc in the AP geometry one has MR 1. From the middle
and the bottom panels in Figure 3.4 we see that even at 20 deviation from the AP
conguration, the value of MR exceeds 3000% in the interval |V | < Vc, and this is
indeed a very large value.
j79
80
3.4.1
Impurity Suppression of TMR
2e2 X
GLs GRs
;
ph i E i m2 G2
3:28
where Gs GLs GRs is the total width of a resonance given by a sum of the partial
widths GL(R) corresponding to electron tunneling from the impurity state at the
energy Ei to the left (right) terminal. It is easiest to analyze the case of parallel (P) and
antiparallel (AP) mutual orientation of magnetic moments M1 and M2 on electrodes
with the angle y between them. In this case, one looks at tunneling of majority (maj)
and minority (min) carriers from the left electrode Ls (Lmaj, Lmin) into states
Rs (Rmaj, Rmin) for parallel orientation (y 0) or Rs (Rmin, Rmaj) in antiparallel
orientation (y p), respectively. The general case is then easily obtained from
standard spinor algebra for spin projections. The tunnel widths can be evaluated
analytically for a rectangular barrier, GLs gLsOexp [k(w 2zi)], where zi is the
coordinate of the impurity with respect to the center of the barrier, gLs the density of
states in the (left) electode, O [18].
h]
The resonant conductance Equation 3.28 has a sharp maximum [ e2 =2p
when m Ei and GL GR, that is for the symmetric position of the impurity in the
barrier for parallel conguration. For antiparallel conguration, most effective
impurities will be positioned somewhat off-center since the DOS for the majority
and minority spins may be quite different. An asymmetric position of effective
impurities in the AP orientation immediately suggests smaller conductance GAP
than GP and positive (normal) impurity TMR > 0. This result is conrmed by direct
calculation. Indeed, if we assume that we have v defect/localized levels in a unit
volume and unit energy interval in a barrier, then, replacing the sum by an integral in
Equation 3.28, and considering a general conguration of the magnetic moments on
terminals, we obtain the following formula for impurity-assisted conductance per
unit area in leading order in exp(kw):
G1 g 1 1 PL PR cos q;
3:29
e2
N1;
ph
PLR
N 1 p2 vG1 =k;
r " r #
;
r " r #
3:30
N1 being the effective number of one-impurity channels per unit area, and one may
call PF a polarization of the impurity channels, dened by the factor
r s m2 kks =k 2 m22 k2s 1=2 with momenta ks for left (right) [L (R)] electrode.
Comparing direct and impurity-assisted contributions to conductance, we see that
the latter dominates when the density of localized states v 0 (k/p)3Ei1 exp(kw), and
in our example a crossover takes place at the density of localized states v 0 1017 eV1
cm3. When resonant transmission dominates, the magnetoresistance will be given by
MR1
2PL PR
;
1 PP0
3:31
which is only 4% in the case of Fe. We see that indeed MR1 is suppressed yet remains
positive (unless the polarization of tunnel carriers is opposite to the magnetization
direction on one of the electrodes, in this case MR is obviously inverted for trivial
reasons). There are speculations about a possibility of negative MR1, which is analyzed
below in the following subsection.
We have estimated the above critical DOS for localized states for the case of Al2O3
barrier, in systems such as amorphous Si the density of localized states is higher
because of considerably smaller band gap, and estimated as 8 1018 eV1 cm3,
mainly due to dangling bonds and band edge smearing because of disorder [55]. One
can appreciate that in junctions with thin Al2O3 amorphous barriers (<2025 A) of
practical interest the impurity-assisted tunneling is not the major effect, so the above
consideration of elastic tunneling applies. In this seminal work, Beasley have studied
a-Si barriers with wide variety of thicknesses w 301000 A and obtained detailed
data on crossover from direct tunneling to directed inelastic hopping along statistically rare, yet highly conductive, chains of localized states. The crossover thicknesses
depend heavily on the materials parameters of the barrier. The above-described
suppression of TMR by impurities was conrmed experimentally for magnetic
tunnel junctions in Ref. [56].
3.4.2
Negative Resonant TMR?
It should be noted that the MR becomes suppressed yet remains positive. Indeed, the
conductance is dominated, but can it change sign, or become inverted in the case
of impurity assisted tunneling? It was shown above that the asymetry of polarized
DOS in contacts gives positive resonant tunneling TMR. Negative (inverse MR1) can
only appear if the dominating impurity levels were lined up with the Fermi level
[Equation 3.28] and positioned in asymmetric positions in the barrier, with for
j81
82
example, the right width of the resonance much larger than the left width,
GRs Gs GLs gs
Gs
e2 g s
;
ph Gs
TMR P1 P2 ; ?
3:32
the latter was noted in Ref. [57]. However, the required coincidence is statistically very
unlikely in tunnel junctions, where the number of impurity states involved is 1, as in
all usual situations with a possible exception of very small area tunnel junctions.
Indeed, the data in Ref. [57] suggest that, in a tiny percentage of small area junctions,
the TMR is negative. The attempts to simulate the amorphous barrier that may
produce such a result in resonant tunneling regime showed, however, that one needs
an unphysically large amount of disorder in the barrier to obtain traces of negative
TMR. Indeed, for the barrier with height U 1.5 eV, an unphysical amount of onsite
disorder g = 4U = 6 eV should be assumed. It must be concluded that the speculations
about negative resonant TMR in Ref. [57] have nothing to do with most observations of
inverse TMR. Averaging over disorder suppresses TMR, as predicted in Ref. [18] and
observed in for example Ref. [56]. It is noted, however, that is the case when the
impurity states are located at a particular interface in the barrier, perhaps as in tunnel
junctions with composite barriers MgO/NiO [58], there may be a suppression and a
slight inversion of TMR in a certain window of bias voltages, given by the energy
interval occupied by the interfacial states, as described elsewhere [59].
3.4.3
Tunneling in Half-Metallic Ferromagnetic Junctions
Now we shall discuss a couple of systems with half-metallic behavior, CrO2/TiO2 and
CrO2/RuO2 (Figure 3.5). These arebased on half-metallic CrO2, andall species have the
rutile structure type with almost perfect lattice matching, which should yield a good
interface and should help in keeping the system at the desired stoichiometry. TiO2 and
RuO2 are used as the barrier/spacer oxides. The electronic structure of CrO2/TiO2 is
truly stunning in that it has a half-metallic gap which is 2.6 eV wide and extends on both
sides of the Fermi level, where there is a gap either in the minority or majority spin
band. Thus, a huge magnetoresistance should, in principle, be seen not only for
electrons at the Fermi level biased up to 0.5 eV, but also for hot electrons starting at
about 0.5 eVabove the Fermi level. We note that states at the Fermi level are a mixture of
Cr(d) and O(2p) states, so that pd interaction within the rst coordination shell
produces a strong hybridization gap, and the Stoner spin-splitting moves the Fermi
level right into the gap for minority carriers (Figure 3.5). It is worth noting that CrO2
and RuO2 are very similar in terms of a paramagnetic band structure, but the difference
in the number of conduction electrons and exchange splitting results in a usual
metallic behavior of RuO2 as compared to the half-metallic ferromagnet CrO2.
An important difference between two spacer oxides is that TiO2 is an insulator
whereas RuO2 is a good metallic conductor. Thus, the former system can be used in a
tunnel junction, whereas the latter will form a metallic multilayer. In the latter case
the physics of conduction is different from tunnelling, but the effect of vanishing
phase volume for transmitted states still works when current is passed through such a
system perpendicular to planes. One interesting possibility is to form three-terminal
devices with these systems, like a spin-valve transistor [60], and check the effect in a
hot-electron region. CrO2/TiO2 seems to a be a natural candidate to check the present
predictions about half-metallic behavior and for a possible record tunnel magnetoresistance. One important advantage of these systems is an almost perfect lattice
match at the oxide interfaces. The absence of such a match of the conventional Al2O3
barrier with Heussler half-metallics (NiMnSb and PtMnSb) may have been among
other reasons for their unimpressive performance [2]. The main concerns for
achieving a very large value of magnetoresistance will be spin-ip centers, magnon-assisted events, and imperfect alignment of moments. As for conventional
tunnel junctions, the present results show that presence of defect states in the barrier,
or a resonant state, as in a resonant tunnel diode-type of structure, reduces their
magnetoresistance several fold but may dramatically increase the current through the
structure.
j83
84
3.4.4
Surface States Assisted TMR
Direct tunneling, as we have seen, gives a TMR of about 30%, whereas in recent
experiments TMR is well above this value, approaching 4050% in systems with
Al2O3 amorphous barrier, and 200% in systems with epitaxial MgO barriers [16, 52].
It will become clear below, that this enhancement is unlikely to come from
the inelastic processes. Until now, we have disregarded the possibility of localized
states at metaloxide interfaces. Bearing in mind that the usual barrier AlOx is
amorphous, the density of such surface states may be high, and we must take into
account tunneling into/from those states. The results for Tamm states that may exist
at clean interfaces, are similar. The corresponding tunneling conductance per unit
area is [19]:
Gs q
Ps
e2
BDs 1 PF Ps cos q;
ph
Ds" Ds#
;
Ds" Ds#
s 1 Ds" Ds# ;
D
2
3:33
I xP
I xAP
2pe X a L R
X g # g " dwrmag
a weV wqeV w;
h a
2pe R L R
mag
X g " g " dwrR weV wqeV w
h
mag
X L g L# g R# dwrL weV wqeV w;
mag
3:34
j85
86
3.5
Spin Injection/Extraction into (from) Semiconductors
Much attention has been devoted recently to exploring the possibility of a threeterminal spin injection device where spin is injected into semiconductor from either
metallic ferromagnetic electrode [67, 68], or from magnetic semiconductor electrode,
as demonstrated in Ref. [64]. However, the magnetization in FMS usually vanishes or
is too small at room temperature. Relatively high spin injection from ferromagnets
(FM) into non-magnetic semiconductors (S) has recently been demonstrated at low
temperatures [65], and the attempts to achieve an efcient room-temperature spin
injection have faced substantial difculties [66]. Theoretical studies of the spin
injection from ferromagnetic metals, as initiated in Refs. [68, 69], have been subject
of extensive research in Refs. [510, 6977] that has gained much insight into the
problem of spin injection/accumulation in semiconductors. As a consequence, some
suggestions for spin transistors and other spintronic devices have appeared that are
experimentally realizable, can work at room temperatures, and exceed the parameters of standard semiconductor analogues [5, 6].
As an important distinction with spin transport in magnetic tunnel junctions, one
would like to create non-equilibrium spin polarization and manipulate it with
external elds in semiconductors, with a possible advantage of long spin relaxation
time in comparison with mean collision time. In order to be interesting for
applications, there should be a straightforward method of creating substantial
non-equilibrium spin polarization density in semiconductor. This is different from
tunnel junctions, where one is interested in large spin injection efciency; that is, in
large resistance change with respect to magnetic conguration of the electrodes.
Obviously, a spin imbalance in the drain ferromagnetic electrode is created in MTJ,
proportional to the current density, but relatively minute, given a huge density of
carriers in a metal. The principal difculty of spin injection in semiconductor from
ferromagnet is that the materials in the FM-S junctions usually have very different
electron afnity and, therefore, high Schottky barrier forms at the interface [78]
(Figure 3.7, curve 1). Thus, in GaAs and Si the barrier height D 0.50.8 eV with
practically all metals, including Fe, Ni, and Co [65, 78], and the barrier width is large,
l 0 100 nm for doping concentration Nd 9 1017 cm3. The spin-injection corresponds to a reverse current in the Schottky contact, which is saturated and usually
negligible due to such large l and D [78]. Therefore, a thin heavily doped n S layer
between FM metal and S is used to increase the reverse current [78] determining the
spin-injection [6, 8, 65, 72]. This layer sharply reduces the thickness of the barrier, and
increases its tunneling transparency [6, 78]. Thus, a substantial spin injection has
been observed in FM-S junctions with a thin nlayer [65].
One usually overlooked formal paradox of spin injection is that a current through
Schottky junctions (as derived in textbooks) depends solely on parameters of a
semiconductor [78], and cannot formally be spin-polarized. Some authors even
j87
88
3.5.1
Spin Tunneling through Modified (Delta-Doped) Schottky Barrier
S
F E c0
D0
n;
N c exp
T
T
3:35
where FS is the Fermi level in the semiconductor bulk, Nc 2Mc(2pmT )3/2h3 the
effective density of states and Mc the number of effective minima of the semiconductor conduction band; n and m the concentration and effective mass of electrons
in S [79]. Owing to small barrier thickness l, the electrons can rather easily tunnel
through the d-spike, but only those with an energy E
Ec can overcome the wide
barrier D0 due to thermionic emission, where Ec Ec0 qV. We assume here the
standard convention that the bias voltage V < 0 and current J < 0 in the reverse-biased
FM-S junction and V > 0 ( J > 0) in the forward-biased FM-S junction [78]. At positive
bias voltage V > 0, we assume that the bottom of conduction band shifts upwards to
Ec Ec0 qV with respect to the Fermi level of the metal. Presence of the mini-well
allows to keep the thickness of the d-spike barrier equal to l9l0 and its transparency
high at voltages qV 9 rT (see below).
The calculation of current through the modied barrier is rather similar to what
has been done in the case of tunnel junctions above, with a distinction that in the
present case the barrier is triangular, (Figure 3.7). We again assume elastic coherent
!
!
tunneling, so that the energy E, spin s, and kjj (the component of the wave vector k
parallel to the interface) are conserved. The exact current density of electrons with
spin s ",# through the FM-S junction containing the d-doped layer (at the point
x l; Figure 3.7) is written similarly to Equaion 3.24 as:
J s0
2
d kk
q
dEf E F Ss0 f E F FM
T s;
s0
h
2p2
3:36
where F Ss0 F FM
s0 are the spin quasi-Fermi levels in the semiconductor (ferromagnet)
near the FM-S interface, and the integration includes a summation with respect to a
band index. Note that here we study a strong spin accumulation in the semiconductor.
S
Therefore, we use nonequilibrium Fermi levels, F FM
s0 and F s0 , describing distributions
j89
90
of electrons with spin s ",# in the FM and the S, respectively, which is especially
important for the semiconductor. In reality, due to very high electron density in FM
metal in comparison with electron density in S, F FM
s0 differs negligibly from the
equilibrium Fermi level F for currents under consideration; therefore, we can assume
that F FM
s0 F, as in Refs. [18, 52] (see discussion below).
The current Equation 3.36 should generally be evaluated numerically for a
complex band structure Eks [79]. The analytical expressions for Ts(E, k||) can be
h is
obtained in an effective mass approximation,
hks ms vs , where vs jrE ks j=
the band velocity in the metal. The present Schottky barrier has a pedestal with a
height (Ec F) D0 qV, which is opaque at energies E < Ec. For E > Ec we
approximate the d-barrier by a triangular shape and one can use an analytical
expression for Ts(E, k||) [5] and nd the spin current at the bias 0 < qV 9 rT,
including at room temperature,
J s0 j0 ds
2ns0 V
qV
exp
;
n
T
j0 a0 nqvT exp hk 0 l:
3:37
3:38
vT vs0
:
v2s0
v2t0
3:39
k 0 1=l0 2m =
h2 1=2 D D0 qV1=2 ,
vt0
a0 1.2(k0l)1/3,
pHere
v
(E
)
the
velocity
v
2D D0 qV=m is the characteristic tunnel velocity,
s
s
c
p
of polarized electrons in FM with energy E Ec, vT 3T=m the thermal velocity.
At larger reverse bias the miniwell on the right from the spike in Figure 3.7
disappears and the current practically saturates. Quite clearly, the tunneling electrons
incident almost normally at the interface and contribute most of the current (a more
careful sampling can be done numerically [79]).
One can see from Equation 3.30 that the total current J J"0 J#0 and its spin
components Js0 depend on a conductivity of a semiconductor but not a ferromagnet,
as in usual Schottky junction theories [79]. On the other hand, Js0 is proportional to
the spin factor ds and the coefcient j0ds/ v2T / T, but not the usual Richardsons
factor T2 [78]. Equation 3.37, for current in the FM-S structure, is valid for any sign of
the bias voltage V. Note that at V > 0 (forward bias) it determines the spin current
from S into FM. Hence, it describes spin extraction from S [8].
Following the pioneering studies of Aronov and Pikus [67], one customarily
assumes a boundary condition J"0 (1GF)J/2. Since there is a spin accumulation
in S near the FM-S boundary, the density of electrons with spin s in the semiconductor is ns0 n/2 dns0, where dns0 is a non-linear function of the current J, and
dns0 / J at small current [67] (see also below). Therefore, the larger J the higher the
dns0 and the smaller the current Js0 [see Equation 3.37]. In other words, a type of
negative feedback is realized, which decreases the spin injection efciency G and
makes it a non-linear function of J (see below). We show that the spin injection
efciency, G0, and the polarization, P0 [n"(0) n#(0)]/n in the semiconductor near
FM-S junctions essentially differ, and that both are small at small bias voltage V (and
current J) but increase with the current up to PF. Moreover, PF can essentially differs
from GF, and may ideally approach 100%.
The current in a spin channel s is given by the standard drift-diffusion approximation [67, 76]:
J s qmns E qDrns ;
3:40
where E is the electric eld; and D and m are the diffusion constant and mobility of
the electrons respectively. D and m do not depend on the electron spin s in the
non-degenerate semiconductors. From current continuity and electroneutrality
conditions
X
X
J s const; nx
ns const;
3:41
Jx
s
we nd
Ex J=qmn const;
dn# x dn" x:
3:42
qdns
;
ts
3:43
where in the present one-dimensional case r d/dx. With the use of Equations 3.40
and 3.42, we obtain the equation for dn"(x) dn#(x) [67, 76]. Its solution satisfying a
boundary condition dn" ! 0 at x ! 1, is
x
x
n
dn" x C exp
Pn0 exp ;
3:44
2
L
L
s
!
q
1
Ls
J2
J
2
2
LE 4Ls LE
;
3:45
Linject extract
4
2
JS
2
J 2S
where the plus (minus) sign refers to forward (reverse) bias on the junction,
p
Ls Dts is the usual spin-diffusion length, LE m|E|ts Ls|J|/Js the spin-drift
length. Here we have introduced the characteristic current density
J S qDn=Ls ;
3:46
and the plus and minus signs in the expression for the spin penetration depth L
Equation 3.45 refer to the spin injection at a reverse bias voltage, J < 0, and spin
extraction at a forward bias voltage, J > 0, respectively. Note that Linject > Lextract, and
the spin penetration depth for injection increases with current, at large currents,
|J| JS, Linject Ls|J|/Js Ls, whereas Lextract LsJs/J
Ls.
j91
92
The degree of spin polarization of non-equilibrium electrons (i.e., a spin accumulation in the semiconductor near the interface) is given simply by the parameter C in
Equation 3.37:
C
n" 0 n# 0
P0 P0 :
n
J "0
2
LE
2
g P0 PF
3:47
3:48
where g exp(qV/T)1. From Equation 3.41, one obtains a quadratic equation for
Pn(0) with a physical solution that can be written fairly accurately as
P0
P F gLE
:
gL LE
3:49
By substituting Equation 3.49 into Equation 3.37, we nd for the total current
J J"0J#0:
J J m g J m e qV=T 1;
3:50
3:51
for the bias range |qV| 9 rT. The sign of the Boltzmann exponent is unusual because
we consider the tunneling thermoemission current in a modied barrier. Obviously,
we have J > 0 (<0) when V > 0 (<0) for forward (reverse) bias.
We notice that at a reverse bias voltage qV rT the shallow potential mini-well
vanishes, and Ec(x) takes the shape shown in Figure 3.7 (curve 3). For qV > rT, a
wide potential barrier at x > l (in S behind the spike) remains at (characteristic
length scale 0100 nm at Nd 9 1017 cm3), as in usual Schottky contacts [78].
Therefore, the current becomes weakly dependent on V, since the barrier is opaque
for electrons with energies E < Ec rT (curve 4). Thus, Equation 3.50 is valid only at
qV 9 rT and the reverse current at qV 0 rT practically saturates at the value
J sat qna0 vT d"0 d#0 1 P 2F expr hk 0 l:
3:52
With the use of Equations 3.50 and 3.45, we obtain from Equation 3.42 the spin
polarization of electrons near FM-S interface,
P0 PF
2J
q
:
2J m J 2 4J 2S J
3:53
The spin injection efciency at FM-S interface is, using Equations 3.30, 3.41, 3.38
and 3.46,
q
4J 2S J 2 J
J "0 J #0
L
q
G0
:
3:54
P0
PF
LE
J "0 J #0
2J m J 2 4J 2S J
One can see that G0 strongly differs from P0 at small currents. As expected,
Pn PF|J|/Jm ! 0 vanishes with the current (Figure 3.8), and the prefactor differs
from those obtained in Refs. [67, 71, 73, 75, 76].
These expressions should be compared with the results for the case of a degenerate
semiconductor, Ref. [10], for the polarization
6P F J
;
P0 q
2
3
J 4J 2S J 10J m
and the spin injection efciency
q
4J 2S J 2 J
q
:
G0 PF
2J m J 2 4J 2S J
3:55
3:56
j93
94
current J and the spin polarization (of electron density) approaches the maximum
value PF. Unlike the spin accumulation P0, the spin injection efciency (polarization of current) G0 does not vanish at small currents, but approaches the value
G00 P F J S = J S J m
PF in the present system with transparent tunnel d-barrier.
There is an important difference with the magnetic tunnel junctions, where the
tunnel barrier is relatively opaque and the injection efciency (polarization of
current) is high, G PF [18]. However, the polarization of carriers P0, measured in,
for example, spin-LED devices [66], would be minute (see below). Both P0 and G0
approach the maximum PF only when |J| JS, (Figure 3.9). The condition |J| JS
is fullled at qV rT 0 2T, when Jm 0 JS.
Another situation is realized in the forward-biased FM-S junctions when J > 0.
Indeed, according to Equations 3.53 and 3.54 at J > 0 the electron density distribution
is such that sign (dn"0) sign (PF). If a system like elemental Ni is considered
(Figure 3.7), then PF(F D0) < 0 and dn"0 > 0; that is, the electrons with spin s "
would be accumulated in a non-magnetic semiconductor (NS), whereas electrons
with spin s # would be extracted from NS (the opposite situation would take place
for PF (F D0) > 0). One can see from Equation 3.46 that |P0| one can reach a
maximum PF only when J Js. According to Equation 3.50, the condition J Js can
only be fullled when Jm Js. In this case Equation 3.46 reduces to
P0
PF J
P F 1 e qV=T :
Jm
3:57
L2s
Ls J S
Ls
LE J
Ls
at J J S ;
3:58
that is, it decreases as L / 1/J (Figure 3.8). We see from Equation 3.54 that at J Js
G0
P F J 2S
! 0:
Jm J
3:59
Hence, the behavior of the spin injection efciency at forward bias (extraction) is
very different from a spin injection regime, which occurs at a reverse bias voltage:
here, the spin injection efciency G0 remains
PF and vanishes at large currents as
G0 / Js/J. Therefore, we come to an unexpected conclusion that the spin polarization of
electrons, accumulated in a non-magnetic semiconductor near forward biased FM-S
junction can be relatively large for the parameters of the structure when the spin
injection efciency is actually very small [8]. Similar, albeit much weaker phenomena
are possible in systems with wide opaque Schottky barriers [80] and have been
probably observed [81]. Spin extraction may also be observed at low temperature
in FMS-S contacts [82]. A proximity effect leading to polarization accumulation in
FM-S contacts [83] may be related to the same mechanism.
3.5.2
Conditions for Efficient Spin Injection and Extraction
According to Equations 3.51 and 3.46, the condition for maximal polarization of
electrons Pn can be written as
J m 0J s ;
3:60
3:61
Note that when l 9 l0, the spin injection efciency at small current is small
G00 P F =1 b PF , since in this case the value b (d"0 d#0)a0vTts/Ls 1 for
j95
96
3:62
It can be met only when the d-doped layer is very thin, l9l0 k 0 1 . With typical
semiconductor parameters at T 300 K (D 25 cm2 s1, (D D0) 0.5 eV, us0
108cm2 s1 [78]) the condition in Equation 3.62 is satised at l 9 l0 when the
spin-coherence time ts 1012 s. It is worth noting that it can certainly be met: for
instance, ts can be as large as 1 ns even at T 300 K (e.g., in ZnSe [84]).
Note that the higher the semiconductor conductivity, ss qmn / n, the larger the
threshold current J > Jm / n [Equation 3.51]for achieving the maximal spin injection. In other words, the polarization P0 reaches the maximum value PF at smaller
current in high-resistance lightly doped semiconductors compared to heavily doped
semiconductors. Therefore, the conductivity mismatch [70, 74, 75] is actually
irrelevant for achieving an efcient spin injection.
The necessary condition |J| Js can be rewritten at small voltages, |qV|
T, as
rc
Ls
;
ss
3:63
where rc (dJ/dV)1 is the tunneling contact resistance. Here, we have used the
Einstein relation D/m T/q for non-degenerate semiconductors. We emphasize that
Equation 3.63 is opposite to the condition found by Rashba in Ref. [74] for small currents.
We also emphasize that the spin injection in structures considered in the literature [4, 6576] has been dominated by electrons at the Fermi level and, according to
calculation [85],g#(F) andg"(F) are suchthat PF 9 40%. We also noticethat thecondition
in Equation 3.61 for parameters of the Fe/AlGaAs heterostructure studied in Refs. [65]
(l 3 nm, l0 1 nm and D0 0.46 eV) is satised when ts 0 5 1010 s and can be
fullled only at low temperatures. Moreover, for the concentration n 1019 cm3Ec lies
below F, sothat theelectrons with energies E F are involved in tunneling,but for these
states the polarization is PF 9 40%. Therefore, the authors of Ref. [65] were indeed able
to estimate the observed spin polarization as being 32% at low temperatures.
Better control of the injection can be realized in heterostructures where a d-layer
between the ferromagnet and the n-semiconductor layer is made of very thin heavily
doped n-semiconductor with larger electron afnity than the n-semiconductor. For
instance, FMn-GaAsn-Ga1xAlxAs, FMn-GexSi1xn-Si or FMn-Zn1x Cdx
Sen-ZnSe heterostructures can be used for this purpose. The GaAs, GexSi1x or
Zn1xCdxSenlayer must have the width l < 1 nm and the donor concentration
N d >1020 cm 3 . In this case, the ultrathin barrier forming near the ferromagnetsemiconductor interface is transparent for electron tunneling. The barrier height D0 at
GexSi1xSi, GaAsGa1xAlxAs or Zn1xCdxSeZnSe interface is controlled by the
composition x, and can be selected as D0 0.050.15 eV. When the donor concentration in Si, Ga1xAlxAs, or ZnSe layer is Nd < 1017 cm3, the injected electrons cannot
penetrate relatively low and wide barrier D0 when its width l0 > 10 nm.
3.5.3
High-Frequency Spin-Valve Effect
Here we describe a new high-frequency spin valve effect that can be observed in a FMS-FM device with two back-to-back modied Schottky contacts (Figure 3.10). We nd
the dependence of current on a magnetic conguration in FM electrodes and an
external magnetic eld. The spatial distribution of spin-polarized electrons is
determined by the continuity [Equation 3.43] and the current in spin channel s is
given by Equation 3.40. Note that J < 0, thus E < 0 in a spin injection regime. With the
use of the kinetic equation and Equation 3.40, we obtain the equation for dn",
Equation 3.43 [68]. Its general solution is
j97
98
n
dn" x c 1 e x=L1 c 2 e w x=L2 ;
3:64
2
q
where L12 1=2 L2E 4L2s LE is the same as found earlier in Equation 3.45. Consider the case when w
L1 and the transit time ttr w2/(D m|E|w)
of the electrons through the n-semiconductor layer is shorter than ts. In this case, a
spin ballistic transport takes place; that is, the spin of the electrons injected from the
FM1 layer is conserved in the semiconductor layer, s0 s. Probabilities of the
!
electron spin s " to have the projections along M 2 are cos2(q/2) and sin2(q/2),
!
respectively, where q is the angle between vectors s " and M 2 . Accounting for
this, we nd that the resulting current through the structure saturates at bias
voltage qV > T at the value
J J0
1 P2R cos2 q
;
1 PL P R cosq
3:65
where J0 is the prefactor similar to Equation 3.38. For the opposite bias the total current
J is given by Equation 3.65 with the replacement PL $ PR. The current J is minimal
for antiparallel (AP) moments M 1 and M 2 in the electrodes when q p and near
!
!
maximal for parallel (P) magnetic moments M 1 and M 2 .
The present heterostructure has an additional degree of freedom, compared to
tunneling FM-I-FM structures that can be used for magnetic sensing. Indeed, spins of
the injected electrons can precess in an external magnetic eld H during the transit
time ttr of the electrons through the semiconductor layer (ttr < ts). The angle between
!
the electron spin and the magnetization M 2 in the FM2 layer in Equation 3.65 is in
general q q0 qH where q0 is the angle between the magnetizations M1 and M2, and
yH g0gHttr(m0/m) is the spin rotation angle. Here, H is the magnetic eld normal to
the spin direction, g qg/(mc) is the gyromagnetic ratio, g is the g-factor. According to
Equation 3.65, with increasing H the current oscillates with an amplitude (1 PLPR)/
(1 PLPR) and period DH (2pm)(g0gm0ttr)1 (Figure 3.11, top panel). The maximum operating speed of the eld sensor is very high, since redistribution of nonequilibrium-injected electrons in the semiconductor layer occurs over the transit time
ttr w/m|E| Jswts/(JLs), ttr 9 1011 s for w 9 200 nm, ts 3 1010 s, and J/Js 0 10
(D 25 cm2 s1) at T 300 K [78]). Therefore, the operating frequency f 1/ttr 0 100
GHz may be achievable at room temperature. We see that: (i) the present heterostructure can be used as a sensor for an ultrafast nanoscale reading of an inhomogeneous magnetic eld prole; (ii) it includes two FM-S junctions and can be used for
measuring the spin polarizations of these junctions; and (iii) it is a multifunctional
device where current depends on the mutual orientation of the magnetizations in the
ferromagnetic layers, an external magnetic eld, and a (small) bias voltage. Thus, it
can be used as a logic element, a magnetic memory cell, or an ultrafast read head.
3.5.4
Spin-Injection Devices
The high-frequency spin-valve effect, described above, can be used for designing a
new class of ultrafast spin-injection devices such as an amplier, a frequency
multiplier, and a square-law detector [6]. Their operation is based on the injection
of spin-polarized electrons from one ferromagnet to another through a semiconductor layer, and spin precession of the electrons in the semiconductor layer in a
magnetic eld induced by a (base) current in an adjacent nanowire. The base current
can control the emitter current between the magnetic layers with frequencies up to
several 100 GHz. Here, we shall describe a spintronic mechanism of ultrafast
amplication and frequency conversion, which can be realized in heterostructures
comprising a metallic ferromagnetic nanowire surrounded by a semiconductor (S)
and a ferromagnetic (FM) thin shells (Figure 3.12a). Practical devices may have
various layouts, with two examples shown in Figure 3.12b and c.
Let us consider the principle of operation of the spintronic devices shown
in Figure 3.12a. When the thickness w of the n-type semiconductor layer is not very
small (w 0 30 nm), tunneling through this layer would be negligible. The base
voltage Vb is applied between the ends of the nanowire. The base current Jb, owing
through the nanowire, induces a cylindrically symmetric magnetic eld Hb Jb/2pr
in the S layer, where r is the distance from the center of nanowire. When the emitter
j99
100
Figure 3.12 Schematic of the spin injectionprecession devices having (a) cylindrical, (b)
semi-cylindrical, and (c) planar shape. Here, FM1
and FM2 are the ferromagnetic layers; n-S the ntype semiconductors layer; w the thickness of the
n-S layer; d the d-doped layers; NW the highly
3:66
!
where y y0 yH, y0 is the angle between M1 and M 2 , and qH is the angle of the spin
precesses with the frequency O gH?, where H? is the magnetic eld component
normal to the spin and g is the gyromagnetic ratio. One can see from Figure 3.12a that
H? Hb Jb/(2pr). Thus, the angle of the spin rotation is equal to yH gHbttr
ttrJb/2prs, where rs is the characteristic radius of the S layer. Then, according to
Equation 3.57,
J e J e0 1 P1 P 2 cosq0 kj J b ;
3:67
where kj gttr/2prs g/ors and o 2p/ttr is the frequency of a variation of the base
current, Jb Jscos (ot).
Equation 3.67 shows that, when the magnetization M1 is perpendicular to
M2, y0 p/2, and yH
p,
J e J e0 1 kj P1 P2 J b ; G dJ e =dJ b J e0 kj P 1 P 2 :
3:68
Hence, the amplication of the base current occurs with the gain G, which can be
relatively high even for o 0 100 GHz. Indeed, g q/(mc) 2.2(m0/m)105 m/(As),
where m0 is the free electron mass, m the effective mass of electrons in the
semiconductor, and c the velocity of light. Thus, the factor kj 103 A1 when
rs 30 nm, m0/m 14 (GaAs) and o 100 GHz, so that G > 1 at Je0 > 0.1 mA/
(P1P2).
When M1 is collinear with M2 (y0 0,p) and yH
p, then, according to Equation 3.67, the emitter current is
1
J e J e0 1 P1 P2 J e0 P 1 P 2 k2j J 2b :
2
3:69
The spin extraction effect can be used for making an efcient source of (modulated)
polarized radiation. Consider a structure containing a FM-S junction with the ddoped layer and a double pn0 n heterostructure where the n0 -region is made from
narrower gap semiconductor (Figure 3.13). We show that the following effects can be
realized in the structure when both FM-S junction and the heterostructure are biased
in the forward direction, and the electrons are injected from n-semiconductor region
into FM and p-region. Due to a spin selection property of FM-S junction [7], spinpolarized electrons appear in n-region with a spatial extent L 9 Ls near the FM-S
interface, where Ls is the spin diffusion length in NS. When the thickness of the
n-region w is smaller than L, the spin-polarized electron from the n-region and
j101
102
holes from p-region are injected and accumulated in a thin narrow-gap n0 -region
(quantum well) where they recombine and emit polarized photons.
The conditions for maximal polarization are obtained as follows. When the
thickness of n-region is w < L, we can assume that dn"(x) dn"0 and Pn P0. In
this case, integrating Equation 3.43 over the volume of the n-semiconductor region
(with area S and thickness w), we obtain
3:70
Here, Is JsS; I"FS J"0S and I"pn are the electron currents with spin s " owing
into the n-region from FM and the p-region, respectively; I"c is the spin current out of
the n-region in a contact (Figure 3.13a). The current I"pn is determined by injection of
electrons with s " from the n-region into the p-region through the crossectional
area S, equal to I"pn Ipnn"0/n Ipn(1 P0)/2, where Ipn is the total current in the pn
junction. The current of metal contact Ic is not spin-polarized; hence Ic" (Ipn IFS)/
2, where IFS is the total current in the FM-S junction. The current in the FM-S
junction IFS approaches a maximal value Im JmS at rather small bias, qVFS > 2T.
When Ipn
IFS Im and Im ISw/Ls, we get P0 PF. The way to maximize
polarization is by adjusting VFS. The maximal |P0| can be achieved for the process of
electron tunneling through the d-doped layer when the bottom of conduction band
in a semiconductor Ec F D0 qVFS is close to a peak in the density of states of
minority carriers in the elemental ferromagnet (Figure 3.13c, curve g).
The rate of polarized radiation recombination is Rs qn0 s d=tR and the polarization of radiation is p R" R# =R" R# 2dn0 " =n0 . Since Ipn qn0 d/tn, we nd
2dn0 " =n0 P0 t0 s t0 s tn 1 , so that p P0 t0 s t0 s tn 1 . Thus, the radiation polarization p can approach maximum p |PF| at large current I Im when t < t0 s . The
latter condition can be met at high concentration n0 when the time of radiation
recombination tR tn < t0 s . For example, in GaAs tR 3 1010 s at n 0 5
1017 cm3 [86] and t0 s can be larger than tR [84]. We emphasize that spin injection
efciency near a forward-biased FM-S junction is very small.
Practical structures may have various layouts, with one example shown in
Figure 3.14. It is clear that the distribution of dn" r! in such a 2-D structure is
characterized by the length L 9 Ls in the direction x where the electrical eld E can be
strong, and by the diffusion length Ls in the (y, z) plane where the eld is weak.
Therefore, the spin density near FM and pn junctions will be close to dn"0 when the
size of the p-region is d < Ls. Thus, the above results for one-dimensional structure
j103
104
(Figure 3.13) are also valid for more complex geometry shown in Figure 3.14. The
predicted effect should also exist for a reverse-biased FM-S junction where the
radiation polarization p can approach PF.
3.6
Conclusions
In this chapter we have described a variety of heterostructures where the spin degree
of freedom can be used to efciently control the current: magnetic tunnel junctions,
metallic magnetic multilayers exhibiting giant magnetoresistance, spin-torque effects in magnetic nanopillars. We also described a method of facilitating an efcient
spin injection/accumulation in semiconductors from standard ferromagnetic metals
at room temperature. The main idea is to engineer the band structure near the
ferromagnet-semiconductor interface by fabricating a delta-doped layer there, thus
making the Schottky barrier very thin and transparent for tunneling. A long spin
lifetime in a semiconductor allows the suggestion of some interesting new devices
such as eld detectors, spin transistors, square-law detectors, and sources of the
polarized light described in the present text. This development opens up new
opportunities in potentially important novel spin injection devices. We also discussed
a body of various spin-orbit effects and systems of interacting spins. In particular,
Spin Hall effects result is positive magnetoresitance due to spin accumulation that
may be used to extract the coeefcints for spin-orbit transport. We notice, however,
that DattaDas spinFET would have inferior characteristics compared to MOSFET.
We also discussed the severe challenges facing single-spin logic and, especially, spinbased quantum computers.
References
1 (a) Wolf, S.A., Awschalom, D.D.,
Buhrman, R.A., Daughton, J.M., von
Molnar, S., Roukes, M.L., Chtchelkanova,
A.Y. and Treger, D.M. (2001) Science, 294,
1488; (b) Awschalom D.D., Loss D. and
Samarth N. (eds) (2002) Semiconductor
Spintronics and Quantum Computation,
Springer, Berlin.
c I., Fabian, J. and Das Sarma, S.
2 Zuti
(2004) Reviews of Modern Physics, 76, 323.
3 (a) Datta, S. and Das, B. (1990) Applied
Physics Letters, 56, 665. (b) Gardelis, S.,
Smith, C.G., Barnes, C.H.W., Lineld,
E.H. and Ritchie, D.A. (1999) Physical
Review B-Condensed Matter, 60, 7764.
References
9 Bratkovsky, A.M. and Osipov, V.V. (2005)
Applied Physics Letters, 86, 071120.
10 Osipov, V.V. and Bratkovsky, A.M. (2005)
Physical Review B-Condensed Matter, 72,
115322.
11 (a) Baibich, M.N., Broto, J.M., Fert, A.,
Nguyen Van Dau, F., Petroff, F., Etienne,
P., Creuzet, G., Friederich, A. and
Chazelas, J. (1988) Physical Review Letters,
61, 2472; (b) Berkowitz, A.E., Mitchell, J.R.,
Carey, M.J., Young, A.P., Zhang, S., Spada,
F.E., Parker, F.T., Hutten, A. and Thomas,
G. (1992) Physical Review Letters, 68, 3745.
12 Julliere, M. (1975) Physics Letters, 54A, 225.
13 Maekawa, S. and Gafvert, U. (1982) IEEE
Transactions on Magnetics, 18, 707.
14 Meservey, R. and Tedrow, P.M. (1994)
Physics Reports, 238, 173.
15 Moodera, J.S., Kinder, L.R., Wong, T.M.
and Meservey, R. (1995) Physical Review
Letters, 74, 3273.
16 (a) Yuasa, S., Nagahama, T., Fukushima,
A., Suzuki, Y. and Ando, K. (2004) Nature
Materials, 3, 858; (b) Parkin, S.S.P., Kaiser,
C., Panchula, A., Rice, P.M., Hughes, B.,
Samant, M. and Yang, S.-H. (2004) Nature
Materials, 3, 862.
17 Slonczewski, J.C. (1989) Physical Review
B-Condensed Matter, 39, 6995.
18 Bratkovsky, A.M. (1997) Physical Review
B-Condensed Matter, 56, 2344.
19 Bratkovsky, A.M. (1998) Applied Physics
Letters, 72, 2334.
20 Mott, N.F. (1936) Proceedings of the Royal
Society of London. Series A, 153, 699.
21 Stearns, M.B. (1977) Journal of Magnetism
and Magnetic Materials, 5, 167.
22 Berger, L. (1996) Physical Review
B-Condensed Matter, 54, 9353.
23 Slonczewski, J.C.(1996) Journal of
Magnetism and Magnetic Materials, 159, L1.
24 Tsoi, M.V. et al. (1998) Physical Review
Letters, 80, 4281.
25 (a) Slonczewski, J.C. (2002) Journal of
Magnetism and Magnetic Materials,
247, 324; (b) Slonczewski, J.C. (2005)
Physical Review B-Condensed Matter, 71,
024411; (c) Slonczewski, J.C., Sun, J.Z.
(2007) J. Magn. Magn. Mater., 310, 169.
j105
106
References
68 (a) Johnson M. and Silsbee, R.H. (1987)
Physical Review B-Condensed Matter, 35,
4959; (b) Johnson M. and Byers, J. (2003)
Physical Review B-Condensed Matter, 67,
125112.
69 (a) van Son, P.C., van Kempen, H. and
Wyder, P. (1987) Physical Review Letters, 58,
2271; (b) Schmidt, G., Richter, G., Grabs,
P., Gould, C., Ferrand, D. and Molenkamp,
L.W. (2001) Physical Review Letters, 87,
227203.
70 Schmidt, G., Ferrand, D., Molenkamp,
L.W., Filip A.T. and van Wees, B.J. (2000)
Physical Review B-Condensed Matter, 62,
R4790.
71 Yu Z.G. and Flatte, M.E. (2002) Physical
Review B-Condensed Matter, 66, R201202.
72 Albrecht J.D. and Smith, D.L. (2002)
Physical Review B-Condensed Matter, 66,
113303.
73 Hersheld, S. and Zhao, H.L. (1997)
Physical Review B-Condensed Matter, 56,
3296.
74 Rashba, E.I. (2000) Physical Review BCondensed Matter, 62, R16267.
75 Fert A. and Jaffres, H. (2001) Physical
Review B-Condensed Matter, 64, 184420.
76 Yu Z.G. and Flatte, M.E. (2002) Physical
Review B-Condensed Matter, 66, 235302.
77 (a) Shen, M., Saikin, S. and Cheng, M.-C.
(2005) IEEE Transactions on
Nanotechnology, 4, 40; (b) Shen, M., Saikin,
S. and Cheng, M.-C. (2004) Journal of
Applied Physics, 96, 4319.
78 (a) Sze, S.M. (1981) Physics of
Semiconductor Devices, Wiley, New York; (b)
Monch, W. (1995) Semiconductor Surfaces
and Interfaces, Springer, Berlin; (c) Tung,
R.T. (1992) Physical Review B-Condensed
Matter, 45, 13509.
j107
j109
4
Physics of Computational Elements
Victor V. Zhirnov and Ralph K. Cavin
4.1
The Binary Switch as a Basic Information-Processing Element
4.1.1
Information and Information Processing
Information can be dened as a technically quantitative measure of distinguishability of a physical subsystem from its environment [1]. One way to create
distinguishable states is by the presence or absence of material particles (information
carrier) in a given location. For example, one can envision the representation of
information as an arrangement of particles at specied physical locations as for
instance, the depiction of the acronym IBM by atoms placed at discrete locations
on the material surface (Figure 4.1).
Information of an arbitrary type and amount such as letters, numbers, colors, or
graphics specic sequences and patterns can be represented by a combination of
just two distinguishable states [13]. The two states which are known as binary
states are usually marked as state 0 and state 1. The maximum amount of
information, which can be conveyed by a system with just two states is used as a
unit of information known as 1 bit (abbreviated from binary digit). A system with
two distinguishable states forms a basis for binary switch.
The binary switch is a fundamental computational element in informationprocessing systems (Figure 4.2) which, in its most fundamental form, consists
of:
.
.
.
.
110
4.1.2
Properties of an Abstract Binary Information-Processing System
1
L2
4:1
nbit
tsw
4:2
One can increase the binary throughput by increasing the number of binary
switches per unit area, nbit, and/or decreasing the switching time that is, the time to
transition from one state to the other, tsw.
It should be noted that, as each binary switching transition requires energy Esw, the
total power dissipation growth is in proportion to the information throughput:
P
nbit
E sw BIT E sw
tsw
4:3
The above analysis does not make any assumptions on the material system or the
physics of switch operation. In the following sections we investigate the fundamental
relations for nbit, tsw, Esw and the corresponding implications for the computing
systems.
4.2
Binary State Variables
4.2.1
Essential Operations of an Abstract Binary Switch
Information-processing systems represent system states in terms of physical variables. One way to create physically distinguishable states is by the presence or absence
of material particles or elds in a given location. Figure 4.3a illustrates an abstract
model for a binary switch the state of which is represented by different positions of a
material particle. In principle, the particle can possess arbitrary mass, charge, and so
on. The only two requirements for the implementation of a particle-based binary
switch are the ability to detect the presence/absence of the particle in, for example the
location x1, and the ability to move the particle from x0 to x1 and from x1 to x0.
Pcorrect is dened as the probability that the binary switch is in the correct state at an
arbitrary time after the command to achieve that state is given. (Alternatively, one can
use the probability of error Perr 1 Pcorrect). A necessary condition for the
distinguishability of a binary switch is
Pcorrect > Perr
4:4
j111
112
Or equivalently:
Perr < 0:5
4:5
Fast transition between binary states at the presence of control signal: tsw ! 0 in
CHANGE mode (say a negligible fraction of the clock period).
As T max
state ! in STORE mode, the particle velocity in both 0 and 1 states must be
zero, as the particle must be at rest, that is, v0 v1 0; that is, the kinetic energy
2
E mv2 of the particle should ideally be zero in both STORE modes. In the CHANGE
mode, the average particle velocity is hv01i > 0, and E > 0. The switching time can
then be estimated as
r
L
m
4:6
L
tsw
hv01 i
2E
Equation 2.6 sets a limit for the switching speed in the non-zero distance case (nonrelativistic approximation). Note that in binary switch operation, an amount of energy
E must be supplied to the particle before the CHANGE operation begins, and taken
out of the particle after the CHANGE.
If energy remains in the system, the information-bearing particle will oscillate
between the two states with the period 2tsw. In an oscillating system, if friction is
neglected, then energy is preserved (no dissipation) but the information state is not:
the lifetime of each binary state Tstate ! 0. The conditions (i) and (ii) of maximum
distinguishability will be violated and such a system cannot act as a binary switch.
Alternatively, if one wishes to preserve the binary state (i.e., Tstate > 0), the energy
must be rapidly taken out from the system. The time of energy removal tout must be
less then half of the switching time: tout < tsw
2 , otherwise an unintended transition to
another state may occur. One rapid way to remove energy is by thermal dissipation to
the environment. If instead, the aim is to remove the energy in a controllable manner,
for example for a possible re-use, a faster switch will be needed which, according to
Equation 4.6 will require a greater energy for its operation. It is concluded that nonzero energy dissipation is a necessary attribute of binary switch operations.
The above analysis considers binary switches, with states represented by the
presence or absence of material particles (the information-dening particle or information carrier) in a given locations, for example the utilization of electrons as
information carriers. As mentioned above, the electromagnetic eld also can, in
principle, be used to represent information. For example, a popular candidate is a
binary switch that uses the electron spin magnetic moment, when the two opposing
directions of the magnetic eld represent 0 and 1 (Figure 4.3b).
4.3
Energy Barriers in Binary Switches
4.3.1
Operation of Binary Switches in the Presence of Thermal Noise
Consider again, a binary switch where the binary state is represented by particle
location (see Figure 4.3a). Until now, it has been assumed that the information-
j113
114
Figure 4.4 Illustration of an energy barrier to preserve the binary states in the presence of noise.
dening particle in the binary switch has zero velocity/kinetic energy, prior to a WRITE
command. However, each material particle at equilibrium with the environment
possesses kinetic energy of 1/2 kBT per degree of freedom due to thermal interactions,
where kB is Boltzmanns constant and T is temperature. The permanent supply of
thermal energy to the system occurs via the mechanical vibrations of atoms (phonons),
and via the thermal electromagnetic eld of photons (background radiation).
The existence of random mechanical and electromagnetic stimuli means that the
information carrier/material particle located in x0 (Figure 4.4a) has a non-zero
velocity in a non-zero-T environment, and that it will spontaneously move from its
intended location. According to Equation 4.6, the state lifetime in this case will be
r
m
4:7
T state L
kB T
For an electron-based switch (m me 9.11 1031 kg) of length L 1 mm at
T 300 K. Equation 4.8 gives Tstate as 15 ps, and hence the time before the system
would lose distinguishability would be very small.
In order to prevent the state from changing randomly, it is possible to construct
energy barriers that limit particle movements. The energy barrier, separating the two
states in a binary switch is characterized by its height Eb and width a (Figure 4.4b).
The barrier height, Eb, must be large enough to prevent spontaneous transitions
(errors). Two types of unintended transition can occur: classical and quantum.
The classical error occurs when the particle jumps over barrier, and this can happen
if the kinetic energy of the particle E is larger than Eb. The corresponding classic
error probability, PC, is obtained from the Boltzmann distribution as:
Eb
PC exp
4:8
kB T
The presence of energy barrier of width a sets the minimum device size to be
Lmin a.
4.3.2
Quantum Errors
Another class of errors, termed quantum errors, occur due to quantum mechanical
tunneling through the barrier of nite width a. If the barrier is too narrow, then
spontaneous tunneling through the barrier will destroy the binary information. The
conditions for signicant tunneling can be estimated using the Heisenberg uncertainty principle, as is often carried out in texts on the theory of tunneling [4]:
DxDp
h
2
4:9
The uncertainty relationship of Equation 4.9 can be used to estimate the limits of
distinguishability. Consider again a two-well bit in Figure 4.4b. As is known from
quantum mechanics, a particle can pass (tunnel) through a barrier of nite width,
even if the particle energy is less than the barrier height, Eb. An estimate of how thin
the barrier must be to observe tunneling can be made from Equation 4.9; for a particle
p
at the bottom of the well, the uncertainty in momentum is 2mE b , which gives:
p
h
2mE b Dx
2
4:10
Equation 4.10 states that by initially setting the particle on one side of the barrier, it
is possible to locate the particle on either side, with high probability, if Dx is of the
order of the barrier width a. That is, the condition for losing distinguishability is
Dx ( a, and the minimum barrier width is:
h
amin aH p ;
2 2mE b
4:11
4:12
From Equation 4.12, the tunneling condition can also be written in the form
p
2 2m p
1
a E b 0;
4:13
h
Since for small x, ex 1 x, the tunneling condition then becomes
p
p
2 2m
exp
a Eb 0
h
4:14
The left-hand side of Equation 4.14 has the properties of probability. Indeed, it
represents the tunneling probability through a rectangular barrier given by the
WentzelKramersBrillouin (WKB) approximation [5]:
p
p
2 2m
a Eb
PWKB exp
4:15a
h
This equation also emphasizes the parameters controlling the tunneling process,
which include the barrier height Eb and barrier width a, as well as the mass m of the
j115
116
4:16
4.4
Energy Barrier Framework for the Operating Limits of Binary Switches
4.4.1
Limits on Energy
The minimum energy of binary transition is determined by the energy barrier. The
work required to suppress the barrier is equal or larger than Eb; thus, the minimum
energy of binary transition is given by the minimum barrier height in binary switch.
The minimum barrier height can be found from the distinguishability condition
[Equation 4.5], which requires that the probability of errors Perr < 0.5. First, the case
is considered when only classic (i.e., thermal) errors can occur. In this case,
according to Equation 4.8:
Eb
4:18
Perr PC exp
kB T
4.4 Energy Barrier Framework for the Operating Limits of Binary Switches
4:19
Equation 4.19 corresponds to the minimum barrier height, the point at which
distinguishability of states is completely lost due to thermal over-barrier transitions.
In deriving Equation 4.19, tunneling was ignored that is, the barrier width is
assumed to be very large, a aH.
Next, we consider the case where only quantum (i.e., tunneling) errors can occur.
In this case, according to Equation 4.15a
p
p
2 2m
4:20
a Eb
Perr PQ exp
h
By solving Equation 4.20 for Perr 0.5, we obtain the Heisenbergs limit for the
minimum barrier height, EbH
E bH
2
h
ln 22
8ma2
4:21
2
h
ln 22
8ma2
4:22
Equation 4.22 provides a generalized value for minimum energy per switch
operation at the limits of distinguishability, that takes into account both classic and
quantum transport phenomena. The graph in Figure 4.5 shows the numerical
solution of Equation 4.17 and its approximate analytical solution given by Equation 4.22 for Perr 0.5. Is it clearly seen that for a > 5 nm, the Boltzmanns limit,
EbB kBT ln2, is a valid representation of minimum energy per switch operation,
while for a < 5 nm, the minimum switching energy can be considerably larger.
4.4.2
Limits on Size
The minimum size of a binary switch L cannot be smaller then the distinguishability
length aH. From both Equations 4.11 and 4.19, one can estimate the Heisenbergs
length for the binary switch operation at the Boltzmanns limit of energy:
j117
118
Figure 4.5 Minimum energy per switch operation as a function of minimum switch size.
h
aHB p
2 2mkB T ln 2
4:23
1
2aHB 2
4:24
For electron-based binary switches at 300 K (aHB 1 nm) and nmax 1013 cm2.
4.4.3
Limits on Speed
The next pertinent question is the minimum switching time, tmin, which can be
derived from the Heisenberg relationship for time and energy:
h
2
4:25a
h
2DE
4:25b
DEDt
or
tmin
Equation 4.25b is an estimate for the maximum speed of dynamic evolution [6] or
the maximum passage time [7]. It represents the zero-length approximation for the
h
2 10 14 s
2kB T ln 2
4:26
It should be noted that Equation 4.26 is applicable to all types of device, and no
specic assumptions were made about any physical device structure.
4.4.4
Energy Dissipation by Computation
Using Equations 4.14.3 allows one to estimate power dissipation by a chip containing
the smallest binary switches, L aHB [Equation 4.23], packed to maximum density
[Equation 4.24] and operating at the lowest possible energy per bit [Equation 4.19].
The power dissipation per unit area of this limit technology is given by:
P
4:27
The energy density bound in the range of MW cm2 obtained by invoking kBT ln 2
as the lower bound for the device energy barrier height is an astronomic number. If
known cooling methods are employed, it appears that that heat-removal capacity of
several hundred W cm2 represents a practically achievable limit. The practical
usefulness of alternative electron-transport devices may be derived from lower
fabrication costs or from specic functional behavior; however, the heat removal
challenge will remain.
4.5
Physics of Energy Barriers
The rst requirement for the physical realization of any arbitrary switch is the
creation of distinguishable states within a system of such material particles. The
second requirement is the capability for a conditional change of state. The properties
of distinguishability and conditional change of state are two fundamental and
essential properties of a material subsystem that represents binary information.
These properties are obtained by creating and control energy barriers in a material
system.
The physical implementation of an energy barrier depends on the choice of the
state variable used by the information processing system. The energy barrier creates a
local change of the potential energy of a particle from a value U1 at the generalized
coordinate q1 to a larger value U2 at the generalized coordinate q2. The difference
DU U2 U1 is the barrier height. In a system with an energy barrier, the force
exerted on a particle by the barrier is of the form F qU
qq . A simple illustration of a one
dimensional barrier in linear spatial coordinates, x, is shown in Figure 4.6a. It should
j119
120
be noted, that the spatial energy changes in potential energy require a nite spatial
extension (Dx x2 x1) of the barrier (Figure 4.6a and b). This spatial extension
denes a minimum dimension of energy barrier, amin: amin > 2Dx. In this section, we
consider the physics of barriers for electron charge, electron spin, and optical binary
switches.
4.5.1
Energy Barrier in Charge-Based Binary Switch
For electrons, the basic equation for potential energy is the Poisson equation
r2 j
r
;
e0
4:28
where r is the charge density, e0 8.85 1012 F m1 is the permittivity of free space,
and j is the potential: j U/e. According to Equation 4.28, the presence of an energy
barrier is associated with changes in charge density in the barrier region. The barrierforming charge is introduced in a material, for example by the doping of semiconductors. This is illustrated in Figure 4.6b, for a silicon n-p-n structure where the barrier
is formed by ionized impurity atoms such as P in the n-region and B in the p-region.
The barrier height Eb0 depends on the concentration of the ionized impurity atoms [8]:
E b0 kB T ln
N A N B
;
ni
4:29
Dq
Dj
4:31
4:32
The voltage needed to suppress the barrier from Eb0 to zero (the threshold voltage)
is:
Vt
E b0
e
4:33
Thus, the operation of all charge transport devices involves charging and discharging capacitances to change barrier height, thereby controlling charge transport in
the device. When a capacitor C is charged from a constant voltage power supply Vg,
the energy Edis is dissipated, that is, it is converted into heat [9]:
E dis
CV 2g
2
4:34
The minimum energy needed to suppress the barrier (by charging the gate
capacitor) is equal to the barrier height Eb. Restoration of the barrier (by discharging
gate capacitance) also requires a minimum energy expenditure of Eb. Thus, the
minimum energy required for a full switching cycle is at least 2Eb.
It should be noted that in the solid-state implementation of binary switch, the
number of electrons in both wells is large. This is different from an abstract system
having only one electron (see above). In a multi-electron system, the electrons strike
the barrier from both sides, and the binary transitions are determined by the net
electron ow, as shown in Figure 4.7.
j121
122
Let N0 be the number of electrons that strike the barrier per unit time. Thus, the
number of electrons NA that transition over the barrier from well A per unit time is
Eb
N A N 0 exp
4:35
kB T
The corresponding current IAB is
Eb
IAB e N A eN 0 exp
kB T
4:36
Electrons in well B also can strike the barrier and therefore contribute to the overbarrier transitions with current IBA. Thus, the net over-barrier current is
I I AB IBA
4:37
The energy diagram of Figure 4.7a is symmetric, hence IAB IBA, and I 0.
Therefore, no binary transitions occur in the case of symmetric barrier. In order to
enable the rapid and reliable transition of an electron from well A to well B, an energy
asymmetry between two wells must be created. This is achieved by energy difference
eVAB between the wells A and B (Figure 4.7b).
For such an asymmetric diagram, the barrier height for electrons in the well A is Eb,
and for electrons in the well B is (Eb eVAB). Correspondingly, from Equations 4.36
and 4.37 the net current is
0
1
0
1
E
E
eV
AB A
b A
b
I eN 0 exp@
eN 0 exp@
kB T
kB T
0
12
0
13
4:38a
E
eV
AB
b
A41 exp@
A5
eN 0 exp@
kB T
kB T
By substituting Equation 4.32 for Eb, we obtain:
E b0 eV g
eV AB
1 exp
I eN 0 exp
kB T
kB T
4:38b
By expressing Eb0 as eVt from Equation 4.33 and using the conventional notations
I Ids (source-drain current) and VAB Vds (source-drain voltage), we obtain the
equation for the subthreshold IV characteristics of FET [10]:
eV g V t
eV ds
Ids I 0 exp
1 exp
4:38c
kB T
kB T
An example plot of Equation 4.38c is shown in Figure 4.8.
The minimum energy difference between the wells, eVABmin, can be estimated
based on the distinguishability arguments for CHANGE operation, when is Eb is
j123
124
suppressed by applying the gate voltage, for example Eb 0 (Figure 4.7c). For a
successful change operation, the probability that each electron owing from well A to
well B is not counterbalanced by another electron moving from well B to well A
should be less than 0.5 in the limiting case. The energy difference eVAB forms a
barrier for electrons in well B, but not for electrons in well A, and therefore, from
Equation 4.18 we obtain
eV ABmin E bmin kB T ln 2
4:39
4:40a
If N 1, then
E SWmin 3kB T ln 2 10 20 J
4:40b
4.5.2
Energy Barrier in Spin-Based Binary Switch
4:41
e
h
, and g is the coupling constant known as the
where mB is the Bohr magneton, mB 2m
e
Lande gyromagnetic factor or g-factor. For free electrons and electrons in isolated
atoms g0 2.00. In solids, consisting of a large number of atoms, the effective g-factor
can differ from g0.
m and a magnetic
The energy of interaction, EmB, between a magnetic moment !
!
eld B is:
!
Em B m B
4:42
For the electron spin magnetic moment in a magnetic eld applied in the z
direction, the energy of interaction takes two values, depending on whether the
electron spin magnetic moment is aligned or anti-aligned with the magnetic eld.
From Equations 4.41 and 4.42 one can write, assuming g 2
E ""
eh
Bz
2me
E "#
eh
Bz
2me
4:43
The energy difference between the aligned and anti-aligned states represents the
energy barrier in the spin binary switch and is
E b E "# E "" 2mB Bz
4:44
j125
126
E spin E M 12kB T
4:45
where EM is the energy required to generate the magnetic eld to change the spin
state.
One possible way to address the electrical challenge for spin devices is to change
the spin control paradigm. The paradigm described above uses a system of binary
switches, each of which can be controlled independently by an external stimulus, and
each switch can, in principle control any other switch in the system. However, it is not
clear whether a spin state-based device can be used to control the state of subsequent
spin devices without going through an electrical switching mechanism as discussed
above. Although it is possible that local interactions may be used to advantage for this
purpose [15, 16], feasibility assessments of these proposals in general information
processing applications are clearly required.
Let us now consider a hypothetical single spin binary switch that, ideally, might
have atom-scale dimensions. At thermal equilibrium there is a probability of
spontaneous transition between spin states 1 and 0 in accordance with Equation 4.18. Correspondingly, for Perr < 0.5, according to Equation 4.19 the energy
separation between to state should be larger than kBT ln 2
2mB Bmin kB T ln 2
4:46
From Equations 4.19 and 4.44 we can obtain the minimum value of B for a switch
operation
kB T ln 2 me
kB T ln 2
4:47
Bmin
2mB
eh
Bmax [T]
Magnet
Limiting factor
2
Conventional magnetic-core
electromagnet
Steady-eld air-core NbTi and Nb3Sn
superconducting electromagnet
Steady-eld air-core water-cooled
electromagnet
Pulsed-eld hybrid electromagnet
20
30
Joule heating
50
Maxwell stress
At T 300 K, Equation 4.47 results in Bmin 155 Tesla (T), which is much larger
than can be practically achieved (a summary of the technologies used to generate high
magnetic elds is provided in Table 4.1).
One of the most difcult problems of very high magnetic eld is the excessive
power consumption and Joule heating in electromagnets. The relationship between
power consumption, P, and magnetic eld B, in an electromagnet is [17]:
4:48
P B2
Magnet
B [T]
0.01
0.2
2
16
Mass
Comments
g
g
100 W kg
6065 MW
j127
128
4.5.3
Energy Barriers for Multiple-Spin Systems
In a system of N spins in an external magnetic eld, there are N"" spin magnetic
moments parallel to the external magnetic eld, and the resulting magnetic moment
is
m B
4:49
m mB N "" mB N1 Perr mB N 1 exp B
kB T
As seen above, for all practical cases mBB
kBT, and since (1 ex) x, for x ! 0,
there results
m
Nm2B B
kB T
4:50
Equation 4.50 is known as the Curie law for paramagnetism [11]. From Equations 4.50 and 4.44 one can calculate the minimum number of electron spins
required for spin binary switch operating at realistic magnitudes of the magnetic
eld:
ln 2 kB T
N min
4:51
2 mB B
For example, for B 0.1 T (a small neodymiumironboron magnet; see Table 4.1),
Nmin 7 106. If the number of electrons with unpaired spins per atom is f (f varies
between 1 and 7 for different atoms), then the number of atoms needed is Nmin/f.
Correspondingly, one can estimate the minimum critical dimension amin of the
binary switch as:
1
N min 3
;
4:52
amin
f nV
where nV is the density of atoms in the material structure. Assuming an atomic
density close to the largest known in solids, nV 1.76 1023 cm3 (the atomic density
of diamond) and B 0.1 T, we obtain amin 41 nm for f 1 and amin 22 nm for
f 7. Thus, it is concluded that for reliable operation at moderate magnetic elds, the
physical size of a multi-spin-based binary switch is larger than the ultimate chargebased devices. The effect of collective spin behavior is currently used, for example, in
magnetic random access memory (MRAM) [14] and in electron spin resonance (ESR)
technologies.
As an example, it is estimated that the energy needed to operate a spin-based
binary switch for the case when the magnetic eld is produced by electrical current I
in a circular loop of wire surrounding the switch, as shown in Figure 4.10. The
magnetic eld in the center of the loop is [20]:
B
m0 I
;
2r
4:53
In order to switch the external magnetic eld, for example from zero to B or
!
!
from B to B , work must be done. If the magnetic eld is formed by electrical
current I produced by an external voltage source, then the energy dissipated by one
!
half of switching cycle (e.g., the rise from zero to B ):
E dis
LI 2
2
4:54
LI 2
I2 R t
2
4:55
If the magnetic eld does not need to be sustained (e.g., in ferromagnetic devices),
after switching the current must be reduced to zero, in which case another amount of
energy of Equation 4.54 is dissipated. Thus, the energy expenditure needed to
generate the magnetic eld EM [see Equation 4.55] in a full switching cycle is
E M LI 2
4:56
4:57
4:58
pm0 r
2
4:59
2p 3 2
r B
m0
4:60
For the minimum device size given by Equation 4.52, and noting that r amin as
obtained from Equations 4.19 and 4.60:
j129
130
E M min
p ln 2 kB T 2
m0 nV mB
4:61
For the largest atomic density of solids, nV 1.76 1023 cm3 (diamond), we
obtain:
E M min 2 10 18 J 480kB T
4:62
4.5.4
Energy Barriers for the Optical Binary Switch
Optical digital computing was and still is considered by some as a viable option for
massive information processing [22]. Sometimes, it is referred to as computing at
speed of light [23] and, indeed, photons do move at the speed of light. At the same
time, a photon cannot have a speed other then the speed of light, c, and therefore it
cannot be conned within a binary switch of nite spatial dimensions.
The minimum dimension of optical switch is given by the wavelength of light, l. If
amin < l, there is high probability that the light will not sense the state that is, the
error probability will increase. For visible light, amin 400 nm.
Binary state control in the optical switch is accomplished by local changes in optical
properties of the medium, such as the refraction index, reectivity or absorption,
while photons are used to read the state. In many cases, the changes in optical
properties are related to a rearrangement of atoms under the inuence of electrical,
optical, or thermal energy. The energy barrier in this case is therefore related to either
inter-atomic or inter-molecular bonds. Examples of such optical switches are liquid
crystal spatial light modulators [22] and non-linear interference lters [22]. Another
example is a change in refractive index as the result of a crystalline-to amorphous
phase change, which is used for example in the rewritable CD. The minimum
switching energy of this class of optical switches is related to the number of atoms, N,
and therefore to the size L. In the limiting case, L amin l. In order for the atoms of
an optical switch to have a distinguishable change of their position, the energy supply
to each atom should be larger than kBT. The total switching energy is therefore:
E N kB T
4:63
For a minimum energy estimate, we must consider the smallest possible N, which
corresponds to an single-atom plane of size l. If optical switch materials have an
atomic density n, then one obtains:
2
E n3 l kB T
4:64
For most solids, n 10221023 cm3. Taking n 5 1022 cm3, T 300 K, and
amin 400 nm, we obtain E 1014 J, which is in agreement with estimates of the
physical limit of switching energy of optical digital switches, as reported in the
literature [22].
Optical switches may also be based on electroabsorption. In these devices, the
absorption changes by the application of an external electric eld that deforms the
4.6 Conclusions
energy band structure. One example that has attracted considerable interest for
practical application is the Quantum-Conned Stark Effect [24]. If an electrical eld is
applied to a semiconductor quantum well, the shape of the well is changed, perhaps
from rectangular to triangular. As result, the position of energy levels also changes,
and this affects the optical absorption. As the formation of an electric eld requires
changes in charge distribution [Equation 4.28], the analysis of electroabsorption
optical switch is analogous to a charge-based switch, where the energetics is
determined by charging and discharging of a capacitor [Equations 4.31 and 4.32].
It should be noted that the capacitance of an optical switch is considerably larger then
the capacitance of electron switch, because of larger capacitance area of the optical
switch (l2). By using the estimated minimum size of an electron switch aemin 1
nm (as estimated in Section 4.4.2), and taking into account Equation 4.40b, we obtain
an estimate for the switching energy of an electroabsorption device:
E 3kB T
a2e min
1:2 10 20 J
400 nm
10 15 J
1 nm
4:65
The result from Equation 4.65 is in an agreement with estimates of physical limit of
electro-absorption optical switch, as detailed in the literature [22, 25].
This energy barrier and therefore the switching energy for an optical binary
switch is relatively high, with estimates for the theoretical limit for optical device
switching energy varying between 1014 and 1015 J, for different types of optical
switch [22]. It should be noted that the switching speed is the speed of re-arrangement
for atoms or for charge in the material, and is not related to the speed of light.
4.6
Conclusions
Based on the idea that information is represented by the state of a physical system
for example, the location of a particle we have shown that energy barriers play a
fundamental role in evaluating the operating limits of information-processing
systems. In order for the barrier to be useful in information-processing applications,
it must prevent changes in the state of the processing element with high probability,
and it also must support rapid changes of state when an external CHANGE command
is given. If one examines the limit of tolerable operation that is, the point at
which the state of the information-processing element loses its ability to sustain a
given state it is possible to advance estimates of the limits of performance for
various types of information-processing element. In these limit analyses, the
Heisenberg uncertainty relationship can serve as a basis for estimating performance
using algebraic manipulations only.
It was shown that charge-based devices in the limit could offer extraordinary
performance and scaling into the range of a few nanometers, albeit at the cost of
enormous and unsustainable power densities. Nonetheless, it appears that there is
considerable room for technological advances in charge-based technologies. One
could consider using electron spin as a basis for computation, as the binary system
j131
132
state can be dened in terms of spin orientation. However, an energy barrier analysis,
based on the equilibrium room-temperature operation of a digital spin-ipping
switch, has revealed that extraordinarily large external magnetic elds are required to
sustain the system state, and hence that a high energy consumption would result.
(Although it has been proposed that if devices can be operated out-of-equilibrium
with the thermal environment, then perhaps computational state variables can be
chosen to improve on the switching energy characteristic of spin-based devices [26].)
The need for very large magnetic elds can be eased by utilizing multiple electron
spins to represent the state of the processing element. Unfortunately, the number of
electrons that must be utilized is such that the size of the processing elements would
be about an order of magnitude larger than that of a charge-based device. Finally, we
briey examined the different physical realizations for optical binary elements, and
found that the inability to localize a photon, although an advantage for communication systems, works against the implementation of binary optical switches. As a
general rule, optical binary switches are signicantly larger than charge-based
switches.
Although it appears that it will be difcult to supplant charge as a mainstream
information-processing state variable, there may be important application areas
where the use of spin or optics could be used to advantage. While the present chapter
has focused on the properties of the processing elements themselves, informationprocessing systems are clearly comprised of interconnected systems of these elements, and it is the system consideration that must remain paramount in any
application. Nonetheless charge-based systems, by using the movement of charge to
effect element-to-element communication, avoid changing any state variable to
communicate, and this is a decided advantage.
References
1 Ayres, R.U. (1994) Information, Entropy
and Progress, AIP Press, New York.
2 Brillouin, L. (1962) Science and
Information Theory, Academic Press,
New York.
3 Yaglom, A.M. and Yaglom, I.M. (1983)
Probability and Information, D. Reidel,
Boston.
4 Gomer, R. (1961) Field Emission and Field
Ionization, Harvard University Press.
5 French, A.P. and Taylor, E.F. (1978) An
Introduction to Quantum Physics, W.W.
Norton & Co, Inc.
6 Margolus, N. and Levitin, L.B. (1998) The
maximum speed of dynamical evolution.
Physica D, 120, 1881.
References
12 Pearton, S.J., Norton, D.P., Frazier, R.,
Han, S.Y., Abernathy, C.R. and Zavada,
J.M. (2005) Spintronics device concepts.
IEE Proceedings-Circuits Devices and
Systems, 152, 312.
13 Jansen, R. (2003) The spin-valve transistor:
a review and outlook. Journal of Physics
D-Applied Physics, 36, R289.
14 Daughton, J.M. (1997) Magnetic tunneling
applied to memory. Journal of Applied
Physics, 81, 3758.
15 Bandyopadhay, S., Das, B. and Miller, A.E.
(1994) Supercomputing with spinpolarized single electrons in a quantum
coupled architecture. Nanotechnology, 5,
113.
16 Cowburn, R.P. and Welland, M.E. (2000)
Room temperature magnetic quantum
cellular automata Science, 287, 1466.
17 Motokawa, M. (2004) Physics in high
magnetic elds. Reports on Progress in
Physics, 67, 1995.
18 (a) Marshall, W.S., Swenson, C.A.,
Gavrilin, A. and Schneider-Muntau, H.J.
(2004) Development of Fast Cool pulse
magnet coil technology at NHMFL. Physica
B, 346, 594. (b) National High Magnetic
Field Laboratory website at: http://www.
magnet.fsu.edu/magtech/core.
19 Lietzke, A.F., Bartlett, S.E., Bish, P., Caspi,
S., Dietrich, D., Ferracin, P., Gourlay, S.A.,
20
21
22
23
24
25
26
j133
j135
II
Nanofabrication Methods
j137
5
Charged-Particle Lithography
Lothar Berger, Johannes Kretz, Dirk Beyer, and Anatol Schwersenz
5.1
Survey
l
NA
5:1
where k1 is a factor determined by the projection optics and process ow, l is the
wavelength of the photons, and NA is the numerical aperture between the objective
optical lens and the resist plane.
Nanotechnology. Volume 3: Information Technology I. Edited by Rainer Waser
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31738-7
j 5 Charged-Particle Lithography
138
Figure 5.2 The photolithographic process for making electrical connections to a transistor.
5.1 Survey
conducting the exposure not in air, but in a liquid. With NA 1.4 for a technically
feasible projection in water, CD 42 nm can be achieved. Much progress has been
made with immersion photolithography, and the technique is currently at the stage of
pilot production.
Another means of reducing the wavelength is to forego the use of transmission
photomasks and projection optics with lenses, and to utilize reective masks and
mirror projection optics. Lithography involving reection is no longer considered
as classical photolithography. The wavelength of the photons where mirrors can be
applied most effectively is 13.5 nm, and lithography of 13.5 nm involving masks
with multilayer Bragg reectors is referred to as extended ultraviolet lithography
(EUVL), for which considerable research effort is currently being expended. At this
point, it should be mentioned that EUVL differs greatly from DUVL in that it
requires the redevelopment of almost all the exposure equipment and lithography
processes currently in use. A comprehensive study of the process is presented in
Ref. [3].
An alternative to any lithography involving photons is charged-particle lithography,
where charged particles (electrons, ions) are used for patterning. While certain
charged particle lithography techniques are already used for special applications,
such as fabricating masks for photolithography, or prototyping, promising new
charged-particle lithography techniques for preparing integrated circuits are currently under development, and may in time complement or even replace
photolithography.
In order to illustrate the relationship between charged-particle lithography and
photolithography, it is of help to examine the International Technology Roadmap for
Semiconductors (ITRS), which demonstrates the status of lithographic techniques
currently in use and under development within the microelectronics industry. The
ITRS for the year 2006 is shown in Figure 5.3 [4]. According to the current ITRS, in
2007 the most likely successors of DUVL include EUVL, a multiple-electron-beam
lithography technique called maskless lithography (ML2), and nanoimprint lithography
(NIL). Charged-particle lithography techniques known as electron projection lithography (EPL) and ion projection lithography (IPL), both of which use a transmission
mask and projection optics with electromagnetic lenses to direct electrons and ions,
were removed from the ITRS in 2004. Proximity electron lithography (PEL), which
uses a 1 : 1 transmission mask, was removed in 2005, although its re-emergence
cannot be ruled out completely.
In this chapter we discuss the physical concepts and the principal advantages and
limitations of charged-particle lithography techniques. A brief insight is also provided into the charged-particle lithography techniques currently in use and under
development, while a strong focus is placed on ML2 techniques. The chapter
comprises four sections: Section 5.1 includes a survey of the eld, while Section 5.2
incorporates discussions of electron beam lithography and electron resists, and their
major applications:
.
the fabrication of transmission masks for DUVL and reective masks for
EUVL
j139
j 5 Charged-Particle Lithography
140
.
.
.
the direct-writing of patterns onto wafers with single beams for prototyping, lowvolume production, and mix-and-match with photolithography
the direct-writing of patterns on wafers with multiple beams for volume production: ML2
the fabrication of imprint templates for NIL.
Section 5.2 concludes with a discussion of the special requirements for mix-andmatch, namely the integration of electron beam lithography (EBL) with photolithography. Section 5.3 presents details of ion beam lithography (IBL), for which the major
applications include:
.
.
the direct-structuring of patterns on wafers without resist processing, for prototyping, low-volume production, and special applications
the fabrication of imprint templates for NIL, with direct-structuring of patterns
without resist processing.
5.2
Electron Beam Lithography
5.2.1
Introduction
Electron beam lithography involves the use of electrons to induce a chemical reaction
in an electron resist for pattern formation (the properties of electron resists are
discussed in detail in Section 5.2.2). Because of the extremely short wavelength of
accelerated electrons, EBL is capable of very high resolution, as l = h/(mv), Ekin
(mv2)/2, and E eU gives:
h
l p
2meU
5:2
j141
j 5 Charged-Particle Lithography
142
Parameter
Thermal: tungsten
Thermal: LaB6
300
0.3
2700
3
1700
1.5
5:4
where E kin;1 mv21 =2 eU 0 and E kin;2 mv22 =2, and Equation 5.4 can be expressed as an electronoptical refraction law:
r
sin a1 v2
U
5:5
1
U0
sin a2 v1
Electron Lenses A simple electrostatic lens (Einzel lens) consists of three rings,
where the outlying rings have the same electrostatic potential (Figure 5.7).
j143
j 5 Charged-Particle Lithography
144
For this conguration, if the charge effects are negligible, then a simple equation
for the paraxial trajectories of electrons can be obtained [6]:
d2 r
1 dU dr
1 d2 U
r0
2
dz
2U dz dz 4U dz2
5:6
It should be noted that as Equation 5.6 is invariant towards the scaling of the voltage
U, voltage instabilities in general do not cause a jitter of trajectories through
electrostatic lenses. With the geometrical relation of the trajectory r and the focal
length f, dr(z1)/dz r1/f, and r r1, Equation 5.6 yields [6]:
1
1
p
f
8 U0
z2
2
dU
3
U 2 dz
dz
5:7
z1
F e v B
5:8
r
dz2
m 8U 0
5:9
With the geometrical relation of the trajectory r and the focal length f, dr(z2)/
dz r1/f, and r r1, Equation 5.9 yields [6]:
1
e
f
8mU 0
z2
B2z dz
z1
5:10
Electron Optical Columns For the design of electron optical columns, the previous
simple considerations are not adequate. It is required to derive the trajectories of
electrons in a general form, which is valid also for non-rotational-symmetric lenses. A
straightforward calculation can be based on the general equation of motion of
electrons in an electron optical column:
h!
! ! i
d
!
!
!
mv e E r ; t v B r ; t
dt
5:11
B
ds
v
ds
5:12
The trajectory equation can be derived in Cartesian coordinates, and with the introq
^ F0 F
duction of the abbreviations h 2me 0 ; e 2me0 c2 ; F0 Ee0 ;
F
q
0
1 eF0 F; r jr! j 1 x0 2 y0 2 , where E0 is the initial kinetic energy
of an electron at the source, the trajectory equation results as follows [7]:
^
^ Zr2
r2 qF
qF
x 00
x 0
p rBy y0 Bt
^ qx
qz
^
2F
F
^
^ Zr2
r2 qF
0 qF
0
y
y
p rBx x Bt
^ qy
qz
^
2F
F
00
5:13
It should be noted that this trajectory equation is valid only if all the trajectories are
continuous that is, if x0 (z), y0 (z) are nite. This is not the case with electron mirrors,
although these are not used in current electron optical columns.
Aside of the trajectory representation of Equation 5.13, more elaborate mathematical methods have been applied to the analytic investigation of electron optical
columns concepts and designs, focusing on the prediction of projection imperfections, called aberrations. These methods include the classical mechanics approach of
Lagrange or Hamilton formalism [7], with the latter leading towards the HamiltonJacobi theory of electron optics, which is capable of treating whole sets of
trajectories, and therefore is a standard tool for the design of electron optical columns
with minimal aberrations [8].
Recently, with the support of computer algebra tools, a Lie algebraic electron
optical aberration theory has been derived, which makes accessible high-order
canonical aberration formulas, and therefore may open up new possibilities in the
design of high-performance electron beam projection systems [9].
Electron Optical Aberrations Electron optical columns suffer from projection deviations termed aberrations, which are caused either by non-ideal electron optical
j145
j 5 Charged-Particle Lithography
146
elements, or by the physical limits of electron optics. An ideal electron optical lens
should project an electron beam crossing the entrance plane at the point (x0, y0) onto
the exit plane at the point (x1, y1) m(x0, y0), where m is a scalar called the
magnication. Unfortunately, however, a real lens suffers from imperfections. For
example, a real lens has the same focal length only for paraxial electron beams, while
off-axis beams are slightly deected. This is called spherical aberration, and is
expressed by a coefcient Cs, relative to the beam position in the aperture plane
(xa, ya):
x0 ; y0 ! m x 0 Cs x a x 2a y2a ; y0 Cs ya x 2a y2a
5:14
With Cs > 0, an electron beam is blurred into a nite disk. Electron optical
aberration theory shows that Cs cannot be made to vanish completely for rotational
symmetric lenses [7], which in turn led to the introduction of non-rotational
symmetric elements such as quadrupoles or octopoles. An overview of lens
aberrations occurring in an electron optical column is provided in Ref. [7].
Additionally, the non-ideality of the electron source must be considered: a real
electron source has a small spread in electron energy and, as the focal length of
electron lenses depends on the electron energy, a non-monochromatic electron
beam is blurred into a nite disk. This is referred to as chromatic aberration. In
addition to these lens aberrations, other electron optical elements, such as beam
deectors, also introduce aberrations.
For a high-resolution EBL system, optimal projection is crucial, and therefore the
aberrations discussed above must either be minimized by the design of the electron
optical column, or compensated by additional electron optical elements. For example,
since it is not possible to prepare a column with perfect rotational symmetry, an
electron beam always suffers from astigmatism and misalignment. In order to
compensate for astigmatism, a non-rotational symmetric element is required, a
stigmator. This may be an electrostatic quadrupole, which is a circular arrangement of
four electrodes. However, stigmators are usually designed as electrostatic octopoles
which, in addition to quadrupole elds, can also generate dipole elds at any angle to
correct beam misalignment [10].
While major efforts have been made to apply aberration theory to the design of an
electron column with minimal aberrations, only powerful numerical optimization
tools can be utilized for any systematic approach [11]. Such a tool generally consists of
the following packages:
.
.
.
.
an
an
an
an
5:15
which is then minimized. With this method, existing electron column designs can be
improved, and new designs optimized automatically.
In addition to the design of electron optical columns with minimal aberrations, it
has been proposed to implement adaptive aberration correction by introducing novel
electron optical elements: for example, an electrostatic dodecapole, with timedependent voltage control for the poles [12]. The learning process could be based
on exposure pattern images.
5.2.1.3 Gaussian Beam Lithography
Patterning in EBL can be accomplished by focusing the electrons into beams with
very narrow diameters. Because the electrons are created by a thermal source, they
have a spread in energy. The trajectories of the electrons therefore vary slightly,
resulting in electron beams with near-Gaussian intensity distribution after traversing
the electron optics [13].
The basic principle of Gaussian electron beam exposure is raster scanning. Similar
to a television picture tube, the electron beam is moved in two dimensions across the
scanning area on the electron resist, which typically is about 1 mm2. Within that area,
which is termed the deection eld, the electron beam can be moved very rapidly by the
electron optics. In order to change the position of this area on the substrate, a
mechanical movement of the precision stage is required. Patterns which stretch over
more than 1 mm2 must be stitched together from separate deection elds; in order
to avoid discontinuities in the patterns at the boundaries of the deection elds (these
are known as butting errors, and potentially are caused by mechanical movement of
the stage), the positions of these boundaries are calibrated and corrected in sophisticated manner.
In order to create the pattern shown in Figure 5.9a, the electron beam is moved
across the area where the desired pattern is located, and is blanked on spots not
intended for exposure. Raster scanning certainly has the major problem that the
electron beam must target the whole scanning area, which takes a considerable time.
Therefore, a great improvement is to target only the area of the desired pattern; this is
termed vector scanning (Figure 5.9b). However, another major limitation remains in
that, if a pattern consists of large and small features, the diameter of the electron
Figure 5.9 Writing a pattern with: (a) raster-, (b) vector-, and (c) vector shaped & beam strategies.
j147
j 5 Charged-Particle Lithography
148
beam must be adapted to the smallest feature, thereby greatly increasing the exposure
time of the large features.
5.2.1.4 Shaped Beam Lithography
In order to overcome the limitations of both raster and vector scanning, EBL tools
have been developed which can shape electron beams. A shaped electron beam is
created by special aperture plates and, in contrast to the near-Gaussian intensity
distribution of standard EBL tools, shaped electron beam tools can apply rectangular,
or even triangular, intensity distributions [14]. At present the fastest technique
available is vector scanning using shaped electron beams; when using this technique
the pattern shown in Figure 5.9c requires only two exposures.
The principal function of a variable-shaped beam (VSB) column is illustrated in
Figure 5.10. The electron source illuminates a rst shaping aperture, after which a
rst condenser lens projects the shaped beam onto a second shaping aperture. The
beam position on the second aperture is controlled by an electrostatic deector. A
second condenser lens projects the shaped beam onto the demagnication system,
consisting of two lenses, and the nal aperture in between. After demagnication, the
shaped beam is projected onto the substrate by the nal projection system, which
consists of a stigmator and a projection lens with an integrated deector.
The formation of shaped beams is illustrated for the example of rectangular spots
in Figure 5.11. Two square apertures shape the spot; the image of a rst square
aperture, which appears in the plane of a second square aperture, can be shifted
laterally with respect to the second aperture. This results in a rectangular spot which
then is demagnied and projected onto the substrate.
Modern shaped electron beam tools can apply both rectangular and triangular
spots. For example, the Vistec SB3050 [16] employs a LaB6 thermal source, and
utilizes a vector scan exposure strategy, a continuously moving stage, and the
variable-shaped beam principle. The maximum shot area is 1.6 1.6 mm, and
rectangular shapes with 0 and 45 orientation, as well as triangles, can be exposed
in a single shot. A detailed view of the two shaping aperture plates is shown in
Figure 5.12.
The architecture and motion principle of the stage is decisive for pattern placement
accuracy, so as to avoid butting errors. Position control by interferometer with a
resolution <1 nm, and the use of a beam tracking system, allow write-on-the-y
exposures with stage speeds of up to 75 mm s1. The driving range of the stage is
j149
j 5 Charged-Particle Lithography
150
Figure 5.12 Schematic of electron beam shaping by doubleaperture for rectangular and triangular spots [16].
310 310 mm, thus enabling the exposure of 6 in and 9 in masks, as well as 300 mm
wafers (see Figure 5.13).
However, even such a precision stage cannot eliminate butting errors completely,
and therefore the vector-shaped beam strategy involves overlapping exposure shapes,
resulting in features being exposed at least twice, a process known as multi-pass
writing.
A production-worthy EBL system is highly automated, with no human intervention
required for operation, except for an operator issuing a command for the system to
start loading the substrate and writing the pattern. The pattern is encoded in a digital
data le, and stored in a computer memory or a mass storage device. Prior to writing,
the original design data must be converted to a format which is usable by the writing
tool. This data fracturing is accomplished using separate computer hardware, usually
j151
j 5 Charged-Particle Lithography
152
Figure 5.16 Forward scattering at: (a) low electron energy and (b) high electron energy.
not intended for exposure this is back-scattering (b). Both scattering modes lead to
unintended exposure, and therefore to degraded resolution and distortion of patterns. This issue is known as the proximity effect [19], and should be minimized as
much as possible.
Forward scattering into the resist is addressed by increasing the acceleration
voltage of the electrons, as shown in Figure 5.16. Although the rst electron
lithography tools used U 10 kV, and current tools employ 50 kV, more recently
100 kV tools have been introduced. This increase in acceleration voltage leads also to
an improved projection performance, because the imperfections in the electron
optics are less pronounced.
However, back-scattering intensies with greater electron energy. With the currently required feature size being less than the range of back-scattered electrons, the
features are broadened by back-scattering, thus offsetting the resolution improvement by reduced forward scattering.
For optimal resolution and minimal distortion, correction methods are therefore
applied to overcome the proximity effect. A simple compensation technique, which
accounts for back-scattering only, is to use a second exposure equaling the background exposure of the rst one, with a reverse additional energy distribution [20].
However, as a second exposure is undesirable due to the additional writing time, a
proximity correction by dose variation is introduced during data processing for
converting the circuit designs [21]. Such a correction must be based on suitable
models for proximity effect predictions. Here, a useful tool is the Monte-Carlo-based
simulation of scattering from electron impact. Simulated trajectories for 100 electrons impacting into one point are shown in Figure 5.17, where both forward
scattering and back-scattering appear distinctively.
Figure 5.17 Simulated trajectories for 100 electrons impacting into one point.
Figure 5.18 Proximity function (point exposure distribution fitted within two Gaussian functions).
j153
j 5 Charged-Particle Lithography
154
negative resists (which are soluble from the start, and become insoluble when
exposed). While the chemistry of electron resists is usually considerably different
from that of photoresists, a parameter termed contrast can be dened to characterize
the resolution of the resists.
In order to determine the contrast of an electron resist, the developing rate
depending on the exposure dose is plotted; this is also called the characteristics of
the electron resist. For a low-exposure dose, the resist still behaves like an unexposed
resist, whereas for a high dose it is fully activated. The idealized characteristics of
positive and negative electron resists are shown in Figure 5.20.
Linearization yields two parameters that dene the characteristics: the resist
sensitivity D0, where the resist activation starts, and the resist activation Dv, after
which the resist is fully activated. The contrast is then dened as:
g
1
log DDV0
5:17
One of the rst resists to be developed for EBL was polymethyl methacrylate
(PMMA) [23]. Electron beam exposure breaks the polymer into fragments that are
dissolved by a solvent-based developer (see Figure 5.21). Because of its very high
resolution capability of <10 nm, PMMA is still used for certain R&D applications and
electron beam writer resolution tests. However, it is not suitable for commercial
lithography, mainly because of its poor resistance to dry etching.
Another group of early electron resists is based on a copolymer of chloromethacrylate and methylstyrene. One commercial resist for direct-write applications is
ZEP-520 [24], which has considerable advantages compared to PMMA. While
providing a comparable resolution of <10 nm, the sensitivity towards electrons is
10-fold higher, and the resistance to dry etching is 2.5-fold higher. Although these
resists are still used for R&D applications, they have fallen out of favor for commercial
lithography, because they require solvent-based developers that, because of their
rapid evaporation rate in air, introduce temperature gradients on wafers, and
therefore uniformity problems. A 45 nm 1 : 1 dense-lines pattern of a ZEP-520 resist
is shown in Figure 5.22 [25].
For some time, UV photoresists have been used as electron resists. This group of
resists is based on diazonaphthoquinone (DNQ), and has the advantage of using an
aqueous developer. Their resistance to dry etching is also threefold higher. However,
with resolutions <250 nm, and despite still being used extensively in optical
lithography for micro-electromechanical systems (MEMS) fabrication, their time in
j155
j 5 Charged-Particle Lithography
156
EBL has passed. A detailed presentation of the chemistry of DNQ photoresists can be
found in Ref. [26].
The impetus for the development of the electron resists used today was derived
from the need for a new type of photoresist required for the introduction of DUV
optical lithography. Because of the limitations of DNQ-based resists in resolution and
sensitivity, chemically amplied (CA) resists have been introduced [27]. These are
based on a multi-component scheme, where a sensitizer chemical causes dissolution
modication within the exposed areas of the polymer matrix. The latent image is
obtained from energy transfer to the sensitizer chemical molecules, causing a
degradation into their ionic pairs or neutral species, which can catalyze the reaction
events needed for solubility distinction. Commonly, photo acid generators (PAG) or
photo base generators (PBG) are utilized as sensitizers in CA resists.
In a positive CA resist the PAG, upon exposure, releases an acid. During heating of
the substrate after exposure (the post-exposure bake; PEB), this acid reacts with the
resin, which in turn becomes soluble towards an aqueous developer. In addition,
further acid is produced. With this multiplication reaction an exposed PAG molecule
can trigger up to 1000 reactions. It is also acknowledged that a CA resist shows a high
quantum yield compared to a DNQ-based resist.
Following the establishment of CA photoresists, specialized electron CA resists
have now been developed. The reaction mechanism of a positive electron CA resist is
shown in Figure 5.23. The CA electron resists that are used mainly in current
commercial EBL are the positive-tone FEP-171 [28], and the negative-tone NEB22 [29]. These resists both have resolutions <100 nm and show excellent process
performance, especially in photomask fabrication [30, 31].
However, with the need for <50 nm resolution especially for direct-write
applications the development and evaluation of more advanced positive and
negative CA resists is currently the subject of intense investigation [32]. For example,
the 50 nm dense lines and 70 nm dots with high contrast, obtained with a recently
developed and evaluated positive CA resist [33], are shown in Figure 5.24.
The major challenges in the development of CA resists for <50 nm resolution are
to: (i) reduce the diffusion length during PEB; (ii) improve etch stability; and (iii)
reduce the line edge roughness. For very small features, the molecular structure of
the resist contributes to the roughness of the lines, which can be a signicant fraction
of the linewidth. A measure of line edge roughness is the standard deviation s of the
actual line edge relative to the average line edge. The reduction of line edge roughness
is pursued by the application of resins with shorter molecules.
All of the electron resists discussed so far have been based on organic polymers.
Although, in principle, it has been shown that such resists can achieve a resolution
close to 10 nm, before applicable in manufacturing additional points must be taken
into consideration, including the above-mentioned line edge roughness. As polymers
are relatively large molecules, they cannot easily form smooth edges close to the
atomic scale. Hence, in parallel to the improvement of CA resists, inorganic electron
beam resists, such as hydrogen silsesquioxane (HSQ), are being pursued [34].
Initially, HSQ was used as a low-dielectric (low-k) material, with a k-factor of
2.53.0. In addition, HSQ demonstrates good spin-coating properties, such as good
gap-ll, global planarization, and crack-free adhesion. It also shows excellent proces-
j157
j 5 Charged-Particle Lithography
158
Although EBL has a wide range of applications, the major focus is currently on the
fabrication of transmission masks for DUVL and reective masks for EUVL. Another
important application is the direct-writing of patterns on wafers with single beams
(which is also referred to as direct-write EBL; EBDWL). EBDWL is mainly used for
fabricating device prototypes, and with recent improvements in shaped electron
beam lithography tools, is also suitable for the low-volume production of applicationspecic integrated circuits (ASIC) and other specialized devices, for example hard
disk heads. Multiple EBL such as maskless lithography (ML2), where a single electron
beam is split into multiple beams to enable massively parallel EBDWL, shows the
potential to complement or even replace optical lithography. The fabrication of
imprint templates for NIL by using EBL techniques similar to photomask making is
steadily gaining in importance. Finally, EBL is also combined with optical lithography
in volume production, where it is used to fabricate critical structures such as gates.
This approach termed mix-and-match lithography production, or hybrid lithography if
a single resist is utilized as both electron resist and photoresist has special
requirements.
5.2.3.1 Photolithography Masks
In optical lithography, the patterns on wafers are reproductions of those on a
photomask. As the photomask is used for thousands of chip exposures, the quality
of photomasks is critical for optical lithography. Photomasks are fabricated with
techniques similar to those used in wafer processing (see Section 5.1). A photomask
blank, a glass substrate with a deposited opaque lm (usually chromium) is coated
with a resist, and the latter is exposed with the pattern, developed, and the opaque lm
is etched. A processed photomask for DUV optical lithography is shown in
Figure 5.25 [35].
Three types of photomask are currently in use. The simplest type is the binary
mask (Figure 5.26a) [36], which employs only clear areas in the opaque chromium
lm to project the pattern to the wafer. Unfortunately, binary masks have the problem
that, because of diffraction, the edges of the resist lines do not become straight.
To rectify this problem, masks with molybdenum silicide lms, which function as
phase shifters, are used in phase-shift masks (PSM) (Figure 5.26b) [36]. Recently,
so-called chromeless phase lithography (CPL) masks have been introduced for a
resolution-enhancement technique (RET) called off-axis illumination, as shown in
Figure 5.26c [36].
The pattern of a CPL mask featuring 125 nm lines applicable for 32 nm optical
lithography is shown in Figure 5.27 [37]. An introduction to the functional principles
of the different photomask types and their application in optical lithography is
provided in Ref. [1].
Typically, photomask patterning is carried out with beam writers. Whilst for lowand medium-resolution masks, optical beam writers can be used, high-resolution
masks are prepared using electron beam writers [38]. The general EBL techniques
were described in Section 5.2.1; today, raster scanning has been replaced by vector
scanning, and both Gaussian-beam and shaped-beam writers are currently employed
for mask-making.
Gaussian-beam lithography is usually applied in a different way for photomasks
than as described in Section 5.2.1.3. In addition to the IC patterns, photomasks also
contain smaller features than the CD for optical proximity correction (OPC). When
using Gaussian-beam writing, creation of the pattern in Figure 5.28 requires a beam
size equivalent to the smallest feature (here, the right upper edge), but this leads to
long writing times. However, it is possible to expose this pattern with a beam-size
which is twice the size of the smallest feature. Because of the Gaussian intensity
distribution, the 2s circle touches the edges of the writing grid, whereupon the
outline of the pattern can be made by multiple exposures of the spots; this is referred
j159
j 5 Charged-Particle Lithography
160
to as multi-pass gray writing [39]. The straight outlines are exposed four times, while
the outlines of the pattern can be moved locally by exposing spots only three, two, or
one times. Because of the overlap between spots and the multiple passes, this method
has the additional effect of smoothing the exposure.
Although Gaussian-beam strategy is still used today, high-end mask-making has
become a domain of 50 kV variable-shaped beam writers, which have a signicantly
higher throughput than Gaussian-beam writers. The shaped-beam strategy is applied
as discussed in Section 5.2.1; however, mask-making encounters several specic
issues, each of which must be addressed.
In addition to the proximity effect (see Section 5.2.1.5), which is a short-range
phenomenon, long-range effects also appear which may stretch over large portions of
the masks. The re-scattering of incident electrons at the objective electron lens can
lead to a long-range background exposure, called a fogging effect. Further, an optimal
exposure result may be compromised by succeeding processing steps, such as
developing or etching. During developing, the concentration of the developer
chemical decreases faster in areas with dense patterns, than in areas with sparse
patterns. Similar effects appear in both wet and dry etching, and this may lead to long-
range distortions of the nal patterns, known as the loading effect. Whilst both fogging
and loading effects can, in principle, be corrected within post-exposure process steps
such as the PEB, the proximity effect correction methods of electron beam writers
have been successfully augmented for handling both fogging and loading effects.
As shown in Section 5.2.1.5, forward scattering into the resist can be reduced by
increasing the acceleration voltage of the electrons. However, this approach causes a
signicant increase in scattering within the substrate, and as a result the substrate is
heated up. For silicon wafers, this problem is less pronounced, because silicon has a
high thermal conductivity, and distortions of the wafer can be rectied by electrostatic
chucks with wafer backside cooling. However, photomasks have a low thermal
conductivity, and because of their large thermal capacity, the local temperature
increases signicantly during exposure. Therefore, mask writing tools currently do
not exceed 50 kVacceleration voltage, and there is a reluctance to employ 100 kV tools.
The fabrication of high-end photomasks with EBL requires specialized postexposure processing equipment. The PEB is a critical process step for CA resists,
requiring a temperature uniformity of <0.1 K within the resist plane of the mask.
Because of the large thermal capacity of photomasks, and their non-radial shape,
such temperature uniformity is difcult to achieve. The PEB equipment is preferably
connected directly to the electron beam writer, thus enabling the PEB and development to be conducted immediately after exposure. With such a direct connection, a
Figure 5.28 Writing a pattern with: (a) vector single-pass; and (b) with vector multi-pass strategies.
j161
j 5 Charged-Particle Lithography
162
Figure 5.30 Nanoscale planar double gate transistor. (a) Top view
of the design. (b) Scanning electron microscopy image of the first
fabricated structure: bottom gate (produced by EBDWL) [42].
signicant increase of the scattering within the substrate, and also heats up the
substrate, in the case of silicon wafers this problem is less pronounced due to silicons
high thermal conductivity. As distortions of the wafer can be rectied by electrostatic
chucks with wafer backside cooling, modern direct-writing tools employ an acceleration voltage of up to 100 kV.
Aside from the electron beam writer, specialized post-exposure processing equipment is required for reliable EBDWL. Similar to photomask fabrication, CA resists
are employed, for which the PEB is a critical process step, requiring a temperature
uniformity of <0.1 K within the resist plane of the wafer. However, because of the
high thermal conductivity of silicon wafers, such uniformity is less difcult to achieve
than with photomasks. As with photomask fabrication, the post-exposure processing
equipment is preferably connected directly to the electron beam writer, enabling the
PEB and development to be conducted immediately after exposure.
The fabrication of integrated circuits, either completely with EBL or with mix-andmatch lithography, poses some signicant challenges in integrating the required
Figure 5.31 FinFET. (a) The design model [43]. (b) Scanning
electron microscopy image of the actual device: with 50 nm gate
(produced by EBDWL) [44].
j163
j 5 Charged-Particle Lithography
164
process steps, and especially when aligning the different layers of patterned functional lms. These challenges and the methods used to overcome them are
discussed in Section 5.2.4.
5.2.3.3 Maskless Lithography
Considerable effort has been made towards making EBL available for volume
production. The initial approach had been to implement either separate multiple
electron optical columns [46] or separate multiple electron beams [47], to achieve
massively parallel EBDWL. However, as the adequate calibration of either multiple
columns or beams is a very challenging task, the suggestion was made to mimic
optical lithography by introducing a transmission mask and projection optics with
electromagnetic lenses to direct the electrons, which led in time to the development
of electron projection lithography (EPL). In parallel, the use of a 1 : 1 transmission
mask was also investigated, which led to the development of proximity electron
lithography (PEL). As mentioned in Section 5.1, EPL was removed from the ITRS in
2004 due to signicant difculties with fabrication and application of the EPL
transmission masks. PEL was subsequently removed in 2005, although its reemergence cannot be ruled out completely. Whilst EPL and PEL are currently
dormant, the advances made in electron optics have been signicant, and hence
the idea was conceived to devise electron projection and proximity techniques
without the use of a mask hence the term maskless lithography (ML2). As the
current ML2 techniques are, loosely, derivatives of EPL and PEL, the details of both
processes are explained in the following sections.
EPL: Principles and Limitations The transmission mask for EPL is a 4 : 1 stencil
mask, a thin membrane through which holes are etched for the transmission of
electrons. The stencil masks are themselves prepared using EBL, utilizing similar
processes as for photomasks (see Section 5.2.3.1). Due to the instability of the
membrane, fabrication has proven very challenging. In addition, a stencil mask
absorbs electrons where there are no holes, thus causing the mask to undergo
considerable heating, which then leads to distortions. Two concepts were devised to
overcome this problem:
.
SCALPEL (SCattering with Angular Limitation Projection Electron-beam Lithography) employed a scattering mask made from an extremely thin membrane
(<150 nm) of low-atomic-number material (e.g., silicon nitride), through which the
electrons can pass. The development of SCALPEL has been stopped, mainly
because even the smallest deviation in mask membrane thickness resulted in
intolerable intensity variations on the wafer. The main principles of SCALPEL are
detailed in Ref. [48].
PREVAIL (PRojection Exposure with Variable Axis Immersion Lenses) employed a
stencil mask with a thick membrane (12 mm), thereby scattering the electrons to
unexposed spots. This concept is quite similar to the vector-shaped beam strategy
presented in Section 5.2.1.4, but instead of a simple shape a quadratic portion of the
stencil mask was printed onto the wafer. The development of PREVAIL has also
been stopped, mainly because even today it is not possible to make a 4 : 1 stencil
r
e0
5:18
This electrostatic potential acts as an extended diverging lens. The global charge
effect can, in principle, be compensated for by a suitable electron lens [50].
This is not the case for the stochastic charge effect, which is especially pronounced
in a crossover, for example at demagnication, where all electrons interact within a
small space. The stochastic charge effect leads not only to a beam blur but also to a
beam energy spread, which in turn leads to chromatic aberrations. Global and
stochastic charge effects are illustrated in Figure 5.33.
PEL: Principles and Limitations The transmission mask for PEL is a 1 : 1 stencil
mask, and electrons are used for the proximity printing of this mask to the wafer.
Initially, PEL employed 10 keV electrons, but currently low-energy electron-beam
proximity lithography (LEEPL) [51], which utilizes low-energy electrons of 2 keV, is
the PEL technique of choice.
The principle of LEEPL is shown in Figure 5.34. A single electron beam is
generated in an electron beam column, and the mask is scanned. The low-energy
electrons minimizetheproximity effect, but forwardscatteringdegradestheresolution.
j165
j 5 Charged-Particle Lithography
166
As the range of 2 keV electrons in the resist is <150 nm, the resist thickness for LEEPL
is limited to 100 nm. In order to achieve the required aspect ratios, BLR [52] processes,
which initially were developed for extending DUV lithography towards smaller
features, must be applied (see Figure 5.35). Currently, LEEPL is not included in the
ITRS as it has encountered several problems. For example, as this is a proximity
technique, the distance between the stencil mask and the wafer is very small, usually
<50 mm. Therefore, any distortion of the mask or wafer, or the presence of particles,
would severely compromise the exposure. Furthermore, the implementation of bilayer electron resist processes is not yet satisfactory. One positive development
Figure 5.35 Comparison of single-layer (a) and bi-layer (b) resist processes [51].
however has been that, recently, stencil masks with a 26 26 mm exposure eld could
be prepared.
Projection Maskless Lithography Projection ML2 (PML2) [53] can be seen as the ML2
equivalent of EPL. Here, a single electron beam is split into multiple beams, with
imaging being accomplished by a programmable aperture plate system (APS) [54, 55].
A range of innovative technologies was introduced to overcome the specic problems
of both electron beam direct write and multiple beam application. The column of the
demonstration system is shown in Figure 5.36 [53]. This employs a single electron
source, therefore avoiding the control problems with multiple sources. The primary
electron beam has a low acceleration voltage of U 5 kV, and is widened by
condenser optics to fully cover the APS. Because of the low energy of the electrons,
the APS cover plate experiences no signicant thermal expansion problems. Subsequently, hundreds of thousands of separate electron beams emerge from the APS,
Figure 5.36 Schematic diagram of the PML2 multi-electron-beam column demonstrator [53, 54].
j167
j 5 Charged-Particle Lithography
168
and are accelerated by U 100 kV, which results in a very high contrast when
imaging on the wafer.
The APS, which is shown in detail in Figure 5.37 [5355], consists of a cover,
blanking, and aperture plate. The blanking plate employs MEMS-based structured
electrodes for each of the transmission holes, which deect the electron beam to
strike the aperture plate and provide opaque features on the wafer. The diameter of
each transmission hole is 5 mm. By using a two-stage electron optics, a reduction of
200 is achieved, leading to a beam size of 25 nm on the wafer. Currently, the
exposure area of the APS is 100 100 mm.
With this exposure area, patterns on the wafer are written in stripes as shown in
Figure 5.38 [53]. As discussed in Section 5.2.1, this may potentially lead to butting
errors but, due to the small stripe size of <300 mm compared to current vector
shaped-beam tools, no difculties are expected.
The major challenge for PML2 is certainly to provide the required data transmission rate to the APS control electronics. A proof-of-concept (POC) tool was intended
Figure 5.38 The writing strategy of a multi-electron beam system [53, 54].
within the MEDEA CMOS logic 0.1 mm project [56], for which a data transmission
rate of 36 Gbit s1 was demonstrated, scalable by channel count. A throughput of
approximately 0.1 of a 300-mm wafer per hour was intended with this POC tool,
although commercial tools should expose up to ve 300-mm wafers per hour.
However, this is still a small throughput compared to optical lithography steppers,
with typical exposure throughputs of >100 wafers per hour. Nonetheless, this would
be a great leap from EBDWL, which accomplishes much less than 0.1 wafer per hour
in 65-nm patterning. Another issue is that the 200 reduction requires all electrons to
cross in a single region, thus leading to global and stochastic space charge effects,
which potentially limit the throughput for the 22-nm node to less than one wafer per
hour. To address this problem, an innovative PML2 scheme, with a throughput
potential of up to 20 wafers per hour for the 32 and 22-nm nodes, is currently being
investigated within the Radical Innovation MAskless NAnolithography (RIMANA)
project [57].
Proximity Maskless Lithography Proximity ML2 (Mapper) [58] can be seen as the
ML2 equivalent of LEEPL. Within Mapper, low-energy electrons of 5 keV are used,
and the multiple electron beams are generated by splitting a single electron beam that
originates from a single electron source. The multiple beams are then separately
focused within an electrostatic lens array. The electron beams are arranged in such a
way that they form a rectangular slit with a width of 26 mm, the same width as a eld
in current optical steppers. During exposure, the beams are deected over 2 mm
perpendicular to the wafer stage movement. With one scan of the wafer a full eld of
26 33 mm can be exposed. During simultaneous scanning of the wafer, and
deection of the electron beams, these beams are switched on and off by light
signals, one for each beam. The light signals are generated in a data system that
contains the chip patterns in a bitmap format. The column of the proof-of-lithography
(POL) tool, implementing 110 electron beams, is shown in Figure 5.39 [58]. A
commercial tool would most likely implement 13 000 electron beams, so the bitmap
would be divided over 13 000 data channels and streamed to the electron beams at up
to 10 GHz, thus enabling a throughput of ten 300-mm wafers per hour.
As with LEEPL, the major challenges for Mapper are the problems arising with the
proximity of the exposure, and the resist process. Additionally, as Mapper imple-
j169
j 5 Charged-Particle Lithography
170
ments an electrostatic lens array within a <50 mm distance to the wafer, it is still
unclear how the cleanliness of this array can be maintained during wafer throughput.
Overall, as ML2 techniques may provide the performance and throughput advantages of EPL and PEL, whilst avoiding the problems arising from mask fabrication
and application, they should have the potential to rival optical lithography [59].
5.2.3.4 Imprint Templates
Nanoimprint lithography (NIL), considered to be a lithography method with the
potential to rival optical lithography, is a technique where a patterned template is
pressed onto a substrate coated with resist [60]. Currently, photoactivated NIL (PNIL),
which uses a monomer resist with low viscosity, is considered to show the highest
potential for volume production. The template, which must be constructed from a
transparent material such as fused silica, is pressed onto the sample, after which a
polymerization reaction is induced in the resist by applying UV light (thus, the
technique is also called UV-NIL), and the template is removed.
Whilst NIL in itself does not involve exposure with photons or charged particles,
the patterned template must be fabricated rst, similar to a photomask for optical
lithography. The templates are prepared by electron beam lithography, using similar
processes as for photomasks (see Section 5.2.3.1). As the template is reproduced
without demagnication, and therefore requires the same feature size as the pattern
on the wafer, both fabrication and application still pose certain challenges. Currently,
efforts are under way to utilize photomask fabrication methods for making PNIL
templates [61]. As mentioned above, the templates must be transparent to UV light,
and therefore fused silica photomask blanks may be used for template making. An
overview of the template process ow is shown in Figure 5.40: in a rst lithography
step, the template is structured by EBL (rst write). In a second lithography step, the
pedestal required for imprint is made (second level write). Further details of the
template process ow are presented in Refs. [62, 63].
When using photomask fabrication methods, currently four imprint templates are
structured on a single photomask blank, as shown in Figure 5.41a and b. The
photomask blank is then diced into separate templates (Figure 5.41c). As the dicing
introduces contamination and mechanical strain, a modied fabrication approach
must be developed before NIL can be employed in volume production. The size
standard for templates resembles the exposure eld of current optical lithography
steppers (Figure 5.41d).
Complete details of the NIL imprint processes are described in Chapter 7 of this
volume.
5.2.4
Integration
Although EBL is widely used in research because of its exibility and high resolution,
its low throughput and complex maintenance requirements of electron beam writing
tools have limited the use of EBDWL in volume production. However, continuous
improvements have led to the development of reliable tools with shaped beam writing
Figure 5.40 An overview of the fabrication process for two-dimensional templates [62, 63].
Figure 5.41 Application of photomask fabrication methods for template making [64].
strategy, which fulll the requirements of the current 65 nm node fabrication which,
according to the ITRS, are 65 nm dense lines, 45 nm isolated lines, 90 nm contact
holes, and an overlay accuracy of <25 nm [16]. Yet, while the required resolution and
repeatability has been achieved through developments in tools and CA resists, the
overlay accuracy has long posed a major challenge.
Integrated circuits consist of several functional layers, for example metallizations and barriers, which must be fabricated sequentially. In optical lithography,
each layer is patterned by a suitably made photomask. When using EBL, two
scenarios can occur:
j171
j 5 Charged-Particle Lithography
172
.
.
Either all layers are made by EBL, which is the case for making device
prototypes [65]
Just one critical layer, for example the gates, are made by EBL, and all other layers
are made by optical lithography.
The use of mixed EBL and optical lithography in volume production is referred to
as mix-and-match lithography production [66], or hybrid lithography [67] if a single
resist is utilized as both electron resist and photoresist.
In order to ensure that all subsequent layers are exactly matched, aligning
techniques are used in optical lithography. One widely used alignment method in
wafer steppers is through-the-lens alignment, where an alignment mark on the wafer
is projected onto an alignment mark on the photomask, and a comparison is made.
However, this approach is not possible with EBL tools; rather, two types of EBL
alignment mark are currently used. The rst option is to employ marks made from a
lm of high-atomic-weight material. This type of mark can be detected by secondary
electron emission, but the method may lead to contamination issues and it is,
therefore, mostly used only for back-end processing [68]. The second option is to
create trenches as marks (see Figure 5.42), which are then scanned. This EBL
alignment strategy has been used successfully in creating 65 nm node integrated
circuits with hybrid lithography [69].
Figure 5.42 Trench alignment mark for EBDWL of a 25 25 mm integrated circuit [69].
5.3
Ion Beam Lithography
5.3.1
Introduction
Ion beam lithography (IBL) either utilizes ions to induce a chemical reaction in an ion
resist for pattern formation, or can directly structure a functional lm such as a
metallization or barrier layer. When using ion resists, the wavelength of accelerated
ions is even smaller than that of electrons, because of their higher mass. For example,
the mass of H is 2000-times the mass of an electron, and therefore a calculation
analogous to that in Section 5.2.1 yields:
h
l p
2mQU
5:19
Although IBL is not used for integrated circuit fabrication, it is being applied
and continuously improved for the direct-structuring of functional lms in the
j173
j 5 Charged-Particle Lithography
174
fabrication of special devices, such as nano-electromechanical systems (NEMS), nanophotonics, nanomagnetics, and molecular nanotechnology devices. Direct-structuring
is also currently being investigated for the fabrication of imprint templates for NIL.
5.3.2.1 Direct-Structuring Lithography
Focused ion beam (FIB) tools can be used to create patterns in functional lms, such
as metallizations, or barriers on wafers directly, which is referred to as ion directstructuring (IDS) lithography. The ions, when striking the functional lm, cause the
material to sputter, such that IDS is also known as ion milling. Another possibility is
the local deposition of a functional lm, with ions inducing the decomposition of a
process gas at the surface of the wafer.
One major application of FIB tools is the repair of photomasks for optical
lithography. As noted above (see Section 5.2.3.1), photomask quality is of utmost
importance for yield. However, if due to a problem in the manufacturing process a
part of the opaque lm is sticking where it should have been removed, it can be
sputtered away (see Figure 5.43) [71] or, if part of the opaque lm is damaged, then it
can be partially redeposited (see Figure 5.44) [71].
For some applications it is also reasonable to employ techniques initially developed
for IPL, which introduced a transmission mask and projection optics with electromagnetic lenses to direct ions. This derivative of IPL is called ion projection direct
structuring (IPDS). As mentioned in Section 5.1, IPL was removed from the ITRS in
2004 due to signicant problems with fabrication and the application of the
transmission masks.
IPL requires the use of stencil masks (see Section 5.2.3.3), because ions cannot
pass through membrane masks, even with the thinnest imaginable membrane.
However, a stencil mask absorbs the ions where there are no holes, and the resultant
heating of the mask leads to its distortion. Further, with stencil masks it is not
possible to make all required patterns with a single exposure, and so sets of
complementary masks are required (see Section 5.2.3.3). Although initially these
issues appear to make IPL impractical, studies to rectify the situation are ongoing. For
example, thermal radiation cooling could be utilized to solve the heating problem,
Figure 5.43 Opaque film defect: the repair of an undersized contact hole [71].
while single stencil masks with fourfold exposure could enable complex patterns, but
require half-sized features [54]. A comprehensive discussion of IPL, in addition to the
details of a proposed IPL system, are presented in Refs. [54, 72].
IPDS can, for example, be applied to structure magnetic media for high-density
data storage in a single exposure by inter-mixing the lms of a multilayer structure [73]. For such an application a single stencil mask is sufcient [74].
Considerable effort has also been made towards developing IPDS for volume
production. The most-often investigated approach is to use the technique for
multiple EBL (see Section 5.2.3.3), to split a broad ion beam into multiple beams,
and to image with a programmable APS. This multiple ion beam projection maskless
patterning (PMLP) technique is currently being developed within the project
CHARPAN (CHARged PArticle Nanotech) [53, 75], and the column of the demonstration system used is shown schematically in Figure 5.45 [53, 75].
A possible multi-ion-beam tool resulting from CHARPAN would have a wide range
of applications. In addition to IPDS, several other ion-beam-induced patterning
j175
j 5 Charged-Particle Lithography
176
5.4
Conclusions
The continuous improvement of EBL has always placed it one step ahead of the most
advanced optical lithography in integrated circuit fabrication. Although mask fabrication for optical lithography is still its principal application, EBL has become a time- and
cost-effective technique for early device and technology development. Further, with
mask costs currently showing huge increases, EBL represents a viable option for smallvolume production, despite its comparatively low throughput. Additionally, even in
medium-volume production, EBL is employed for writing critical layers within mixand-match and hybrid lithography. Because of ever-increasing device complexity,
applying EBDWL, using shaped-beam writing tools in combination with advanced
CA resists, is mandatory. In parallel, efforts are being continued in the investigation of
parallel electron beam writing systems (ML2), which show the potential almost to
match the throughput of optical lithography, and thus may in time complement or even
replace the latter process for high-volume production.
In contrast, in its current form, IBL is not applicable for integrated circuit fabrication,
although further improvements in FIB tools towards IPDS, as well as the development
of parallel ion beam writing systems (PMLP), may lead to its feasible application.
Integrated circuit fabrication aside, both EBL and IBL techniques are currently
being used and continuously improved for the fabrication of special devices in low
volume, such as nano-electromechanical systems, nanophotonics, nanomagnetics,
and molecular nanotechnology devices.
Acknowledgments
The authors thank L. Markwort of Carl Zeiss AG, H. Loeschner of IMS Nanofabrication GmbH, and H.J. Doering and T. Elster of Vistec Electron Beam GmbH, for their
References
References
1 Levinson, H.J. (2001) Principles of
Lithography, SPIE, The International
Society for Optical Engineering,
Bellingham, WA.
2 Rai-Choudhury, P. (1997) Handbook of
Microlithography, Micromachining and
Microfabrication Vol. 1, SPIE, The
International Society for Optical
Engineering, Bellingham, WA.
3 Attwood, D. (2000) Soft X-rays and Extreme
Ultraviolet Radiation, Principles and Applications, Oxford University Press, Oxford.
4 International Technology Roadmap for
Semiconductors, www.itrs.net.
5 Craighead, H.G. (1984) 10 nm resolution
electron-beam lithography. Journal of
Applied Physics, 55, 4430.
6 Breton, B. (2004) Fifty Years of Scanning
Electron Microscopy, Academic Press.
7 Hawkes, P.W. (1989) Electron Optics,
Academic Press.
8 Ximen, J. (1990) Canonical aberration
theory in electron optics. Journal of Applied
Physics, 68, 5963.
9 Hu, K. and Tang, T.T. (1998) Lie algebraic
aberration theory and calculation method
for combined electron beam focusingdeection systems. Journal of Vacuum
Science & Technology B, 16, 3248.
10 Rose, H. and Wan, W. (2005) Aberration
correction in electron microscopy. IEEE
Particle Accelerator Conference Proceedings,
p. 44.
11 Chu, H.C. and Munro, E. (1998)
Computerized optimization of electronbeam lithography systems. Journal of
Vacuum Science & Technology B, 19, 1053.
12 Uno, S., Honda, K., Nakamura, N.,
Matsuya, M. and Zach, J. (2005) Aberration
correction and its automatic control in
scanning electron microscopes. Journal for
Light and Electron Optics, 116, 438.
j177
j 5 Charged-Particle Lithography
178
33 Qimonda, www.qimonda.com.
34 Lutz, T., Kretz, J., Dreeskornfeld, L., Ilicali,
G. and Weber, W. (2005) Comparative
study of calixarene and HSQ resist systems
for the fabrication of sub-20 nm MOSFET
device demonstrators. Microelectronic
Engineering Elsevier, 479, 7879.
35 Advanced Mask Technology Center, www.
amtc-dresden.com.
36 ASML, www.asml.com.
37 Koepernik, C., Becker, H., Birkner, R.,
Buttgereit, U., Irmscher, M., Nedelmann,
L. and Zibold, A. (2006) Extended process
window using variable transmission PSM
materials for 65 nm and 45 nm node. SPIE
Proceedings, Vol. 6283, p. 1D.
38 Beyer, D., Loffelmacher, D., Goedel, G.,
Hudek, P., Schnabel, B. and Th. Elster,
(2001) Tool and process optimization for
100 nm mask making using a 50 kV
variable shaped e-beam system. SPIE
Proceedings, Vol. 4562, p. 88.
39 Dameron, D.H., Fu, C.C. and Pease, R.F.W.
(1988) A multiple exposure strategy for
reducing butting errors in a raster-scanned
electron beam exposure system. Journal of
Vacuum Science & Technology B, 6, 213.
40 Berger, L., Dress, P., Gairing, T., Chen,
C.J., Hsieh, R.G., Lee, H.C. and Hsieh,
H.C. (2004) Global CD uniformity
improvement for mask fabrication with
nCARs by zone-controlled post-exposure
bake. Journal of Microlithography,
Microfabrication, and Microsystems, 3, 203.
41 Ehrlich, C., Edinger, K., Boegli, V. and
Kuschnerus, P. (2005) Application data of
the electron beam based photomask repair
tool MeRiT MG. SPIE Proceedings, Vol.
5835, p. 145.
42 Weber, W., Ilicali, G., Kretz, J.,
Dreeskornfeld, L., Roesner, W., Haensch,
W. and Risch, L. (2005) Electron beam
lithography for nanometer-scale planar
double-gate transistors. Microelectronic
Engineering Elsevier, 206, 7879.
43 Kretz, J., Dreeskornfeld, L., Hartwich, J.
and Roesner, W. (2003) 20 nm electron
beam lithography and reactive ion etching
for the fabrication of double gate FinFET
References
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
j179
j 5 Charged-Particle Lithography
180
j181
6
Extreme Ultraviolet Lithography
Klaus Bergmann, Larissa Juschkin, and Reinhart Poprawe
6.1
Introduction
6.1.1
General Aspects
l
;
NA
DOF k2
l
NA2
6:1
182
(EUV) range. A reduction from 193 to 13.5 nm in EUV lithography relieves the
situation with the process constants and the numerical aperture. Thus, 16 nm is
expected to be printable with a NA of 0.35 and k1 equal to 0.41 [1].
According to the current roadmap for semiconductors [2], EUV lithography will be
introduced into the production process at the 45 nm node during the year 2011,
together with improved 193 nm technologies. In moving towards ever smaller
features below 22 nm and approaching the physical limits of silicon-based chips,
EUV seems to be the only photon-based solution from todays point of view [1].
The step from DUV to EUV, however, implies a variety of technological changes
compared to the conventional technology. Operating in the EUV range requires that
all components are held in a vacuum in order to avoid absorption in an ambient gas.
EUV radiation has the strongest interaction with matter that is, the highest crosssection for absorption. Typical penetration depths of EUV radiation into solids are in
the range of a few hundreds of nanometers. The optical system and the mask to be
imaged onto the wafer consist of reecting multilayer mirrors, and the source is no
longer a UV laser but rather an incoherent plasma source emitting isotropically with a
wavelength around 13.5 nm. The technology still requires further development
before reaching industrial maturity, which is expected to be achieved by about 2010.
A simple discussion of Equation 6.1 allows an estimation to be made of the
required wavelength region aiming at a resolution of, for example, better than 70 nm
and a depth of focus of more than several hundreds of nanometers. This consideration leads to a wavelength below 20 nm. Multilayer-based mirrors with a high
normal incidence reectivity at a wavelength of about 13 nm are currently available,
and the semiconductor industry has xed the wavelength to 13.5 nm as a standard to
maintain lithium-based plasmas as an option, which have a strong line emission at
this wavelength. With our current knowledge of all components, such as mirror
reectivity, sensitivity of the resist, parameters of the optical system and the desired
wafer throughput, it is possible to specify the requirements for the source, which can
be considered as the least known component of the system. Although a solution for
the nal concept has not yet been found, some early examples of source and system
specications have been identied, for example in Refs. [3, 4]. In addition, the actual
requirements are updated continuously, taking into account new aspects and
increasing knowledge [5].
The present chapter provides an overview of the system architecture of an EUV
scanner, together with the demands made on each component, namely the light
source, optical components for beam propagation and imaging, masks and resists.
The current status of development and future challenges are also addressed.
6.1.2
System Architecture
A variety of EUV lithography systems have been installed during the past few years in
order to test the whole chain, beginning at the source and ending at the wafer level or
using simpler, high-NA systems with small imaging elds for printing ne structures [4, 6, 7]. The principle of an EUV scanner will be explained with the example of
6.1 Introduction
Figure 6.1 Schematic view of the ASML alpha demo tool, the first
full-field scanner for extreme ultraviolet (EUV) lithography. The
collector and source are not shown clearly; the beam propagation
at the respective location of the source collector module is shown
on the left.
the ASML alpha demo tool demonstrator, which can be regarded as the latest
development and which is close to the future lithography tool with respect to design.
A schematic of the alpha demo tool is shown in Figure 6.1 (taken from Ref. [8]). Using a
collector preferably a Wolter-type nested shell collector the light of a plasma source
is focused into the so-called intermediate focus as the second focal point of the collector
(Figure 6.1 shows only the beam propagation from the source and the collector to the
second focus, but not the hardware itself). The light is fed into the illuminator, which
consists of a set of spherical multilayer mirrors and is used to produce a bananashaped illumination of the mask (the top optical element in Figure 6.1). Another set of
mirrors is used to image this eld onto the wafer (the bottom optical element in
Figure 6.1), with a typical magnication of 0.25. Wafer and mask are moved
continuously to scan the whole mask and to transfer the structures onto a wafer of
typically 300 mm diameter. In contrast to conventional DUV scanners, all of the
components are contained inside a vacuum in order to achieve a high optical
transmission for the EUV light. The etendue of the current optical system is around
3.3 mm2 sr, which leads directly to a specication for the source size [9]. The optical
system is able to use all the light from a spatially extended plasma source of around
1.6 mm in length along the optical axis and 1 mm in diameter in the radial direction,
which is emitted into the solid angle of the collector. The plasma source is operated in
a pulsed mode, ultimately requiring repetition rates of 710 kHz to guarantee a
sufciently homogeneous illumination of the resist. There are requirements on all
components of the scanner that is, the source, collector, optical system and
components, masks, resist and on the system itself, concerning, for example,
vacuum conditions and contamination issues. Today, the source is regarded as the
most critical component, although this might also be due to the fact that other
j183
184
components were developed prior to of the plasma source, and consequently less
experience is available with this component.
The wafer throughput is furthermore dependent on the mechanical properties of
the mask and the wafer handling system. According to Ref. [10], a simplied wafer
throughput model can be formulated:
T
T scan N T oh
Nt
2acc tsettle texp tsettle tdec
3 T oh
N4
2P
L HWR5
2tsettle
T oh
aw WR
P
6:2
where Tscan is the scanning time per eld, N is the number of elds per wafer, Toh is
the overhead time (wafer exchange, wafer alignment, . . .), tacc (tdec) is the acceleration
(deceleration) time, texp is the eld exposure time, tsettle is the stage settling time
after acceleration and before deceleration, P is the EUV intensity on wafer, aw is
the acceleration of the wafer stage, W(H) is the eld width (arc height slit width) of
the banana-shaped eld, L is the eld height, and R is the sensitivity of the resist.
With this model the wafer throughput as a function of the reticle stage acceleration
has been estimated for different illumination power levels ranging from 160 to
640 mW on the wafer [5]. The higher dose leads to a higher throughput only if
the acceleration is increased; for example, an 80 wafers per hour throughput can
only be achieved for the 640 mW, if the acceleration is more than 1.5 G. In the
160 mW case, a lower acceleration is required. With higher acceleration values and
higher power levels at the wafer throughput is, of course, higher. These numbers are
based on a R 5 mJ cm2 resist, a stage settling time of 25 ms, an overhead time of
11.5 s, a eld size of 25 25 mm, a number of 89 elds, and an exposure slit
of 2 25 mm2.
The rst results have been obtained with the alpha demo tool, and the possibility
of full-eld imaging has been successfully proven. An example of different printed
lines of 50 nm down to 35 nm and the corresponding line edge roughness (LER)
using a resist of 18 mJ cm2 sensitivity, is shown in Figure 6.2. Full-eld imaging
Figure 6.2 First results obtained with the ASML alpha demo tool
of printed lines and spaces with resolution down to 35 nm and the
respective LERs. The sensitivity of the resist was 18 mJ cm2.
over more than 20 mm slit height at the wafer level with a depth of focus of more
than 240 nm has been demonstrated experimentally [6, 8].
6.2
The Components of EUV Lithography
6.2.1
Light Sources
According to Ref. [5], some of the requirements for the source and lifetime of the
system for a production tool are as follows:
13.5
0.27
100
>115
710
12
12
30 000
<3.3
to be determined
The use of a multilayer mirror system (see Section 6.2.3) restricts the usable
bandwidth to 2% around 13.5 or 0.27 nm, which is termed inband radiation. The
throughput model is based on a 5 mJ cm2 sensitivity of the resist, which has not yet
been achieved (as discussed below). A less-sensitive resist would lead to higher
source power specications. The incoherent plasma source emits not only light but
also debris in the form of particles, at least from the EUV-emitting plasma,
irrespective of the source concept. Thus, some type of debris mitigation element
is required between the source and collector appropriate to the actual source design.
Typical collector half-opening angles range up to 7080 . The total overall efciency of
the collector and the debris mitigation system can be estimated as around 20% of all
the inband light emitted in the hemisphere of 2psr [12], assuming a transmission of
the debris mitigation system of 50%. This requires an inband emission of the source
of at least 600 W/(2% b.w. 2psr). Reasonable conversion efciencies in the range of, at
maximum, a few percent of the input energy for usable EUV radiation require a
power input in the range of several tens of kilowatts. This imposes strict thermal
demands, especially for the cooling of the debris mitigation system and the collector,
which is the closest optical element to the source.
The multilayer mirrors have a nite transmission in the DUV range of 130 to
400 nm, which means that the emission from the source should not be too great
within this wavelength region, as the resists are also sensitive in the DUV. The nal
j185
186
specication is still under consideration and is, for example, dependent on progress
in spectral purity lters with minimum losses for EUV radiation.
6.2.1.1 Plasmas as EUV Radiators
Extreme ultraviolet radiation sources can be divided into thermal and non-thermal
emitters. Non-thermal emitters are X-ray tubes or synchrotron radiation sources,
where the radiation is generated by deecting charged particles. Thermal emitters,
based on the generation of hot plasmas, are a cost-effective and compact solution for
EUV lithography. Generally, for thermal emitters matter is heated up to a high
temperature, T, where the limit of the emission of light can be described by Plancks
law of radiation:
Bl T
2hc 2
l
hc
e kB T l 1
1
6:3
with speed of light, c, Plancks constant, h, and Boltzmann constant, kB. For a blackbody radiator, the temperature and the wavelength of maximum emission are related
by Wiens law, which is derived from Equation 6.3:
lmax T 250 nm eV
6:4
Figure 6.3 Typical emission spectra in the EUV for tin- and xenonbased gas discharge plasma sources. In the case of tin (bold line),
the laser-induced plasma appears similar, whereas for xenon
the laser-induced emission spectrum is smoother due to
overlapping emission lines. The transitions of tin around 13.5 nm
are iso-electronic to those of xenon around 11.0 nm.
1:11
10
cm
ncrit
llaser
e2
The temperature of the resulting plasma roughly scales with the laser intensity,
4=9
ILaser, according to T e I Laser [14, 15]. For a Nd:YAG laser with llaser 1.064 mm, the
electron temperature, Te, can be estimated as described in Ref. [16]:
4
6:6
j187
188
this type of plasma. Maximum conversion efciencies of 5%/(2psr 2% b.w.) for solid
tin targets have been reported in the literature [18]. The conversion efciency is
dened as the ratio of usable inband EUV radiation into 2psr to the incident laser light
energy. In order to meet the source power requirement of 115 W in the intermediate
focus, an average laser power of more than 5 kW is required, assuming an optimistic
efciency of 50% for the collector and the debris mitigation system. Obtaining highpower pulsed lasers at this level is an issue in current research and development
activities. Different laser concepts are under discussion, such as pulsed CO2 lasers or
solid-state diode pumped laser, as reported elsewhere [19, 20, 22, 24]. However, it has
not yet been shown that laser-induced plasmas can operate continuously on this
power level. Further details on laser-induced plasma are available elsewhere [21, 23].
Besides the availability of the laser itself, the target is still an issue. Currently,
different target concepts are under discussion, such as mass-limited targets to reduce
debris production to a minimum level, and gaseous or droplets targets from frozen
liquids or gases [18]. Most of the effort is currently being expended on tin-based
targets, which have the highest expected conversion efciencies.
6.2.1.3 Gas Discharge Plasmas
Producing the hot plasma by an electrical discharge is another well known method for
the generation of light. For EUV-emitting plasmas, a pulsed electrical current is fed
into an electrode system, which is lled with the working gas to be heated up at a
neutral gas pressure of several tens of Pa. In a simplied concept, the current can be
assumed to ow through a plasma cylinder, which is compressed by the self-magnetic
eld of the current to a high density of up to typically ne 1019 cm3. The plasmas
also experience ohmic heating, nally resulting in a dense and hot plasma column of
several tens of eV electron temperature, and with a typical diameter of several
hundreds of micrometers and a length in the range of few millimeters. The necessary
current, Io, can be approximated by assuming a Bennet equilibrium of the magnetic
force and the plasma pressure [25]:
m0 I20
ni ne kB T e
8p2 r 2p
6:7
where m0 is the magnetic eld constant, rp is the radius of the compressed plasma
column, and ni and ne are the electron and ion density, respectively. The term
r 2p ni ne can be expressed by the starting radius, a, of the neutral gas column and
the neutral gas pressure, p, by pa2 [26]. As an example, for a xenon plasma with 10-fold
ionized ions (hZi 10, ne hZi ni) and a desired electron temperature of 35 eV, a
current of 8 kA results, which is also characteristic of the devices under investigation.
This pulsed current is usually produced in a fast discharge of a charged capacity, C,
which is connected in a low-inductive manner to the electrode system. Typical values
for the inductance of the system are around 10 nH and few 1001000 nF for the
capacity. Stored pulse energies are in the range from 1 to 10 J.
A variety of different concepts exist for discharge-based plasmas, which differ
mainly in the special geometry of the electrode system and the ignition of the plasma.
For further information the reader is referred to numerous other reports [11, 27, 28].
j189
190
Figure 6.5 The Philips NovaTin EUV source based on the vacuum
arc concept, which is used in ASMLs alpha demo tool.
The currently preferred technical solution for collecting the light of the isotropically
emitting plasma is the nested Wolter-type multi-shell collector [41]. A single collector
shell consists of a hyperboloid and an ellipsoidal shell with identical focal points. The
source is located within this common focal point, and is focused into the other focal
point (intermediate focus), where the collector, although not an imaging element,
leads to a magnication of the source by a factor of about 10 in the intermediate
focus. The rst optical element is about 50100 cm behind this second focus. This
type of collector allows light to be collected over a relatively large opening angle with a
moderate gracing incidence angle at the reecting surface, which is of particular
importance for high reectivity in the EUV. Such gracing incidence optics usually
have a ruthenium coating with large reectivity up to angles of 20 , which is typical
of applications in EUV lithography. An example of a multi-shell collector which is
used for modeling light distribution after the intermediate focus at the rst optical
element of the illuminator is shown in Figure 6.7. A collection angle of more than 80
half-opening angle with a total efciency, including the nite reectivity of the
ruthenium coating, of more than 40%/(2psr) is reported from the collector
supplier [12].
Figure 6.8 shows the theoretical angular-dependent reectivity of a ruthenium
coating for different surface roughnesses, s, and the typical range of operation for the
collector. It should be noted that two reections occur for the hyperboloid and the
j191
192
ellipsoid. The data points are based on the atomic data for the refractive index
published by CXRO [42]. The state of the art at the collector manufacturers for
example, Zeiss or Media Lario involves surface roughness below 1 nm and
chemically clean surfaces based on a physical vapor deposition (PVD) coating
technology. Thus, the transmissions of the collectors are close to the theoretical
limit. In addition, there is no longer any difculty in fabricating substrates for the
shells with diameters of several tens of centimeters.
Figure 6.7 also shows a typical light distribution for an eight-shell collector and a
point source in a plane after the intermediate focus; this indicates the contributions of
the different shells and the shadow of the mechanical support structure for the shells.
Usually, the collector is located at a distance of only a few tens of centimeters from the
plasma, which is also a thermal source in the 10 kW range. Cooling of the collector is
achieved by using water-cooled lines around each collector shell. Results relating to
the cooling capabilities are reported in Ref. [43], with a temperature increase of less
than 1 when operated in the vicinity of a high-power source. Currently, research is
going on in order to clarify whether this approach is sufcient, or whether more
sophisticated cooling strategies must be applied, including for example a homogeneous temperature increase of the shell surfaces.
Another option is to have a normal-incidence collector based on a Schwarzschild
design, with two spherical, multilayer coated mirrors. Some possible solutions to this
problem are presented in Refs. [4446].
As the closest optical element to the source, the collector experiences most of the
heat load and debris emitted from the source. Consequently, overcoming the
problems of a limited collector lifetime represents some of the major issues in
current EUV lithography development activities. Today, such investigations are
under way not only in industry but also at various academic institutes, all of which
have encountered this problem [47, 48]. Within the present chapter, the details of
various debris mitigation schemes, and the results obtained, are restricted to the
activities at Philips, whose clear aim is to integrate the above-mentioned tin-based gas
discharge source.
The minimum number of particles necessary for the effective generation of an
EUV-emitting pinch plasma is about 1015 atoms per pulse. These particles can be
assumed to be emitted into 4psr, partly redeposited onto the electrodes, and also
emitted towards the optical system. For a 1-hour operation at 5 kHz this will require
approximately 3.5 g of tin. Such an amount is clearly excessive, assuming that this will
be deposited on the collector surface, where even a few nanometers thickness is
unacceptable due to the reduced reectivity of a tin-coated surface. Hence, both the
deposition of material and sputtering of the optical coating by the emitted particles
must be avoided. The particles are emitted in the form of fast ions with energies
exceeding at least 10 keV, as neutrals, and also as droplets from the wet electrode
surfaces. One highly effective method of stopping and removing particles is the socalled foil trap concept, which is described in more detail in Ref. [48]. The process,
which is shown schematically in Figure 6.9, includes a system of lamellas located
between the source and the collector. The foil trap is operated in combination with a
buffer gas, usually argon with high transmission in the EUV. The emitted particles
are deected and nally stopped by the ambient argon atoms, and then stick to the
walls of the foil trap. In the case of tin, the foil trap is heated above the tin melting
point in order to avoid an accumulation of tin in the system of the lamellas. Using
only this technique, a collector lifetime of more than 109 shots has been reported for
an operation with a tin-based discharge source [39]. However, the foil trap concept
does not permit complete suppression of the emission of particles towards the
collector, and consequently for longer operating times a deposition of tin on the
j193
194
For EUV radiation, the index of refraction is close to unity, and the absorption in
matter is relatively high. A high reectivity at surfaces is only achieved for incidence
angles of the light of typically below 20 . This feature is, for example, exploited with
gracing incidence optics as presented above. A high reectivity for normal incidence
is only achieved with multilayer systems, as shown schematically in Figure 6.10. Such
multilayer systems consist of alternating layers of so-called spacer material and
absorber material, which have different indices of refraction and are thus reective
at their boundaries. Part of the incident light is reected at each layer boundary, and
the superimposed beam exhibits a high intensity. A well-known example, which is
also used in EUV lithography, is a system consisting of silicon and molybdenum with
high peak reectivity around a central wavelength of 1314 nm. A transmission
electron microscopy (TEM) image of a real mirror is also shown in Figure 6.10. The
j195
196
diameters are 165 mm for M1, 209 mm for M2, 104 mm for M3, and 170 mm for M4.
The corresponding radii are 3055 mm for M1, 1088 mm for M2, 389 mm for
M3, and 504 mm for M4 [55], where indicates a concave and a convex
surface. Usually, a system of several mirrors is chosen in order to have sufcient
degrees of freedom for the correction of aberrations and other imaging errors.
Typical diameters of the mirrors reach 200 mm for the ETS system, and even more
for other optical systems. In order to meet the imaging specications, the root mean
square (RMS) gure error and the roughness must be below a certain level. The
surface specications will be discussed in more detail. Usually, the surface topology
is described by a function z(x, y). For simplicity, the following discussion and
denitions are for the one-dimensional case z(x). The extension to two dimensions
is described elsewhere [56].
The average of the surface height is dened:
1
z lim
L! L
L=2
zxdx
6:9
L=2
with L being the spatial extension under consideration of the surface. The surface
roughness, s, is given by:
1
s lim
L! L
L=2
zx z2 dx
6:10
L=2
It is useful to discuss the Fourier transform of the surface in terms of the spatial
frequency, fx:
L=2
Zf x ; L
zxe 2pif x dx
6:11
L=2
The power spectral density (PSD) function is often used for characterization of a
surface, which can also be directly measured in scatterometry [56] and can be related
to the Fourier transform Z(fx, L):
PSDf x lim
L! L
jZf x ; Lj2
6:12
As dened in Equation 6.10, the roughness is the integral over all spatial
frequencies of the PSD function. Often, different regions are dened in the
specications depending on the respective frequency interval fmin to fmax
f max
s2Df 2
PSDf x df x
6:13
f min
The surface gure error corresponds to frequencies typically ranging from the
inverse aperture to 1 mm1. This type of error is responsible for aberrations. The
j197
198
In contrast to conventional DUV lithography, masks are also based on multilayercoated reective mirrors. A cross-section of a mask is shown schematically in
Figure 6.14. The mask blank is dened as that part including the substrate and a
protective layer of, for example, SiO2 necessary for the patterning process. The
structures to be imaged onto the wafer are written onto the surface using an absorber
layer of typically 100 nm thickness. The preferred absorber materials are Cr, TaN, Al
or W. However, as the mask is imaged, there are additional specications in
comparison to the multilayer mirrors for the optical system [58, 59], and these will
be addressed in the following.
The substrate must have a low thermal expansion coefcient (CTE) of typically less
than 5 ppb K1, as approximately 40% of the incident EUV light is absorbed and heats
up the mask. A low thermal expansion is required in order to avoid any magnication
correction between the changing of a wafer, and also to minimize image placement
distortion due to thermal expansion of the mask. With respect to multilayer optics,
the roughness specication is divided into high spatial frequency roughness (HSFR)
and mid-spatial frequency roughness (MSFR). HSFR (lspatial <1 mm) should be
below 0.100.15 nm (RMS) in order to reduce the losses due to scattering of light out
of the entrance pupil of the optical system. In order to reduce the small angle
scattering and image speckles, MSFR (1 mm < lspatial <10 mm) should also be below
0.10.2 nm (RMS). A peak reectivity of more than 67% with a centroid wavelength
uniformity across the mask of below 0.03 nm is required.
Another specication refers to the defect density of below 0.003 defects cm2 for
defects larger than 30 nm. This is the most challenging demand for the masks, and is
one of the most critical issues in EUV lithography. As EUV light has a strong
interaction with matter, and thus a short penetration depth of typically <100 nm,
defects on the masks have a much higher probability of being printed, in contrast to
other wavelengths as in the UV region. Thus, special care must be taken to reduce the
defects on the masks to a level of 0.003 per cm2 or, in other words, to less than a few
defects per mask.
Many different categories of defect have been dened, and many activities are
required simply to reduce the number on masks, mask blanks and the substrate by
cleaning and repair techniques [60], detecting printable defects [61] and simulation of
their inuence on the picture at the wafer level [62]. Although defects on the substrate
will be buried after the multilayer coating is applied, the various types of defect may
lead to phase errors of the reected light. Once such defect has been localized the
absorber structure can be appropriately aligned to cover these defect and thus
reduce its inuence. The inuence of defects (particles) on the mask depends on
their size. If they are sufciently small, they are not seen in the de-magnied image
on the wafer; hence, only defects larger than 2030 nm are of interest. The inuence
of defect size on image is discussed in Ref. [62].
In order to obtain an impression of the current status, the defect densities achieved
on mask blanks are taken from Ref. [1]. For defects larger than 120 nm the density is
0.03 defects cm2, while for >60 nm a density of 0.3 defects cm2 is achieved, this
being more than two orders away from the nal specication. As yet, no appropriate
metrology is available for smaller defects.
6.2.5
Resist
The use of higher photon energies and the printing of ever-smaller features requires
the development of a new generation of photo resists compared to those currently
used in DUV lithography. According to the International Roadmap for Semiconductors (ITRS), a resist thickness of between 40 and 80 nm is required for EUV, while the
line edge roughness (LER) and the critical dimension control (resolution) should be
below 1 nm (3s) [63, 64]. It should be noted that these specications have nearatomic-scale resolution, which is not achievable with sufciently high sensitivity
when using the resists and concepts currently available. To ensure a certain wafer
throughput, the resist sensitivity that is, the number of photons or energy per unit
area required to convert the resist molecules into solvable components should be in
the range of a few mJ cm2. The required thickness of less than 80 nm is lower than
for DUV resists because of the short absorption length for EUV radiation in
j199
200
MTFdiff
p2
4p2 Dtf
4p2 Dtf
p2
1e
6:14
The respective diffusion length, Ld, is dened by Ld2 2Dtf. The modulation
transfer function is shown as a function of the ratio p/Ld in Figure 6.15, where
MTFdiff 1 implies no change of the initial distribution. For example, if a deterioration to 70% is accepted, the diffusion length should not exceed a value of 0.2p. This
limitation of the diffusion length also implies a limitation in resist sensitivity. It
should be noted that, with decreasing pitch, the absolute value of the diffusion length
must also decrease, which therefore means less sensitivity in the transition from
DUV to EUV.
Another parameter describing the quality of a resist is the LER, which is distinct
from the achievable resolution in terms of the above discussion. Generally, the LER is
dependent on the spatial distribution of solvable components after exposure with a
number density, A. The LER is given by the ratio of the standard deviation of this
density and its gradient, leading to LER / sA/rA. For a sufciently low number of
photons, both parameters are determined by the incident number of photons. Here, a
variation due to the Poisson statistic (shot noise) of the absorbed photons and, in the
case of CA resist, the number of produced acids also comes into play. In general, the
is
standard deviation,
p sN, of the number of photons, N, in a certain volumep
proportional
to
N
,
while
A
/
N.
Consequently,
the
LER
scales
as
LER
/
1/
N
p
or 1/ E , where E is the incident dose. For a chemically amplied resist, the volume
which is relevant to the estimation of the number of photons is the diffusion sphere.
3=2
Thus, for a low diffusion length the LER is proportional to Ld when the dose is kept
constant. With increasing diffusion length and lower variation due to photon
statistics, the diffusion process and the MTF become dominant. The scaling of LER
with the diffusion length can be expressed [66]:
3=2
1
Ld
6:15
=MTFdiff
LER /
Ld
p
The LER scaling factor according to Equation 6.15 is shown in Figure 6.16 as a
function of the diffusion length relative to the pitch. Two regions can be distinguished: for Ld/p < 0.33, the scaling is dominated by the photon statistics, whereas
for Ld/p > 0.33 region the acid diffusion process is relevant for the LER. These two
scaling regions are also observed experimentally [66].
The absolute value of LER is still dependent on
pthe resist sensitivity or the
necessary dose, which leads to the scaling with 1/ E . This is illustrated in Figure 6.17, which shows the LER achieved for resists of different sensitivity [67]. The
estimated shot noise limit is also indicated. The theoretical limit has clearly not yet
been achieved, as the experimental data are slightly higher compared to this limit.
A number of other parameters, such as resist thickness, molecular size or outgassing, are also relevant to use in EUV lithography (see discussion in Ref. [65]). In
summary, it is somewhat challenging to meet the specications for a chemically
amplied resist for use in EUV lithography, and in fact such a resist does not yet
exist. In terms of LER and resolution, the specications may be achieved with a
j201
202
6.3 Outlook
Finally, it is illustrative to discuss the role of shot noise simply by estimating the
number of photons involved in the exposure process. The incident number of
photons per unit area, I, can be rewritten in terms of the necessary dose, E, and the
wavelength, l, of the photons:
I 5:0 10 2
N Ph l cm2
E
nm2 nm mJ
6:16
In the transition from DUV with 193 nm to EUV at 13.5 nm, the reduced
wavelength alone leads to a more serious inuence of the photon statistics. For
EUV radiation and an envisioned resist sensitivity of E 5 mJ cm2, we obtain
3.4 Ph nm2. With regards to the specications of LER and CD control below 1 nm, it
is clear that shot noise becomes a limiting factor in EUV lithography. This not only
requires the development of new resist materials, but also implies that many research
investigations will be necessary over the next few years.
6.3
Outlook
Intensive research and development activities conducted during the past decade have
shown that EUV lithography has the potential to provide a solution for the highvolume manufacture of semiconductor devices. Moreover, the technique has the
potential to decrease structure sizes to 11 nm, into the range of the physical limits of
silicon-based semiconductor technology. Several machines have been installed to
demonstrate the capability of printing small structures using EUV radiation; most
notably, the ASML alpha demo tool exhibits the full architecture with respect to the
optical system, wafer and mask handling for a scanning operation and full-eld
imaging. The diffraction-limited printing of small structures down to 29 nm was also
j203
204
successfully demonstrated, though major efforts are still needed to meet the
requirements of the components and to drive the technology to its theoretical limits
in different areas. These challenges extend not only to the source power but also to the
simultaneous high reliability and long lifetime of the source, and this is valid for both
laser-induced and discharge-based plasmas. Further issues here include debris
mitigation in order to increase the collector lifetime, collector thermal issues, and
increase the opening angle. Additional studies are also required on the lifetime and
contamination of the optical system by oxygen and hydrocarbons under EUV
radiation, on defect-free masks, and on resists with a sufciently high sensitivity
at high resolution and low LER.
Despite the nal specications not having yet been met for several components,
progress is nonetheless being made in all elds. For example, activities during the
past few years have led to plasma sources which emit inband radiation into the
hemisphere on a power level of several hundred watts close to nal specication
that in the past was believed to be the most critical issue in EUVL. Major progress
is also being achieved in improving the lifetime of both the source and the
collector, using sophisticated debris mitigation techniques. In fact, a collector
lifetime of more than 1 Gshot has recently been demonstrated, operating
with a tin-emitting plasma source, and an ultimate lifetime of 100 Gshot seems
feasible.
References
1 van den Brink, M. (2006) The only cost
effective extendable lithography option:
EUV, Third International Symposium on
EUV Lithography, Barcelona.
2 International Technology Roadmap for
semiconductors (ITRS). The current
version is available from www.sematech.
org or www.itrs.net.
3 Ceglio, N.M., Hawryluk, A.M. and
Sommargren, G.E. (1993) Front-end
design issues in soft-X-ray projection
lithography. Applied Optics, 32 (34),
70507056.
4 Gwyn, C.W., Stulen, R., Sweeney, D. and
Attwood, D. (1998) Extreme ultraviolet
lithography. Journal of Vacuum Science &
Technology B, 16 (6), 31423149.
5 Ota, K., Watanabe, Y., Banine, V. and
Franken, H. (2006) EUV source
requirements for EUV lithography, in EUV
Sources for Lithography (ed. V. Bakshi), SPIE
Press, Bellingham, Washington,
pp. 2743.
References
10
11
12
13
14
15
16
17
j205
206
27
28
29
30
31
32
33
34
References
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
j207
208
59
60
61
62
63
64
65
66
67
68
69
j209
7
Non-Optical Lithography
Clivia M. Sotomayor Torres and Jouni Ahopelto
7.1
Introduction
In the quest to use nanofabrication methods to exploit the know-how and potentials of
nanotechnology, one major roadblock is the high cost factor which characterizes
high-resolution fabrication technologies such as electron beam lithography (EBL)
and extreme ultraviolet (EUV) lithography. The need to circumvent these problems of
cost has inspired research and development in alternative nanofabrication, also
referred to as emerging or bottom-up approaches. Hence, it is within this context
that the status and prospects of nanoimprint lithography (NIL) are presented in this
chapter.
Nanofabrication needs are highly diverse, not only in the materials used but also in
the range of applications. Within the physical sciences, the drive is to realize nanostructures in order to produce articial electronic, photonic, plasmonic or phononic
crystals. This, in turn, depends on an ability to realize periodic or quasi-periodic
arrays of nanostructures, on the one hand to meet the stringent demands of periodicity, order and critical dimensions to obtain the desired dispersion relation and, on
the other hand, to identify a reproducible, cost-effective and reliable way in which
such materials may be fabricated, using a suitable form of nanopatterning.
Nanopatterning covers a wide range of methods from top-down approaches, as well
as bottom-up approaches (for discussions, see Chapters 5, 6, 8 and 9 of this volume).
In fact, the needs for lithography are found in several elds:
.
In (nano)photonics, a eld in which in addition to packaging the cost of fabrication of III-V semiconductor optoelectronic devices containing nanostructures in
the form of photonic crystals, is prohibitive.
j 7 Non-Optical Lithography
210
In organic opto- and nano-electronics, where the lifetime issue of the organic
materials is compounded with that of a cost-effective volume production with
lateral resolution down to a few hundreds of nanometers for electrodes and pixels.
In micro electro-mechanical systems (MEMS) and nano electro-mechanical systems (NEMS), where the fabrication of resonators, cantilevers and many other
structures with and without direct interface to Si-based electronics, requires the
control of 3-D nanofabrication with minimum damage to the underlying electronic
platform.
7.2
Nanoimprint Lithography
7.2.1
The Nanoimprint Process
Historically, NIL has been preceded by some remarkable events. In the twelfth
century, metal type printing techniques were developed in Korea; for example, in
1234 the Kogumsangjong-yemun (Prescribed Ritual Text of Past and Present)
appeared, while in 1450 Gutenberg introduced his press and printed 300 issues of the
two-volume Bible. Somewhat strangely, an extensive time lapse then occurred until
the early twentieth century, when the rst vinyl records were produced by using hot
embossing [11]. The next major step occurred during the 1970s, when compact discs
were fabricated by injection molding.
The term nanoimprint lithography was most likely used for the rst time by
Stephen Y. Chou, when referring to patterning of the surface of a polymer lm or
resist with lateral feature sizes below 10 nm [3]. Previously, within a larger lateral size
range, the method was referred to as hot embossing. At about the same time, Jan
Haisma reported the molding of a monomer in a vacuum contact printer and subsequent curing by UV radiation, which was known as mold-assisted lithography
[12]. The rst comprehensive review of these two approaches appeared in 2000 [13],
but since then several excellent reviews of NIL have been produced [14, 15]. During
recent years these methods have developed further and have become to be known as
thermal nanoimprint lithography and UV-nanoimprint lithography (UV-NIL),
respectively.
The question must be asked, however, what is NIL? Nanoimprint lithography
is basically a polymer surface-structuring method which functions by making a
polymer ow into the recesses of a hard stamp in a cycle involving temperature and
pressure. In order to nanoimprint a surface, three basic components are required:
(i) a stamp with suitable feature sizes; (ii) a material to be printed; and (iii) the
equipment for printing with adequate control of temperature, pressure and control of
parallelism of the stamp and substrate. The NIL process is illustrated schematically
in Figure 7.1. In essence, the process consists of pressing the solid stamp using a
pressure in the range of about 50 to 100 bar, against a thin polymer lm. This takes
place when the polymer is held some 90100 C above its glass transition temperature (Tg), in a time scale of few minutes, during which time the polymer can ow to ll
in the volume delimited by the surface topology of the stamp. The stamp is detached
from the printed substrate after cooling both it and the substrate. The cycle, which is
illustrated graphically in Figure 7.2, involves time, temperature, and pressure. Here,
we have the main issues of NIL: polymer ow and rheology. Although these points
have been addressed from the materials point of view in Refs. [16, 17], they remain a
serious challenge for feature sizes below 20 nm. These aspects will be discussed in
the following sections.
In UV-NIL the thermal cycle is replaced by curing the molded polymer by UV light
through a transparent stamp. This requires different polymer properties, as will be
discussed in the next section.
7.2.2
Polymers for Nanoimprint Lithography
The polymers used in NIL play a critical role. A comparison of the 10 most-often used
polymers (resists) used in thermal NIL is provided in Ref. [15]. The resists determine
both the quality of printing and the throughput. Quality is achieved via the thickness
uniformity of the spin-coated lm, the strong adhesion to the substrate, and the weak
adhesion to the stamp. Throughput is achieved via the duration of the printing cycle,
which in turn is determined by several time scales including:
.
the time needed to reach the printing temperature (the higher the Tg, the longer the
cycle, unless there is a pre-heating stage)
j211
j 7 Non-Optical Lithography
212
.
.
the hold time for optimum ow at the printing temperature (the more viscous the
polymer, the longer the hold time)
the time needed for cooling and demolding.
Different criteria must be met for thermal NIL and for UV-NIL [18]. For thermal
NIL, the polymer is used as a thin lm of a few hundred nanometers thickness which is
spin-coated onto the support substrate. The key properties are of a thermodynamic
nature, and therefore these polymers are of the thermoplastic and thermosetting
varieties with varying molecular weights, chemical structures, and rheological and
mechanical behaviors. The dependence of the viscosity of a thermoplastic polymer as a
function of temperature is shown graphically in Figure 7.3, and illustrates the region
where thermal NIL takes place. The mechanical behaviors of polymers in different
temperature regimes, in relation to the molecular mobility, are listed in Table 7.1.
j213
j 7 Non-Optical Lithography
214
Temperature
regime
State
Mechanical
appearance
Youngs
modulus
(N mm2)
Molecular
mobility
Effect of stress
Tsubtransition
Tg
Tflow
Viscoelastic
Glassy
Brittle
Rubber elastic
Hard elastic, rigid
Plastic
Rubber elastic
about 3000
about 1000
about 1
Too small to
measure
Molecular
conformation
completely
xed.
Molecular
conformation
largely xed.
Occasional change
in molecular
positions of side
groups and chain
segments.
Energy-driven
elastic
distortions.
Energy-driven
elastic distortions.
Entropy-driven elastic
distortion. Besides
temperature, the
deformation rate affects
the mechanical behavior.
Imprinting is possible, but
will have memory effects.
No restricted
rotation around
single bonds.
Whole
macromolecules
change their
positions gliding
past each other.
Plastic ow.
Macro-Brownian
motion.
Pseudoplasticity,
shear thinning.
Suitability
for printing
Best printing
temperature range.
One of the polymer strategies used to reduce the printing temperature and to
improve thermal stability has been to cure prepolymers (special precursors of
crosslinked polymers). Here, the term curing refers to the photochemical (UV)or thermal-induced crosslinking of macromolecules to generate a spatial macromolecular network. The prepolymers are low-molecular-weight products, with a low Tg,
which are soluble and contain functional groups for further polymerization. Thus,
lower printing temperatures of about 100 C can be used. Curing can take place
during the printing time, or thereafter, with the thermal stability enhancement
arising from the crosslinking process of the macromolecules.
Polymers for UV-NIL must be suitable for liquid resist processing; that is, they are
characterized by a lower viscosity than the polymers for thermal NIL. Naturally, they
must also be UV-curable over short time scales [19]. The characteristics of these
polymers after printing for their direct use, as in polymer optics or microuidics, or
as a mask for subsequent pattern transfer, by means of dry etching, demand high
mechanical, thermal and temporal stability. In photonic applications, stability in
terms of optical properties, such as refractive index, is also essential. Recently, micro
resist technology GmbH [20] has developed a whole range of polymers for thermal
and UV-NIL (for a discussion, see Ref. [21]). Moreover, tailoring the polymer properties to increase the control of critical dimensions remains an area where, although
rapid progress has recently been made [18, 19, 21], further research investigations
are still required. The importance of this research may be appreciated especially in a
one-to-one lling of the stamp cavities, thereby making the printed polymer
features resilient to residual layer removal. Moreover, polymer engineering is also
a determinant in larger throughputs, in terms of shorter times for the curing and
pressure cycles.
In recent years several reports detailing mechanical studies of thermal NIL have
been made, and the interested reader is referred to the data of Hirai [22], the review
of Schift and Kristensen [15], and to a recent review of the research on the simple
viscous squeeze ow theory [23].
7.2.3
Variations of NIL Methods
To date, four main variations of the NIL process have been developed, and these are
briey described below.
7.2.3.1 Single-Step NIL
This is the most commonly used method to print a polymer in one temperature
pressure cycle, and has been extended to the printing of 150-mm [24] and 200-mm
wafers [25]. A scanning electron microscopy (SEM) image of an array of lines of
200 nm width printed over a 200-mm silicon wafer is shown in Figure 7.4. Although
one-step thermal NIL can be performed using regular laboratory-scale equipment,
commercially available tools include, among others, those of OBDUCAT [26] and
EVG [28], which are available in Asia and the Americas. Thermal expansion may
cause distortions in the imprinted pattern and to avoid this, strategies for roomtemperature NIL have been investigated [28, 29].
j215
j 7 Non-Optical Lithography
216
(usually quartz), the fabrication of which is at present less straightforward than for
thermal NIL.
7.2.3.4 Roll-to-Roll Printing
This is an advanced sequential method stemming from production method used
in, for example, the newspaper industry. Roll-to-roll nanoimprinting is a versatile
method that can be combined with other continuous printing techniques, as shown
schematically in Figure 7.8. Its extension to 100-nm lateral resolution has been
reported [36]. Recent developments suggest that roll-to-roll nanoimprinting is the
closest to an industrial technology for organic opto- and nano-electronics, as well as
for lab-on-chip device fabrication. The challenge is to fabricate the round stamps; that
is, the printing rolls with nanometer-scale features. Moreover, due to the nature of the
continuous process, some restrictions may arise in applications requiring multilevel
patterning with high alignment accuracy between the layers. Examples of the feature
size that can be obtained with a laboratory-scale roll-to-roll printer are shown in
Figure 7.9.
7.2.4
Stamps
Stamps for NIL have been extensively discussed in Ref. [15]. The main considerations
from the materials aspect include:
.
Hardness (e.g., typically from 500 to thousands of kg mm 2), which determines the
stamp lifetime and the way in which it wears out.
j217
j 7 Non-Optical Lithography
218
.
.
Thermal expansion coefcient (e.g., typically from 0.6 to 3 10 6 K 1), as well as
Poissons ratio (e.g., typically from 0.1 to 3.0), which will have a strong impact on
distortion while demolding.
Surface smoothness (e.g., better than 0.2 nm), as a rough surface will require large
demolding forces and may lead to stronger than needed adhesion.
Youngs modulus (e.g., typically from 70 to hundreds of GPa), which in turn will
control possible stamp bending. The latter effect may lead to uneven residual layer
thickness, and thus compromise critical dimensions.
Thermal conductivity (e.g., typically from 6 to hundreds of Wm 1 K 1), which
determines the duration of the heating and cooling cycles.
With regards to fabrication, the parameters to be considered include the minimum lateral feature size or resolution, the aspect ratio (feature lateral size: feature
height), the homogeneity of the feature height across the stamp, as well as depth
homogeneity, sidewall roughness, and inclination. For thermal NIL, stamps are
usually fabricated in silicon by using EBL and reactive ion etching for the highest
resolution and versatility. Unfortunately, these procedures are rather expensive, so
that strategies for lower-cost replication have been developed, such as SSIL, the use
of a master stamp and a (negative) rst-generation replica using NIL or a (positive)
second-generation replica, again using NIL. Replication while maintaining a resolution of 100 nm or better requires electroplating to replicate the original in metal.
Current developments in the replication of a master stamp in thermosetting polymers show great promise, as they are expected greatly to reduce the cost of stamp
replication [37]. As nanoimprint is a 1-to-1 replication technology, it is essential that
the stamp has the correct feature sizes required on the wafer, thus emphasizing the
need for quality stamps.
j219
j 7 Non-Optical Lithography
220
In the past, stamps have been realized for wafer-scale thermal NIL (see Figure 7.10).
In addition, electron-beam-written silicon stamps for thermal NIL are commercially
available from NILTechnology [38], and an example is shown in Figure 7.11.
In recent years, the adhesion between the stamp and the printed polymer lm has
been the subject of signicant research effort in thermal NIL. Here, the main issue is
to ensure that the interfacial energy between the stamp and the polymer lm to be
printed is smaller than the respective interfacial energy between the substrate and the
polymer lm [39]. However, based on the materials commonly used, this matching
is not sufcient for easy detachment, in which the frozen strain also plays a role.
The normal practice here in order to facilitate demolding and to prolong the stamp
Figure 7.11 Silicon stamp from NILTechnology [38]. (Illustration courtesy of NIL Technology.)
Material
PMMA
PS
PTFE
CF3 and CF2
Silicon surface
41.1
40.7
15.6
1517
2026
lifetime, is to coat the stamp with an anti-adhesive layer to minimize the interfacial
energy and, therefore, the adhesion. Values of the surface energies of materials
commonly used in the NIL process are listed in Table 7.2. These data show that a
uorinated compound can dramatically reduce the surface energy and minimize
adhesion while demolding a stamp from the printed polymer.
For both UV-NIL and SFIL, UV-transparent stamps are required, and these are
typically constructed from quartz. Although, the fabrication of quartz stamps for high
resolution has not yet been standardized, various efforts have been made to use
photomask fabrication methods to prepare stamps or templates for UV-NIL [40]. A
schematic overview of the stamp fabrication process is shown in Figure 7.12. In a rst
lithography step, the stamp is structured using EBL (rst level writing), while in a
second lithography step the pedestal requirement for imprint is made (second level
Figure 7.12 Schematics of the fabrication process of twodimensional stamps for step-and- stamp and or step-and-flash
UV-NIL [41, 42].
j221
j 7 Non-Optical Lithography
222
Figure 7.13 Photomask fabrication methods for UV-NIL stamp fabrication [43].
writing). Further details on the process ow for UV-NIL stamps can be found in
Refs. [41, 42]. By using photomask fabrication methods, four imprint stamps or
templates may be structured on a single photomask blank (see Figure 7.13a and b).
The photomask blank is then diced into separate stamps (Figure 7.13c). As dicing
introduces some contamination and mechanical strains, a modied fabrication
process must be introduced before step-and-repeat- and step-and-ash- UV-NIL can
be employed in volume production. The size standard for stamps resembles the
exposure eld of current optical lithography steppers (see Figure 7.13d).
Recently, stamps for 3-D structuring tests of several layers of functional lms using
UV-NIL targeting the back-end CMOS processes have been developed, using stamps
similar to that shown schematically in Figure 7.13d.
The fabrication of stamps with high-resolution features for roll-to-roll nanoimprinting is more complicated because patterning of the curved surfaces is not
straightforward. One possibility way to overcome this is to make a bendable shim
that is wrapped around the printing roll. Such bendable large area stamps can be
fabricated by electroplating, and exploiting SSIL in large-area pattering has been
shown to reduce the fabrication time remarkably [44]. A 100 mm-diameter bendable
Ni stamp is shown in Figure 7.14; this gure also shows that sub-100-nm features can
be easily reproduced by using an electroplating process.
The details of stamps used for 3-D printing are discussed later in the chapter.
7.2.5
Residual Layer and Critical Dimensions
Most of the processes described above yield a nanostructured polymer layer (as
shown schematically in Figure 7.1), with a residual layer under the features of the
stamp. If the desired nanostructured surface is the polymer itself, with no material
between the features, then the residual layer must be removed. The same applies [44]
if the patterned polymer or resist is to be used directly as a mask for pattern transfer
into the substrate by, for example, reactive ion etching, or if a metal lift-off step will be
needed that will result in a metal mask for further pattern transfer [45]. An etching
step of the printed polymer, whether to remove the residual layer or to be used as a
mask, necessarily results in the feature sizes experiencing change. This was shown in
the variation in the width of printed AharonovBohm ring leads following removal of
the residual layer by etching, and after metal lift-off. The leads increased by 15 nm in
width from the targeted width of 500 nm, taken over an average of 20 samples [32].
This means that, in order to control the critical dimensions of the printed features,
the residual layer thickness uniformity must also be controlled, as its removal leads to
a size uctuation of the resulting nanostructures. Signicant efforts have been made
to develop non-destructive metrology for nanoimprinted polymers. One of the most
salient approaches is to use scatterometry as applied to NIL [46], in order to determine
both the feature height and residual layer thickness. Being based on the principles of
ellipsometry, a laser spot is used, which is scanned over the region of interest. An
example of this is shown in Figure 7.15 (right panel), with a cross-sectional SEM
image of the printed ridges and the corresponding scatterometry data and curve
tting. The left panel of Figure 7.15 depicts an optical reection image of a printed
wafer, showing the thickness variation of the residual layer across the wafer [47]. An
in-situ and non-destructive method was demonstrated by adding chromophores to
the printed polymer and using their emission as an indicator of stamp deterioration
(such as missing features), mirrored in the printed elds. Although the resolution of
this method was poor, it did at least demonstrate the feasibility of the in-line
monitoring of printed arrays of nanostructures [48].
j223
j 7 Non-Optical Lithography
224
motion in thin lms over large distances. There are two basic considerations to this
point:
.
The force has only a linear dependency on viscosity, which changes by orders of
magnitude when the temperature is in the vicinity of Tg.
Although a Newtonian uid, or a uid in the limit of small shear rates, the viscosity
does not depend on the shear rate. However, at moderate or high shear rates, a nonlinear ow can lead to a decrease in viscosity by several orders of magnitude. The
effect of this on the NIL process would be seen as a reduction either in the pressure
needed or in the processing time.
The question is, therefore, what are the contributions to the force from pressure
and shear stress of the uid motion? Clearly, pressure is related to the contact area
of the stamp and the uid (polymer above Tg), whereas the shear stress is related
to the ow velocity, which in turn depends on the distances over which the uid
must be transported, and therefore on the particular stamp design. At any given
time, the condition of continuity and the conservation of momentum of an
incompressible liquid requires that the velocity must increase with radial distance, which would result in a parabolic velocity prole in the z-direction. The
velocity would be least at the interface with the disk walls, and greatest in the
middle of the gap. To this, the inversely proportional cubic dependence on liquid
layer thickness must be added. The calculated values of viscosity and transport
thickness tend to agree with the observed experimental values for polymethyl
methacrylate (PMMA) and, rather simplistically, some basic trends can be
obtained:
.
Thermal NIL works best for smallest features (sub-100 nm) which are close
together and in which a local ow takes place, allowing easy and reliable lling
of the stamp cavities.
Conversely, large features (>10 mm) separated by large distances require a large
displacement of material, and larger forces.
Here, the forcedisplacement curve results reviewed in Ref. [23] are highly illuminating, as a more complete model requires the consideration of several ow elds
arising from the different shapes, depths, and separation of cavities in the stamp.
Schift and Heyderman carried out a thorough analysis in this respect in the linear
micrometer regime [50].
In the linear regime, the temperature dependence of the viscosity is viewed as a
thermally activated process [49, 50], following a formalism of amorphous polymers and
remaining within the limit of small shear rates. Such a non-linear regime
is substantially more complex, and is basically exemplied by shear thinning and
extrudate swelling. Hoffmann suggested that shear thinning, with its inherent shear
rate-dependent viscosity, may inuence the thermal NIL process, especially for small
features [49].
A key remaining issue is the understanding of how the stored deformation energy
depends on the rate at which the temperature and pressure are applied and released,
j225
j 7 Non-Optical Lithography
226
and to what degree these inuence the mechanical stability of the printed polymer
features in time scales of weeks, months, and years.
In practice, on order to gain an understanding of the lling dynamics of a stamp
cavity under the combined effects of squeezing ow, polymer rheology, surface
tension and contact angle in typical NIL experiments, full uidsolid interaction
models based on the continuum approach have been devised [51, 52]. In these, both
the uid bed and the solid stamp are represented and a continuity of displacement
and pressure is applied at the interface. As these are based on nite elements, there
are almost no limits to the choice of the materials constitutive behaviour, and these
clearly reect the effects of stamp anisotropy and the shear thinning behavior of
the polymer. In particular, they are especially efcient at predicting the shape of the
polymer in partially lled cavities.
Coarse grain methods have proven powerful in computing the residual layer
thickness of the embossing process [53]. Based on the Stokes equation, they solve the
simple squeeze ow equation for Newtonian uids and embossed areas of up to
several square millimeters within a matter of minutes. Here, the calculation is based
on the determination of a homogenized depth which is representative of the average
pressure applied to the area. The quantitative agreement has proven excellent, and is
generally acceptable when the polymer layer is embossed well above Tg [54]. Freezing
of the embossed structures through the Tg has also been modeled [55, 56], and has
provided precious insight into the build-up of internal stresses prior to stamp release
and of the polymerstamp interface. It has also been possible to simulate the
demolding process, and thus the nal shape of the embossed structures, both after
stamp release and after relaxation for a given period of time [57]. Two examples of the
models described in this section are shown in Figures 7.16 and 7.17.
7.2.6
Towards 3-D Nanoimprinting
One special aspect of NIL and SFIL is their ability to pattern in three dimensions
compared to other lithographies. Several applications require this ability, from
MEMS to photonic crystals, including a myriad of sensors. One of the rst demonstrations of 3-D patterning by NIL was the realization of a T-gate for microwave
transistors with a footprint of 40 nm by a single-step NIL and metal lift-off [58]. SFIL
j227
j 7 Non-Optical Lithography
228
also showed its 3-D pattering ability in the fabrication of multitiered structures,
maintaining a high aspect ratio [59].
If metal lift-off is to be avoided, then 3-D NIL requires 3-D stamps. These
are produced by gray-scale lithography with sub-100 nm resolution, but are limited
in depth and volume production due to the sequential nature of EBL [60]. One
recent variation of this approach consisted of using inorganic resists and lowacceleration electron-beam writing, thus allowing the control of the depth to tens
of nanometers [61].
A combination method which was based on focused ion beam and isotropic wet
etching has been demonstrated by Tormen et al. [62], and resulted in tightly controlled
3-D proles in the range from 10 nm to 100 mm.
Within the microelectronics industry, one of the main expectations from NIL was
its application as a lithography method in the dual damascene process for back-end
CMOS fabrication [1], and this process is still undergoing testing today.
Bao et al. showed that it is possible to print over non-at surfaces using polymers
with different mechanical properties using thermal NIL and polymers with progressively lower Tg-values for each subsequent layer [63]. This meant that a different
polymer must be used for each layer. In order to overcome this situation, several other
variations and combinations of methods based on NIL have been developed. One
such development is that of reversed contact ultraviolet nanoimprint lithography
(RUVNIL) [64], which combines the advantages of both reverse nanoimprint
lithography (RNIL) and contact ultraviolet (UV) lithography. In this process, a UV
crosslinkable polymer and a thermoplastic polymer are spin-coated onto a patterned
hybrid metalquartz stamp. The thin polymer lms are then transferred from the
stamp to the substrate by contact at a suitable temperature and pressure, after which
the whole assembly is exposed to UV light. Following separation of the stamp and
substrate, the unexposed polymer areas are rinsed away with a suitable developer,
leaving behind the negative features of the original stamp. The process is shown
schematically in Figure 7.18.
By using the same UV-curable polymer for each layer, 3-D nanostructures have
been obtained (Figure 7.19). This technique offers a unique advantage over reversecontact NIL and thermal NIL, as no residual layer is obtained by controlling the UV
light exposure. This avoids the normal post-imprinting etching step, and therefore
results in a much better control of the critical dimensions. Another interesting
feature here is that it is not necessary to treat the stamp with an anti-adhesive coating.
Three-dimensional UV-NIL can be potentially used in the fabrication of modern integrated circuits, which employ several layers of copper interconnects, separated by an interlayer dielectric (ILD) and connected by copper vias (Figure 7.20).
An imprintable and curable UV-ILD material is deposited on an existing interconnect layer (Figure 7.20a), this material having been structured by a 3-dimensional
j229
j 7 Non-Optical Lithography
230
stamp (Figure 7.20b). After UV-curing, the materials resembles a structured ILD
(Figure 7.20c), which is then lled with copper to form two layers of vias and
interconnects (Figure 7.20d).
7.2.7
The State of the Art
7.3
Discussion
50f
12m
150t
8 nm min features.
50 nm/5 mm on
same stamp
25 nmf/mm
9 nm/100 mml
25 nm/20 mms
NIL
SSILd
UV-NILk
Soft UV-NIL
200
200
200b
Largest wafer
printed (mm)
150 mmu
about 20o
<250
500
Overlay
Accuracy (nm)
45 min; about
12 wafers/hrv
20 s/stepp three
wafers/hq
Minutes, 10 s, Min,
1015 min
Full cycle 2.5 min
with, 20 s without
full auto-collimation.
20 wafers/hj
T align, T print,
T release, T cycle
>50w
>1000r
800g
(Continued)
Various NILTM105,
AMONIL, PAK 01
MRT07xp, PAK01,
AMONIL1,
AMONIL2, NXR,
Laromer
AMONIL1,
NXR-Mod, Laromer
mr-I 8000,
mr-I 7000
Various
>50c
1000
Materials
No. of times
stamp used
SFILe
14
5 nma/N/A
Technique
50
Min pitch
(nm)
Smallest/largest
features in
same print
7.3 Discussion
j231
T.-Wei Wu, M. Best, D. Kercher, E. Dobisz, Z. Bandic, H. Yang and T. R. Albrecht, Nanoimprint Applications on Patterned Media, NNT 06, San Francisco, US, November 15
17, 2006.
S.V. Sreenivasan, P. Schumaker, I. McMackin and J. Choi, Nano-Scale Mechanics of Drop-On-Demand UV Imprinting, NNT 06, San Francisco, US, November 1517, 2006.
k
R. Hershey, M. Miller, C. Jones, M. G. Subramanian, X. Lu, G. Doyle, D. Lentz and D. LaBrake, SPIE2006, 6337, 20.
l
UV-NIL includes Single-Step & Step&Repeat on a EVG770 tool.
m
B. Vratzov, et al., J. Vac. Sci. Technol. B2003, 21, 2760; and http://www.amo/de.
n
S. Y. Chou, et al., Nanotechnology2005, 16, 10051.
o
4-inch (10-cm) Single-Step, 8-inch (20-cm) Step&Repeat on EVG770.
p
A. Fuchs, et al., J. Vac. Sci. Technol. 2004, 22, 32423245, and to be published.
q
Without ne alignment nor automation yet.
r
http://www.molecularimprints.com.
s
M. Otto, et al., Microelectronic Eng. 2004, 7374, 152.
t
U. Plachetka, et al., Microelectronic Eng. 2006, 83, 944.
u
Pitch for Soft UV-NIL tested so far and to be published.
v
Only coarse alignment available; depending on stamp material used.
w
7080% of the given time is to cure the resist.
x
Based on laboratory tests to date.
232
j 7 Non-Optical Lithography
7.3 Discussion
j233
j 7 Non-Optical Lithography
234
challenges. Here, the impact will be upon stamp layout, and on the design rules. With
regards to critical dimensions, those applications with a strict control of periodicity
and smoothness of features will serve as the acid test for NIL. For example, if NIL is
to be used as a mask-making method for the transfer of 2-D photonic crystal patterns
into a high-refractive index material then, in addition to alignment, the disorder
must be controlled to better than a few nanometers after pattern transfer that is, after
reactive ion etching. Although these critical dimensions are more relaxed in lowerrefractive index photonic crystals, such as those printed directly in polymers [71], the
verticality of the side walls is still of paramount importance. While the current resolution of NIL is already of a few nanometers, such demands will need to be even more
stringent in order to fabricate hypersonic phononic crystals and nanoplasmonic
structures, where the relevant wavelengths are of the order of only a few nanometers.
The versatility of the printable polymers, the resolution of NIL and the ability to
realize 3-D structures open further possibilities. One such advance is the use of 3-D
templates or scaffolds to provide not only the support but also input and output
contacts for supramolecular structures. To be successful, this has two requirements:
rst, the feature sizes must be commensurate, and the structured polymer surface
site-selectively functionalized. Whilst the complete proof of concept is still missing,
some degree of progress has been made towards spatially selective functionalization
by means of NIL, chemical functionalization and lift-off, and this has resulted in
electrical contacts to 150 nm-wide arrays of polypyrrole nanowires [76]. The second
requirement is in the use of 3-D nanostructures, beyond the face-centered cubic (fcc)
and cubic symmetries, the properties of which can be modied by subsequent
surface treatment, while preserving the symmetry. Modications may include
coating with oxides, and also removal of the polymer template, followed by subsequent inlling with another material. In this way, an articial 3-D superlattice may be
realized, thereby providing a periodic or quasi-periodic arrangement for electronic
and or optical excitations.
7.4
Conclusions
In this chapter we have reviewed some of the key developments of NIL, as a nonoptical lithography method, paying particular attention to the schematics of
the process, and discussing: (i) materials issues and their impact on the process;
(ii) the variations of NIL (among which roll-to-roll appears particularly promising for
volume production); (iii) stamp design, both in terms of robustness and adhesion;
(iv) the issue of residual layer thickness and its impact on critical dimensions;
and (v) nally briey reviewing the latest progress in 3-D NIL. In addition, we have
discussed the importance of understanding polymer ow as an enabling knowledge
to optimize stamp design, and thereby throughput. When discussing 3-D NIL,
the scope for further progress was outlined as to date this is largely unexplored.
NIL has been said to have short-term prospects in back-end CMOS fabrication
processes, but more so in the fabrication of photonic structures and circuits, with a
References
proviso in case of photonic crystals where stringent tolerances are still to be met. On
the other hand, applications in less-demanding areas, such as gene chips for diagnostic
screening, appear to have a very bright future.
Acknowledgments
References
1 http://www.itrs.net/Links/2006/Update/
Final/ToPost/08_Lithography2006.
Update.pdf.
2 Sotomayor Torres, C.M. (Ed.) (2003)
Alternative Lithography: Unleashing the
Potentials of Nanotechnology, Kluwer
Academic Plenum Publishers, New York.
3 (a) Chou, S.Y., Krauss, P.R. and Renstrom,
P.J. (1995) Applied Physics Letters, 67, 3114.
(b) Chou, S.Y. (1998) United States Patent,
No. 5,772,905, 57.
4 Garcia, R. (2003) Alternative Lithography:
Unleashing the Potentials of Nanotechnology
(ed. C.M. Sotomayor Torres), Kluwer
Academic Plenum Publishers, New York,
p. 213.
5 Piner, D., Zhu, J., Xu, F., Hong, S. and
Mirkin, C.A. (1999) Science, 283, 661.
6 Xia, Y., Zhao, X.-M. and Whitesides, G.M.
(1996) Microelectronic Engineering, 32,
255.
7 Michel, B., Bernard, A., Bietsch, A.,
Delamarche, E., Geissler, M., Juncker, D.,
Kind, H., Renault, J.-P., Rothuizen, H.,
Schmid, H., Schmidt-Winkel, P., Stutz, R.
and Wolf, H. (2001) IBM Journal of
Research and Development, 45, 697.
j235
j 7 Non-Optical Lithography
236
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
References
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
j237
j 7 Non-Optical Lithography
238
j239
8
Nanomanipulation with the Atomic Force Microscope
Ari Requicha
8.1
Introduction
The Scanning Probe Microscope (SPM) provides a direct window into the nanoscale
world, and is one of the primary tools that are making possible the current
development of nanoscience and nanoengineering. The rst type of SPM was the
Scanning Tunneling Microscope (STM), invented at the IBM Z
urich laboratory by
Binnig and Rohrer [1], who received the Nobel Prize for it only a few years later (1986).
The STM provided for the rst time the ability to image individual atoms and small
molecules, and it is still widely used, especially to study the physics of metals and
semiconductors. Much of the STM work is conducted in ultra high vacuum (UHV)
and often at low temperatures. The STM main drawback is the need for conductive
samples, which rules out many of its potential applications in biology and other
important areas.
The next instrument to be developed in the SPM family was the Atomic Force
Microscope (AFM), sometimes also called Scanning Force Microscope [2]. The AFM
has become the most popular type of SPM because, unlike the STM, it can be used
with non-conductive samples, and therefore has broad applicability. Today, there are
many other types of SPMs. All of these instruments scan a surface with a sharp tip
(with apex radius on the order of a few nm), placed very close to the surface
(sometimes at distances 1 nm), and measure the interaction between tip and
surface. For example, STMs measure the tunneling current between tip and sample,
and AFMs measure interatomic forces see Section 8.2 below for a lengthier
discussion of SPM principles.
It was noticed from the beginnings of SPM work that scanning a sample with the
tip often modied the sample. This was initially considered undesirable, but
researchers soon recognized that the ability to modify a surface could be exploited
for nanolithography and nanomanipulation. In SPM nanolithography one writes
240
lines and other structures directly on a surface by using the SPM tip. A well-known
technique is called local oxidation, rst demonstrated by passing an STM tip in air
over a surface of hydrogen-passivated silicon [3]. Other materials can be used, as
well as a conductive AFM tip in lieu of an STM [4]. Another STM method involves
removing atoms from a silicon surface by applying voltage pulses to the tip [5].
Lines as narrow as a silicon dimer have been produced by this method. Lithography by material deposition, as opposed to material removal, has also been
demonstrated in early work. For example, in [6] atomic-level structures of germanium were deposited on a Ge surface in UHV by pulsing the voltage on an STM tip,
whereas in [7] gold clusters were deposited on a gold surface also by applying
voltage pulses to the STM tip, but in air and at room temperature. Many other SPM
nanolithography approaches have been demonstrated see [8] for a survey of early
work.
More recently, several other SPM nanolithographic techniques have been
developed. Some examples follow. Dip Pen Nanolithography [9] involves depositing material on a surface much like one writes with a pen on paper. A pen (the AFM
tip) is inked by dipping it into a reservoir containing the material to be deposited,
and then it is moved to the desired locations on the sample. As the tip approaches
the sample, a capillary meniscus is formed, which drives the material onto the
sample. Other approaches are discussed for example in [1015]. For a recent review
see [16].
Nanomanipulation is dened in this chapter as the motion of nanoscale objects
from one position to another on a sample under external control. Precise, highresolution nanolithography shares with nanomanipulation the need for accurately
positioning the tip on the sample. This is a challenging issue, which we will discuss
later in this article.
Given the atomic resolution achieved by SPM imaging, one would expect also that
atoms might be moved individually. This is indeed the case, and it was demonstrated
in the early 1990s [17]. At the IBM Almaden laboratory, Eiglers group has been able to
precisely position xenon atoms on a nickel surface, platinum atoms on platinum,
carbon monoxide molecules on platinum [18], iron on copper [19], and so on, by using
a sliding, or dragging process. The tip is brought sufciently close to an adsorbed
atom for the attractive forces to prevail over the resistance to lateral motion. The tip
then moves over the surface, and the atom moves along with it. Tip withdrawal leaves
the atom in its new position.
Eigler also has succeded in transferring to and from an STM tip xenon atoms on
platinum and nickel, platinum on platinum, and benzene molecules. This was done
by approaching the atoms or molecules with the tip until contact (or near contact) was
established. In addition, xenon atoms on nickel were transferred to the tip by applying
a voltage pulse to the tip. All of Eiglers work cited above has been done in ultra high
vacuum (UHV) at 4 K.
Avouris group, at the IBM Yorktown laboratory, and Aonos group in Japan have
transferred silicon atoms between an STM tip and a surface in UHV at room
temperature, by aplying voltage bias pulses to the tip [20, 21].
8.1 Introduction
j241
242
8.2
Principles of Operation of the AFM
8.2.1
The Instrument and its Modes of Operation
The AFM links the macroworld inhabited by users, computers and displays to the
nanoworld of the sample by using a microscopic cantilever that reacts to the
interatomic forces between its sharp tip and the sample. Cantilevers are typically
built from silicon or silicon nitride by using MEMS (microelectromechanical
systems) mass fabrication techniques. They have typical dimensions on the order
of 100 20 5 mm. Tips are built at one of the ends of the cantilevers and are usually
pyramidal or conical with apex diameters on the order of 1050 nm. The tip apex has
dimensions comparable to those of the samples features and can interact with them
effectively.
Figure 8.1 shows diagramatically the interaction between atoms in the tip and in
the sample. The main forces involved are the long-range electrostatic force (if the tip
or sample are charged), the relatively short-range van der Waals force, the capillary
force (when working in ambient air, which always contains some humidity), and the
repulsive force that arises when contact is established [28]. The cantilever bends
under the action of these forces, and its deection is usually measured by an optical
system, as shown in Figure 8.2. Laser light bounces off the back of the cantilever,
opposite to the tip, and is collected in the two halves of a photodetector. In the AFM
jargon, the electrical signals output from the two photodetector halves are called A
and B, and the diferential signal is called AB. For zero deection and a calibrated
instrument, the differential signal from the detector is also zero, and in general AB
is approximately proportional to the cantilever deection. The force between tip and
sample is simply the product of the measured deection and the cantilevers spring
constant k, which can be determined in several ways [29].
Relative motion between the tip and the sample is accomplished by means of a
scanner, which consists of piezoelectric actuators capable of imparting x, y, z
displacements to the tip or, more commonly, to the sample. (We assume that the
sample is on the x, y horizontal plane, and the tip is approximately aligned with the z
axis.) Piezo drives react quickly and are very precise, but require high voltages on the
order of 100200 V, and have a small range of motion. They also are highly nonlinear,
as we will discuss later.
In contact mode operation, the tip rst moves in z until it contacts the sample and a
desired value of the tip-sample force, called the force setpoint, is achieved. Then it
scans the sample by moving in x, y in a raster fashion, while moving also in z under
feedback control to maintain a constant force. The feedback circuitry is driven by the
deviation (or error) between the (scaled) photodetector signal AB and the force
setpoint. Suppose that the tip moves in straight line along the x direction, at a constant
y. When it encounters a change of height Dz of the sample, the scanner must move
the sample by the same Dz in the z direction to maintain the contact force (i.e.,
cantilever deection). Therefore, the amount of motion of the scanner in z at point x
gives us exactly the height of the surface z(x), usually called the topography of the
surface. Now we move the tip back to the beginning of the line, increment y by Dy and
scan again in the x direction. If we do this for a large number of y values we obtain a
series of line scans that closely approximate the surface of the sample z(x, y). In this
example, x is called the fast scan direction, and y the slow scan direction. In practice,
line scan signals are sampled (perhaps after some time-averaging) and discretized,
and the output signal becomes a series of values z(xi, yj), often called pixel values,
since the output is a digital image. These images are normally displayed by encoding
the height values z as intensities. Of course, other display options are also available,
such as perspective images that provide a better feel for the three-dimensional
structure of the sample, and so on. A typical AFM image contains 256 256 pixels.
For this resolution and a square scan of size 1 1 mm, pixels are 4 4 nm, which is a
rather large size. Therefore, very small scan sizes are necessary for precise
operations.
In contact mode, essentially, the tip is pushed against the sample and dragged
across it. This may damage the sample and the tip, and may dislodge nanoobjects
from the surface on which they are deposited, thereby making it impossible to image
them. An alternative mode of AFM operation that avoids the drawbacks of contact
j243
244
mode and is kinder to the tip and sample is the dynamic mode, also known by other
designations such as non-contact, tapping, intermitent-contact, AC, and oscillatory.
Here the cantilever is vibrated at a frequency near its resonant frequency, which is
typically on the order of 100300KHz in air and 130KHz in water. The vibration is
usually generated by a dedicated piezo drive installed at the base of the cantilever. The
piezo moves the cantilever endpoint opposite to the tip up and down at a frequency
near resonance, and the vibration is mechanically amplied by the cantilever,
resulting in a much larger amplitude of oscillation at the tip. Alternatively, the
cantilever can be coated with a magnetic material and oscillated by means of an
external electromagnet. This is usually called MAC mode (for magnetic AC), and
does not require operation near the resonance frequency since it does not rely on
mechanical amplication.
The amplitude of the vibration at the output of the photodetector is computed,
typically by a lock-in amplier or an analog or digital demodulation technique, and
compared with an amplitude setpoint Aset. Feedback circuitry drive the z piezo and
adjust the vertical displacement to keep the amplitude constant see Figure 8.3.
The principles of dynamic mode operation may be explained by approximating the
vibrating tip with a harmonic oscillator in a nonlinear force eld, as follows. Suppose
initially that the tip is at some distance z0 to the sample, with a force F(z0) between tip
and sample. Then, the following equation of motion must be satised:
m
z0 c z_ 0 kz0 Fz0 ;
where m is the effective mass, c is the damping coefcient, k is the spring constant,
and the dots denote derivatives with respect to time. Now, consider deviations from
this point that are small compared to the tip-sample distance. Denote by z the
deviation from the value z0. The equation of motion may now be written as
m
z0 z c_z0 z
_ kz0 z Fz0 z:
Subtracting the two previous equations, expanding F in Taylor series and keeping
only the rst term yields
m z c z_ kz zF 0 z0 ;
where F0 denotes the derivative of F with respect to z. This equation may be written as
m z c z_ k0 z 0;
where
k0 k F 0 z0 :
This means that small deviations of the cantilever satisfy the equations of motion
of a simple harmonic oscillator with a spring constant k0 , which depends on the
actual spring constant of the cantilever and the gradient of the tip-sample force at
the equilibrium point. Therefore, the resonance frequency changes from its initial
value to
q
v0 0 k0 =m:
When the cantilever is at a large distance from the sample, the interaction forces
between the two are negligible, the cantilever has a resonance frequency fres
corresponding to its spring constant k, and has a resonance curve (amplitude vs.
frequency) as shown by the red curve in Figure 8.4.
Suppose now that we drive the cantilever at a frequency fdrive, which generally is
near fres. If the tip is sufciently far from the sample for the interation force to
be negligible, F 0, the cantilever oscillates with the frequency fdrive and an amplitude Afree that we can read directly from the resonance curve. This is called the free
amplitude. Now we specify an amplitude setpoint, smaller than Afree. For the
cantilever to oscillate at this amplitude, the resonance curve must shift as shown
by the blue curve in the gure. Therefore, the feedback system must move the
j245
246
cantilever closer to the sample until the force gradient is such that the spring
constant and corresponding resonant frequency shift appropriately. We see that the
DC, or average, position of the cantilever is controlled in a rather indirect manner,
via the fdrive, Afree, and Aset parameters. (The free amplitude essentially scales the
resonance curve.) Aset is usually specied as a percentage of Afree. Typical values of
Aset are on the order of 80%. Lower setpoints imply large damping, which means
that a considerable amount of the cantilevers oscillation energy is being transferred to the sample. In essence, we are tapping hard on the sample, and this is
usually undesirable.
The theory just presented is simple and provides an intuitive understanding of the
dynamic mode operation. Unfortunately, it is predicated on a linearization about an
operating point, and is valid only for small oscillation amplitudes. Usually, however,
the AFM is operated with a setpoint that implies a relatively large amplitude and
causes the tip to hit (tap on) the surface of the sample at the lower part of each
oscillation cycle. The actual behavior of the cantilever when tapping is involved is
rather complicated see for example [30, 31]. But in normal imaging conditions the
oscillation amplitude varies approximately linearly with the DC tip position, as shown
in Figure 8.5. Note that such A-d (amplitude-distance) curves vary from cantilever to
cantilever and depend on several parameters.
The feedback circuitry in dynamic mode maintain a constant amplitude, and
therefore a constant distance to the sample. (Here we are assuming that the sample is
of a homogeneous material, or at least that the tip-sample forces do not vary over the
samples extent.) Scanning the tip over the sample in dynamic mode produces a
topographic image of the sample. There are other modes of AFM operation, but the
two discussed above are the most common and important.
More information on AFM theory and practice are available for example in
[3234].
8.2.2
Spatial Uncertainties
Let us now turn our attention to the sources of positional errors in AFM
operation. These give rise to spatial uncertainty and are important for accurate
imaging and especially for nanomanipulation. User intervention is normally used
to compensate for spatial uncertainties in nanomanipulation, but extensive user
interaction is slow and labor intensive, and therefore is severely limited in the
complexity of structures it can construct. Automatic operation is highly desirable
but cannot be accomplished without compensating for spatial uncertainties, as we
will show below. Compensation techniques are described later in this chapter, in
Section 8.4.2.
We noted earlier that the output of a line scan along the x direction is the
topography z(x). In the actual implementation this is not quite true. If no error
compensation is used, the output is Vz(Vx), where Vz and Vx are the voltages applied to
the z and x piezos. In an ideal situation these voltages would be linearly related to the
piezo extensions, and the signals Vz(Vx) and z(x) would coincide modulo scale
factors. But in practice they dont.
There are many nonlinearities involved. Some of these are normally taken into
account by AFM vendors hardware and software, for example, non-linearities in
the voltage-extension relationship for the piezos, coupling between the different
axes of motion, and so on. The most pernicious are drift, creep and hysteresis. As
far as we know, at the time of this writing (2006), drift is not adequately compensated for in any commercial instrument, and creep and hysteresis are negligible
only in top of the line AFMs that have feedback control for the x, y directions, and
whose controller noise r.m.s. is under 1 nm. The vast majority of AFMs in use today
either have no x, y feedback or have noise levels on their feedback circuitry that are
too large for precise lithography and manipulation. Open-loop operation with a
small scan size (e.g., 1 1 mm) is preferable for nanomanipulation operations with
such instruments.
Drift is caused by changes of temperature in an instrument made from several
materials with different coefcients of expansion. At the very low temperatures often
used for atomic manipulation the effects are negligible, but at room temperature they
can be quite large. Figure 8.6 shows four AFM images of gold nanoparticles with
15 nm diameters, taken at successive times, 8 min apart. The particles appear to be
moving, but in reality they are xed on the substrate surface. The piezos are driven by
the same voltage signals in all the panels of the gure, and have the same extensions,
but the position of the tip relative to the sample is the sum of the piezo extension and
the drift, and therefore changes with time. Experimental observations in our lab
indicate that drift is a translation with speeds on the order of 0.010.1 nm/s. The drift
velocity remains approximately constant for several minutes, and then appears to
change randomly to another value.
j247
248
Hysteresis is also present in piezo actuators and has non-negligible effects see
Figure 8.8. Hysteresis is a nonlinear process with memory. The extension of a
piezo depends not only on the currently applied voltage, but also on past extremal
values.
Finally, the images of the features that appear in a topography scan differ from the
actual physical features in the sample because of tip effects. The tip functions as a lowpass lter, and broadens the images. To a rst approximation, the images lateral
dimensions of a feature equal the true dimensions plus the tip diameter. Algorithms
are known for combating this effect [35]. Note that the vertical dimensions of a
features image are not affected by the tips dimensions.
Compensation of spatial uncertainties due to drift, creep and hysteresis in AFMs
will be discussed later, in Section 8.4.2, in the context of automated nanomanipulation.
Figure 8.8 Hysteresis effects. Left-to-right vs. right-to-left singleline scans of 15 nm Au particles on mica. Scan size 100 nm.
j249
250
8.3
Nanomanipulation: Principles and Approaches
8.3.1
LMR Nanomanipulation by Pushing
Here we discuss the approach to nanomanipulation that has been under development at USCs Laboratory for Molecular Robotics (LMR) over the last decade. It
was rst presented at the fourth International Conference on Nanometer-Scale
Science and Technology, Beijing, P.R. China, September 812, 1996, and later
reported in a string of papers [3641]. Other approaches are considered in the next
subsection.
We begin by preparing a sample with nanoparticles or other structures to be
manipulated. A typical sample consists of a mica surface coated with poly-L-lysine, on
which we deposit Au nanoparticles. The coating is needed because freshly cleaved
mica is negatively charged, and so are the nanoparticles; the poly-L-lysine attaches to
the mica and offers a positively-charged surface to the nanoparticles. We have also
used other surfaces such as (oxidized) silicon, glass and ITO (indium tin oxide), other
coatings such as silane layers [42], other particles such as latex, silver or CdSe, and
rods or wires of various kinds. We typically manipulate particles with diameters
between 5 and 30 nm, but have occasionally moved particles as small as 2 nm and as
large as 100 nm. In all cases the structures to be moved are weakly attached to the
underlying surfaces and cannot be imaged by contact mode AFM. We image them in
dynamic mode, apply a attening procedure to remove any potential surface tilt, and
then proceed with the manipulation. The bulk of our experiments have been
conducted in ambient air at room temperature and without humidity control, but
we also have demonstrated manipulation in a liquid environment [43]. We use stiff
cantilevers, with spring constants >10 N/m.
The nanomanipulation process is very simple. We move in a straight line with
an oscillating tip towards the center of a particle and, before reaching the particle,
turn off the z feedback. We turn the feedback on when we reach the desired end
of the particle trajectory. With the feedback off, the tip does not move up to keep
constant distance to the sample when it encounters a nanoparticle. Rather, it hits
the particle and pushes it. We use the same dynamic AFM parameters for
pushing as for imaging, but sometimes we force the tip to approach the surface
by applying directly a command to move by Dz immediately after turning off the
feedback.
Figure 8.9 shows experimental data acquired during a pushing operation for a
15 nm Au particle on mica. The two vertical dashed lines indicate the points where the
feedback is turned off and on. The top trace (A) is simply the topography signal
acquired by a single line scan in dynamic mode. The next trace (B) is the topography
signal during the push. The topography signal is at while the feedback is off because
the tip does not move up and down to follow the sample topography. Observe that as
soon as the feedback is turned back on we immediately get a non-null topography
signal that indicates that the tip was somewhat below the top of the particle at the end
of the manipulation. We conclude from these data that the tip is pushing the particle
forward rather than dragging it behind itself.
Trace C shows the amplitude of the vibration during the manipulation. The
amplitude is constant at the setpoint value when the feedback is on, but it decreases
as the tip approaches the particle with the feedback off, and eventually reaches zero
and stays at zero for the reminder of the push. At the same time that the amplitude
decreases, the average (DC) value of the cantilever deection increases and then
reaches an approximately constant level trace D in the gure. Finally, trace E is a
single line scan after the push.
We interpret the data in Figure 8.9 as follows. When the tip approaches the particle
with the feedback off, it starts to exchange energy with the particle and the vibration
amplitude decreases, much like in standard A-d curves (see Section 8.2.1). When the
vibration goes to zero, the tip touches the particle, and remains in contact with it until
the feedback is turned on. While the tip contacts the particle the cantilever starts to
j251
252
climb the particle, and the DC deection increases. When enough force is exerted
on the particle for it to overcome the surface adhesion forces, the particle moves, and
the deection (and hence the force) remains approximately constant. Our experiments reveal that there is a deection (or force) threshold above which the particle
moves.
For successful pushing, the trajectory has to pass close to the center of the particle.
Here the spatial uncertainties discussed in Section 8.2.2 are a major source of
problems. We address these problems in the interactive version of our Probe Control
Software (PCS) as follows. The user draws a line over the image and instructs the
AFM to scan along that line and output the corresponding topography signal. The
user moves the line over (or perhaps near) the particle to be pushed until he or
she detects the maximum height of the particle see Figure 8.10. This indicates that
the line is going through the center of the particle. Usually this is several nm away
from the apparent center because of drift and other spatial uncertainties. After the
center is found, the user sets two points along the trajectory for turning the feedback
off and on, and instructs the AFM to proceed with the push. The result can be
assessed immediately by looking at the single line scan after the push see trace E in
Figure 8.9.
The amplitude and deection signals see traces C and D in Figure 8.9 are useful
to assess whether a pushing operation is proceeding normally. For example, we have
observed experimentally that when we loose a particle (i.e., when it does not move
as far as specied) the deection drops to zero prematurely. In principle, one could
monitor the amplitude and deection signals while pushing and, for example, stop
and locate the particle when the signals are not as expected. In practice, however, this
may require substantial modications to the controller, if decisions are to be made
automatically based on this information while the manipulation operation is taking
place.
How reliably can particles be moved by the LMR aproach? Figure 8.11 attempts to
answer this question. Observe that operations in which the commanded motion is
below 50 nm are very successful, whereas for distances 80 nm the actual and
desired displacements of the particles begin to differ considerably. The reasons why
pushing over large distances is unsuccessful are not fully understood.
8.3.2
Other Approaches
As the tip approaches the particle, instead of turning the feedback off and on,
change the amplitude setpoint so that the tip gets closer to the surface.
Move towards the particle while tapping hard on the substrate and then turn the
feedback off an on. This appears to induce a lateral push, in which the cantilever
deection does not increase as in the standard pushing protocol of the previous
subsection.
j253
254
Sweden [47]. The Purdue group pushed 1020 nm gold clusters on graphite or WSe2
substrates with an AFM, in a nitrogen environment at room temperature [46]. They
image with non-contact AFM, but then stop the cantilever oscillation, approach the
substrate until contact, disable the feedback, and push. Samuelsons group at the
University of Lund succeeded in pushing galium arsenide (GaAs) nanoparticles of
sizes in the order of 30 nm on a GaAs substrate at room temperature in air [47]. They
use an AFM in non-contact mode, approach the particles, disable the z feedback and
push. This is the protocol investigated in detail later on by the LMR, and discussed in
the previous subsection. Essentially the same protocol is used in [48] to push Ag
nanoparticles. They observe that the vibration amplitude decreases as the particle is
approached, and then essentially vanishes during pushing, which agrees with the
ndings in our own laboratory.
Liebers group at Harvard has moved nanocrystals of molybdenium oxide
(MoO3) on a molybdenium disulte (MoS2) surface in a nitrogen environment
by using a series of contact AFM scans with large force setpoints [49]. The
nanoManipulator group at the University of North Carolina at Chapel Hill
moves particles by increasing the contact force, under user control through a
haptic device [50]. Sittis group reports manipulation of Au-coated latex particles
with nominal diameters 242 and 484 nm on a Si substrate with accuracies on the
order of 2030 nm [51]. First, they move the tip until it contacts the surface, and
then move it horizontally to a point near the particle, and up by a predetermined
amount. Next, they move against the particle using feedback to maintain either
constant height or constant force on the particle. In constant-height pushing, the
force signal exhibits several characteristic signatures that may be interpreted as
signifying that the particle is sliding, rolling or rotating. Constant-force pushing
is equivalent to contact-mode manipulation. Xis group at Michigan State
University has demonstrated pushing of latex nanoparticles with 110 nm diameters on a polycarbonate surface by two methods [52]. The rst consists of
scanning in contact mode with a high force. The second disables the feedback
and moves the tip open loop along a computed trajectory based on a model of
the surface acquired by a previous scan. This requires an accurate model of the
surface.
Theil Hansen and coworkers moved Fe particles on a GaAs substrate by approaching a particle in dynamic mode, switching to contact mode and pushing with the
feedback on [53]. This has the advantage that the pushing force can be controlled by
the AFM. However, switching modes is a non-trivial operation that can cause
damaging transients. (In the AutoProbe CP-R AFM that we use routinely for
nanomanipulation at LMR, such a switch is not allowed by the vendors software.)
Furthermore, switching from tapping to contact mode implies that the tip in contact
mode does not touch the same point of the surface it was tapping on, because the
cantilever normally is not horizontal see Figure 8.12.
The LMR and related protocols are essentially open-loop, because it is
virtually impossible to incorporate the force sensed by the cantilever into a
feedback loop during actuation for commercial AFMs. For example, in the
AutoProbe CP-R we are completely blind during a pushing operation. We can
record the force (deection) signal while pushing, but cannot do anything with it
until the motion stops. To do otherwise would require a major change to the
controller. On the other hand, it is not difcult to make the force signal available
for visualization in interactive pushing, and several research groups have reported
such capabilities. By developing their own controllers, Sitti and co-workers have
been able to use the force signal during pushing, primarily to determine when an
operation should be stopped because it is not going to succeed. When the tipparticle force drops to zero the particle is no longer being pushed. One should stop
the motion, locate the particle and schedule a new operation to deliver it to its
target position.
The AFM is both an imaging device and a manipulator but not both simultaneously. For example, it would be useful to see a particle while it is being pushed, but
this cannot be done solely with an AFM. An interesting approach that provides realtime visualization consists of operating the AFM within the chamber of a Scanning
Electron Microscope (SEM), or sometimes a Transmission Electron Microscope
(TEM). The motion of the tip can then be monitored by the electron microscope and
known techniques for visual feedback developed at larger spatial scales can be
deployed [54].
The AFM-SEM approach was pioneered by Satos group for microscopic objects [55, 56], and has been used successfully by several groups [50, 5759]. In some of
this work an AFM cantilever is used as an end effector for a specially-built
micromanipulator. Working inside an SEM has its drawbacks: electron microscopes
are expensive instruments, they are less precise than AFMs, require more elaborate
sample preparation, and normally operate in a vacuum environment, which precludes their use for certain applications, for example, in biology.
All of the work on AFM nanomanipulation discussed above involves essentially
pushing objects on a at surface. Pushing nanoparticles over steps [37] and onto
other particles [60] has been demonstrated by our group, but this is a very
rudimentary 3-D capability. More sophisticated 3-D tasks would be feasible if there
was the equivalent of a macroscopic pick-and-place operation for nanoparticles.
(Pick-and-place is possible with atoms and small molecules, as noted in the
j255
256
j257
258
manipulated by the AFM. Figure 8.14 shows on the left a SnO2 wire with a
diameter of 10 nm and a length of 9 mm, and on the right the result of cutting
the wire in two spots and then moving the three smaller wires. The manipulation
was done at the LMR by using our standard protocols.
For many applications, particles and other nanoscale objects must be linked
together to form a single, larger object. Linking may be accomplished by various
methods: chemically, by using material deposition, sintering, and welding. We
showed that Au nanoparticles can be connected chemically by using linkers with thiol
functional ends [79]. This can be done in two ways: (1) the particles are rst
functionalized with the di-thiols, then deposited and manipulated against one
another to form the target structure, or (2) the manipulation is done rst and then
the thiol treatment is applied. In either case the result appears to be the same. The
resulting assemblies can then be manipulated by using the same protocols, and
joined to make larger assemblies. Therefore, we have demonstrated that hierarchical
assembly is possible at the nanoscale [80]. Figure 8.15 shows on the left clusters of 2
and 3 particles, which were constructed by manipulating individual particles (initial
conguration not shown). On the right is a ring-like structure obtained by moving the
clusters on the left.
A different approach is reported in [81]: nanoparticles are manipulated to form a
target structure, which is then grown by deposition of additional material. Growth
is accomplished essentially by electroless deposition, by immersing the sample in
a hydroxilamine seeding solution. Figure 8.16 illustrates the results. On the top left
is a wire made by manipulating Au particles, and on the top right is a single line
scan through the centerline of the structure, showing that the height of the
particles is 8 nm. On the bottom left is the structure after deposition by
hydroxilamine seeding, and on the bottom right is a single line scan, which now
shows a particle size of 20 nm. The initial structure looks like a continuous wire
in the gure but is not mechanically stable; touching it with the tip causes the
structure to fall apart. In contrast, the nal structure is a solid wire. One
disadvantage of this method is that after the seeding the structures can no longer
be moved on the substrate surface.
j259
260
8.4
Manipulation Systems
8.4.1
Interactive Systems
Much more sophisticated interfaces have been developed by others. Hollis group
at the IBM Yorktown laboratory (now at Carnegie Mellon University) built an
interface to an STM in which the user could drive the tip over the sample by moving
a mechanical wrist [84]. The z servo signal is fed back to the wrist so that the user feels
the topography of the surface as wrist vertical motions. This force, however, is not
(a scaled version of) the actual force between tip and sample.
The nanoManipulator group in North Carolina has developed virtual reality user
interfaces, rst for STMs and then for AFMs [45, 50, 85, 86]. In the AFM interface
a user can either be in imaging or manipulation mode. During imaging, the
topographical data collected by the AFM is presented to the user in virtual reality, as
a 3-D display. In addition, the user can feel the surface by using a haptic device, as if
moving a stylus over a hard surface. Note, however, that the forces felt through the
haptic device are not the cantilever-sensed forces, but rather are forces computed by
standard virtual reality techniques so as to simulate the feel of a surface that
approximates the measured topography of the sample. In the imaging mode, the
user haptic input does not control the actual motion of the instrument, but rather
the position of a virtual hand over the image of the surface. In contrast, the hand can
be used to move the tip over the sample in manipulation mode. As the hand moves
in virtual space and the tip moves correspondingly over the sample, the topography
data generated by the AFM is used on the y to compute a planar approximation to
the surface. The user feels this approximated surface through the haptic stylus.
Although the user does not feel the actual forces sensed by the cantilever, he or she
can control the force applied to the sample during manipulation by using a set of
knobs.
Sitti and co-workers also implement a virtual reality graphics interface, and add a
one degree-of-freedom haptic device [51, 87]. Through a bilateral feedback system
based on theoretical models of the forces between tip, sample and particle, the user
can drive the tip over the sample by using a mouse, while at the same time feeling
with the haptic device the forces experienced by the cantilever.
Xis group has developed an augmented reality system in which cantilever forces
are reected in a haptic device [88, 89]. They develop a theoretical model for the
interaction forces between tip, object and surface, and use it to compute the position
of the tip based on the real-time force being measured. The visual display in a small
window around the point of manipulation is updated in real-time to reect the
computed particle position. Thus, a user can follow the (computed) motion of the
particle in real time during the manipulation.
8.4.2
Automated Systems
The automatic assembly of nanoobject patterns with the AFM consists of planning
and executing the motions required for moving a set of objects from a given initial
conguration into a goal conguration. The initial state usually corresponds to
nanoobjects randomly dispersed on a surface.
j261
262
As far as we know, there are only two systems today that are capable of building
nanoobject patterns automatically by AFM nanomanipulation. One is being developed by Xis group at Michigan State University, and the other at the LMR [73, 90].
The two systems use different planning algorithms and pushing protocols. Xis
system addresses at length the issues that arise in nanorod manipulation, and
therefore is more general than ours, which focuses on nanoparticles. On the other
hand, the Michigan State system has a more rudimentary drift compensator than
ours, and does not compensate for creep or hysteresis, which are important for the
manipulation of small objects, with dimensions 10 nm or less. The manipulation
tasks demonstrated in [73] involve objects which are roughly one order of magnitude
larger than those we normally manipulate, and the positional errors in the nal
structures shown in the gures of [73] also appear to be similarly larger than ours.
In the remainder of this section we describe the LMR automatic manipulation
systems, from the top down, starting with high-level planning and ending with the
system software architecture.
The input to the planner consists of a specication for a goal assembly of
nanoparticles, and an initial arrangement that is obtained by imaging a physical
sample with a compensated AFM (compensation is discussed below). In an initial
step the planner assigns particles to target locations by using the Hungarian
algorithm for bipartite matching, which is optimal [91]. It uses direct, straight-line
paths if they are collision free, or indirect paths around obstacles computed by the
optimal visibility algorithm [92]. Next, the planner computes a sequence of positioning paths, to connect the locations of the tip at the end of a push (determined in the
previous step) and at the beginning of another push. This is done by a greedy
algorithm, which sequentially selects the shortest paths. It is sub-optimal but
performs well in practice. In a general case collisions between particles may arise.
The planner handles collisions by exploiting the fact that all particles are assumed
identical. It simulates the sequence of operations previously computed, at each step
updating the state of the particle arrangement. If a collision is detected, it swaps
operations, and does this recursively because solving one collision problem may
generate new ones.
The planner just outlined is the second one we write. Our rst planner, developed
several years ago [93], was more complicated and slower, and did not perform better
than the current one. However, we abandoned work on planning at that time not
because of planner problems, but rather because we could not implement reliably the
primitive operation assumed by the planner, which is simply to move from an initial
point P to another goal point Q. The spatial uncertainties associated with the AFM
see Section 8.2.2 were such that after a few operations we could no longer nd the
particles and push them without user interaction, and the task could not be
completed automatically. We embarked on a research program aimed at compensation of uncertainties, and developed the compensators described briey below.
Details are available in [71, 94, 95].
The drift compensator is based on Kalman ltering, a standard technique in
robotics and dynamic systems. We assume a simple (but incorrect) model for the
time evolution of the drift. The model can be used to predict future values of the drift,
but these values will become increasingly wrong as time goes by, because the model is
not perfect. The Kalman equations provide us with means to estimate the prediction
error. When this value exceeds a threshold, drift measurements are scheduled, and
the measured and predicted values are combined to produce better estimates, again
by using the Kalman equations. A decade-long series of experiments indicates that
the drift is accurately approximated as a translation, with a direction and speed that
vary slowly. The estimated drift values obtained from the Kalman lter are added as
offsets to all the motion commands of the AFM, thus compensating for this
translation. Drift measurement techniques require that the AFM tip move on the
sample to acquire data. Therefore, manipulation operations must be suspended
when a measurement is needed. In contrast, when the lter is in prediction mode the
offsets can be calculated very quickly and used to update the coordinates without
interrupting the manipulation task.
Creep and hysteresis compensation is achieved through a feedforward scheme. A
model for the two phenomena together is constructed as explained below, using a
Prandtl-Ishlinskii operator [95]. This operator has the important property of invertibility. The inverse operator is computed and the desired trajectory is fed to the
inverse system. The result is the signal required to drive the AFM piezos so as to
follow the goal trajectory, assuming that the model is perfect. The model, of course, is
not perfect, but experimental results show that it is sufciently accurate for obtaining
very good results an order of magnitude decrease on the effects of creep and
hysteresis has been veried experimentally. Creep is modeled by a linear term plus a
superposition of exponentially decaying terms, with different time constants. Hysteresis is modeled by a superposition of operators which are essentially simple
hysteresis loops. The piezo extension is the sum of the values of creep and hysteresis
obtained from their models, and can be expressed in terms of a Prandtl-Ishlinskii
operator. The combined model depends on several parameters, which can be
estimated by analyzing the AFM topography signal for a line scan over a few particles.
The line should span the entire region in which the manipulations will take place. The
parameters are valid as long as the scan size of the AFM is not changed, and can be
computed rapidly by running the tip back and forth a few times with the compensator
on. Details may be found in [95].
Running both compensators together results in a software-compensated AFM
with sufciently low spatial uncertainties to provide a reliable implementation of the
most basic robotic primitive Move from point P to point Q on the sample. However,
this is not sufcient to reliably push particles between arbitrary points because
long pushes tend to be unreliable see Figure 8.11. Therefore, we break down any
long pushing trajectory into smaller segments, currently 30 nm long. Having a
reliable pushing routine, the output of the high-level planner can now be executed
also with high reliability.
Now that we have discussed the high-level planner and its primitive commands, let
us turn our attention to the software needed to implement the system. We found in
the beginnings of the LMR, in 1994, that commercial AFM software was designed for
j263
264
imaging and not suitable for manipulation. Therefore we designed and implemented
a manipulation system, called Probe Control Software (PCS), running on top of the
vendor-supplied Application Programming Interface (API) [36]. It was implemented
on AutoProbe AFMs (Park Scientic Instruments, which later on became Thermomicroscopes and now Veeco), which to our knowledge were the only instruments
sold with an available API. PCS evolved as time went by, and was the workhorse for
the interactive manipulation research in our laboratory until recently. Research often
moves in unpredictable ways, and we found that the ability to easily modify the
software was fundamental to our experimental work. Unfortunately, the API was
written for a 16-bit Windows system, which is far from being convenient to program.
We concluded that we were spending too much time ghting an inhospitable
programming environment and launched a re-write of the whole system, which
has been completed recently. (We nd that research software is in a permanent state
of ux.)
The new system is called PyPCS, for Python PCS. It is written in C and Python,
which is a scripting language that greatly facilitates program development. For
example, new modules may be added to the system without the need to recompile and
link the whole system. PyPCS has a client-server architecture. The server is written in
C for 16-bit Windows and runs in the PC that controls the instrument. The client
is written in Python, communicates with the server via standard interprocess
communication primitives, and may run in the AFM PC or in any computer that
is connected to it by Ethernet [96].
We conclude this section with a complete example, including planning and
execution in the AFM. Figure 8.18 shows on the left panel the initial random
dispersion of nanoparticles on the sample. On the right is the goal conguration
(yellow crosses) plus the result of planning, with the pushing paths in red, and
the positioning paths between pushes in green. The particles marked with a black
Figure 8.18 Left: Initial state. Right: Goal state (yellow) and
planner output superposed on the intial image, showing pushing
paths (red) and positioning paths (green). 15 nm Au particles on
mica. Reproduced with kind permission from [90].
cross are extraneous and should be removed from the area where the pattern is
being built. (This is also done automatically.) Figure 8.19 shows on the left the
result of executing the plan. On the right is the result of another similar operation
also performed automatically. The pattern on the right of Figure 8.19 represents a
different encoding of ASCI characters into nanoparticle positions compare with
Figure 8.13. Here a particle on the top row of each 2-row group signies a 1 and
a particle on the bottom row signies a 0. The 4 groups of 2 rows read NANO
in ASCI. This encoding uses twice as many particles as that of Figure 8.13 but has
an interesting advantage: editing the stored data amounts simply to pushing
particles up or down by a xed amount, and could be achieved very efciently by
an AFM with a multi-tip array with spacing equal to that of the particle grid
see [97] and the discussion in the next section. The patterns shown in Figure 8.19
were built in a few minutes with the automated system. They would take at least
one day of work by a skilled user if they had been built with our interactive
system.
8.5
Conclusion and Outlook
Manipulation with the AFM of nanoscale objects with dimensions 5100 nm has
been under study for over a decade, and is now routinely performed in several
laboratories. Nevertheless, some basic questions remain unanswered. For example,
how far off center can we hit a particle for it to be pushed reliably? How high above the
surface can we strike a particle for it to move? Does the size of a particle matter? Do
the shape and size of a tip matter? Why do particles fall off the desired trajectories?
What is the force threshold needed to move a particle? Are there preferred directions
j265
266
The LMR work on nanotechnology was supported in part by the NSF Grants EIA-9871775, IIS-99-87977, EIA-01-21141, DMI-02-09678 and Cooperative Agreement
CCR-01-20778; and the Okawa Foundation.
References
I would like to thank my LMR faculty colleagues, postdocs and students, too many
to mention here, who did much of the LMR work reported in this chapter and from
whom I learned much over the last decade. I wish to single out the students who built
the probe control software who made possible all of our work on nanomanipulation.
They were Cenk Gazen, who started it all, Nick Montoya who extended and
maintained PCS for a couple of years, Jon Kelly, who wrote the rst version of
PyPCS, Babak Mokaberi, who built the drift, creep and hysteresis compensators, Dan
Arbuckle, who is the architect of the current version of PyPCS, which integrates the
old PCS with Mokaberis work, and Jaehong Yun, who did the high level planning
software and, with Arbuckle, has integrated it into PyPCS.
An exhaustive bibliography on nanomanipulation and related topics is beyond the
scope of this Chapter. In many cases I have attempted to cite the pioneering works on
specic subjects but I may have failed to acknowledge some of them. I offer my
apologies to the colleagues whom I may not have cited, or whose work I may have
misinterpreted. There is simply too much research in this area for me to be able to
keep up with all of it.
References
1 Binnig, G., Rohrer, H., Gerber, Ch. and
Weibel, E. (1982) Surface studies
by scanning tunneling microscopy.
Physical Review Letters, 49,
5761.
2 Binnig, G., Quate, C.F. and Gerber, Ch.
(1986) Atomic force microscope. Physical
Review Letters, 56, 931933.
3 Dagata, J.A., Schneir, J., Harary, H.H.,
Evans, C.J., Postek, M.T. and Bennett, J.
(1990) Modication of hydrogenpassivated silicon by a scanning tunneling
microscope operating in air. Applied Physics
Letters, 56, 20012003.
4 Snow, E.S. and Campbell, P.M. (1995)
AFM fabrication of sub- 10-nanometer
metal-oxide devices with in situ control of
electrical properties. Science, 270,
16391641.
5 Salling, C.T. and Lagally, M.G. (1994)
Fabrication of atomic-scale structures
on Si(001) surfaces. Science, 265,
502506.
6 Becker, R.S., Golovchenko, J.A. and
Swartzentruber, B.S. (1987) Atomic-scale
surface modications using a tunneling
microscope. Nature, 325, 419421.
j267
268
23
24
25
26
27
28
29
30
31
32
References
33 Waser, R. (ed.) (2003) Nanoelectronics, and
Information Technology, Wiley-VCH,
Weinheim, Germany.
34 Meyer, E., Hug, H.J. and Bennewitz, R.
(2004) Scanning Probe Microscopy, Springer
Verlag, Heidelberg, Germany.
35 Villarrubia, J.S. (1994) Morphological
estimation of tip geometry for scanned
probe microscopy. Surface Science, 321,
287300.
36 Baur, C., Gazen, B.C., Koel, B.,
Ramachandran, T.R., Requicha, A.A.G.
and Zini, L. (1997) Robotic
nanomanipulation with a scanning probe
microscope in a networked computing
environment. Journal of Vacuum Science &
Technology B, 15, 15771580.
37 Baur, C., Bugacov, A., Koel, B.E.,
Madhukar, A., Montoya, N.,
Ramachandran, T.R., Requicha, A.A.G.,
Resch, R. and Will, P. (1998) Nanoparticle manipulation by mechanical
pushing: underlying phenomena and
real-time monitoring. Nanotechnology, 9,
360364.
38 Bugacov, A., Resch, R., Baur, C., Montoya,
N., Woronowicz, K., Papson, A., Koel, B.E.,
Requicha, A.A.G. and Will, P. (1999)
Measuring the tip-sample separation in
dynamic force microscopy. Probe
Microscopy, 1, 345354.
39 Requicha, A.A.G., Baur, C., Bugacov, A.,
Gazen, B.C., Koel, B., Madhukar, A.,
Ramachandran, T.R., Resch, R. and Will, P.
(1998) Nanorobotic assembly of twodimensional structures. Proceedings IEEE
International Conference on Robotics and
Automation (ICRA 98), Leuven, Belgium,
May 1621, pp. 33683374.
40 Requicha, A.A.G., Meltzer, S., Teran Arce,
P.F., Makaliwe, J.H., Siken, H., Hsieh,
S., Lewis, D., Koel, B.E. and Thompson,
M.E. (2001) Manipulation of nanoscale
components with the AFM: principles
and applications. Proceedings 1st IEEE
International Conference on Nanotechnology, Maui, HI, October 2830,
pp. 8186.
j269
270
58
59
60
61
62
63
64
65
References
66
67
68
69
70
71
72
73
j271
272
82
83
84
85
86
87
References
97 Requicha, A.A.G. (1999) Massively parallel
nanorobotics for lithography and data
storage. International Journal of Robotics
Research, 18, 344350.
98 Vettiger, P., Cross, G., Despont, M.,
Drechsler, U., D
urig, U., Gotsmann, B.,
Haberle, W., Lantz, M.A., Rothuizen,
H.E., Stutz, R. and Binnig, G.K.
(March 2002) The millipede
j273
j275
9
Harnessing Molecular Biology to the Self-Assembly of
Molecular-Scale Electronics
Uri Sivan
9.1
Introduction
Microelectronics and biology provide two distinct paradigms for complex systems. In
microelectronics, the information guiding the fabrication process is encoded into
computer programs or glass masks and, based on that information, a complex circuit
is imprinted in silicon in a series of chemical and physical processes. This top-tobottom approach is guided by a supervisor whose wisdom is external to the circuit
being built. Biology adopts an opposite strategy, whereby complex constructs are
assembled from molecular-scale building blocks, based on the information encoded
into the ingredients. For example, proteins are synthesized from amino acids based
on the instructions coded in the genome and other proteins. The assembled objects
process further molecules to form larger structures capable of executing elaborate
functions, and so on. This autonomous bottom-up strategy allows, in critical bottlenecks, for an exquisite control over the molecular structure in a way which is
unmatched by man-made engineering. In other cases it allows for the errors that
are so critical for evolution.
The fact that man-made engineering evolved so differently from nature
engineering deserves a separate discussion that is beyond the scope of this chapter.
Here, we will only comment that the perception of nature as a type of engineering is
somewhat oversimplifying. While engineering aims at meeting a predened challenge namely, to execute a desired function nature evolved with no aim. Yet, the
hope behind biomimetics is that concepts and tools which evolved during several
billions years of evolution may nd applications in engineering.
Electronics is particularly alien to biology. With the exception of short-range
electron hopping in certain proteins, biology relies on ion transport rather than
electrons. The electronic conductivity of biomolecules is orders of magnitude too
276
small for implementing them as useful electronic components. For instance, albeit in
earlier reports, DNA has been found to be an excellent insulator [13]. The foreseen
potential of biology in the context of electronics is, therefore, in the assembly process
rather than in electronic functionality per se. This observation is reected in the
scientic research described below; it concerns the bioassembly of electronic
materials to form devices, rather than attempts to use biomolecules as electronic
components.
The term self-assembly is widely used to describe a variety of processes which
include the self-assembly of organic molecules to form uniform monolayers on
substrates. This is not the type of self-assembly under consideration in this chapter,
whereby the term refers to the construction of an elaborate object, namely, the
embedment of a signicant amount of information into the object being built. The
subject of the intimate relationship between self-assembly, information, and complexity will be revisited in Section 9.4.
The term complex self-assembly deserves some introductory remarks. When
looking back at nature, one realizes that complex objects are typically assembled in a
modular way. Most protein machines, for instance, comprise several subunits, each
made of a separate protein. Each such protein is synthesized in the cell from amino
acids which are in turn synthesized from atoms. This example is identied in four
levels of hierarchy, namely atoms, amino acids, proteins, and machines made of
several protein subunits. This hierarchal or modular assembly is an essential
ingredient of complex self-assembly, the reason being that none of the modules
reects a global minimal free energy of its elementary constituents. The protein
machine, for instance, does not pertain to a minimal free energy of the collection of
amino acids making it, and so on.
In many instances the system is guided to a certain conguration by auxiliary
molecules (enzymes, chaperones, etc.) which at times consume energy. However, in
the cases of interest here, where self-assembly is governed by non-covalent interactions and relatively simple congurations, each step can be driven by a down-hill drift
in free energy towards a long-lived metastable state, thus rendering the module
amenable for the next assembly step. Clearly, complex electronics cannot be assembled from its elementary building blocks in a single step, and so requires modular
assembly.
The next comment concerns the unavoidable errors characterizing self-assembly.
In order for molecular recognition to take place, the molecules should effectively
explore multiple docking congurations with other parts of the target molecule or
with other molecules. The free energy landscape corresponding to the collection of all
such congurations should, therefore, facilitate thermally assisted hops between
local minima, corresponding to wrong congurations, in addition to the desired
conguration. Special measures must be devised in order to produce overwhelming
discrimination in favor of the desired conguration at nite time experiments. In
the absence of such measures, the yield of self-assembly is intrinsically limited by
the same uctuations that facilitate molecular recognition. Over time, biology
has evolved sophisticated error suppression and correction tools, and equivalent
9.1 Introduction
j277
278
9.2
DNA-Templated Electronics
9.2.1
Scaffolds and Metallization
Figure 9.1 Heuristic scheme of a DNAtemplated electronic circuit. (a) Gold pads are
defined on an inert substrate. Panels (bd)
correspond to the circle of (a) at different stages
of circuit construction. (b) Oligonucleotides of
different sequences are attached to the different
pads. (c) DNA network is constructed and bound
to the oligonucleotides on the gold electrodes.
j279
280
except on the silver aggregates attached to the DNA catalyzed the process. Under the
experimental conditions, metal deposition therefore occurred only along the DNA
skeleton, leaving the passivated substrate practically clean of silver. An atomic force
microscopy (AFM) image of a segment of a 100 nm-wide, 12 mm-long silver wire
prepared in this way is shown in Figure 9.3.
Since the publication of Ref. [1], the metallization scheme has been improved in
two essential ways. First, silver has been replaced with gold [6] in the enhancing step
and, after a few hours sintering at 300 C, excellent wires were obtained. Second, the
hydroquinone has been substituted for glutaraldehyde [6, 14] localized on the DNA
itself. The connement of the reducing agent to the DNA molecule suppressed nonspecic metal deposition on other objects in the system, leading to much cleaner
circuits. A DNA-templated gold wire is depicted in the inset of Figure 9.4, together
with its currentvoltage (IV) characteristics.
Other research groups have since extended the scope of the metallization of
biomolecules to proteins, amyloid brils, protein S-layers, microtubules, actin bers,
and even complete viral particles. Today, the choice of metals includes Pd, Pt, Au, Cu,
and Co. An account of biomolecules metallization can be found in Refs. [1520].
9.2.2
Sequence-Specific Molecular Lithography
j281
282
j283
284
contacted within the exposed DNA sequences present in the unmetallized gaps.
Further manipulations of DNA templates including the localization of man-made
objects at specic addresses along the DNA molecule, the generation of three- and
four-armed junctions, and elaborate metallization patterning can be found in
Refs. [6, 7, 14, 16].
9.2.3
Self-Assembly of a DNA-Templated Carbon Nanotube Field-Effect Transistor
The superb electronic properties of CNTs [23], their large aspect ratio, and their
inertness with respect to the DNA metallization process, make them an ideal choice
for the active elements in DNA-templated electronics. The ability to localize molecular objects at any desired address along a dsDNA molecule and to pattern sequencespecically the DNA metallization (as described above) facilitate the incorporation of
CNTs into DNA-templated functional devices, and their wiring. In the assembly of
the eld-effect transistor (FET), a DNA scaffold molecule provided the address for the
precise localization of a semiconducting single-wall carbon nanotube (SWNT), and
templated the extended wires contacting it. The localization of the SWNT relied on
j285
286
proteins were rst polymerized on the auxiliary ssDNA molecules to form nucleoprotein laments (Figure 9.7, step i), which were then mixed with the scaffold dsDNA
molecules. The nucleoprotein lament bound to the dsDNA molecule according to
the sequence homology between the ssDNA and the designated address on the
dsDNA (Figure 9.7, step ii). The RecA later served to localize a SWNT at that address
and to protect the covered DNA segment against metallization. A streptavidinfunctionalized SWNT was guided to the desired location on the scaffold dsDNA
molecule using antibodies to the bound RecA and biotinstreptavidin-specic
binding (Figure 9.7, step iii). The SWNTs were solubilized in water by micellization
in sodium dodecyl sulfate (SDS) [24] and functionalized with streptavidin by nonspecic adsorption [25, 26].
Primary anti-RecA antibodies were reacted with the product of the homologous
recombination reaction, and this resulted in specic binding of the antibodies to
the RecA nucleoprotein lament. Next, biotin-conjugated secondary antibodies,
having high afnity to their primary counterparts, were localized on the primary
anti-RecA antibodies. Finally, the streptavidin-coated SWNTs were added, leading to
their localization on the RecA via biotinstreptavidin-specic binding (Figure 9.7,
step iii). The DNA/SWNT assembly was then stretched on a passivated oxidized
silicon wafer. An AFM image of a SWNT bound to a RecA-coated 500-base-long
ssDNA localized at the homologous site in the middle of a scaffold l-DNA molecule is
shown in Figure 9.8a. The conducting CNT can be clearly distinguished from the
insulating DNA by the use of scanning conductance microscopy [27, 28]. The
topographic and conductance images of the same area are depicted in Figure 9.8b
and c, respectively. The evident difference between the two images identies the
SWNT on the DNA molecule. It should be noted that the CNT is aligned with the
DNA, which is almost always the case due to the stiffness of the SWNT and the
stretching process.
Following stretching on the substrate, the scaffold DNA molecule was metallized.
The RecA, doubling as a sequence-specic resist, protected the active area of the
transistor against metallization. The metallization scheme described above was
employed, in which aldehyde residues, acting as reducing agents, were bound to
the scaffold DNA molecules by reacting the latter with glutaraldehyde. Highly
conductive metallic wires were formed by silver reduction along the exposed parts
of the aldehyde-derivatized DNA (Figure 9.7, step v) and subsequent electroless gold
plating using the silver clusters as nucleation centers (Figure 9.7, step vi). As the
SWNT was longer than the gap dictated by the RecA, the deposited metal covered the
ends of the nanotube and contacted it. A SEM image of an individual SWNTcontacted
by two DNA-templated gold wires is depicted in Figure 9.8d.
The extended DNA-templated gold wires were contacted by electron-beam
lithography, and the device was characterized by direct electrical measurements
under ambient conditions. The p-type substrate was used to gate the transistor.
The electronic characteristics of the device are shown in Figure 9.9a and b. The
gating polarity indicated p-type conduction of the SWNT, as is usually the case
with semiconducting CNTs in air [29]. The saturation of the drain-source current
for negative gate voltages indicated resistance in series with the SWNT; this
resistance was attributed to the contacts between the gold wires and the SWNT as
the four-terminal resistance of the DNA-templated gold wires was typically
smaller than 100 O. Each of the different devices had somewhat different turnoff voltages.
j287
288
9.3
Recognition of Electronic Surfaces by Antibodies
was found to bind preferentially to the (1 0 0) facet through its coat protein (Figure 1S,
supplementary material to Ref. [9]). As those phages were identical to the library
phages used in Ref. [46], and given the lack of selectivity displayed by the only free
peptide tested thus far [47], it seemed that further experiments with free peptides
would be needed in order to either conrm or disprove semiconductor facet
recognition by short peptides. In contrast to Ref. [46], the present antibodies were
also tested and found to be selective towards crystal orientation when detached from
the phage.
The 7- and 12-mer peptides used in most in-vitro selections of binders to inorganic
crystals are typically too short to assume a stable structure. Antibodies on the other
hand, display a rigid three-dimensional (3-D) structure which is potentially essential
for high-afnity selective binding [30, 31]. Moreover, the recognition site in the
latter case involves six amino acid sequences grouped into three complementaritydetermining regions (CDR). All together, these CDRs form a large, structured
binding site spanning up to 3 3 nm. The critical role of the antibody 3-D structure
for the recognition of organic crystal facets is well established [30, 31].
Another hint to the importance of rigidity for facet recognition is provided by the
rigid structure characterizing antifreeze peptides that target specic ice facets [48]. It
has also been shown that the stable helical structure of a 31-mer peptide catalyzing
calcite crystallization is essential for inducing directed crystal growth along a
preferred axis [49], possibly due to its differential binding to the various facets.
Hence, structure rigidity may turn central to facet recognition by biomolecules,
thereby underscoring the importance of antibody libraries as a promising source for
selective binders.
Selective binding to specic crystalline facets can be directly utilized for numerous
micro- and nanotechnological applications, including the positioning of nanocrystals
at a well-dened orientation, governing crystal growth and forcing it to certain
directions [49], and positioning nanometer-scale objects at specic sites on a
substrate marked by certain crystalline facets. An application of one of these soluble
antibodies to the latter task is demonstrated in Figure 9.10. By using conventional
photolithography and H3PO4 : H2O2 : H2O etching, a long trench has been dened
on a GaAs (1 0 0) substrate in the (1 1 0) direction (Figure 9.10a). Due to the slow
etching rate of phosphoric acid in the (1 1 1A) direction, the process leads to slanted
(1 1 1A) side walls and a at (1 0 0) trench oor (Figure 9.10a). A SEM image of a cut
across the trench, and proving that the slanted walls are indeed tilted in the (1 1 1A)
direction (54.7 relative to the (1 0 0) direction), is depicted in Figure 9.10b. When the
isolated scFv antibodies are applied to the GaAs substrate they attach themselves
selectively to the (1 1 1A) slopes.
In order to image the bound antibody molecules, they were targeted with antihuman secondary antibodies conjugated to a uorescent dye, Alexa Fluor. As shown
in Figure 9.10c, uorescence is limited solely to the (1 1 1A) slopes with practically no
background signal coming from the (1 0 0) surfaces. Control experiments depleted
of the scFv fragments exclude possible artifacts such as the natural uorescence of
the (1 1 1A) facets, selective binding of the uorescent dye, or secondary antibodies to
that facet.
j289
290
The images in Figure 9.10 prove that the selected scFv antibody molecules
recognize and bind selectively GaAs (1 1 1A) as opposed to GaAs (1 0 0). As such,
they can be used to localize practically any microscopic object on (1 1 1A) surfaces,
with negligible attachment to other crystalline facets. The isolation of such binders
using phage display technology, and the quantication of their selectivity, is described
in the following section.
The Ronit1 scFv antibody phage library [50] used in the present study, is a
phagemid library [51] comprising 2 109 different human semi-synthetic singlechain Fv fragments, where in-vivo-formed CDR loops were shufed combinatorially
onto germline-derived human variable region framework regions of the heavy (VH)
and light (VL) domains.
To select scFv binders to GaAs (1 1 1A), approximately 1011 phages (100 copies of
each library clone) were applied to the semiconductor crystal (panning step). After
washing the unbound phages, the bound units were recovered by rinsing the sample
in an alkaline solution. The recovered viruses were then quantied by infecting
bacteria and plating dilution series on Petri dishes. The amplied sublibrary was
applied again to the target crystal facet, and so on. Typically, three to four panning
rounds were required to isolate excellent binders to the target. As is evident from
Figure 9.11, the number of bound phages retrieved from the semiconductor grew
300-fold when panning was repeated three times. For comparison, the non-specic
binding of identical phages (M13) carrying no scFv fragments remained low
throughout the selection process. It was found experimentally that blocking with
milk was essential to prevent the non-specic binding of phages to the GaAs targets.
Interestingly, as shown in the supplementary material to Ref. [9], in the absence of
blocking against non-specic binding (a step missing in Ref. [46]), the non-specic
binding of phages through their coat protein to GaAs (1 0 0) was larger than to GaAs
(1 1 1A). The data in Figure 9.11 prove the selection of increasingly better binders to
GaAs (1 1 1A), but provide no indication of selectivity with respect to GaAs (1 0 0).
Indeed, as indicated by the two left-hand columns of Figure 9.12, application of the
polyclonal population of binders selected on GaAs (1 1 1A) to GaAs (1 0 0) shows
similar binding to the latter crystalline facet. Hence, the process described above
produced good, but non-selective, binders.
Preferential binding to a given crystalline facet was achieved by a slight modication of the process. The phages recovered from the rst panning on GaAs (1 1 1A)
j291
292
were amplied in E. coli and then applied to GaAs (1 0 0). However, this time the
unbound phages were collected and applied in a second panning step to GaAs (1 1 1A).
As evident from the two right-hand columns of Figure 9.12, the depletion step on
GaAs (1 0 0) is enriched for specic phage clones that both bind GaAs (1 1 1A) and
lack binding to GaAs (1 0 0). On this occasion, binding of the selected phages to the
(1 1 1A) facet was almost 100-fold higher than to the (1 0 0) facet. This depletion step,
which was crucial to the present case, was missing in Ref. [46].
The polyclonal population of selected phages contains different scFv fragments,
each characterized by different afnity and selectivity to the two crystalline facets. In
order to correlate specicity with sequence, the binding selectivity of the individual
clones was next analyzed. Monoclonal binders were isolated by infecting E. coli
bacteria with the sublibrary and plating them on solid agar. As each bacterium can be
infected by a single phage, all bacteria within a given colony carry DNA coding for the
same scFv fragment. Infection of the colony with helper phages resulted in the
release of phages displaying the same scFv on their PIII coat proteins. The isolated
monoclonal phages were then analyzed with ELISA against GaAs (1 1 1A) and (1 0 0).
The sequences of the light (VL) and heavy (VH) CDRs of ten monoclonal binders that
were identied by the ELISA assay can be found in Ref. [9] and its supplementary
material, together with a discussion of their main features.
Figures 9.11 and 9.12 correspond to the scFv fragments displayed on phage
particles. For practical applications (such as that demonstrated in Figure 9.10) it is
preferable to have soluble monoclonal scFv fragments detached from the phage coat
proteins. The results of the ELISA assays of the scFv fragment of Figure 9.10, in its
soluble form, are presented in Figure 9.13.
In Figure 9.13, bars 16 correspond to the six ELISA assays on GaAs (1 1 1A) and
GaAs (1 0 0) pieces, each of 4 4 mm. After washing the substrates, the bound
antibodies were reacted with anti-human horseradish peroxidase (HRP), and the
binding was quantied by adding tetramethylbenzidine (TMB) as a colorimetric
substrate, and reading the resulting optical density (OD) at 450 nm. Bars 79
provided the following controls. Bars 7 quantied the non-specic binding of the
secondary anti-human HRP to the ELISA plate in the absence of the EB scFv and
semiconductor substrates. Bars 8 corresponded to the non-specic binding of the
scFv to the plate, and bars 9 to non-specic binding of the secondary antibodies to the
semiconductor substrates.
The background ELISA signal, depicted by bars 79, accounts for most of the GaAs
(1 0 0) signal in columns 16. When subtracting this background from columns 1 to
6, a remarkable preference is found to GaAs (1 1 1A) compared with (1 0 0). Interestingly, the binding of the secondary antibody to GaAs (1 0 0) was almost twice as large
compared to its binding to GaAs (1 1 1A), in opposition to the selectivity of the isolated
scFv fragments. Overall, the data in Figure 9.13 prove that the isolated scFv preserves
its selectivity also when detached from the phage.
Little is known of the interaction between biomolecules and inorganic surfaces, let
alone the recognition of such surfaces by antibody molecules. The GaAs surface is
modied by surface reconstruction, oxidation, and possibly other chemical reactions.
Moreover, it displays atomic steps and possibly surface defects. It is therefore difcult
to estimate how much of the underlying crystalline order manifests itself in the
recognition process. Unfortunately, as no experimental tools capable of determining
these parameters with atomic resolution exist at present, the recognition mechanism
is unclear, except for the accumulating indications of the importance of structural
rigidity (as discussed in the introduction to this section). The discrimination between
the two crystalline facets may reect the different underlying crystalline structures,
they may stem from the different surface chemistries of the two facets, or they may
result from global properties such as atom density and different electronegativity. The
latter factor has been found to be important for the differential binding of short
peptides to different semiconductors [47]. The abundance of positively charged
amino acids in the heavy chain of CDR1 and CDR3 and the light chain of CDR1 may
indicate an afnity to the exposed gallium atoms. The negatively charged amino acid
in CDR3 VL (missing in anti-gold scFv isolated from the same library) combined with
the positively charged CDR3 VH may match the polar nature of GaAs.
The recognition of man-made materials by antibodies opens new opportunities for
a functional interface between biology and nanotechnology, far beyond what was has
been exercised to date.
9.4
Molecular Shift-Registers and their Use as Autonomous DNA Synthesizers [11]
9.4.1
Molecular Shift-Registers
The DNA-templated assembly of elaborate circuits requires distinct dsDNA molecules with non-recurring sequences. For the assembly of periodic structures, such as
memories, a segment of non-recurring sequences should be replicated to form a
periodic molecule, and the synthesis of such molecules presents a remarkable
j293
294
challenge to biotechnology. The two existing strategies for generating long molecules,
namely PCR [52] and ligase assembly [53], utilize synthetic oligonucleotides which
together span (with overlap) the full length of the desired molecule. Hence, when
following any of these strategies, the assembly of an N-base long molecule with
distinct p-long segments requires O(N/p) oligonucleotides. These approaches therefore quickly become impractical when a rich variety of distinct molecules or
addresses along a given molecule are needed for the construction of an elaborate
template for molecular electronics [1, 5]. Motivated by the concept of DNA-templated
electronics, the present author and colleagues were therefore forced to invent an
exponentially more economic synthesis strategy based on the chemical realization of
molecular SRs. The dramatic reduction in synthesis effort by SRs is facilitated by
exploiting a novel concept in DNA synthesis; a sliding overlapping reading frame.
Rather than the xed frame that directs segment ligation or polymerization in the two
schemes listed above or in hairpin-based DNA logic [54, 55] and programmed
mutagenesis [56], the SRs utilize a previously synthesized sequence to dictate
synthesis of the next bases. The automaton is an example of DNA computing where
the result of the computation (tape) is a useful molecule.
An autonomous binary p-shift register (p-SR) is a computing machine with 2p
internal states represented by an array of p cells (Figure 9.14a), each occupying one bit,
xi {i 1 . . . p}. In each step a binary function, f (x1, x2, . . . xp), is computed and its value
is inserted into cell p. Simultaneously, xj is shifted to cell j 1; {j 2 . . . p}. On printing
x1 to a tape, a long periodic binary sequence is generated. Electronic SRs are utilized in
many applications including secure communication, small signal recovery, and
sequence generation [57]. Here, it is shown that molecular SRs can be realized and
utilized for the autonomous synthesis of DNA molecules the sequence of which is
uniquely determined by a chemical embodiment of the function f (x1, x2, . . . xp).
Consider a 3-SR with xn1 f (xn2, xn1, xn) xn2 xn ( XOR) and an initial
setting (seed) x1, x2, x3 001. Repetitive application of f generates the sequence
001110100111010. . . . The sequence is periodic with a period seven and any of the
seven L 3 bit long consecutive strings in a period is different from the rest. In
general, it is well known [57] that for any p, a SR can be found a with a linear feedback
p
P
ai xi ; ai 2 f0; 1g (the sum is mod 2), that generates a sequence
function [58], f
i1
Q1
Q2
j295
296
of the tape molecules are elongated by one bit according to the rule xn1 xn xn2.
Elongation is terminated by addition of excess stop primers that intercept the tape
molecules as soon as the latter display a desired tail (001 in the example of Figure 9.15
step xii). The polymerase then copies the stop primer and adds its alien sequence to the
tape, which is unrecognizable by any rule strand. As a result, elongation terminates.
The 50 seed and 30 stop primers tails are later used for PCR amplication of the tape.
Elongation is guided by a sliding reading frame where all, except the rst, shifted bits
from the previous reading frame plus a single new bit provide the current reading
frame. The sliding frame is the crux of our concept, as it facilitates exponentially smaller
synthesis effort compared with any of the previous, xed-frame approaches.
It should be noted that rule strands are not consumed during synthesis; rather, they
only serve as enzymes to direct the reaction. Thus, synthesis in ow may be
envisioned, where the rule strands are attached in synthesis order to subsequent
segments of a tube or a column. While the reactants ow through the tube the correct
sequence is generated, and this strategy is advantageous to straightforward synthesis
in a DNA synthesizer as faulty strands are not recognized (and hence not elongated)
by rule strands. Clearly, errors are doomed to be short.
Now, an actual demonstration of the concept may be described. In the rst
implementation each bit is realized by a sequence of three nucleotides, 50 TGC for
0 and50 GCTfor 1.These sequences were chosen as they minimize errors dueto one
and two base shifts in the annealing step. The demonstration starts with the three-bit
maximal SR as discussed above. Such SR requires seven 4-bit (3-bit rules plus one
function bit) strands (Figure 9.14b), but in order to suppress synthesis errors longer,
redundant 6-bit rules(5-bit rulesplus one function bit) areemployed. Error suppression
by redundancy is discussed in Section 9.4.2. The seven 6-bit rule strands [60] used in the
30 011101,3
0 111010,
30 110100,
30 101001,3
0 010011,
synthesiscomprise,30 001110,
0
0
0
3 1 0 0 1 1 1. The complementary bits, 0 and 1, correspond to 3 ACG and 3 CGA, respectively. The rule strands are synthesized with three nucleotides only (G, C, A) in order to
prevent their extension by polymerase (poor mans ddDNA). The 2/3 GC content
gives [61] DG 8.5 10.5 kBT free energy per bit (stacking included) which in a 5-bit
realization of a 3-bit SR translates ideally to suppression of the error rate by a factor proportional to exp(3DG/kBT) exp(25.5) (see Section 9.4.2). The seed strand comprises a 50 tail followed by a 5-bit sequence [62], 50 GCATGCGCCCGTCAGGCG00111.
The tail is later used to amplify the SR sequence by PCR. The seed, the rule strands, and
three nucleotides (dGTP, dCTP, dTTP) are mixed together and subjected to 45 thermal
is added in 10-fold excess
cycles [63], after which a stop primer, 30 01001GACGTC,
compared with each rule strand. During an additional ve to ten cycles the tape
molecules are further elongated until in some cycle their last ve bits read 01001. At
that pointa stopprimer binds tothe tapeandits complementarysequence isaddedtothe
tape by the polymerase. The elongation now terminates as the sequence added by the
stop primer is alien to all rule strands. The absence of dATP guarantees single strand
synthesis. The expected synthesized sequences read
50 GCATGCGCCCGTCAGGCG00111j 0100111n 01001CTGCAG with n 0; 1; . . .
seed primer
7!
9:1
Finally, the elongation products are PCR amplied with two primers, identical to
the rst 19 nucleotides of the seed (50 GCATGCGCCCGTCAGGCGT) and to the last
19 nucleotides of the stop primer (50 CTGCAGAGCGCAGCAAGCG).
The resulting PCR products, when run against a standard ruler in a polyacrylamide
gel, are depicted in Figure 9.16a. Four bands corresponding to Equation 9.1 with
n 0, 1, 2, 3 are clearly resolved. Sequencing of the four bands with a primer identical
to the rst 19 nucleotides on the 50 end of the seed primer proves the bands
identication with the respective n values in Equation 9.1. The high delity of the
automaton is reected in the perfect matching of the sequencing with Equation 9.1,
and the absence of any unexpected bands.
j297
298
As shown in Figure 9.16b, after 100 elongation cycles it was possible to resolve
10 bands, n 0 9, corresponding to 54, 75, 96, 117, 138, 159, 180, 201, 222, and
243 base-long sequences. The automaton thus synthesizes at least 204 bases at a
remarkable delity. The n 9 sequence comprises 10 periods, each of 21 bases,
with exactly one repetition of each 3-bit (or longer) address per period. Direct
sequencing of the bands conrmed the results up to n 6. The small
material quantities in the higher bands were insufcient for reliable sequencing. As
PCR amplication favors shorter sequences, the relative band brightness cannot be
taken as a measure for synthesis efciency of molecules with different n-values.
The synthesis of longer period molecules, as well as of non-binary sequences, is
demonstrated in Figure 9.16c and d, and details can be found in Ref. [11]. The
synthesis of the last two examples was held in a thermal ratchet mode at a xed
temperature [11].
9.4.2
Error Suppression and Analogy Between Synthesis and Communication Theory
j299
300
rate (the ratio between incorrect and correct annealing) can be reduced in this way
to exp(DG/kBT), where DG is the corresponding free energy per bit. The error
rate may be systematically suppressed by using longer rule strands to generate the
same sequence. By using de Bruijn graphs it can optimally be shown that each extra
bit can reduce the error rate by an additional factor of exp(DG/kBT). This is the
reason for the 6-bit long rule strands used in the realization of the 3-SR. The two
extra bits are meant to suppress synthesis errors.
To the best of the present authors knowledge, this is the rst incorporation of a
redundancy code in chemical synthesis. The analogy drawn between chemical
synthesis and transmission of messages over noisy lines suggests further applications of communication theory to chemical synthesis.
9.5
Future Perspectives
In Sections 9.2 to 9.4, a novel concept was outlined, namely the harnessing of the
remarkable assembly strategies and tools of molecular biology to the self-assembly
of molecular-scale electronics. Central issues such as instilling biomolecules with
electrical conductance, molecular lithography for patterning metallization and
localizing devices on DNA templates, the direct recognition of electronically
relevant man-made objects by biomolecules, and the economic synthesis of DNA
molecules characterized by non-recurring sequences have now been resolved to a
point where the formidable challenge of complex self-assembly can be faced with
condence. However, harnessing the power of bioassembly presented here to the
realization even of simple circuits requires more than mere optimization of the
tools developed to date. As argued above, complex self-assembly will require a
hierarchical, modular approach and, hence, the development of molecular switches
that test for electronic functionality and feed back on the bioassembly process. Such
switches will involve a functional interface between molecular biology and electronics, namely the ability of biomolecules to read electronic signals presented to
them by the assembled devices and circuits, and then to effect the assembly process
based on those ndings. Only then can a full merging of biology and electronics be
achieved.
Acknowledgments
The concepts and tools described in this chapter have been developed over the past
decade by a signicant group of researchers at Technion Israel Institute of
Technology. The author is especially grateful to Erez Braun, Kinneret Keren, Yoram
Reiter, Arbel Artzi, Stav Zeitzev, Ilya Baskin, and Doron Lipson, whose contributions
were immeasurable. Different areas of the research were funded by the Israeli
Science Foundation, Bikura, the fth EU program, the German Israeli DIP, the
Rosenbloom family, and the Russell Berrie Nanotechnology Institute.
References
References
1 Braun, E., Eichen, Y., Sivan, U. and Ben
Yoseph, G. (1998) DNA templated
assembly and electrode attachment of
conducting silver wire. Nature, 391,
775778.
2 Endres, R.G., Cox, D.L. and Singh, R.R.P.
(2004) The quest for high-conductance
DNA. Reviews of Modern Physics, 76 195.
and references therein.
3 Legrand, O., Cte, D. and Bockelmann, U.
(2006) Single molecule study of DNA
conductivity in aqueous environment.
Physical Review, E73, 031925. and
references therein.
4 Heath, J.R., Kuekes, P.J., Snider, G.S. and
Stanley Williams, R. (1998) A defecttolerant computer architecture:
opportunities for nanotechnology. Science,
280, 1716.
5 Eichen, Y., Braun, E., Sivan, U. and Ben
Yoseph, G. (1998) Self assembly of
nanoelectronics components and circuits
using biological templates. Acta
Polymerica, 49, 663670.
6 Keren, K., Krueger, M., Gilad, R., BenYoseph, G., Sivan, U. and Braun, E. (2002)
Sequence-specic molecular lithography
on single DNA molecules. Science, 297, 72.
7 Keren, K., Berman, R.S. and Braun, E.
(2004) Patterned DNA Metallization by
sequence-specic localization of a reducing
agent. Nano Letters, 4 (2), 323326.
8 Keren, K., Berman, R., Sivan, U. and
Braun, E. (2003) DNA-templated carbonnanotube eld-effect transistor. Science,
302, 13801382.
9 Artzy-Schnirman, A., Zahavi, E., Yeger, H.,
Rosenfeld, R., Benhar, I., Reiter, Y. and
Sivan, U. (2006) Antibody molecules
discriminate between crystalline facets of
gallium arsenide semiconductor. Nano
Letters, 6, 1870.
10 Brod, E., Nimri, S., Turner, B. and Uri,
Sivan (2008) Electrical control over
antibody-antigen binding, Sensors and
Actuators B: Chemical, 128, 560.
j301
302
23
24
25
26
27
28
29
30
31
32
References
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
j303
j305
10
Formation of Nanostructures by Self-Assembly
Melanie Homberger, Silvia Karthauser, Ulrich Simon, and Bert Voigtlander
10.1
Introduction
The increasing demand for high-density electronic devices has triggered and
continues to trigger the development of new nanofabrication methods. Two
conceptually different strategies are applied for the fabrication of nanostructures,
namely: (i) the top-down strategy; and (ii) the bottom-up strategy.
The top-down approaches utilize lithographical methods to fabricate nanostructures starting from the bulk materials (see Chapters 5, 6, and 7), whereas in the
bottom-up approaches nanostructures are built up from atoms, molecules, or
nanoscale sub-units. The top-down methods enable the generation of a large variety
of dened structures, but these are limited by the resolution of current lithography
techniques. The bottom-up methods offer the opportunity to fabricate structures even
in the single-digit nanometer range, but they suffer from the fact that it is still a great
challenge to direct the functional sub-units into desired structures. One extreme
approach in this context is the utilization of a scanning probe microscope for building
up nanostructures atom by atom at low temperatures (see Chapter 9). However,
although this approach is ultimate in terms of the size of the nanostructures, it is a
very slow and sophisticated method. Compared to this method, processes based on
self-organization or self-assembly have the key advantage that they enable the
formation of billions of nanostructures with control over size, shape, and composition in a fast and parallel fashion. Due to entropic effects during the formation of
nanostructures by self-assembly, defects are expected always to be present, and faulttolerant architectures are required to cope with this problem. The combination of the
self-assembly of atoms, molecules and nanoscale subunits could lead to well-ordered
functional nanostructures. For example, inorganic nanostructures, generated by the
self-assembly of atoms via epitaxial growth, may serve as templates for the selective
adsorption of functional molecules, which themselves display anchor-points at
which size-selected clusters could be attached, altogether leading to highly ordered
functional nanostructures with applications in molecular electronics. One critical
306
factor determining the benets of this approach for electronic systems will be the
surface-selective SAM formation that is, the selective assembly of functional
molecules on special device patterns forming an ordered array. In this context, in
the following chapter attention is focused on the formation of nanostructures by selfassembly via epitaxial growth, the self-assembly of molecules, and the formation and
self-assembly of nanoscale subunits. Basic physical principles and selected examples
will be presented.
10.2
Self-Assembly by Epitaxial Growth
One approach for the fabrication of nanostructures is epitaxial growth. Such growth
usually occurs under kinetic conditions, so that the sizes can be tuned down to the
single-digit nanometer range by choosing appropriate growth conditions. However,
size uniformity is the greatest challenge here. If the growth is taking place under
(near) equilibrium conditions, then the size distribution of the nanostructures may
be narrow, but is provided by the material system and cannot be varied easily. The
formation of islands, wires and rods will be presented as examples of nanostructures
grown by epitaxy. Subsequently, the growth of nanostructures on template substrates
structured by step arrays or underlying dislocation networks will be considered. The
combination of self-organized growth with lithography (hybrid methods) allows the
self-assembled nanostructures to be aligned relative to predened patterns. It is
possible that such inorganic nanostructured templates may be used in the future for
the selective formation of molecular layers.
10.2.1
Physical Principles of Self-Organized Epitaxial Growth
10.2.1.1 Epitaxial Growth Techniques
The main methods used for semiconductor epitaxial growth are chemical vapor
deposition (CVD) [1] and molecular beam epitaxy (MBE) [2, 3]. In CVD, growth gases
containing compounds of the elements to be deposited are introduced into the
growth chamber. When the gas molecules hit the substrate surface, they decompose
(partially) and the gaseous products desorb from the surface. Different chemical
reactions taking place at the surface, or even in the gas phase, lead to a quite complex
nature of the fundamental processes of epitaxial growth in CVD. Molecular beam
epitaxy is conceptually simpler; here, the elements to be deposited are heated in
evaporators until they evaporate, whereupon the beam of the atoms hits the surface
and the atoms diffuse over the surface and nally bind at surface lattice sites
(Figure 10.1).
In spite of the fact that the MBE growth is, in principle, much easier than the CVD
growth, there are still many different fundamental processes occurring during
epitaxial growth by MBE [4]. Part of these are illustrated schematically in Figure 10.1.
Atoms from the molecular beam arrive at the surface of the crystalline substrate (a)
and may diffuse over the surface when the activation energy for diffusion is overcome
(b). When two atoms (or sometimes also more than two atoms) meet, they form a
nucleus for a stable island (c). Such a nucleus may grow to a stable two-dimensional
(2-D) island by attachment of further diffusing adatoms (d). The nucleus for which
the probabilities to grow or decay are equal is called the critical nucleus [5]. Nuclei
which are larger than the critical nucleus are termed stable 2-D islands, while nuclei
smaller than the critical nucleus are called subcritical nuclei or embryos. Another
process is the diffusion and attachment at pre-existing steps if the diffusion length is
sufcient (e).
10.2.1.2 Kinetically Limited Growth in Homoepitaxy
In kinetically limited growth the system is governed by energetic barriers such as
barriers for the diffusion of adatoms and barriers for incorporation of atoms into the
crystal, and additionally by outer conditions such as the growth rate. The 2-D islands
(which are one atomic layer high) represent the simplest example of the selfassembled growth of nanostructures. In the following section it will be shown how
the density and size of these islands can be controlled by the kinetic parameters
temperature and growth rate. First, the deposition temperature inuences the island
density strongly, as shown by the comparison of Figure 10.2a and b. The island
density as function of temperature follows an Arrhenius law: n exp(Eact/kT), where
Eact is an effective activation energy consisting of a diffusion energy and binding
energy component, having values around 1 eV in the case of semiconductors [5]. The
temperature is one important parameter of growth kinetics, and the deposition rate is
another. It has been found that the island density (n) scales with the deposition rate (F)
in the form of a power law n Fa, with a scaling exponent a. Combining the
temperature and the rate dependence results in the following scaling law: n Fa
j307
308
exp (Eact/kT) [5], which shows that the island density can be controlled over a wide
range by adjusting the kinetic growth parameters of temperature and growth rate.
The average island distance is simply the square root of the inverse of the island
p
density L 1= n.
Although the nucleation of the islands is a random process, the distribution of the
island sizes is centered around a mean value (Figure 10.3). This arises due to a
saturation of the island nucleation, as will be explained in the following. During the
early stage of growth (nucleation regime), the islands nucleate randomly on the
surface and the distance between them decreases. If the distance between the islands
is equal to the mean distance that an adatom travels before a nucleation event
happens, then the incorporation of adatoms in existing islands becomes a more
probable event than the nucleation of new islands; hence, a capture zone forms
around each island. Adatoms deposited in this capture zone attach to the corresponding island. Without this effect the distribution of island sizes would be even broader.
The nucleation of further islands is suppressed beyond a certain coverage (growth
regime), and the average island size can be controlled by the deposited amount. The
island size distributions for two different temperatures are shown in Figure 10.3,
where it can be seen that the peak in the island size distribution scales towards larger
sizes with higher temperatures. For very small islands, the surface reconstruction can
also modify the island size distribution [4]. In the kinetic growth regime the island
density of 2-D islands can be controlled by the kinetic parameters temperature and
deposition rate, while the size distribution is quite broad due to the stochastic nature
of the nucleation of the islands.
j309
310
comes from the edge energy (b is the edge energy per length). The energy of an island
is E Eedge 4 Lb. The number of atoms in an island (N) depends on the dimension L,
as N L2/o, with o being the area per atom. The chemical potential is then
m
dE 2wb 1
dN
L
L
10:1
Since m is decreasing for larger islands, innite size islands have the lowest
chemical potential (Figure 10.4b), which means that the stable island is innitely
large. In this case, the equilibration does not result in a stable nite island size;
equilibration in this model by material transport between islands is also referred to as
coarsening because it results in the shrinkage of small islands and a growth
(coarsening) of large islands (Ostwald ripening).
An innitely large stable island size is the result for homoepitaxial growth, taking
into account only the edge energy. However, the situation becomes different when
elastic stress is also taken into account, as it occurs in heteroepitaxy where two
different materials grow onto each other. Here, stress is induced by the different
lattice constants of the substrate material and the material of the islands. The elastic
effect of strained 2-D islands can be approximated by that of a surface-stress domain
that is, the surface stress at the area of the island is different from that at the rest of the
surface (Figure 10.5). The strain energy of a quadratic surface-stress domain can be
calculated using the elastic theory as Estrain 2LC lnL [6]. Adding the step edge energy
results in a total energy of a strained island:
E E edge E strain 2L2b C 0 ln L
This results in the following chemical potential:
2b C 0 C0
ln L
mw
L
L
10:2
10:3
which is illustrated in Figure 10.5. In this case, the chemical potential has a minimum
at the size Lmin exp(2b/C0 ), which would mean that during coarsening the islands
would approach this size. Larger islands would dissolve and smaller islands grow
until all islands have the size Lmin, that is, the lowest chemical potential, and this
would result in a very narrow size distribution. Unfortunately, step energies are only
very poorly known, so that it is not possible to predict a reliable number for the
equilibrium island size. An experimental realization of thermodynamically stable
islands has not yet been conrmed, apart from surface reconstructions with a
relatively large unit cell.
If the formation of nanostructures in equilibrium is compared to the formation of
nanostructures by growth kinetics, the following advantages and disadvantages occur.
Nanostructures grown under equilibrium conditions have (under specic conditions)
the advantage of a narrow size distribution around the optimum size. However, a
disadvantage is that the size is determined by the material parameters (strain energy
and step edge energy for instance), and cannot be tuned freely. The size and density of
nanostructures formed under kinetic conditions can be tuned easily by variations of
the growth parameters such as growth rate and temperature. On the other hand, the
size uniformity of the islands grown under kinetic conditions is relatively poor.
10.2.1.4 Nanostructure Formation in Heteroepitaxial Growth
Semiconductor nanostructures can be fabricated by self-organization using heteroepitaxial growth, which is the growth of a material B on a substrate of different
material A. In heteroepitaxial growth, the lattice constants of the two materials are
often different. The lattice mismatch for the two most commonly used material
systems, Si/Ge and GaAs/InAs, is 4.2% and 7%, respectively (shown schematically in
Figure 10.6a). This lattice mismatch leads to a build-up of elastic stress in the initial
2-D growth in heteroepitaxy. In the case of Ge heteroepitaxy on Si, the Ge is conned
to the smaller lattice constant of the Si substrate that is, the Ge is strained to the Si
lattice constant (Figure 10.6b). One way to relax this stress is via the formation of
three-dimensional (3-D) Ge islands, in which only the bottom of the islands is
j311
312
conned to the substrate lattice constant. In the upper part of the 3-D island the lattice
constant can relax to the Ge bulk lattice constant and reduce the stress energy in this
way (Figure 10.6c). This growth mode, which is characterized by the formation of a
2-D wetting layer and the subsequent growth of (partially relaxed) 3-D islands, is
referred to as the StranskiKrastanov growth mode, some examples are which are
described shown in Section 10.2.2.
The driving force for the formation of self-organized nanoislands in heteroepitaxial growth is the build-up of elastic strain energy in the stressed 2-D layer. As a
reaction to this, a partial stress relaxation by the formation of 3-D islands can lower
the free energy of the system. The process of island formation close to equilibrium is
a trade-off between elastic relaxation by the formation of 3-D islands, which lowers
the energy of the system, and an increase of the surface area, which increases the
energy.
In a simple model, where the islands are cubes with the length x, the additional
surface energy for a lm in an island morphology (compared to a strained lm) is
proportional to the island length squared (x2). The gained elastic relaxation energy
compared to that of a at lm is, in the simplest assumption, proportional to the
volume of the island (x3). For the same total volume in the lm, the energy difference
between the 3-D island morphology and the at morphology is
DE E surf E relax Cgx 2 C0 e2 x 3
10:4
where g is the surface energy, e is the lattice mismatch, and C and C0 are constants. The
contributions of Esurf, Erelax and the total energy difference between the 3-D island
morphology and a at lm are shown in Figure 10.7, as a function of the island size x.
For small sizes of the 3-D islands, the 3-D island morphology is unfavorable up until
the point where the absolute value of the gained elastic relaxation energy (x3)
becomes larger than the cost of the surface energy (x2). For islands larger than a
critical island size, xcrit, the formation of 3-D islands is energetically preferred over the
2-D lm morphology. While this simple model shows the basic driving forces for the
2-D to 3-D transition, it contains several simplications. For example, in this simple
model the island morphology is assumed as being cuboid, which does not correspond
to the experimentally observed island shapes. Further, the simple model contains only
energetic considerations of two nal states. Kinetic effects, such as the required
material transport necessary during the 2-D and 3-D transition are not considered.
Apart from the formation of 3-D islands, there is another process which can partially
relax the stress of a strained 2-D layer, namely the introduction of mist dislocations.
This corresponds to the removal of one lattice plane of a compressively strained 2-D
layer. If a lattice plane is removed in regular distances in the 2-D layer, then a mist
dislocation network forms. Depending on the growth parameters of temperature and
growth rate, the self-organized growth can either be close to equilibrium or in the
kinetically limited regime. At close to equilibrium (i.e., at high growth temperatures or
low deposition rates), the occurring morphology (strained layer, 3-D islands, or a lm
with dislocations) is determined only by the energies of the particular congurations,
and the morphology with the lowest energy will be formed. If the growth is kinetically
limited, then the activation barriers are important. For instance, an initially at
strained layer can transform to a morphology with 3-D islands or to a lm with
dislocations. Yet, what actually happens depends on the kinetics of the growth process
that is, on the activation energy for the formation of 3-D islands compared to the
activation energy for the introduction of mist dislocations.
10.2.2
Semiconductor Nanoislands and Nanowires
10.2.2.1 StranskiKrastanov Growth of Nanoislands
StranskiKrastanov growth occurs, for example, in InGaAs/GaAs growth [7]. An
example of InAs nanoislands grown on a GaAs substrate is shown in the transmission
j313
314
electron microscopy (TEM) image in Figure 10.8. The GaAs islands were grown by
MBE at a growth temperature of 775 K, and the density of the islands was 4.5 1010
cm2, with an average lateral size of 17.5 0.5 nm. The challenges in the growth
ofthese semiconductor islands are to grow islands of desired size and density, and
with a high size uniformity. As in the case of the 2-D islands, a higher growth
temperature generally leads to the formation of larger islands, while a higher
growth rate leads to the formation of smaller islands. The size of the islands increases
with coverage; often, the density of the islands saturates during an early stage
of the growth. These are general trends which may depend on the material system
and the particular deposition technique. In some cases (self-limiting growth), the size
of the islands saturates while the density increases with coverage, and this type of
growth mode leads to a high size uniformity of the islands. The size uniformity
achieved in self-assembled growth of semiconductor islands may be as small as a small
percent. The connement of charge carriers in all three directions gives rise to atomiclike energy levels. Quantum dot lasers operating at room temperature have now been
realized [8]. The islands grown on a at substrate are usually not ordered laterally due
to the random nature of the nucleation process. In the following section, it will be
shown how nucleation at specic sites can be achieved.
10.2.2.2 Lateral Positioning of Nanoislands by Growth on Templates
An example of ordered nucleation at a prestructured substrate is shown in
Figure 10.9a [9], where Ge islands nucleate above dislocation lines. However, when
a SiGe lm is grown on a Si(0 0 1) substrate, dislocations form at the interface between
the SiGe lm and the substrate. The driving force for the formation of dislocations is
the relief of elastic strain, which arises due to the different lattice constants between the
Si substrate and a Ge/Si lm on this substrate. During annealing, the dislocations form
a relatively regular network, due to a repulsive elastic interaction between the
dislocations. The preferred nucleation of Ge islands above the dislocation lines
(Figure 10.9a) can be explained by local stress relaxation above the dislocation lines
providing a lattice constant closer to the Ge one. The nucleation does not occur
randomly at the surface, but rather occurs simultaneously at sites which have the same
structure. This leads to a more narrow size distribution than that for the growth on
unstructured Si(0 0 1) substrates (Figure 10.9b).
10.2.2.3 Silicide Nanowires
If the crystal structure of the deposited material is different from that of the substrate,
then effects related to the anisotropic match of both crystal structures may appear. If
the overlayer material has a crystal structure which is closely lattice-matched to the
substrate along one major crystallographic axis, but has a signicant lattice-mismatch
along the perpendicular axis, this should allow unrestricted growth of the epitaxial
crystal in the rst direction but limit the width in the other direction. Such a strategy
has been applied to grow silicide nanowires [10]. Here, the substrate is a Si(1 0 0)
surface (Si has diamond crystal structure), and by deposition of Er and subsequent
annealing, ErSi2-oriented crystallites with a hexagonal AlB2-type crystal structure
were formed on the Si substrate. The [0 0 0 1] axis of the ErSi2 was oriented along a
[1 1 0] axis of the Si(0 0 1) substrate, and the [1 1 2 0] of the ErSi2 was oriented along
the perpendicular [1 1 0] axis, with lattice mismatches of 6.5% and 1.3%,
respectively; this almost satises the proposed growth conditions for nanowires.
ErSi2 nanowires grown on the Si(1 0 0) surface are shown in Figure 10.10. The ErSi2
nanowires align along one of the two perpendicular <1 1 0> Si directions, which are
the small mismatch directions. In these directions the crystal can grow without much
build-up of stress, while the width of the ErSi2 nanowire is 4 nm, the height
0.8 nm, and the length is several hundred nanometers. Such self-assembled arrays
of nanowires may also be used as conductors for defect-tolerant nanocircuits, or as a
template for further nanofabrication.
10.2.2.4 Monolayer-Thick Wires at Step Edges
Monolayer-high surface steps can be used to fabricate Ge nanowires using step-ow
growth. Here, pre-existing step edges on the Si(1 1 1) surface are used as templates for
j315
316
the growth of 2-D Ge wires at the step edges. When the diffusion of the deposited
atoms is sufcient to reach the step edges, the deposited atoms are incorporated
exclusively at the step edges, and the growth proceeds by a homogeneous advancement of the steps (step ow growth mode [4]. If small amounts of Ge are deposited,
then the steps will advance only a few nanometers and narrow Ge wires can be grown.
One key issue for the controlled fabrication of nanostructures consisting of
different materials is a method of characterization which can distinguish
between the different materials on the nanoscale. If the surface is terminated with
a monolayer of Bi, it is possible to distinguish between Si and Ge [11]. Figure 10.11a
shows a scanning tunneling microscopy (STM) image after repeated alternating
deposition of 0.15 atomic layers of Ge and Si, respectively. Due to the step-ow
growth, the Ge and Si wires are formed at the advancing step edge. Whilst both
elements can be easily distinguished by the apparent heights in the STM images, it
transpired that the height measured by the STM was higher in areas consisting of Ge
(red stripes) than in areas consisting of Si (yellow stripes). The apparent height of the
Ge areas was 0.1 nm higher than that of the Si wires (Figure 10.11b), and the
cross-section of a 3.3 nm-wide Ge nanowire was seen to contain only approximately
20 atoms (Figure 10.11c). The apparent height difference arises due to an atomic layer
of Bi which is deposited initially and always oats on top of the growing layer. The
different widths of the wires can easily be achieved by depositing different amounts
of Ge and Si.
10.2.3
Hybrid Methods: The Combination of Lithography and Self-Organized Growth
j317
318
the Ge does not grow on the oxide, the edges of the oxide hole cannot serve as sinks for
deposited Ge atoms, and therefore the Ge adatom concentration is homogeneous
across the hole and nucleation of the Ge island is random within the oxide hole. If
the edges of the hole were to serve as sinks for Ge atoms (e.g., if the edges of the
hole were to consist of Si), then the adatom density would have a maximum at the
center of the hole and the nucleation of Ge islands would occur preferentially at
the center of the oxide holes (Figure 10.12e).
10.2.4
Inorganic Nanostructures as Templates for Molecular Layers
Several of the nanostructures discussed here can potentially be used as templates for
the selective formation of molecular structures onto specic areas of the inorganic
nanostructures generated by the self-assembly of atoms via epitaxial growth. The
importance of the inorganic substrate for the formation of molecular layers, which is
discussed in detail in the following section, is manifold. The role of the inorganic
nanostructured template for the molecular self-assembly may be to steer the
adsorption process kinetically, and to direct the molecules towards predened
adsorption sites. The rst steps in this direction have been taken recently. Initially,
special substrate surfaces were selected in accordance with their ability to adsorb
molecules. For example, substrates with only weak adsorption properties are useful
for molecular assemblies with weak intermolecular interactions, because such
substrates allow for the necessary reorganization of molecules. In addition, it is
j319
320
p p
with a larger periodicity (ve times that of the Ag/Si(1 1 1)- 3 3R30 lattice
constant; see Figure 10.13d) has been created by the assembly of two types of
molecule on the Ag-terminated silicon surface [13]. This hexagonal molecular
network is shown in Figure 10.13c, and is discussed in detail in Section 10.3.3. The
registry of the molecules with respect to the underlying silver-terminated Si surface
has been determined, and is shown schematically in Figure 10.13d. The calculated
melaminemelamine separation has a near-commensurability with the surface
lattice, showing the importance of the underlying inorganic template for the
formation of the supramolecular structure.
In the future, the selective bonding of molecular species to inorganic template
structures, which would enable site direction, will also represent a major challenge
for the successful combination of inorganic templates and molecular structures.
10.3
Molecular Self-Assembly
10.3.1
Attaching Molecules to Surfaces
Bare surfaces of metals and metal oxides tend to adsorb organic materials because the
adsorbates lower the free energy of the interface between the respective material and
the ambient environment. The character of the chemical bond between the adsorbed
molecules and the metal surface determines the interfacial electronic contact and the
strength of the geometric xation. Two main groups of links between molecules and
solids can be distinguished: (i) covalent bonds, which result from the overlap of
partially occupied orbitals of interacting atoms; and (ii) non-covalent bonds, which
are based on the electrical properties of the interacting atoms or molecules.
Planar molecules with extended p-systems have been found to physisorb onto
surfaces, such as highly oriented pyrolytic graphite (HOPG), Au(1 1 1), Cu(1 1 0), in a
at-lying geometry. This allows functional groups at the molecular periphery to
approach each other easily and to build up intermolecular interactions, predominantly comprising hydrogen bonds and metalligand interactions. If the molecules
are sufciently mobile to diffuse on the surface, then the intermolecular interactions
will guide the adsorbed molecules into 2-D supramolecular systems. Then, by
adjusting the molecular backbone size and the position or number of the functional
recognition groups, complex supramolecular nanostructures can be designed [31].
Covalent bonds are established, if there is a signicant overlap of the electron
densities of the molecules and the metal, and this will result in a strong electronic and
structural coupling. The spontaneous formation of SAMs on substrates through
covalent bonds requires organic molecules with a chemical functionality or
headgroup and a specic afnity for a selected substrate. There exists a number
of headgroups, which bind to specic substrates forming directed covalent links. One
frequently used covalent link is the bond between a thiol group on the molecular site
and a noble metal substrate. Here, gold is favorable due to its proper non-oxidizing
surface, although thiol or selenol bonds are also possible to Ag, Pt, Cu, Hg, Ge, Ni,
and even semiconductor surfaces. The reason for the great success of the SAu bond
is its good stability at ambient temperature, and the ease of reorganization to form an
ordered array. Both are elementary requirements for the building up of a selfassembled monolayer.
Besides the prominent thiolates, other functional molecules, such as alcohols
(ROH) or acids, have been demonstrated to form organized monolayers on metals or
metal oxide surfaces, such as Al2O3, TiO2, ZrO2, or HfO2. SAMs of alkylchlorosilanes
(RSiCl3) and other silane derivatives require hydroxylated surfaces as substrates for
their formation. The driving force for this self-assembly is the in-situ formation of
polysiloxane, which is connected to surface silanol groups (SiOH) via robust
SiOSi bridges [32]. Substrates on which these monolayers have successfully
been prepared include silicon oxide, aluminum oxide, quartz, glass, and mica.
During the past few years, signicant advances have been made by coupling
alkenes and alkines onto Si and SiH surfaces. The covalent coupling of vinyl
compounds on H-terminated silicon yields very stable SiC covalent bonds [33], and
recently a method for the direct assembly of aryl groups on silicon and gallium
j321
322
arsenide using aryl diazonium salts has also been developed. There is a spontaneous
ejection of N2 and direct carbonsilicon formation [34], but the CSi bonds are so
strong that a facile reconstruction in order to form a highly ordered SAM is
implausible.
10.3.1.1 Preparation of Substrates
For the deposition of a SAM, a 2-D lm with a thickness of one molecule, a highquality surface with a very low surface roughness is required. Depending on any
further use of the SAM, the quality of the surface must be adapted. For example, if the
SAMs are applied as etch resists, protection layers, chemical sensors or model
surfaces for biological studies, then polycrystalline lms will mostly sufce as
substrates. In contrast, if the properties of the SAMs themselves are to be studied
in detail, such as their organization, structure or electronic properties, then oriented
single crystalline surfaces are required as substrates.
Planar substrates for SAMs are either thin lms or single crystals of metals,
semiconductors, or metal oxides. Thin lms can be grown on silicon wafers, glass,
single crystals or mica by CVD, physical vapor deposition (PVD), electrodeposition,
or electroless deposition. Metal lms on glass or silicon are polycrystalline and
composed of grains that can range in size from 10 to 1000 nm.
As pseudo single crystals, thin lms of metals on freshly cleaved mica are
commonly used. Gold lms grow epitaxially with a strongly oriented (1 1 1) texture on
the (1 0 0) surface of mica. The lms are usually prepared by thermal evaporation of
gold at rates of 0.10.2 nm s1 onto a heated (400650 C) sample [35]. By using an
optimized two-step process, a surface roughness down to 0.4 nm over areas of
5 5 mm can be achieved [36]. Surfaces with almost comparable roughness can be
created by a method known as template stripping [37]. Here, a glass slide or a silicon
wafer is glued to the exposed surface of a gold lm on mica, and subsequently the gold
lm is peeled from the mica to expose the surface that had been in direct contact with
the mica. Typically, these methods result ultimately in surface roughnesses of 1 nm
over areas of 200 200 nm2. For fundamental studies of SAMs by ultra-high vacuum
(UHV) methods, single-crystal metal substrates provide the highest quality with
respect to surface roughness, orientation, and cleanliness. These substrates result in
densely packed SAMs of the highest order.
10.3.1.2 Preparation of Self-Assembled Monolayers
In principle, there are two possibilities of preparing SAMs, namely deposition from
solution, and deposition from the vapor phase. For deposition from solution, a clean,
freshly prepared substrate is immersed into a highly diluted solution of the
corresponding organic molecules. After only a few minutes of immersion, a dense
molecular monolayer is built; however, to ensure that the lm reaches equilibrium
the substrates are kept in solution for several hours to allow reorganization
(Figure 10.14). In particular, the structure of the adsorbate determines the highest
achievable density of the SAM on a given surface, or whether a SAM can be formed at
all. The other parameters, such as solvent, temperature, concentration and immersion time, should be chosen adequately to achieve the best possible result. The
advantages of this method are the simplicity of the equipment and the ease of
preparation.
In the case of alkanethiols on Au(1 1 1), the annealing procedure at elevated
temperatures to increase the quality of the lms has been studied extensively.
Annealing the SAMs in a diluted solution of their molecules for short periods at
80 C often results in a reduction in the number of vacancy islands, and an
enlargement of the domain sizes due to Ostwald ripening. This behavior is explained
by an intralayer diffusion of monovacancies towards larger holes, which grow at the
expense of smaller holes. Furthermore, some vacancy islands diffuse towards the
gold step edges and annihilate there, which explains the decrease in area occupied by
the vacancy islands. In addition, the conformational defects in the SAMs decrease,
and this will result in a higher order.
In the case of gas-phase deposition, UHV systems with base pressures in the range
of 105 to 107 mbar are used. The amount of deposited molecules is controlled by
the pressure, the temperature and the time. Vapor deposition has the advantage that
absolutely clean surfaces can be used, a good control of the amount of deposited
molecules is possible, and the SAM can be transferred to an analyzing tool without
breaking the vacuum. By applying this method, submonolayers and highly ordered
monolayers of extreme size can be created (Figure 10.14).
10.3.1.3 Preparation of Mixed Self-Assembled Monolayers
Mixed SAMs that is, SAMs built up from different organic molecules and showing a
well-dened structure can be created in several ways; however, the two most widely
used approaches will be described here.
The rst method is coadsorption from solutions containing mixtures of selected
organic molecules, and results in mixtures of molecular structures (Figure 10.15).
This process allows the formation of SAMs with widely varying compositions and
physical properties [39, 40].
j323
324
The second method is a two-step deposition process which begins with a full
coverage monolayer of one organic species, which is used as the host matrix [4143].
In a second step, the substrate covered with this host-matrix is immersed into the
solution of an organic molecule of interest. Insertion of this guest-molecule takes
place preferentially at defect sites such as pin holes or domain boundaries in the host
matrix. The rate-determining step is the replacement of host molecules by guestmolecules. Depending on the immersion time, domains of inserted molecules,
bundles or even single guest molecules can be identied in the resulting mixed
monolayer (Figure 10.15). A well-ordered surrounding matrix can be used as the
reference system for the analysis of the structural and electrical properties of the
inserted molecules. This matrix-isolation method, in combination with scanning
probe microscopy (SPM) techniques, is suitable for investigating series of organic
molecules in order to determine new physical properties.
10.3.2
Structure of Self-Assembled Monolayers
Figure 10.16 A schematic diagram of a SAM, with the characteristic features highlighted.
j325
326
hollow site on the Au(1 1 1) surface. The symmetry of the alkanethiolates is hexagonal
p p
with a ( 3 3) R30 structure relative to the underlying Au(1 1 1) substrate, a SS
spacing of 0.4995 nm, and a calculated area per molecule of 0.216 nm2. The
alkanethiols are tilted 30 off the surface normal, and the hydrocarbon backbones
are in all-trans conguration. Additionally, the alkanethiolates on Au(1 1 1) surfaces
exhibit a c(4 2) superlattice which is characterized by a systematic arrangement of
molecules showing a distinct height difference (Figure 10.17) [44]. The height
differences in STM images are believed to be due to different conformations of the
molecules.
Highly ordered SAMs can easily be built up from alkanethiols, although their
structure is affected directly by the addition of any sterically demanding top-end
group. The size and the chemical properties (e.g., high polarity) of additionally
introduced surface functionalities may reduce the monolayer order.
10.3.2.2 Carboxylates on Copper
Compared to the extensive studies of organothiols on gold surfaces, very few
investigations have been undertaken to study the self-assembly process of carboxylic
acids on metal surfaces. The carboxyl group is known to be an anchoring group for
the chemical bonding to metal surfaces. During the adsorption process of simple
carboxylic acids the acid group is deprotonated into the carboxylate functionality,
resulting in an upright adsorption conguration onto copper or nickel surfaces, as are
observed for formic, acetic, and thiophene carboxylic acids [20]. The oxygen atoms in
the carboxylate group are equidistant to the surface and form a rigid adsorption
geometry.
More recently, tartaric acid adsorbed onto copper or nickel surfaces [20] has
attracted interest because of a possible use in chiral technology. It is the aim of this
technology to establish enantioselective catalytic methods to produce pure enantiomeric forms of materials such as pharmaceuticals and avors. One way to create
heterogeneous chiral catalysts is to adsorb chiral organic molecules at metal surfaces
in order to introduce asymmetry. Tartaric acid has two chiral centers, and is therefore
of potential interest as chiral modier; indeed, recently it has been used successfully
to stereodirect hydrogenation reactions with a yield of >90% of one enantiomer.
The self-assembly process of R,R-tartaric acid on Cu(1 1 0) under varying coverage
and temperature conditions leads to a variety of different structures. Due to the two
carboxylic acid functionalities, R,R-tartaric acid can adsorb in the monotartrate, the
bitartrate, or the dimer form. It can be seen from Figure 10.18 that the bitartrate and
the dimermonomer assembly have no symmetry elements and create a chiral
surface which is non-superimposable on its mirror image. This is a result of the
inherent chirality of the R,R-tartaric acid molecules and their two-point bonding at
the surface, which uniquely dictates the position of all its functional groups.
Subsequently, the intermolecular interactions control the placement of the neighboring molecules. Due to the chirality of the adsorbates the lateral interactions are
anisotropic and lead to organized chiral structures.
10.3.3
Supramolecular Nanostructures
Highly ordered 2-D supramolecular nanostructures can be created by using the MBE
technique, or at the solidliquid interface from solution [31] by tuning the molecular
backbone size and controlling the supramolecular binding. This approach is based,
in principle, on the concepts of supramolecular chemistry directing 3-D structures [16], but in the case of 2-D structures the inuence of the substrate must also
be taken into account [26].
j327
328
form a brick-wall structure on the Au(1 1 1) surface. The bright protrusions in the
STM image (Figure 10.20b) may be attributed to tert-butyl groups at both ends of the
molecules; the positions of these groups with respect to the molecular backbone
identies the enantiomeric form of the molecule. Two chiral enantiomers (LL and
RR) and one achiral meso-form (LR/RL) of this molecule exist. The molecules are not
completely stereochemically xed by the substrate, but change between different
surface conformers. An intermolecular trans conguration of the headgroups of two
adjacent molecules has been found to exhibit the lowest potential energy
(D 4 kJ mol1).
Supramolecular structures built by strong metalligand interactions are of high
stability, and the incorporated metal centers offer additional functionalities. A system
which forms a variety of 2-D surface-supported networks is based on iron (Fe) and
aromatic dicarboxylic acids in different relative concentrations on copper surfaces.
Mononuclear metalcarboxylate clusters are obtained from one Fe center per four
tricarboxylic acid (TCA) molecules on Cu(1 0 0) surfaces [46]. The (Fe(TPA)4) complexes form large, highly ordered arrays which are thought to be stabilized by
substrate templating and weak hydrogen bonds between neighboring complexes
(Figure 10.21a). From this a perfect arrangement of the Fe ions results, which cannot
be achieved by using top-down methods.
A completely different network is obtained when two Fe atoms per three dicarboxylic acid (DCA) molecules are deposited onto the Cu(1 0 0) surface. The resulting
array can be described as a ladder structure forming a regular array of nanocavities [26] (Figure 10.21b). The ladders are formed by metalligand interactions, while
the connections between the ladders are formed by hydrogen bonds. If one Fe atom is
deposited per linker molecule, a fully interconnected metalligand 2-D network
results [47] (Figure 10.21c). By using DCAs of different length as linker molecules
between the Fe centers, the size of the resulting nanocavities in the network can be
tuned.
j329
330
10.3.4
Applications of Self-Assembled Monolayers
10.3.4.1 Surface Modifications
One potential application for SAMs is the modication of surfaces in order to change
the surface properties for special applications. For example, polar surfaces can
be created by the adsorption of SAMs with terminal groups such as cyano (CN).
These polar surfaces are useful for the investigation of dipoledipole interactions
in surface adhesion. On the other hand, SAMs with terminal OH groups can
vary wetting behaviors, and are used in investigations to study the importance of
H-bonding in surface phenomena. Additionally, surface OH and COOH groups
and especially acid chlorides are very useful groups for chemical transformations.
For example, reacting the acid chloride with a carboxylic acid-terminated thiol
provides the corresponding thioester. The control of surface reactions opens up the
way to chemical sensors [21], and is the basis of chemical force microscopy [22].
10.3.4.2 Adsorption of Nanocomponents
Mixed SAMs containing two or more constituent molecules can be used as test
systems to study the interactions of surfaces with bioorganic nanocomponents
(proteins, carbohydrates, antibodies). Usually, the SAM contains alkanethiols with
a surface terminal group of interest (e.g., suitable for hydrophobic or hydrophilic
interactions) and an alkanethiol with a reactive site for linking to a biological ligand.
SAMs make it possible to generate surfaces with anchored biomolecules that remain
biologically active and in their native conformations.
Additionally, it is possible to use the specic chemical binding properties of SAM
surface groups to direct nanocomponents into desired structures. This approach has
been used extensively to form selected assemblies of nanoparticles, and opens up a
pathway to a variety of different structures (this subject is discussed in detail in
Section 10.4.2). Another example of the fabrication of desired structures due to the
binding properties of SAMs is the directed assembly of carbon nanotubes (CNTs).
Recently, a method was developed which is based on the observation that CNTs are
strongly attracted to COOH-terminated SAMs or to the boundary between COOHand CH3-terminated SAMs. By using nanopatterned afnity templates, desired
structures of CNTs can be formed (Figure 10.22). Useful methods for the generation
of appropriate templates include dip-pen nanolithography (see Chapter 8) and microor nano-contact printing [48].
10.3.4.3 Steps to Nanoelectronic Devices
Molecular electronics requires several structural elements such as wires, diodes,
switches, and transistors in order to build up nanodevices. In the studies conducted
by Weiss and colleagues [24], conjugated oligophenylene ethynylenes (OPE) have
been investigated, which possess potentially interesting features, including negative
differential resistance (NDR) (increased resistance with increasing driving voltage),
bistable conductance states, and controlled switching under an applied electric eld.
Single OPEs have been studied in a 2-D isolation matrix of host SAMs of dodecanethiolate on a gold electrode. As a result, series of surface images (Figure 10.23)
showed the conductance switching due to conformational changes of the OPE
molecules with a low rate, if the surrounding matrix was well ordered. Conversely,
when the surrounding matrix was poorly ordered, the inserted molecules switched
more often [24]. The switching of OPE molecules can only be observed in arrays of
small bundles of molecules. Therefore, it is assumed that the forming and breaking
off of hydrogen bonds between adjacent molecules and the consequent twisting of
the molecule, which prevents conjugation of the p-orbitals of the molecular backbone
is responsible for the two conduction states. Such a device, constituted by a
switching molecule attached to a bottom electrode and a conductive tip, represents a
simple form of a memory.
According to Avriam and Ratner in 1974 [49], the working principle of a molecular
diode should be based on two separated electrondonor and electronacceptor
j331
332
p-systems (see Chapter 24). However, in recent years asymmetric electron transmission through symmetrical molecules has been also observed by SPM investigations on
the molecular level [50]. Diode behavior is possible for symmetric molecules, if they
are connected asymmetrically to the electrodes that is, with two different moleculeelectrode spacings, or if asymmetric electrodes are used. A further development of
this idea leads to the assumption, that engineering the frontier orbitals of the molecule
in the asymmetric junction should make it possible to control the orientation of the
diode that is, whether an alignment of the cathode to the LUMO of the molecule or of
the anode to the HOMO is achieved at lower bias. If, additionally, the frontier orbitals
of the molecule can be changed reversibly by an electrical pulse, then an optical pulse
Figure 10.24 (a) Chemical formula of hexa-perihexabenzocoronene (HBC) decorated with six
anthraquinone (AQ) functions. (b) STM current
image of HBCAQ6 molecules with coadsorbed
charge-transfer (CT) complexes. (c)
Currentvoltage (IV) relationships through
j333
334
10.4
Preparation and Self-Assembly of Metal Nanoparticles
Generally, metal nanoparticles are prepared by the reduction of a soluble metal salt via
suitable reducing agents, or via electrochemically [5355] or physically assisted
methods (e.g., thermolysis [56], sonochemistry [57], photochemistry [58]), or directly
via the decomposition of labile zero-valent organometallic complexes. In all cases the
synthesis must be performed in the presence of surfactants, which form SAMs on the
nanoparticles surfaces (see also Section 10.3) and thus stabilize the formed nanoparticles. The stabilizing effects of the surfactants refer to: (i) steric effects, meaning
stabilization due to the required space of the ligand shell; and (ii) electrostatic effects,
implying stabilization due to coulombic repulsion between the particles. Furthermore, the surfactants inuence the size, shape, and the physical properties and
assembly patterns of the nanoparticles. In recent years, many excellent reviews have
been produced providing detailed overviews on the preparation techniques, properties and surfactant inuences [5963]. In this context it should be mentioned that,
because of the protecting SAM on the surface, ligand-stabilized nanoparticles are also
sometimes denoted as monolayer-protected clusters (MPCs). However, in comparison to SAMs on planar surfaces, the structures of the SAMs on the nanoparticles
surfaces differ greatly due to the surface curvature [64].
Important and already more or less standardized examples of the preparation
of non-stoichiometrically composed metal nanoparticles via the reduction of a metal
salt with a suitable reducing agent include the route of Turkevich et al. [65] and that
of Brust et al. [66, 67]. Turkevich and colleagues were the rst to introduce a
standardized method for the preparation of gold nanoparticles with diameters
ranging from 14.5 1.4 nm to 24 2.9 nm, via the reduction of HAuCl4 with
sodium citrate in water. Thereby, the nanoparticle size can be controlled by variation
of the ratio HAuCl4/sodium citrate. This route is often applied due to the fact that
citrate-stabilized gold nanoparticles can simply be surface-modied because of a
weak electrostatically bound, and thus easily exchangeable, citrate ligand. Brust et
al. utilized sodium borohydride as a reducing agent, and took advantage of the high
binding afnity of thiols to gold; this enabled the preparation of relatively stable
nanoparticles that could be precipitated, redissolved, analyzed chromatographically,
and further surface-modied without any apparent change in properties. This high
stability represents an important property in terms of controlling nanoparticle
assembly.
A recently published report described the surfactant-free synthesis of gold
nanoparticles [68]. This approach is especially interesting in terms of the preparation of small gold nanoparticles with narrow dispersity, protected by ligands
carrying functional groups that are typically not stable towards reducing agents.
In this method, a solution of HAuCl4 in diethyleneglycol dimethyl ether (diglyme)
is reduced by a solution of sodium naphthalenide in diglyme to yield weakly
solvent-molecule-protected gold nanoparticles. In the rst step, these formed
nanoparticles are further stabilized and functionalized simply by the addition of
various ligands (1-dodecanethiol, dodecaneamine, oleylamine and triphenylphosphine sulde). The size of the nanoparticles can be tuned within the range of
1.9 to 5.2 nm, with dispersities of 1520% depending on the volume of the added
reduction solution and the time between addition of the reduction solution and the
ligand molecule solution.
Magnetic nanoparticles, such as cobalt or iron nanoparticles, are typically prepared
via the decomposition of a zero-valent organometallic precursor, for example
carbonyl metal complexes. One example is the synthesis of monodisperse ( one
atomic layer) Co and Fe nanoparticles with sizes of approximately 6 nm via thermal
decomposition of the respective carbonyl compounds (Fe(CO)5, Co2(CO)9) under an
inert atmosphere [69]. In this way the nanoparticle size can be controlled by adjusting
the temperature and the metal precursor:surfactant ratio. For example, higher
temperatures and higher metal precursor:surfactant ratios produce larger nanoparticles. An additional control parameter is the ratio of the surfactants tributyl
phosphine and oleic acid, both of which bind to the nanoparticle surface. Tributyl
phosphine binds weakly, allowing rapid growth, while oleic acid binds tightly and
favors slow growth to produce smaller particles. As might be expected, iron
j335
336
nanoparticles show great sensitivity towards oxidation in air; even a short contact of
the nanoparticle surface with air resulted in the formation of an Fe3O4 layer with a
thickness of approximately 2 nm (Figure 10.25).
The thermal decomposition of metal carbonyl complexes for the preparation of
nanoparticles or nanostructured materials can also be achieved by treatment with
ultrasound. Treatment of a liquid with ultrasound causes the formation, growth and
implosive collapse of bubbles in the liquid, and this in turn generates a localized hotspot [70]. As an example, amorphous Fe/Co nanoparticles are prepared by the
sonolysis of Fe(CO)5 and Co(NO)(CO)3 in decanediphenylmethane at 293300 K
under an argon atmosphere, to produce pyrophoric amorphous Fe/Co alloy nanoparticles. Annealing of these particles in an argon atmosphere at 600 C leads to
growth of the Fe/Co particles, and this nally yields air-stable nanocrystalline Fe/Co
particles due to carbon coating on the surface [70].
While the above-mentioned examples were all non-stoichiometrically composed
gold nanoparticles, one famous example of a stoichiometrically composed gold
nanoparticle and thus great control over size-dependent properties is the so-called
Schmid cluster [Au55(PPh3)12Cl6], which was introduced in 1981 [71]. The cluster is
prepared by the reduction of Au(PPh3)Cl with in-situ-formed B2H6 in warm benzene.
The relevance of this cluster refers to its quantum size behavior and to the fact that it
can be regarded as a prototype of a metallic quantum dot [72, 73]. The dened
stoichiometric composition of Au55(PPh3)12Cl6 is based on the so-called full-shell
cluster principle, whereby the cluster is seen as a cut-out of the metal lattice of the
bulk metal. This implies that the cluster consists of a metal nucleus surrounded by
shells of close-packed metal atoms, so that each shell has 10n2 2 atoms
(n number of shells) [59, 74]. Further examples in this context are [Pt309phen 36O30]
(four-shell cluster) and [Pd561phen36O200] (ve-shell cluster) (phen bathophenanthroline; phen 1,10-phenanthroline) [7577].
10.4.2
Assembly of Metal Nanoparticles
j337
338
The inuence of the ligand, the type, and the linker length on charge-transport
properties in metal nanoparticle assemblies has been studied for a variety of cases.
For example, the insertion of bifunctional amines into the 3-D arrangement of
Pd561phen36O200 clusters yields an increased interparticle spacing compared to the
closed sphere packing obtained from solution. The increased interparticle spacing is
reected in an increase of the activation energy of the electron transport through the
material [84]. The inuence of ligand type was recently discussed for the case of Au55cluster arrangements by Simon and Schmid [85]. When comparing the chargetransport properties of networks of Au55-clusters interconnected by either weak ionic
interaction or by covalent bond formation between bifunctional ligands, it transpired
that both types showed characteristically different charge-transport properties. The
non-covalently interconnected cluster systems showed a continuous increase of
activation energy for the charge transport with increasing interparticle distance,
whereas the covalently linked cluster systems showed a decrease in activation energy,
to signicantly lower values [85].
10.4.2.2 Two-Dimensional Assemblies: The Formation of Monolayers
In general, the assembly of nanoparticles into two dimensions is achieved by binding
the metal nanoparticles onto a substrate surface. Thus, in order to direct the assembly
into a dened pattern, the key point is the existence of functional groups on the
substrate surface which enable a specic interaction between the nanoparticle and the
surface. These interactions may be either weak electrostatic forces or weaker van der
Waals forces; alternatively, covalent bonds may be formed between the monolayerprotected nanoparticle and the substrate surface. The weak interactions have the
advantage that they allow a reasonable mobility on the surface, and so enable ordering
due to self-organization. Covalent bond formation has the advantage of building
j339
340
approximately 1500 particles per mm2 (Figure 10.30) [88]. The average interparticle
spacing is adjusted by the thickness of the citrate shell.
SAMs based on the covalent attachment of nanoparticles were presented by
Schmid and coworkers, who described the formation of 2-D arrangements of
ligand-stabilized gold clusters and gold colloids on various inorganic conducting
and insulating surfaces [89]. For this purpose, oxidized silicon as well as quartz glass
surfaces were treated with (3-mercaptopropyl) trimethoxysilane to generate monolayers of the SH-functionalized silane. When dipped into an aqueous solution of
13-nm gold colloids, stable covalent SAu bonds were formed, thus xing the colloids
in a highly disordered arrangement. The coating of the surface was visualized using
AFM.
Microcontact printing (mCP) is also used to create nanoparticle surface assemblies
based on covalent forces, and with predened positions of the nanoparticles within
these arrays. A recent example of mCP use was the chemically directed assembly of
monolayer-protected gold nanoparticles on lithographically generated patterns [90].
Here, gold surfaces patterned with mercaptohexadecanoic acid (MHA) were prepared using mCP and dip-pen nanolithography. A diamine molecule was used to link
the mercaptoundecanoid acid-coated gold nanoparticles onto the MHA-dened
patterns. Sonication was then used to remove the non-specically absorbed nanoparticles, and the nanoparticle assembly was proven by using AFM, with height
increases of 26 nm with respect to the preformed MHA SAM (Figure 10.31).
10.4.2.3 One-Dimensional Assemblies
The 1-D assembly of nanoparticles remains a major challenge, as 1-D (or at least
quasi-1-D) assemblies require appropriate nanoparticle surface modication, appropriate templates, or special techniques (including special substrate modications,
e.g., AFM-based methods).
One recently published example is the spontaneous quasi-1-D arrangement of
spherical Au nanoparticles protected by a liquid crystal ligand (the 40 -(12-mercapto-
j341
342
j343
344
10.5
Conclusions
In this chapter, we have discussed the basic principles and selected examples of the
formation of nanostructures via the self-assembly of atoms by epitaxial growth, of
molecules, and of metal clusters. The examples presented have provided an impressive demonstration that these methods represent powerful tools for the controlled
preparation of highly ordered nanostructures. In future, a combination of the three
methods should represent the next key step towards building up well-dened
functional nanostructures with suitable properties for applications in molecular
electronics. Moreover, besides the applications of self-assembled structures discussed here, such nanoscale structures are of growing interest in the development of
new sensors and catalysts. Inorganic substrates with epitaxially grown nanostructures display suitable starting points for the site-selective attachment of molecules,
and such molecules may serve as intelligent adhesives for the binding of nanoscale
subunits. The future goal of fabrication of complex functional nanoscale structures
will be achieved only by employing a hierarchical self-assembly approach, using
different scales and materials.
References
1 Vescan, L. (1995) Handbook of Thin Film
Process Technology, D.A. Glocker and S.I.
Shah (Eds.), IOP, Bristol.
2 Kasper, E. (1988) Silicon Molecular Beam
Epitaxy, Vol. 12, CRC Press.
3 Herman, M.A., Richter, W. and Sitter, H.
(2004) Epitaxy Physical Principles and
Technical Implementation, Springer.
4 Voigtlander, B. (2001) Surface Science
Reports, 43, 127.
5 Venables, J.A. (1994) Surface Science, 299/
300, 798.
6 Jesson, D.E., Voigtlander, B. and Kastner,
M. (2000) Physical Review Letters,
84, 330.
7 Shchukin, V.A., Lendentsov, A.A. and
Bimberg, D. (2003) Epitaxy of
Nanostructures, Springer, Heidelberg.
8 Heinrichsdorff, F., Ribbat, Ch.,
Grundmann, M. and Bimberg, D. (2000)
Applied Physics Letters, 76, 556558.
9 Shiryaev, S.Yu., Jensen, F., Lundsgaard
Hansen, J., Wulff Petersen, J. and
Nylandsted Larsen, A. (1997) Physical
Review Letters, 78, 503.
10 Chen, Y., Ohlberg, D.A.A., MedeirosRibeiro, G., Chang, Y.A. and Williams, R.S.
(2000) Applied Physics Letters, 76, 4004.
11 Kawamura, M., Paul, N., Cherepanov, V.
and Voigtlander, B. (2003) Physical Review
Letters, 91, 096102.
12 Kim, E.S., Usami, N. and Shiraki, Y. (1998)
Applied Physics Letters, 72, 1617.
13 Theobald, J.A., Oxtoby, N.S., Phillips,
M.A., Champness, N.R. and Beton, P.H.
(2003) Nature, 424, 10291031.
14 Butcher, M.J., Nolan, J.W., Hunt, M.R.C.,
Beton, P.H., Dunsch, L., Kuran, P., Georgi,
P. and Dennis, T.J.S. (2001) Physical Review
B-Condensed Matter, 64, 195401.
15 Wan, K.J., Lin, X.F. and Nogami, J. (1992)
Physical Review B-Condensed Matter, 45,
9509.
16 Lehn, J.M. (1995) Supramolecular
Chemistry, Wiley-VCH Weinheim,
Germany.
17 Prime, K.L. and Whiteside, G.M. (1991)
Science, 252, 11641167.
18 Arte, S.V., Liedberg, B. and Allara, D.L.
(1995) Langmuir, 11, 38823893.
References
19 Scherer, J., Vogt, M.R., Magnussen, O.M.
and Behm, R.J. (1997) Langmuir, 13,
70457051.
20 Barlow, S.M. and Raval, R. (2003) Surface
Science Reports, 50, 201341.
21 Rickert, J., Weiss, T. and Goppel, W. (1996)
Sensors and Actuators B: Chemical, 31,
4550.
22 Sch
onherr, H. and Vansso, G.J. (2006)
Chemical force microscopy, in Scanning
Probe Microscopies Beyond Imaging (ed. P.
Samori), Wiley-VCH Weinheim,
Germany.
23 Joachim, C., Gimzewski, J.K. and Aviram,
A. (2000) Nature, 408, 541548.
24 Donhauser, Z., Mantooth, B., Kelly, K.,
Bumm, L., Monnell, J., Stapleton, J., Price,
D., Rawlett, A., Allara, D., Tour, J. and
Weiss, P. (2001) Science, 292, 23032307.
25 G
olzhauser, A., Geyer, W., Stadler, V., Eck,
W., Grunze, M., Edinger, K., Weimann, Th.
and Hinze, P. (2000) Journal of Vacuum
Science & Technology, B18, 34143418.
26 Barth, J.V., Costantini, G. and Kern, K.
(2005) Nature, 437, 671679.
27 Blinov, L.M. (1988) Soviet Physics Uspekhi,
31, 623644.
28 Metzger, R.M. (2003) Chemical Reviews,
103, 38033834.
29 Schreiber, F. (2000) Progress in Surface
Science, 65, 151256.
30 Yang, G. and Liu, G. (2003) Journal of
Physical Chemistry. B, 107, 87468759.
31 De Feyter, S. and De Schryver, F.C. (2005)
Journal of Physical Chemistry. B, 109,
42904302.
32 Sugimura, H., Hanji, T., Hayashi, K. and
Takai, O. (2002) Ultramicroscopy, 91,
221226.
33 Buriak, J.M. (2002) Chemical Reviews, 102,
12711308.
34 Kosynkin, D.V. and Tour, J.M. (2001)
Organic Letters, 3, 993995.
35 DeRose, J., Thundat, T., Nagahara, L.A.
and Lindsay, S.M. (1991) Surface Science,
256, 102108.
36 L
ussem, B., Karthauser, S., Haselier, H.
and Waser, R. (2005) Applied Surface
Science, 249, 197202.
j345
346
68
69
70
71
72
73
74
75
76
77
78
79
References
80
81
82
83
84
85
86
87
88
89
90
91
92
j347
j349
III
High-Density Memories
j351
11
Flash-Type Memories
Thomas Mikolajick
11.1
Introduction
The trend towards mobile electronic devices drives an increasing demand for nonvolatile memories [1]. In 2007, the market for NAND Flash memories approached the
size of the dynamic random access memory (DRAM) market with regards to bit
volume, and continues to grow. The hierarchy of todays non-volatile memories is
illustrated schematically in Figure 11.1. From this, with the technologies available
today, a trade-off between cost and exibility must be made.
At the low end of exibility there is the read-only memory (ROM), which can only
be programmed during production, but delivers the lowest cost per bit. In a practical
application the most important feature today is the electrical rewritability. The
electrical erasable and electrical programmable ROM allows for this reprogrammability on a Byte level. To achieve this, each memory cell must be constructed from
two transistors the storage transistor and a select device but this leads to a large cell
size. The electrical programmable memory (EPROM), on the other hand, consists of
only one memory transistor, but does not allow an electrical erase. The Flash-type
memory combines the small cell size of the EPROM with the electrical erasability of
the EEPROM simply by allowing the erase operation not on a Byte level but only on
large blocks of 16 kB to 1 MB (see Figure 11.2). At the high end of exibility there is a
memory that allows random access-like operation as in DRAMs or static random
access memories (SRAMs), and is non-volatile. Today, this non-volatile RAM can only
be realized by the combination of DRAM or SRAM [2] with EEPROM or Flash, or by
ferroelectric RAMs [3].
Todays standalone Flash memories can be divided into memories for code
applications and for data applications. In the code application, the memory must
allow a fast random access to enable real-time code execution. In the data application,
the focus is on highest density and fast program and erase throughput. The
implications of this difference on the array architecture and cell construction will
be explained in Sections 11.3.2 and 11.3.3. Additionally a number of applications
j 11 Flash-Type Memories
352
such as smart cards call for Flash memory embedded into a high-performance logic
circuit [4]. In the embedded Flash segment the density is typically much lower than in
standalone memories. Therefore, the focus lies on easy integration into the standard
complementary metal oxidesemiconductor (CMOS) ow and low design circuit
overhead for the memory module. The requirements, however, are dependent on the
actual application, which in turn leads to the development of a large number of
different concepts for embedded Flash memories. In the standalone segment of the
market, in contrast, one mainstream solution for code and one mainstream solution
for data Flash memories have evolved.
11.2
Basics of Flash Memories
In order to integrate a Flash memory into a product, two basic elements are required.
First, a memory cell that can perform the program, erase and read operation with the
required parameters is necessary. A generic charge storage memory cell, illustrating
the program, erase and read operation, is shown in Figure 11.3. To build a large
memory, these memory cells must be connected into memory arrays, with the nal
memory parameters being governed by the combination of memory cell and array
architecture.
11.2.1
Programming and Erase Mechanisms
In programming and erase, the charge must be transferred to and from the charge
storage layer, overcoming the large potential barrier of the bottom or tunneling
dielectric. In principle, two different mechanisms are possible (see Figure 11.4). In
the hot carrier injection mode, the energy of the carriers is heated up to a level which is
sufcient to overcome the barrier. In the tunneling mode a large voltage is applied to
the barrier in order to reduce its effective width. Variants of both effects are used in
different type of Flash concepts. The ways in which several combinations of these
effects are realized in Flash concepts are listed in Table 11.1, and the most important
concepts will be explained in Section 11.3. Details of other concepts may be found in
the references listed in Table 11.1. At this point it should be noted that in Table 11.1
j353
j 11 Flash-Type Memories
354
and the remainder of this chapter, the operation performed on a byte, word or page
level is called programming and the operation performed on a block level is called
erase.
11.2.1.1 Hot Carrier Injection
In channel hot electron programming, the electrons are accelerated until they have
enough energy to surmount the barrier between silicon and silicon dioxide. For
electrons this barrier is about 3.1 eV [18], whilst for holes the silicon bandgap must be
Hot Holes
(BBT)
Field Enhancing
PHINES [89]
Tunneling Injector
[47, 48]
Twin MONOS
[85, 86]
to Poly
NROM [14],
TwinFlash [78],
MirrorBit [76]
AND [32]
DINOR [16]
Hot Holes
(BBT)
AG-AND [44]
Tunneling
HMOS [21]
to Drain
Nordheim
CHISEL [19]
to Source
Fowler
ETOX till
0.18 mm [11]
Programming
Source Side
Injection
Erase
Hot Electrons
with secondary
Hot Electrons impact ionization
Table 11.1 Combination of program and erase mechanisms used in different Flash cell concepts.
FLOTOX [9]
HiCR [17]
j355
j 11 Flash-Type Memories
356
added, resulting in a barrier of 4.2 eV. To achieve this energy, a high eld must be
generated in the channel by applying a sufciently high drain voltage. Additionally, a
gate voltage that attracts the generated carriers must be applied. This method has the
advantage of microsecond programming speed for a single bit, as well as the fact that
it is a three-terminal operation making the disturb optimization easy in a NOR-type
architecture. On the other hand, the mechanism has the problem of being very
ineffective, as typically approximately 105 to 106 channel electrons are needed to inject
one electron into the storage layer. This leads to a current consumption which is in the
range of 100 mA per cell. The consequence is a limited parallelism of cells during
programming, and therefore a limited programming throughput.
In order to reduce both the programming current as well as the drain
voltage required during programming, two alternative approaches are possible
(see Figure 11.5). First, it is possible to apply a bulk voltage during programming,
and in this case the eld between drain and bulk is increased. As hot electrons at the
drain side will lead to electron hole pair creation by impact ionization, the generated
holes can be accelerated to the bulk and thereby create a second impact ionization.
The so-generated tertiary electrons again may be accelerated towards the charge
storage layer. By using this approach the drain voltage can be reduced below 3 V and
the current can also be signicantly reduced [19]. However, care must be taken to
maintain the benet when scaling down the channel length [20].
Another approach is to use a so-called split gate transistor, where the channel
region is divided into two serial regions with individual gates. The storage layer is
present only below one of the two gates. The gate voltage at the gate close to the source
is chosen at a value slightly above the threshold voltage, which limits the current
owing trough the channel. The voltage of the second gate is set to a high enough
voltage to accelerate the carriers into the charge storage layer. By doing this, a high
Figure 11.5 Channel hot carrier injection mechanisms. In the classical channel hot electron
injection, the injection is mainly by primary
channel electrons or secondary electrons
(a). With applied back bias, the injection current
is significantly increased by the carriers
additionally generated during the secondary
eld is created at the region between the two gates, such that the electrons will
become hot in that area of the device. As the carriers are injected at a location close to
the source, this method is referred to as source side injection (SSI) [21]. The
channel current can be reduced to the single mA range. Due to the necessity of a
second gate electrode, however, there is a cell size drawback and a circuit overhead
associated with this source side injection.
Besides channel hot electron generation, carriers generated by band to band
tunneling [27] may also be used for programming and erase. In this case, band-toband tunneling in the drain junction is induced by applying a high drain potential
while the gate is turned off. The so-generated carriers are accelerated towards the
channel by the electrical eld, collecting enough energy to surmount the potential
barrier. This is illustrated in Figure 11.6 for the case of hot hole generation in a
n-channel device. A modied version uses a highly doped buried layer to generate the
band-to-band tunneling [28].
11.2.1.2 FowlerNordheimTunneling
In FowlerNordheim (FN) tunneling, a high electric led applied to the barrier
creates a trapezoidal barrier which signicantly reduced the effective barrier for the
carriers (see Figure 11.4). The current can be calculated according to the well-known
equation [29]:
BFN
11:1
IG AFN E 2ox exp
E ox
where Eox is the electrical eld in the oxide and AFN and BFN are material-specic
constants. If a dielectric charge-trapping layer is used for charge storage, the material
stack relevant for the tunneling will also depend on the applied eld [33]. For low
elds, the carriers must tunnel through part of the trapping layer in addition to the
tunneling dielectric. For higher elds and very thin layers, direct tunneling is
j357
j 11 Flash-Type Memories
358
dominant. Finally, for high elds the FN tunneling [as given by Equation 11.1] is the
most important mechanism.
Until now, we have been considering charge transfer between the transistor
channel and the charge storage layer. However, by using tunneling the charge can
also be transferred to the gate electrode [30] or to a specially designed erase gate [31].
In this case the modied properties of the tunneling barrier created on a polysilicon
electrode, as well as eld enhancement by poly tips, must be taken into account in the
FN tunneling current.
11.2.1.3 Array Architecture
When combining memory cells to a memory array, different versions are possible. The
straightforward way is to connect the gate of every memory cell to a wordline, the drain of
every memory cell to the bitline, and to connect all the sources of the memory cells to
ground. By using this construction all n cells on one bitline are connected in parallel to
each other. As this resembles the n-channel portion of a n-input NOR gate, this
architecture is referred to as NOR, or more precisely common ground NOR. Other
types of NOR architecture will be discussed later in this section. In contrast, it is also
possible to connect the cells on the bitline in a serial connection, leading to a NAND-type
array. In practical applications the number of cells in series must be limited to keep the
read current on an acceptable level; therefore, typically 32 cells are placed between two
select transistors and the so-constructed NAND strings are connected to the bitline in a
similar manner as the individual cells in the standard NOR arrangement. When
comparing NAND and NOR architectures, it is clear that the random access time is
much faster in the NOR-type array, as every cell is directly accessed by a bitline. In the
NAND architecture, in contrast, the serial connection of cells results in a high resistance
through which the current must ow to the bitline. The result is a much lower read
current and therefore a much slower random access. On the other hand, the NAND
arrangement has a distinct size advantage, as the contacts are shared between all cells of
the string, whereas one contact for every two cells is necessary in common ground NOR.
The program and erase mechanism used in the cell also has an impact on the
possible array architecture. For a cell programmed by channel hot electron injection,
the common ground NOR arrangement is an ideal t, as the necessary voltages can be
precisely applied to the cell, thus minimizing any disturbance to other cells. For other
injection mechanisms, or to minimize the cell size, the common ground NOR
architecture may be modied. An overview of the possible array architectures is
shown in Figure 11.7. In NOR, as well as the above-mentioned common ground NOR
architecture, a separate source line may also be used for every bitline; such as array is
commonly known as an AND [32] array. Although the most compact realization uses
buried bitlines, some versions with metal bitlines are also used [34] if access time has
to be minimized rather than cell size. Finally, in such an array each pair of
neighboring bit line and source line can be combined to one line. Since here the
ground is dened only by the operation of the array, this architecture is referred to as a
virtual ground NOR array [35]. This has the advantage of a very small cell size
(similar to NAND), and also enables a symmetrical operation of the cell, which is
essential in multi bit operation (see Section 11.4.2).
11.3
Floating-Gate Flash Concepts
11.3.1
The Floating-Gate Transistor
In essence, every ash cell is a metal oxide silicon (MOS) transistor with the charge
storage layer placed in between the control gate and the channel. The drain current ID
of a MOS transistor can be expressed by the set of Equation 11.2 [36]
0
1
2
V
I D b@V G V T V D D A
2
for
VG VT > VD > 0
b
I D V G V T 2
2
for
0 < VG VT VD
ID 0
for
VG VT < 0
11:2
where VG is the gate voltage, VD is the drain voltage, VT is the threshold voltage, and b
is the transconductance. b and VT are given by Equation 11.3:
j359
j 11 Flash-Type Memories
360
W
CIS m
L
V T FMS 2f
QS
Q
IS
C IS
C IS
11:3
where W and L are the channel width and channel length of the device, CIS is the total
capacitance of the gate insulator, m is the channel mobility, FMS is the workfunction
difference between the gate electrode and the channel, fB is the Fermi potential, QS is
the charge in the depletion layer, and QIS is the total charge in the insulator
normalized to the silicon/insulator interface. In principle, the stored charge in a
non-volatile memory cell can be modeled by the insulator charge QIS. For a chargetrapping device (as introduced in Section 11.4) this holds true, without further
modications.
In the oating-gate device, however, the gate controlling the device is the oating
gate (FG) where, from the outside world, only the control gate (CG) can be accessed.
Therefore, the gate potential VG in Equation 11.2 can only be controlled according to
the capacitive coupling of the oating gate to the external terminals. The capacitive
coupling of the oating gate to the external accessible terminals is illustrated
schematically in Figure 11.8; from this gure, under the assumption that the bulk
and source of the device are grounded, Equation 11.4 can readily be deduced giving
the oating-gate voltage VFG as a function of the gate and drain voltage.
V FG aG V CG f V D
f
aD CD
aG CC
11:4
gate coupling :
aG
CC
CT
gate coupling :
aD
CD
CT
11:5
V T;FG aD
Q
V D FG :
aG
aG
CC
11:6
j361
j 11 Flash-Type Memories
362
Table 11.2 Voltage conditions in read, program and erase for a typical ETOX-type cell.
Read
Program
Erase
Gate
Drain
Bulk
4V
7 to 10 V
6 to 8 V
<1 V
3 to 6 V
oat
0V
0 to 1.5 V (optional)
6 to 8 V
A read disturb is caused by the fact that, for very low drain voltages, there is a
probability of hot carriers being injected into the oating gate. Additionally, during
read the applied gate voltage may cause FN tunneling into the oating gate. Both of
these effects give rise to an unwanted programming of a cell during read. During
programming, all cells connected to the same bitline are seeing a drain disturb, while
all cells connected to the same wordline are seeing a gate disturb (see Figure 11.10). In
Figure 11.10 Gate and drain disturb in a common ground NOR-type Flash memory array.
In NAND, only one concept exists, as illustrated in Figure 11.7. As with the NOR-type
memory, the cell is a stacked gate but, due to the serial connection of cells, hot electron
programming is not practical. The cells are therefore programmed and erased by FN
tunneling between channel and oating gate. A schematic overview of a typical
NAND string, including cell cross-sections taken from a 60-nm cell, is shown in
Figure 11.11 [38]. Today, 32 cells are connected between two selects; this number has
increased from eight cells via 16 cells, and may increase to 64 cells in the future [59].
The ground select connects the string to a sourceline, while the bitline select connects
each string to a metal bitline.
In the wordline direction the cell resembles the ETOX cell described earlier. In the
bitline direction, the cell is much denser due to a lack of contacts, as well as
the channel length which can be scaled down to the minimum feature size due to
the fact that the cell only has to isolate very low voltages. This becomes clear from
Figure 11.12, where the main operations read, write and erase are explained in
more detail. Erase is achieved simply by applying a sufciently high voltage for FN
tunneling to the well and applying 0 V to all the wordlines in the sector that must be
erased, while leaving the wordlines of the non-erased sectors oating.
In the read operation a voltage higher than the highest VT of a programmed cell
must be applied to all the non-selected wordlines. In the erased state, the threshold
voltage of the cells is chosen to be negative. Therefore, 0 V can be used on the
wordlines that must be read. If a voltage of about 1 V is applied to the bitline,
the selected cell will conduct if it is in the erased state and will be below threshold
in the programmed state. For writing, a voltage high enough for FN tunneling
(e.g., 1520 V) is applied to the selected wordline. The channel of the selected cell is
set to 0 V by applying 0 V to the selected bitline, turning the bitline selector on, and
applying a high enough voltage to all the other wordlines; this will allow all other cells
in the string to be turned on.
Care must be taken to avoid programming of the cells on the same wordlines. In
early NAND Flash implementations a voltage of about 7 V was applied to the
unselected bitlines and transferred to the channel of the cells on the selected
wordline [39]. This, however, had two signicant drawbacks. First, the junctions of
the cells had to withstand the high voltage applied to the unselected bitlines, which
restricts the scalability of the cell. Second, all the unselected bitlines had to be charged
to a high voltage during programming. When the supply voltage was reduced from 5
to 3.3 V a new solution was implemented [40]. In that solution the bitline selects of the
unselected wordlines are switched off by applying VDD to the unselected bitlines. The
channel of the inhibited cell will then raise its potential by capacitive coupling via
the tunnel oxide capacitance CTunnel and the ONO capacitance CONO to the high
voltage applied to the wordline. A high voltage applied to the passing wordlines will
j363
j 11 Flash-Type Memories
364
further help to raise the channel potential in the disturbed cell. Now, the programming voltage and the voltage applied to the passing wordlines must be optimized to
minimize the disturb effect. Examples for 120 and 90 nm Technology generations can
be found in Ref. [41].
A large number of oating-gate concepts other than standard NOR or NAND have
been proposed, and some of these are or were in production. Good overviews can
be found in Refs. [42, 43]. Embedded ash memories have somewhat different
requirements than standalone memories. Normally, much smaller memory densities are required than in standalone memories, and therefore the cell size is not as
important but rather the size of the complete memory module including charge
pumps, decoding, and so on must be minimized. This results in a high incentive to
minimize the voltage requirements. Besides, every application focuses on different
Figure 11.12 Read, write and erase operations of a NAND memory array.
requirements such as very fast random access, high endurance, high reliability, or
low power, and therefore a large number of concepts coexist. In general, the ETOX is a
good t to many requirements, as long as the power consumption during programming can be tolerated. Among the large number of different concepts (some of these
are referred to in Table 11.1) the eld-enhancing tunneling injector cell [45, 47] is very
popular and has been adopted by many foundries. Here, source side injection is used
for the programming.
11.3.4
Reliability Aspects of Floating-Gate Flash
j365
j 11 Flash-Type Memories
366
The most severe issue in scaling down oating-gate devices is the non-scalability of
the tunnel oxide and the inter poly dielectric. In order to achieve non-volatile
retention, the tunneling dielectric must be at least 6 nm thick [50], although in
practical memories thickness of 810 nm are used, based on the concrete reliability
specication. This margin allows the covering of extrinsic effects such as moving bits.
Scaling of the oxide-nitride-oxide (ONO) layer used as the interpoly dielectric is
limited to an electrical effective thickness of about 13 nm due to retention and Vt
stability constraints [51].
Further scaling of the tunneling oxide can only be obtained by radically reengineering the tunnel barrier. Materials with a higher dielectric constant (e.g.,
HfO2, ZrO2) that are currently investigated for logic transistors can help. Crested
barriers [52] could further improve the basic memory cell by increasing the ratio
between the on and off current, leading to much faster write times as well as lower
programming voltages. The principle of such an approach is shown schematically in
Figure 11.15. The triangular shape of the barrier maintains the maximum barrier
height if no voltage is applied (retention case), but drastically reduces the effective
barrier in case of an applied voltage (programming or erase case). As a crested barrier
is not achievable with those materials that have the required barrier heights, a
staircase approximation using three layers with different band offsets, as well as
different dielectric constants, is a reasonable approach. In the optimum structure the
center layer would have a high band offset and a high dielectric constant, while the
surrounding layer has lower band offset as well as a lower dielectric constant
(Figure 11.15c). In most materials, however, a high band offset is correlated with
a low dielectric constant and vice versa, making the optimum choice very difcult.
A stack consisting of Si3N4/Al2O3/Si3N4 could be a reasonable and producible
j367
j 11 Flash-Type Memories
368
The reduced spacing between oating gates will also lead to higher capacitive
coupling between oating gates, and result in severe crosstalk between cells [55]. This
calls for a material with a lower dielectric constant between the oating gates, as
shown in Figure 11.16b. This must also be implemented in the area between the word
lines. Replacement of the silicon nitride spacer of the cell transistor with a silicon
dioxide spacer (as shown in Ref. [58]) may help to signicantly reduce the effect, but
in the long term real low-k materials will be necessary. The recently demonstrated air
gaps between the oating gates may represent an ultimate solution [59].
Although the scaling challenges presented so far are valid for all types of oatinggate ash memory devices, in the NOR-type architecture two more limitations must
be considered [60]. First, a contact is required for every two cells, and this results in a
signicant area overhead. The overhead can be minimized by using a contact that is
self-aligned to the control gate [61], or by using a virtual ground NOR array rather than
a common ground NOR array [62]. The second limitation is that the channel length
scaling is limited by the high voltages required during channel hot electron
programming. However, a vertical device may be required to overcome this issue [63].
Instead of reducing the feature size, the cost reduction and density increase of a
ash memory can also be achieved by increasing the number of bits stored on the
same surface area, rather than reducing the size of a physical cell. As the charge
storage is analogue in nature, more than two levels can be stored on one oating gate.
To code n bits, 2n levels are required. Figure 11.17 illustrates the corresponding VT
distributions for both NOR and NAND devices. For NOR devices, the multi-level
approach was introduced to the market back in 1997 [64], and today in NAND devices
j369
j 11 Flash-Type Memories
370
11.4
Charge-Trapping Flash
Instead of using a oating gate to store the charge, an insulator with a high density
of traps a so-called charge-trapping layer may also be used. Although this
approach enjoyed some success during the 1970s and 1980s for EPROM and
EEPROM memories [67, 68], in Flash memories the oating gate, in its ETOX and
NAND versions, became dominant during the 1990s. The charge-trapping concept
has some interesting advantages over oating-gate devices. As the charge is highly
localized, no capacitive coupling effects (as described in Section 11.3.1) need to be
considered. The cell can be described by Equations 11.2 and 11.3 with the storage
charge included in QIS, which means that there is no drain turn on effect. The
localization of charge also makes the cell less sensitive to any local defects
responsible for erratic and moving bits in oating-gate cells. The coupling between
cells is also much less pronounced. A few years ago the localization was utilized to
store two physically separated bits in the same memory cell to create a multi-bit
cell [69] (see Section 11.4.2), and this led to a revival of charge-trapping devices in
the Flash world.
11.4.1
SONOS
reduction of hole tunneling from the channel to the trapping layer. Finally, a steady
state of the two processes and therefore a saturation of the erase will occur.
Due to the asymmetry of the two interfaces, the increase in erase voltage will lead to
a strong increase in the injection from the gate, and hence to an even higher
saturation level. This can be seen in the lower right-hand graph in Figure 11.18. In
order to increase the erase speed, it is possible to increase the eld over the bottom
oxide, to reduce the eld over the top oxide, or to increase the barrier for electrons at
the control gatetop oxide interface. Although the simplest choice is to reduce the
bottom oxide thickness, which will lead to a signicantly increased eld over the
bottom oxide, it will also degrade the retention properties. For many years, therefore,
the SONOS development was caught in the trade-off between bad retention and slow
erase.
j371
j 11 Flash-Type Memories
372
The increase in the barrier for electron injection from the gate electrode is very
effective, as can be seen in Figure 11.18 (lower right), where n doped gates and p
doped gates are compared. However, this alone is insufcient to achieve an erase
performance that will be fast enough for a data Flash memory, yet still be secure after
10 years of retention. The eld over the top oxide can be reduced by replacing the
silicon dioxide with a high-k material such as Al2O3 [71]. Indeed, recently the
combination of an Al2O3 topoxide and a high-workfunction TaN gate electrode was
implemented in order to achieve NAND Flash-compatible performance on a 63 nm
demonstrator using a bottom oxide as thick as 4 nm [72]. The structure used was
referred to as a TANOS (Tantalum-nitride/Silicon-nitride/Silicon-dioxide/Silicon).
The cross-section of the used stack, as well as the erase curves demonstrating a
memory window of 5 V, are shown in Figure 11.19. This device represents a very
promising candidate for NAND Flash memories with sub-40 nm ground rules.
11.4.2
Multi-Bit Charge Trapping
placed in very narrow region which is self-aligned to the drain junction of the device.
By interchanging the source and drain of such a device, two physically separated bits
can be stored (see Figure 11.20a and Ref. [69]). In order to read the bits separately
from one another, the source and drain must be interchanged compared to the
programming conditions, and a drain voltage large enough to punch through the
region below the charge above the drain region used in read must be applied
(Figure 11.20c and d). As the charge is localized in a region close to the drain
junction during programming, hot holes generated by band-to-band tunneling can
be used for erase to compensate the stored charge (Figure 11.20b; see also Figure 11.7
and Ref. [13]). This allows for a sufciently thick bottom oxide layer so as to avoid
vertical charge loss and circumvent the erase saturation issue described in
Section 11.4.1.
This device is implemented under different trade names such as NROM [13],
MirrorBIT [73], NBit [74] and Twin Flash [75], both in code and data Flash products.
In order to operate the multi-bit charge-trapping cell described above, it is essential
to have an architecture where no difference between bitlines and sourceline exists.
The virtual ground NOR array (see Figure 11.7) therefore is the natural choice. This
can be constructed by the structuring of a ONO layer, implanting the bitlines through
the openings, growing an oxide for isolation, and nally forming the wordlines
perpendicular to the bitlines. This method was used in the rst-generation systems
(see Figure 11.21a), but it has the drawback of a high thermal budget that must be
applied to the bitline implants.
j373
j 11 Flash-Type Memories
374
Although the multi-bit charge-trapping type of memory cell has all the advantages
described in Section 11.4.1, plus the inherent two bit per cell operation, there remain
two challenges that must be considered. First, the unique mechanism of localized
charge storage makes an understanding of the reliability-governing factors more
complex. On an empirical basis, all effects that are necessary to create a reliable
product are well understood and under control [76]. The physical basis for the
observed results, however, remains the subject of debate among the scientic
community. In principle, two effects may occur: First, the injection of hot holes
during erase may damage the bottom oxide, leading to traps that can cause a vertical
loss of the stored electrons [77]. Second, due to the fact that in programming and
erase two localized mechanisms are used that will not be totally aligned to each other,
a dipole will be created. This dipole may lead to lateral charge movement and
therefore a change in VT. In practice, however, both mechanisms may be involved,
leading to a well-behaved and predictably reliable unit [78].
A number of modications of the above-described multi-bit charge-trapping cell
have been reported. For example, by adding an assist gate programming can be
carried out using source side injection, which signicantly reduces the cell current
during programming. Examples of multi-bit charge-trapping cells using source side
injection may be found in Refs. [8284]. Another interesting variant is created by
programming the cell with hot holes rather than hot electrons. In that case, the erase
may be performed by FN tunneling either to the channel or to the gate [85, 86]. This
concept, which is referred to as PHINES (programming by hot-hole injection nitride
electron storage), has one main drawback in that programming of the second bit on
the same junction must be avoided. However, the same basic cell can also be used in a
NAND-type architecture [87].
11.4.3
Scaling of Charge-Trapping Flash
When further scaling down the planar type of SONOS cell, the cell properties will
suffer from low gate control, low read current, and a small number of electrons. In
principle, the options of tailoring the tunneling barrier described for oating-gate
device scaling can also be applied to the bottom oxide of a charge-trapping cell.
Charge trapping, however, also enables another scaling path, by utilizing a FinFET
device [88]. In principle, this could also be achieved with a oating-gate device, but
two problems are encountered: First, the stack of tunnel oxide, oating gate and
interpoly dielectric is too thick to t the space between two neighboring cells; and
second, the high coupling of the oating gate to the channel in a FinFET device will
cause a deterioration in the gate coupling ratio [see Equation 11.2]. In a chargetrapping device the implementation of a FinFETdevice is straightforward; the general
concept, as well as the excellent programming and erase curves that may be achieved
with devices as short as 20 nm [89], are illustrated in Figure 11.22.
j375
j 11 Flash-Type Memories
376
In order to further increase the memory density, a 3-D NAND-type memory would
be very benecial. Although the straightforward approach here would be to use thinlm transistors, charge trapping is again the favored solution, as it is much easier to
integrate in a stacked manner compared to a oating-gate device. The rst results of a
thin-lm transistor-based charge-trapping memory cell have recently been published
by Walker and colleagues [90].
For the multi-bit charge-trapping memory cell it is essential to scale down the
channel length, and indeed some excellent cell properties have been demonstrated
down to 60 nm generation with the type of cell shown in Figure 11.21b [75, 81]. For
further scaling down, a cell which uses structured nitride areas to store the charge was
proposed. This has the advantage of controlling the cross-talk between bits, as well as
being able to us thinner gate dielectrics between the ONO layers [91, 92]. A more
radical approach that utilizes the third dimension by adopting a U-shaped device [93]
is shown in Figure 11.23. Another approach to increase the storage density and
reduce the area usage per bit is to combine the multi-bit concept with a multi-level
approach. As a result, with four levels on each side of the cell (a total of eight levels), 4
bits can be stored [94], whereas by comparison a oating-gate cell requires 16 levels to
store 4 bits (see Section 11.3.5).
11.5
Nanocrystal Flash Memories
j377
j 11 Flash-Type Memories
378
a small nanocrystal embedded into a thin oxide layer is placed below the larger
nanocrystal which actually carries the stored charge.
In most cases, either a low-pressure chemical vapor deposition (LPCVD) from
SiH4 [104] or ion-implantation with subsequent thermal treatment [105] are used to
fabricate nanocrystals. Although other techniques have shown promise [106],
LPCVD and ion implantation are the easiest procedures for integration into a
standard CMOS process. In both cases, the nanocrystal formation is a statistical
process leading to controllability issues in scaled down devices [107]. Methods for
controlled fabrication of nanocrystal size and distance by using templates or selforganization would, therefore, signicantly improve the outcome [108]. When
further scaling down the nanocrystal device, this path may in time lead to a
single-electron memory [109].
11.6
Summary and Outlook
The trend towards mobile electronic devices has created and continues to create a
rapidly increasing demand for non-volatile memories. Today, Flash memories
represent the best solution for most of these applications, where coded Flash
applications are typically covered by NOR Flash devices and data Flash applications
by NAND Flash devices. Currently, the oating-gate transistor is seen as the
workhorse of those cell devices used in many of todays technologies. Indeed,
oating-gate technology shows a scaling potential for further generations if innovations such as high-k coupling dielectrics or low-k isolation oxides can be mastered. By
contrast, charge-trapping devices are possibly due to make a return, with multi-bit
charge-trapping having recently emerged in a number of applications. In fact, a
modied version of the classical SONOS device, programmed and erased by
tunneling, may replace the oating-gate transistor in future generations of NAND
Flash. Nanocrystals represent another option to replace the oating gate, although at
present the challenges that they face seem much more severe than for the chargetrapping case. However, in the long term this development may lead to a singleelectron device.
Unfortunately, ash-type memories based on charge storage in either oating
gates or charge-trapping layers still suffer from important drawbacks, including
limited endurance, slow write/erase, and no direct overwrite. Hence, for many
years research groups have sought new storage mechanisms that could supply a
non-volatile memory without such shortcomings. To achieve this goal, new
materials with innovative switching effects must be integrated into the CMOS
ow [110, 111]. Although these technologies are beyond the scope of this chapter,
it is important to note that although they may have distinct advantages over Flash
memories, and indeed some have now reached the production stage (see Chapters 1316), the scaling of Flash memories has to date been much more
successful. Such scaling possibilities provide Flash with a major competitive
advantage in terms of system cost.
References
References
1 Niebel, A. (2004) Proceedings of the 20th
Nonvolatile Semiconductor Memory
Workshop, p. 14.
2 Harari, E., Schmitz, L., Troutman, B. and
Wang, S. (1978) ISSCC Digest of Technical
Papers, 108.
3 Mikolajick, T. et al. (2001) Microelectronics
Reliability, 7, 947.
4 Yoshikawa, K. (1999) VLSI Symposium on
Technology, Systems and Applications,
p. 183.
5 Kahng, D. and Sze, S.M. (1967) BELL
System Technical Journal, 46, 1288.
6 Wegener, H.A.R. et al. (1967) IEDM Digest
of Technical Papers, 70.
7 Frohman-Bentchkowsky, D. (1971)
ISSCC Digest of Technical Papers, 80.
8 Cricchi, J.R., Blaha, F.C. and Fitzpatrick,
M.D. (1974) IEDM Digest of Technical
Papers, 204.
9 Johnson, W. et al. (1980) ISSCC Digest of
Technical Papers, 152.
10 Masuoka, F. et al. (1984) IEDM Digest of
Technical Papers, 464.
11 Kynett, V.N. et al. (1988) ISSCC Digest of
Technical Papers, 132.
12 Shirota, R. et al. (1988) Symposium on
VLSI Technology, 33.
13 Chan, T.Y., Young, K.K. and Hu, C. (1987)
IEEE Electron Device Letters, 8, 93.
14 Eitan, B. et al. (1999) Proceedings SSDM,
522.
15 Keeney, S. (2001) IEDM Digest of Technical
Papers, 41.
16 Onoda, H. et al. (1992) IEDM Digest of
Technical Papers, 599.
17 Hisamune, Y.S. et al. (1993) IEDM Digest
of Technical Papers, 19.
18 Sze, S.M. (1981) Physics of Semiconductor
Devices, John Wiley & Sons, New York.
p. 397.
19 Bude, J.D. et al. (1997) IEDM Digest of
Technical Papers, 279.
20 Mahapatra, S., Shukuri, S. and, Bude, J.D.
(2002) IEEE Transactions on Electron
Devices, 7, 1296.
j379
j 11 Flash-Type Memories
380
References
83 Ogura, T. et al. (2003) Symposium on VLSI
Technology, 207.
84 Tomiye, H. et al. (2002) Symposium on
VLSI Technology, 206.
85 Yeh, C.C. et al. (2002) IEDM Digest of
Technical Papers, 931.
86 Yeh, C.C. et al. (2005) IEEE Transactions on
Electron Devices, 52, 541.
87 Yeh, C.C. et al. (2006) Proceedings of the
21st Nonvolatile Semiconductor Memory
Workshop, p. 76.
88 Hisamoto, D. et al. (2000) IEEE
Transactions on Electron Devices, 47,
2320.
89 Specht, M. et al. (2004) IEDM Digest of
Technical Papers, 1083.
90 Walker, A.J. et al. (2003) Symposium on
VLSI Technology, 29.
91 Lee, Y.K. (2004) Proceedings of the 20th
Nonvolatile Semiconductor Memory
Workshop, p. 96.
92 Choi, B.Y. (2006) Proceedings of the 21st
Nonvolatile Semiconductor Memory
Workshop, p. 72.
93 Willer, J. et al. (2003) Proceedings of the 19th
Nonvolatile Semiconductor Memory
Workshop, p. 42.
94 Eitan, B. (2005) IEDM Digest of Technical
Papers, 22.1.1.
95 Tiwari, S. (1996) Applied Physics Letters, 68,
1377.
96 Bostedt, C. et al. (2004) Applied Physics
Letters, 84, 4056.
j381
j383
12
Dynamic Random Access Memory
Fumio Horiguchi
12.1
DRAM Basic Operation
Dynamic random access memories (DRAMs) use the charge stored in a capacitor to
represent binary digital data values. They are called dynamic because the stored
charge leaks away after several seconds, even with power continuously applied.
Therefore, the cells must be read and refreshed at periodic intervals. Despite this complex operating principle, their advantages of small cell size and high density have made
DRAMs the most widely used semiconductor memories in commercial applications.
In 1970, the three-transistor cell used for the 1 kbit DRAM was rst reported [1], and the
one-transistor (1T-1C) cell became standard use in 4 kbit DRAMs [2]. During the
following years, the density of DRAMsincreased exponentially, with rapid improvement
to the cell design, its supporting circuit technologies, and ne patterning techniques.
The equivalent circuit of the 1T-1C DRAM cell is shown in Figure 12.1. The array
transistor acts as a switch and is addressed by the word line (WL), which controls the
gate. The storage capacitor, CS, represents the charge storage element containing the
information and is connected to the bit line, BL, via the array transistor. When the
array transistor switch is closed, the voltage level VDD/2 or VDD/2 is applied to CS
via the bit line. The corresponding charge on CS represents the binary information,
1 or 0. After this write pulse, the capacitor is disconnected by opening the array
transistor switch.
The memory state is read by turning on the array transistor and sensing the charge
on the capacitor via the bit line, which is precharged to VDD/2 (where VDD is the power
supply voltage). The cell charge is redistributed between the cell capacitance, CS, and
the bit line capacitance, CB, leading to a voltage change in the bit line. This voltage
change is detected by the sense amplier in the bit line and amplied to drive the
input/output lines. Because a read pulse destroys the charge state of the capacitor, it
must be followed by a rewrite pulse to maintain the stored information. The plate, PL,
is kept at VDD/2 to reduce the electric voltage stress on the capacitor dielectric, which
is charged to VDD/2 or VDD/2 instead of being discharged to 0 Vand charged to the
full power supply voltage, VDD.
Nanotechnology. Volume 3: Information Technology. Edited by Rainer Waser
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31738-7
384
Figure 12.1 DRAM memory cell equivalent circuit. See text for details.
DRAM has required almost constant storage capacitance (more than 40 fF, i.e.,
4 1014 F) among the generations, despite the scaling to smaller cell sizes, and this
is the reason for the requirement of three-dimensional (3-D) structures such as
trench or stacked capacitor. A trench capacitor uses the inner surface of a Si hole to
store charge, while a stacked capacitor uses a poly Si capacitor above the array
transistor and bit line (see Figure 12.2).
12.2
Advanced DRAM Technology Requirements
Historically, the cost of DRAM has been forced to decrease to retain its share in
the huge and competitive market for high-density memory. This has resulted in a
Cell area
Storage capacitor
Array transistor
Bit line contact
Storage node contact
decrease in DRAM cell size because the chip cost is directly related to the cell area.
Thus, every part of the cell is required to be as small as possible, and the cell size must
be less than or equal to 8 F2 (where F is the feature size). A summary of the
technological requirements for a DRAM memory cell is provided in Table 12.1. The
most critical part in the cell shrinkage is the capacitor; thus, a 3-D structure such as a
trench or stacked capacitor has been adopted to retain sufcient capacitor area within
a limited space.
As the DRAM cell size shrinks to sub-100 nm, it becomes critically important to
realize a sufcient on-off-current ratio in the array transistor. In general, the scaling
approach implies that the transistor sizes L and W, the gate oxide thickness Tox, the
supply voltage, and the threshold voltage Vth should be reduced by a factor of 1/k
(k: scale factor), and that channel doping should be increased by a factor of k in order
to sustain or improve the transistor performance. In a DRAM cell, the charge must be
stored in storage capacitors; therefore, an extremely low off-current in the array
transistor is required for data retention. Thus, Vth should be made as small as
possible to decrease the channel leakage current, and the supply voltage should be
minimized so that sufcient charge can be written into the capacitor. Gate oxide
thickness must also be reduced to maintain a sufcient breakdown voltage for the
gate dielectrics. All this means that the array transistor cannot be scaled down in a
conventional manner. On the other hand, a sufcient on-current in the array
transistor is required for fast writing characteristics. This suggests a short L and
large W, but scaling difculties prevent the reduction of L and the cell size limits W.
Thus, maintaining sufcient on-current in the array transistor is difcult and,
moreover, increasing channel doping degrades the channel mobility, which further
decreases the on-current in the array transistor.
In view of these considerations, a different structural approach for array transistors
in DRAMs is necessary.
12.3
Capacitor Technologies
The rst generation of DRAMs used a planar storage capacitor for memories of up to
1 Mbit. However, from the 4 Mbit generation onwards, trench or stacked capacitors
were used for maintaining the same storage capacitance within a limited cell area. A
comparison between the stacked capacitor cell and the trench capacitor cell is shown
j385
386
Stacked
Complexity of capacitor formation
Decrement of memory cell parasitics
Shrinkability of memory cell
Compatibility with logic process integration
Compatibility with logic device characteristics
Compatibility with logic layout design rules
Additional mask steps: @130 nm node
Trench
69
in Table 12.2. The differences arise mainly from the transistor formation process
compared with the capacitor formation process. The array transistor in a stacked
capacitor is formed before the capacitor; thus, the transistor source/drain junctions
can easily be extended after the stacked capacitor is fabricated by a thermal process,
which results in a degradation in transistor performance. Figures 12.2 to 12.4 show
examples of trench and stacked capacitor cells, where each cell has high-aspect-ratio
storage nodes under (trench) and over (stacked) the array transistor [3, 4].
Figure 12.5 shows the relationship between CS and the trench diameter. CS becomes
smaller than 40 fF for a design rule of less than 90 nm because of the smaller capacitor
area. To overcome the capacitor area reduction, hemispherical grains (HSG) [5, 6] can
be used to enhance the surface area of the storage node in a deep trench cell (see
Figure 12.6). HSG technology is an approach more typically used in the stacked
capacitor cell. Another technique to enhance the capacitance is to use high-k dielectric materials such as Al2O3 or Ta2O5, where k denotes the dielectric permittivity. An
example of a trench capacitor with a high-k Al2O3 dielectric is shown in Figure 12.7 [7].
The capacitance is increased by more than 30% compared with the standard nitrideoxide (NO) dielectric, which must be scaled to 12 nm thickness in a typical 65-nm
process. As capacitor dielectrics are scaled, the leakage current increases (1.2 nm
SiO2 consists of only ve atomic layers). Consequently, high-k dielectrics have been
Figure 12.5 The relationship between capacitance (CS) and trench diameter (DT).
j387
388
proposed as alternatives for capacitor dielectrics to address the leakage problem. The
most common high-k capacitor dielectrics used for gate dielectrics are Al2O3, Ta2O5
and hafnium-based dielectrics (e.g., HfO2, HfSixOy). However, high-k gate dielectrics are not compatible with the polysilicon gate electrodes commonly used in
todays integrated circuit technology. Their combination leads to threshold voltage
uncontrollability and on-current reduction. Thus, metal gate electrodes with appropriate work functions must be used. Despite this, the implementation of high-k
dielectrics and metal gate electrodes into complementary metal oxidesemiconductor (CMOS) technology is difcult, and involves many technical issues such as
deposition methods, dielectrics reliability, charge trapping and interface quality. For
capacitor dielectrics, it is much easier to implement high-k dielectrics because the
threshold voltage shift induced by the charge trapping and the interface quality do not
affect the capacitor characteristics compared with their effect on gate dielectrics.
Thus, Al2O3 or Ta2O5 have been used as capacitor high-k dielectrics for a few DRAM
products. For future DRAMs, high-k dielectrics will most likely be used not only for
capacitor dielectrics but also for peripheral transistor gate dielectrics to overcome
scaling problems.
12.4
Array Transistor Technologies
The recess-channel-array transistor (RCAT) [8] is used to reduce the electrical eld
near the drain to achieve a long data retention time in a stacked capacitor cell.
Figure 12.8 shows the RCATstructure, which increases the effective gate length of the
array transistor and mitigates the short-channel effect without increasing area.
Usually, channel doping enhances the electric eld near the drain and degrades
data retention characteristics because of the increased drain leakage current. To
overcome this effect, the RCAT is used to reduce the electric eld by separating the
channel from the drain using an engraved channel region. This is effective in
reducing the electric eld and short-channel effects; however, the longer channel
results in a small on-current in the array transistor. Thus, the RCAT is not suitable for
high-speed writing.
j389
390
The RCATcan be used down to around the 50 nm node, but below this another 3-D
approach will be needed to satisfy the requirement of current drivability and to reduce
the short-channel effect.
Figure 12.9 shows the on-current (Ion) trend of array transistors. The Ion decreases
with transistor size in accordance with the design rule scaling; this in turn increases
the signal delay in data sensing on the bit line (see Figure 12.10), in the case of reading
a 1. When Ion is small, the signal appears on the bit line with a delay and approaches
the 1 target level slowly.
To overcome these constraints in the array eld-effect transistor (FET), a trench
isolated transistor using sidewall gates (TIS) or a n-array-FET can be adopted to
improve the transistor performance, as in the case of silicon-on-insulator (SOI)
transistors [912]. Figure 12.11 shows a birds-eye view of a TIS-array FETwhere the
TIS gate structure, which consists of a top gate and a sidewall gate enables a high
on-current and a low off-current simultaneously because of the double-gate structure
and high gate controllability. Figure 12.12 shows the Tn (the width of the n)
dependence of the minimum gate length (Lg). A thinner Tn would be expected to
result in a marked reduction of off-current, which means that the TIS gate structure is
very suitable for array transistors.
In the TIS structure, the n substrate is fully depleted and the double side gates
contribute to the potential of each side channel. The subthreshold swing of the TIS
transistor is smaller than that of the conventional planar transistor because of the
strong effect of the sidewall gates. Thus, a small gate voltage difference can rapidly
change the drain current from a small off-current to a large on-current. Moreover, the
constant threshold voltage characteristics without a back-gate bias effect contribute to
the large on-off-current ratio.
A more advanced array transistor is the vertical transistor, in which the source, gate
and drain are arranged vertically. There are two types of vertical transistors. One uses
the inner sidewall of the trench hole, while the other uses the outer sidewall of a
silicon pillar for the channel. The former is suitable for trench capacitor cells [13],
while the latter is known as a surrounding gate transistor (SGT) [14, 15]. The gate
electrode of the SGT surrounds a pillar of silicon, and the gate length of the SGT is
Figure 12.12 The dependence of fin width (Tfin) on minimum gate length (Lg).
j391
392
adjusted by the pillar height, as shown in Figure 12.13. Therefore, the SGT has the
merits of short-channel-effect immunity and superior current drivability resulting
from the excellent gate controllability.
Planar array transistors cannot easily be scaled down (as noted above), and the TIS
has good on-off-current ratio characteristics. The vertical transistor is different from
the planar type in that the channel length is dened by the depth of the hole or the
height of the pillar. Thus, the gate length is free from the minimum design rule and
the cell area limitations, and can be selected to be sufciently large so as to avoid the
short-channel effect. Similar to the trench-type capacitor, the vertical transistor and
the capacitor are formed in the same hole, and this contributes to the small cell size of
less than 6 F2. In the SGT cell, it is more difcult to form the capacitor and array
transistor, although it has ideal array transistor characteristics. The SGT substrate is
fully depleted and the surrounding gate contributes to the potential of the pillar
surface channel. The subthreshold swing of the SGT cell is smaller than that of the
conventional planar transistor and the TIS because of the stronger effect of the
surrounding gate. Also, surrounding gate structures contribute to the large width of
the transistor by using the entire perimeter of the pillar. Thus, a large on-off-current
ratio can be attained without a back-gate bias effect, and the SGT can be used for 4 F2
small-cell-size DRAMs.
DRAM scaling will continue to enable the integration of many advanced technologies in view of the huge size of the DRAM market. Thus, these advanced technologies
will be used in future-generation DRAMs.
For the array transistor, the TIS/n type structure is expected to be adopted using
p poly for obtaining a suitable threshold voltage with low channel doping. For the
capacitor, a high-k dielectric (e.g., barium strontium titanate, BST) may be used in
future DRAMs. For the peripheral transistor, mobility enhancement technologies
such as the use of SiGe or a linear strain technique and high-k gate dielectrics will be
adopted to achieve large driveability for a high-speed operation. A 4 F2 cell layout is
12.5
Capacitorless DRAM (Floating Body Cell)
The difculties of DRAM integration are mainly attributable to the necessity for
constant capacitance, even when the cell size is reduced. For this reason, the
integration of capacitors is very complicated for trench or stacked capacitors. The
oating body cell (FBC) is a new concept of a DRAM without a capacitor. Because the
cell is composed of one transistor, the FBC has a simple and compact structure.
Figure 12.14 shows the principle of the FBC, which involves the storage of the
signal charge in the body of the cell transistor. To write 1, VWL is biased to 1.5 V and
VBL to 2 V, so that the body potential (Vbody) is increased by the holes that accumulate
by impact ionization. To write 0, VWL is biased to 1.5 V and VBL to 1.5 V, so that
Vbody is decreased by ejecting holes from the body. The body potential difference
(DVbody) is stored by setting VWL to 1.5 V and VBL to 0 V. In order to read the stored
data, VBL is biased to 0.2 V and VWL is swept up to a certain level, while the bit line
current (Iread) is measured. The IreadVWL characteristics are shown in Figure 12.15.
The threshold voltage difference between a 0 cell and a 1 cell (DVT), which is an
index of the data reading margin, is about 0.32 V. In order to increase DVT or DIread, CS
(the body capacitance for data storage) plays an important role, because the DVbody of
the hold state is reduced by WL-body and BL-body capacitance coupling. A back gate is
used to enable charge accumulation in the body [1618], and also to increase CS,
which stabilizes the body potential. The structure of the FBC has a modied doublegate conguration. A transmission electron microscopy cross-section of a fully
depleted FBC, with thin SOI and BOX layers, is shown in Figure 12.16.
The body capacitance is small compared to the standard DRAM capacitor, typically
by two orders of magnitude. However, the leakage current of the FBC storage node is
small because of the small pn junction area, which is located only at the channel-side
Figure 12.14 The write operation of the floating body cell (FBC). See text for details.
j393
394
Figure 12.15 The read operation of the floating body cell (FBC). See text for details.
edges of the source and drain. Thus, the retention time of the FBC is reduced slightly
compared to the standard DRAM. As a consequence, and because of the short
retention time, the FBC is suitable for high-performance embedded DRAM applications rather than low-power applications.
The SOI structure has been widely used for high-performance applications,
particularly game processors, and is expected to be used in the embedded DRAM
for on-processor caches. An FBC using a SOI substrate can easily be used for these
applications with the same compatibility as the SOI substrate. As the FBC is
composed of one transistor and has no capacitor, it is scalable down to the 32 nm
node. Details of this structure are provided in Ref. [17]. An image of an FBC with
128 Mb DRAM, along with the chip features, is shown in Figure 12.17. The FBC,
which has dimensions of 7.6 8.5 mm, contains all of the necessary circuits
(including internal voltage generators) and operates using a single 3.3 V power
supply [18].
References
j395
Figure 12.17 Floating body cell (FBC) 128 Mb DRAM and its features.
12.6
Summary
Today, while the demand for DRAM remains greater than for any other type of
memory, the capacity of DRAM is continually increasing such that variations are now
becoming available for both low-power and high-speed applications. Because of the
scaling limitations, the TIS/n array-transistor is expected to be used in futuregeneration DRAMs, with more advanced DRAMs such as vertical transistors such
as the SGT most likely being used for DRAMs with a smaller cell size. In addition,
the capacitorless DRAM the FBC shows great promise as a candidate for nextgeneration embedded DRAMs offering both high density and high speed. Clearly, the
use of 3-D structures should help to overcome the scaling problems likely to be
encountered in future-generation memories.
References
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
396
Q9
Q10
10
Q11
Q12
Q13
Q14
11
12
13
14
15
16
17
18
19
Q15
Q16
Q17
Q18
Q19
j397
13
Ferroelectric Random Access Memory
Soon Oh Park, Byoung Jae Bae, Dong Chul Yoo, and U-In Chung
13.1
An Introduction to FRAM
398
13.1.1
1T1C and 2T2C-Type FRAM
13.1.2
Cell Operation and Sensing Scheme of Capacitor-Type FRAM
Figure 13.2 illustrates the writing operation of 1T1C-type FRAM. Figure 13.2a is the
schematic of 1T1C FRAM which is composed of the word line (WL), bit line (BL), and
plate line (PL). Figure 13.2b shows the charges preserved in the hysteresis curve,
while Figures 13.2c and d show the timing diagrams of writing data 1 and 0. To
write 1 into the memory cell, the BL is raised to Vpp and PL is kept as ground (GND).
The polarization directions are from PL to BL and the Pr value is preserved. To write
0, the BL is kept as GND and the PL is kept high as Vpp. Thus, the opposite direction
of the polarization is generated and Pr value is preserved.
Figure 13.3 illustrates the reading operation of 1T1C-type FRAM [10]. A read access
begins by precharging the BL to GND, after which the PL is raised to Vpp. This
establishes serial two capacitors consisting of Cs and CBL between the PL and the
GND, where Cs is the capacitance of ferroelectric storage element and CBL is the
parasitic capacitance of BL. Therefore, the Vpp is divided into Vf and VBL between Cs
and CBL according to their relative capacitance. Depending on the data stored, the
voltage developed on the ferroelectric capacitor and BL can be approximated as
follows:
V f CBL V PP =Cs CBL
13:1
13:2
j399
400
13:3
In general, the voltage developed in the BL is too small to sense charge differences.
Therefore, a sensing amplier should be used in order to drive the BL to full Vpp if the
data is 1, or to 0 V if the data is 0. The structure of the sensing amplier of
capacitor-type FRAM includes the cross-coupled latch sense amplier of DRAM. It
can be classied to a folded bit line and an open bit line according to the cell array, as
shown in Table 13.1 and Figure 13.4 [11]. The open bit-line scheme is applicable to
1T1C structure, and the folded bit-line scheme can be applied to both 1T1C and 2T2C
structures.
Folded bit-line
Open bit-line
Sense amplifier
Realization
Noise immunity
Sensibility
1 ea/2 BL
1 ea/1 BL
Easy layout
Difcult layout
Good
Not good
13.2
Ferroelectric Capacitors
Representative ferroelectric materials for complementary metal oxidesemiconductor (CMOS) integration can be divided to two groups, including perovskite-structured
(PZT and BiFeO3) and Bi-layer structured (SBT, BLT, and BTO) materials. The
characteristics of these are summarized in Table 13.2 [12, 13]. The crystal structure
of PZT can be either tetragonal or rhombohedral according to the Zr/Ti composition
ratio below Curie temperature. In SBT, two SrTaO3 perovskite blocks and one
(Bi2O2)2 layer constitute one unit cell, as shown Figure 13.5b. In BLT, La atoms
are partially substituted to Bi atom in Bi4Ti3O12 (BTO) crystal which is composed of
three TiO6 octahedra and one (Bi2O2)2 layer leading the wanted crystal structure.
These differences of unit cell structure largely determine the characteristics of
corresponding ferroelectrics. Therefore, the typical properties of PZT, SBT and BLT
can be compared from this point of view.
First, the remanent polarization value (Pr) determines the sensing margin between
data 0 and 1 (the larger 2Pr, the better sensing-margin), while the coercive voltage
(Vc) or coercive eld (Ec) decides the operating voltage in an FRAM device (the smaller
Vc, the better operation voltage). PZT shows large Pr and Ec values because of strong
interactions between neighboring perovskite unit cells. In contrast, SBT and BLT
request the anisotropic growth along the a-b axis to attain direct interactions between
the neighboring perovskite unit cells toward electrical eld direction. The smaller Pr
and Ec values of SBT, compared to PZT, tend to increase by Nb doping.
Ferroelectrics
Pb(Zr,Ti)O3 (PZT)
SrBi2Ta2O9 (SBT)
(Bi,La)4Ti3O12 (BLT)
Pr [mC cm2]
Ec [Kv cm1]
Endurance
1040
5070
Poor on Pt electrode
510
3050
Good on Pt
electrode
1015
3050
Good on Pt
electrode
650800
650750
400
400
400
Crystallization
temperature [ C]
Curie temperature [ C]
j401
402
Figure 13.5 The crystal structures of ferroelectric (a) PZT, (b) SBT, and (c) BLT.
Fatigue is a term describing the fact that the remanent polarization becomes small
when a ferroelectric lm experiences numerous polarization reversals. When PZTon
Pt electrodes suffers from reading/writing cycles over 1E5 cycles, the Pr value shows a
conspicuous reduction, which limits the repeated use of a memory. A few reports
about non-fatigue phenomena of Pt/PZT/Pt capacitors should be regarded with great
care because this type of behavior can be observed when the applied voltage is less
than V(90%). Fatigue behavior is strongly related to the generation of oxygen vacancy
by the repeated cycles, which induces dipole-pinning electrons for charge-neutrality;
this is why oxygen vacancies are the only mobile ionic species in the lattice even at the
room temperature on the basis of defect chemistry model. However, the fatigue
problems of PZT can be almost solved at present by the use of conducting oxide
electrodes (e.g., IrO2, RuO2, SrRuO3, CaRuO3, LaNiO3, and LSCO), ensuring no
degradation of the Pr value even up to 1E12 cycles. Figure 13.6 shows a comparison of
fatigue properties in PZT capacitor with Pt and IrO2 electrodes [16]. This improvement of fatigue property can be explained by the fact that oxygen in the IrO2
electrodes reduces oxygen vacancies, which prevents fatigue degradation reducing
the dipole-pinning effect. As Ir is stably converted into IrO2 under oxygen at ambient
temperature, the fatigue problem can be remarkably enhanced in the case of using an
IrO2 oxide electrode.
In contrast to the PZT lm, an SBT lm does not show the fatigue phenomenon up
to 1E13 switching cycles, even if Pt electrodes are used. It was speculated that the
(Bi2O2)2 interlayer can compensate the produced oxygen vacancy. However, similar
Bi-layer structured BTO shows fatigue problems on a Pt electrode, which suggests
that the simple charge-compensation role of the (Bi2O2)2 layers is not sufcient to
make the fatigue-free lms. This reduction in polarization could be much alleviated
by using La-doped BTO (so-called BLT). Accordingly, the limited switching cycles of
dipoles are no longer a serious problem for any ferroelectric materials, as shown in
Figure 13.7 [13, 14].
13.2.3
Retention
j403
404
memories. Most commonly, the retention of ferroelectrics can be classied into the
same-state and opposite-state retention.
The same-state retention, which is closely related to aging, represents the loss of
polarizability when one rst writes the datum of 0 or 1 in a capacitor with
electrical pulses, and reads the datum again after long period without changing the
initial status. Therefore, the same-state retention failure can occur when the relaxation component in the opposite-polarity state increases at the expense of the
relaxation component in the stored polarity state. The stored polarity status can be
stabilized by the use of ferroelectrics with high Curie temperature, after which the
same-state retention loss can be improved from the viewpoint of thermodynamics.
For instance, BaTiO3 is not applicable for non-volatile FRAM because of its low Curie
temperature (140 C), although this can be raised to 500 C by imposing biaxial
compressive strain.
The opposite-state retention, which is closely related to imprint, represents the loss
of polarizability when one rst writes the datum of 0 or 1 in a capacitor with
electrical pulses and reads the changed datum again. In words, the same-state
retention is a longstanding problem of the read-only memory (ROM), while the
opposite-state retention is that of the random-access memory (RAM), because
information must be modiable (as shown in Figure 13.8) [17].
The opposite-state retention failure occurs when a capacitor, which has aged
considerably in one state, is switched to the opposite state. In this case, the capacitor
behaves as if it would prefer to remain in the original state. The charge defects are
activated by thermal energy and redistributed by the polarization eld. Therefore, the
resulting internal eld causes a lower energy barrier and invokes polarization backswitching during the delay time, as shown in Figure 13.9 [18]. Accordingly, the
opposite-state retention can be solved by minimizing the space charges, which results
from defects inside the ferroelectrics, domain wall motion, or defects near the
electrode-ferroelectric interfaces.
The thickness scaling of ferroelectric lms is indispensable when pursuing a
low switching voltage, making this suitable for integrated electronics applications.
However, to date, thinner ferroelectric lms have shown serious degradation of
j405
406
Figure 13.10 Improved opposite-state retention by the use of PbTiO3 seed layer.
and MOCVD method is shown in Figure 13.11, where the MOCVD PZT lm shows
superior retention properties to those of CSD lms due to the low defect density in
ferroelectrics and/or interfaces. It may be speculated that an as-crystallized PZT lm
on an Ir electrode can be obtained by using the MOCVD process, such that the nonswitching layer at the interface between the electrode and ferroelectrics is thinner,
without the formation of Pt3Pb alloys.
13.2.3.3 Perovskite Oxide Electrode
Most ferroelectric materials have a perovskite crystal structure, as outlined in the
previous section. Therefore, if a conducting oxide electrode having a perovskite
structure is use, then ferroelectric properties such as reliability can be greatly
improved due to the reduction of any non-ferroelectric dead layer at the interface
between the ferroelectrics and electrodes. Such remarkable improvement of retention properties by using an SrRuO3 electrode with a perovskite structure is illustrated
in Figure 13.12 [22].
j407
408
During recent years, although perovskite oxide electrodes such as SrRuO3, LaNiO3
and CaRuO3 have undergone intense investigation, the problems of high leakage
currents which inevitably are induced by high defect densities in the oxide
electrode remain to be overcome.
Recently, the successful development of an ultrahighly reliable FRAM device has
been reported, and the retention properties of this fully integrated device, at different
temperatures, are illustrated graphically in Figure 13.13 [23]. Based on these ndings,
the FRAM device could be expected to maintain >80% of any initial charge, even after
10 years at 175 C.
13.3
Cell Structures
A vertical scanning electron microscopy image of the FRAM cell structure is shown in
Figure 13.14 [24]. The cell is composed of a cell transistor, capacitor, buried contact,
bit line, word line, and plate line. The cell structure can be divided into the CUB
(capacitor under bit line) and COB (capacitor over bit line) structures, the merits and
demerits of which are considered in the following section.
13.3.1
CUB Structure
In the CUB cell structure, the ferroelectric capacitor is formed beside the cell
transistor, as shown in Figure 13.15 [25]. This requires a large cell area compared
to the COB cell structure, in which the ferroelectric capacitor is formed over the cell
transistor. The CUB scheme has no thermal budget limitations on the ferroelectric
lm deposition, and the subsequent anneal process for crystallization of the
ferroelectric lm, because the ferroelectric capacitor formation processes (including
stack deposition and dry etching) are completed before the metallization process is
Figure 13.14 A vertical scanning electron microscopy image of the FRAM cell.
j409
410
13.3.2
COB Structure
In the COB cell structure, the ferroelectric capacitor is formed over the bit line. Thus,
the realization of a COB cell structure requires both a new buried contact (BC) plug
and new metal technologies for the oxidation barrier. A stable contact between the BC
plug and the bottom electrode must be provided when the ferroelectric capacitor has
been processed at a high temperature of 600 C or above [26]. As shown in
Figure 13.16a, a high-temperature process is essential to obtain a sufcient polarization value in an MOCVD PZT process [27]. In order to prevent oxidation of the BC
plug, various oxidation-barrier metals have been widely investigated; among these, a
TiAlN lm proved successful in preventing oxidation of the BC plug. The oxidation
resistance properties of TiAlN and TiN thin lms, as a function of temperature, are
illustrated graphically in Figure 13.16b.
As mentioned above, as the COB structure is more benecial with regards to highdensity integration than are CUB structures, an increasing proportion of FRAM
devices are today adopting the COB structure.
13.4
High-Density FRAM
In this section, the current status of planar capacitor technology, together with the
technical issues involved in the development of 3-D capacitors for high-density
FRAM device application, will be discussed.
13.4.1
Area Scaling
In order to achieve a high-density FRAM, the cell size must be scaled down as
much as possible. Unfortunately, however, there exists a scaling limit because the
polarization value (2Pr) decreases in proportion to the cell size, such that etching
damage on the capacitor becomes increasingly critical. The data in Figure 13.17 show
that the polarization decays as the drawn cell size decreases. With a planar capacitor
structure, although the polarization degradation is negligible down to the 150-nm
technology node, the polarization value decreases rapidly below that level. This effect
is mainly caused by the difference between the drawn area and the effective area, and
indicates that the etched slope is no longer steep enough to provide both a designed
top-electrode area and sufcient spacing between adjacent bottom electrodes at the
130-nm technology node. Therefore, in order to increase the effective capacitor area
below the critical cell size, both thickness scaling and the high-etched slope of the
capacitor stack should be guaranteed.
In order to maximize effective capacitor area, the most important technology
is to achieve the high-etched slope of the capacitors, but this is difcult because
both top and bottom electrodes are usually noble metals, and the noble metal
etch process has remained an unanswered question since the initial stages of
FRAM development. Even until quite recently, the limitation of the capacitor
etched slope was about 6065 , mainly owing to the loss of hard-mask from the
sputtering condition of the noble metal etch. This lower capacitor slope can lead to a
decrease in capacitor area of the top electrodes, or to a short circuit between the cap-to
cap at the bottom electrode. However, based on some experimental ndings (see
Figure 13.18), new technology has been successfully developed in order to obtain a
high-etched slope of about 8085 [28]. This new etching scheme was tested at high
temperature with chlorine and uorine chemistry, and a dual hard mask (oxide and
metal). As a result, the noble metal was successfully etched with a high slope by
improving the reactivity between the noble metal and etch gases, and by increasing
the process temperature and reinforcing the robustness of the hard-mask.
j411
412
13.4.2
Voltage Scaling
With the advent of the mobile era, low-voltage operation has become increasingly
important in the reduction of power consumption. In the case of FRAM devices, the
operation voltage is directly related to the thickness of the ferroelectric lm; hence,
the latter dimension should be minimized for low voltage application. As shown in
Figure 13.19a, a PZT capacitor prepared by the CSD process shows a drastic
degradation of ferroelectric properties below 100nm thickness. This is clearly a
critical problem which must be solved in the case of high-density FRAM devices. As
described above, both ferroelectric properties and reliability are greatly improved
when the PZT lms are prepared with MOCVD process; thus, even an 80 nm-thick
PZT lm prepared in this way demonstrates highly reliable ferroelectric properties [29] (Figure 13.19b).
It is difcult to prevent ferroelectric degradation at the interface between electrodes
and ferroelectric material, even when the MOCVD process is employed. However, if
perovskite oxide electrodes are used, the dead layer effect at the interface may be
remarkably reduced. Recently, it has been reported that a high reliability can be
achieved even with a 50 nm-thick PZT capacitor [30]. The charge-to-voltage (QV)
j413
414
A prototype 3-D capacitor has recently been demonstrated, and a TEM image
representing a 3-D PZT capacitor is shown in Figure 13.24. Although some pyrochlores remained in the trench capacitor, the columnar grains were well established
at the side-wall of trench, with optimized deposition condition.
Figure 13.25 illustrates, graphically, the ferroelectric properties with differentsized trench structures. The polarizationvoltage characteristics of a planar capacitor
and trench capacitors are shown in Figure 13.25a. Under 2.1 V external bias, and an
electric eld of 350 kV cm1, these capacitors produced no current leakage and
showed quite good hysteretic behavior compared to their planar counterpart. The
remnant polarization (2Pr) plotted against the external maximum voltage is shown in
Figure 13.25b; these data showed that 2Pr is very similar to that for the planar
capacitor in the case of a 0.32 mm trench-diameter 3-D capacitor. However, a 0.25 mm
trench-diameter capacitor showed a 2Pr value of 19 mC cm2 under an external
j415
416
maximum voltage of 2.1 V, which was 80% of the 2Pr values in either planar or
0.32 mm trench-diameter cases. This difference may be derived from an incomplete
extension of the columnar grains on the 0.25 mm trench side-wall. Based on these
ndings, it is quite possible that the side-wall PZT lm has the same ferroelectric
properties as the planar PZT lm.
In order to realize Giga-bit FRAMs with a 3-D capacitor, it has been necessary to
develop the atomic layer deposition (ALD) process for the PZTand electrode material.
As shown in Figure 13.23, the thickness of the ferroelectric material should be less
than 50 nm because the bottom/top electrode and ferroelectric lms may be formed
inside a trench of 200 nm diameter. This means that the ferroelectric properties of
sub-50 nm-thick PZT capacitors should be obtained for 3-D capacitor research. In
addition, a step coverage of the PZT lm becomes important as the aspect ratio of the
capacitor increases. Because the PZT lm should have a uniform composition at the
bottom and side-wall, the ALD method is regarded as the best choice among other
deposition methods, such as PVD and CVD. Although ALD for PZT has been
investigated by many research groups, process optimization is still required. Moreover, both noble electrode metals and ferroelectric materials may be prepared using
ALD. Recently, although iridium was successfully deposited using ALD, additional
improvements of properties should also be investigated. In contrast, a CMP technology for noble metal electrodes may need to be introduced in order to separate each
capacitor within this structure. Although noble metal CMP has not yet been achieved,
it is currently undergoing extensive investigation.
Unfortunately, as of today several technical difculties, including the reliability of
the 3-D capacitor, have not been fully solved. Nonetheless, the activities of many
research groups have provided much promise for 3-D FRAM development. It follows
that, if some of the above-mentioned problems are solved in the near future, then the
Giga-bit FRAM era will be well and truly opened.
References
13.5
Summary and Conclusions
FRAM technology, which has been undergoing continuous development since the
early 1990s, has been used to target a universal memory in the semiconductor
industry. Although reliability notably endurance and retention was initially a
major challenge, recent ndings have shown that this is no longer a key issue for
FRAM devices. Rather, it is scalability which has become an important issue,
following the development of 64 Mb FRAM through material and cell structural
innovations. At this density, FRAM may be applied to low-density embedded memory
(e.g., a smartcard), based on the demands of non-volatility, rapid access, high read/
write endurance, low-power operation, and high security level. In order to produce
high-density FRAM devices for use in major applications, a conventional planar-type
capacitor technology is insufcient for further cell size scaling. Rather, breakthrough
technologies such as the 3-D capacitor must be developed in order for the FRAM
device to serve as an ideal, non-volatile memory in the future.
References
1 Kim, K.N. and Lee, S.Y. (2004) Integrated
Ferroelectrics, 64, 314.
2 Kim, K.N. (1999) Integrated ferroelectrics,
25, 149167.
3 Jeong, G.T., Hwang, Y.N., Lee, S.H.,
Lee, S.Y., Ryoo, K.C., Park, J.H., Song, Y.J.,
Ahn, S.J., Jeong, C.W., Kim, Y.T., Horii, H.,
Ha, Y.H., Koh, G.H., Jeong, H.S. and Kim,
K.N. (2005) IEEE International Conference
on Integrated Circuit and Technology, pp.
1922.
4 Kim, H.J., Oh, S.C., Bae, J.S., Nam, K.T.,
Lee, J.E., Park, S.O., Kim, H.S., Lee, N.I.,
Chung, U.I., Moon, J.T. and Kang, H.K.
(2005) IEEE Transactions on Magnetics, 41,
26612663.
5 Baek, I.G., Lee, M.S., Seo, S., Lee, M.J.,
Seo, D.H., Suh, D.S., Park, J.C., Park, S.O.,
Kim, H.S., Yoo, I.K., Chung, U.I. and
Moon, J.T. (2004) IEDM Technical Digest,
pp. 587590.
6 Ishiwara, H., Okuyama, M. and Arimoto,
Y. (eds) (2004) Ferroelectric Random Access
Memories Fundamentals and Applications,
Spinger-Verlag.
j417
418
j419
14
Magnetoresistive Random Access Memory
Michael C. Gaidis
14.1
Magnetoresistive Random Access Memory (MRAM)
Through the merging of magnetics (spin) and electronics, the burgeoning eld of
spintronics has created MRAM memory with characteristics of non-volatility, high
density, high endurance, radiation hardness, high-speed operation, and inexpensive
complementary metal oxidesemiconductor (CMOS) integration. While MRAM is
unique in combining all of the above qualities, it is not necessarily the best memory
technology for any single characteristic. For example, SRAM is faster, ash is more
dense, and DRAM is less expensive. Stand-alone memories are generally valued for
one particular characteristic: speed, density, or economy. MRAM therefore faces
difcult odds in competing against the aforementioned memories in a stand-alone
application. However, embedded memory for application-specic integrated circuits
or microprocessor caching often demands exibility over narrow performance
optimization. This is where MRAM excels: it can be called the handyman of
memories for its ability to exibly perform a variety of tasks at a relatively low
cost [1]. Whilst one may hire a specialist to rewire the entire electrical circuitry of a
house, or install entirely new plumbing, a handyman with a exible toolbox is a much
more reasonable option for repairing a single electrical outlet or a leaky sink.
Moreover, the handyman may be able to repair a defective electrical circuit discovered
while in the process of repairing leaky plumbing!
A semiconductor fabrication facility that has MRAM in its toolbox is more likely to
tailor circuit designs to a customers individual needs for optimal performance at
reasonable cost. The ways in which the characteristics of MRAM compare to those of
other embedded memory technologies at the relatively conservative 180 nm node are
listed in Table 14.1. In the remainder of this chapter, the state of the art in MRAM
technology will be reviewed: how it works; how its memory circuits are designed; how
it is fabricated; the potential pitfalls; and an outlook for future use of MRAM as
devices are scaled smaller.
420
Parameter
Size
Size
Cost
Speed
Speed
Power
Power
Power
Endurance
Rad Hard
eSRAM
eDRAM
eFlash
eMRAM
3.7
65%
0
3.3 ns
3.4 ns
400 mA
15 pC b1
15 pC b1
Unlimited
Average
0.6
40%
20% (4 msk)
13 ns
20 ns
5000 mA
5.4 pC b1
5.4 pC b1
Unlimited
Poor
0.5
30%
25% (8 msk)
13 ns
5000 ns
0
28 pC b1
31 000 pC b1
1e5 cycles
Average
1.2
40%
20% (3 msk)
15 ns
15 ns
0
6.3 pC b1
44 pC b1
Unlimited
Excellent
The shaded cells indicate where MRAM has a distinct advantage. Relative comparisons should
hold through scaling to the 65 nm node [2].
14.2
Basic MRAM
apparent. The different resistance values for the high resistance state (Rhigh) and the
low resistance state (Rlow) can be used to dene a magnetoresistance ratio (MR) as in
Equation 14.1:
MR
Rhigh Rlow
Rlow
14:1
Typically, MR values for GMR devices are in the range of 510% for room-temperature operation.
By choosing different coercive elds for the two ferromagnets, it is possible to
create a so-called spin-valve MRAM structure with a conguration similar to that
shown in Figure 14.1. For example, ferromagnet 1 can be chosen to have a high
coercivity, thus xing its magnetization in a certain direction. Ferromagnet 2 can be
chosen with a lower coercivity, allowing its magnetization direction to uctuate. For a
magnetic eld sensor such as used in disk drive read heads, small changes in the
magnetization angle of ferromagnet 2 induced by an external magnetic eld can be
sensed as changes in the resistance of the spin valve. Because the spin-valve
sensitivity to external elds can be substantially better than inductive pickup, such
devices have enabled dramatic shrinkage of the bit size in modern hard drives. An
alternative use for the spin-valve structure is found if it is designed to utilize just two
well-dened magnetization states of ferromagnet 2 (e.g., parallel or antiparallel to
ferromagnet 1). Such spin-valve designs serve as a binary memory device, and have
found application in rad-hard non-volatile memories as large as 1 Mb [6]. The
drawbacks of this type of memory are:
.
.
.
a relatively low magnetoresistance, providing only low signal amplitudes and thus
longer read times
a low device resistance, making for difcult integration with resistive CMOS
transistor channels
in-plane device formation which is more difcult to scale to small dimensions than
devices formed perpendicular to the plane.
Solutions to these problems can all be found in the magnetic tunnel junction (MTJ)
MRAM. The MTJ structure is similar to the GMR spin-valve in that it uses the
property of electron spins aligning with the magnetic moment inside a ferromagnet.
However, instead of passing current in-plane through a normal metal between
j421
422
14.3
MTJ MRAM
The structure illustrated in Figure 14.2 can store binary information in the direction
of magnetization within ferromagnet 1 (the free layer), provided that the magneti-
j423
424
applied from a permanent magnet incorporated into the chip packaging. This is
somewhat impractical, however, due both to packaging cost and to stringent
requirements of across-chip uniformity.
Fortunately, clever manipulation of lm properties has driven the evolution of
several generations of MTJ structures, overcoming issues such as the offset eld
described above. Two such advances are illustrated in Figure 14.4. In Figure 14.4a,
an antiferromagnet is exchange coupled to the pinned layer, thus providing a much
larger effective coercive eld for the pinned, or reference side of the tunnel
junction [10]. With exceptional care to maintain a clean, smooth interface between
the antiferromagnet and the pinned layer above it, one can obtain the strong
exchange coupling between these lms that is necessary to resist eld switching.
At least 11.5 nm of ferromagnetic pinned layer must still remain in the stack to
act as an electron spin polarizer, but when coupled to the antiferromagnet it can be
extremely well pinned even if the ferromagnet has a low Hc. By removing the need
for a high-Hc ferromagnet in the pinned layer, this structure allows some
additional exibility in the choice of ferromagnet pinned layer material. One can
optimize for maximum electron spin polarization for best magnetoresistance, and
choose lm qualities for low remanence and thus a lesser offset of the R versus
H hysteresis curve. Correspondingly, Figure 14.4b illustrates a representative
improvement in offset, for comparison with Figure 14.3b from the simpler stack
structure.
Although there is much benet in using the simple antiferromagnet (AF)-pinned
structure of Figure 14.4a, best device operation often calls for reducing the R versus H
hysteresis offset to an even smaller value. In this case, the ux-closed AF-pinned
structure shown in Figure 14.4c can be tailored to give arbitrarily small offset elds.
Here, a synthetic antiferromagnet (SAF) is formed from two ferromagnets separated
by a thin spacer layer. For common spacer layers of 0.61.0 nm of Ru, one can obtain a
strong antiparallel coupling between the two ferromagnets [11]. For reasonable
external elds, this coupling forces them to be antiparallel, and thus the thicknesses
of the two ferromagnets can be balanced such that the external magnetic ux is
negligible. The pinning of one of these ferromagnet layers with an antiferromagnet
gives a high effective Hc while at the same time causing negligible offset to the R
versus H hysteresis loop (Figure 14.4d).
Flux-closing the reference layer ferromagnet works remarkably well in practice,
particularly with recent advances in materials deposition tooling which enable tight
control over lm thicknesses for multilayer lm structures covering entire 200- to
300-mm wafers [13]. A cross-section transmission electron microscopy (TEM) image
of such a ux-closed reference layer MTJ stack is shown in Figure 14.5. Some
interesting features of the magnetics-related elements can be discerned from the
TEM image, and these are discussed below.
j425
426
14.3.1
Antiferromagnet
The pinning strength must be large compared to the elds used to switch the free
layer between its binary memory states.
nents of the antiferromagnet should not dissociate and diffuse out of the layer for
process temperatures below about 250 C.
14.3.2
Reference Layer
The reference layer closest to the tunnel barrier must act as an effective spin polarizer,
and so it must be of thickness at least of order the electron spin-ip scattering length.
This implies that 11.5 nm is the minimum thickness of the layer closest to the
tunnel barrier. For best ux closure and minimal offset to the free layer, the reference
layer adjacent to the antiferromagnet will be of a similar thickness, although a perfect
zero free-layer offset may dictate small differences in the thicknesses. An upper limit
to the thickness is set by the additional surface roughening and resultant Neel
coupling that thicker lms will generate. Reference layer materials are chosen for
their best spin polarization properties and compatibility with device-processing
techniques (e.g., minimal corrosion and thermal stability). Films of CoFe of the
order of 2 nm thickness are typically used, separated by the 0.6- to 1.0-nm exchangecoupling Ru layer.
14.3.3
Tunnel Barrier
j427
428
speed of operation. Today, even more highly transparent tunnel barriers are under
development for a class of devices using electron spin current to switch the device
state.
14.3.4
Free Layer
The free layer shown in the TEM is reasonably thin, rather like the underlying pinned
layers. However, it does have a minimum thickness limit set by the spin ltering
characteristics: for a thickness less than the approximate electron spin-ip scattering
length, the magnetoresistance will begin to drop, and this again sets the thickness at
around 1.5 nm or more. Thicker free layers require additional energy to switch, and
so are undesirable for low-power operation. Of critical importance in the characteristics of the free layer is the need for well-dened magnetic states and well-behaved
magnetic switching. As one cannot tailor the read or write circuitry to every individual
device in megabit arrays of MRAM devices, it is critical that each device behave very
much like all others in the array. Ill-dened magnetization states such as vortices,
S-shapes, C-shapes, and multiple domains will add variability to the resistance
measured by the circuitry, because electron spin polarization ltering may not be
strictly parallel or antiparallel to the spin polarization imparted by the pinned layer. In
addition, sensitivity of the lm switching behavior to tunnel barrier and cap
materials, or to device edge roughness or chemistry, can impart variability to the
write operation of the individual bits in megabit arrays. NiFe alloys are preferred for
good magnetic behavior with reasonable corrosion resistance. The addition of Co or
Fe to the NiFe, or dusting with Co or Fe between the tunnel barrier and NiFe layer, can
help to adjust the magnetic anisotropy and improve the MR. Layer thicknesses are
typically in the 2- to 6-nm range for best low-power operation with good switching
characteristics.
Several additional non-magnetic elements are visible in the TEM image, and these
are discussed below.
14.3.5
Substrate
An ultra-smooth substrate is required as the starting point for smooth, uniform, and
reliable tunnel barriers. Rough interfaces also result in increased Neel coupling,
which is detrimental to device performance. Representative materials for the
substrate are thermally oxidized silicon, or chemical-mechanical planarized (CMP)
polished dielectrics such as silicon nitride, silicon oxide, or silicon carbide.
14.3.6
Seed Layer
An appropriate seed layer is required to obtain good growth conditions for the
antiferromagnet, both to ensure a smooth top surface and to ensure good magnetic
pinning strength. Given the high stress in some of the lms in the MTJ stack, this
seed layer is also critical for ensuring good adhesion to the substrate. It may be
formed from tantalum nitride or permalloy (NiFe), for example.
14.3.7
Cap Layer
The proper choice of a cap layer is necessary to protect the free layer during further
device fabrication processing. It is essential as a barrier or getter for contaminants,
keeping the free layer clean and magnetically well behaved. Often-used materials for
this layer include ruthenium, tantalum, and aluminum. The choice of this material
may also depend on its effect on the magnetic behavior of the free layer: certain cap
materials can discourage smooth switching between free layer states, and can result
in substantial dead layers which must be compensated for by a thicker free layer.
14.3.8
Hard Mask
A hard mask (as opposed to a soft photoresist mask) is used to enable patterning of
the MTJ with industry-standard etch techniques. It also eases integration with the
surrounding circuitry by providing a contact layer to connect the MTJ to wiring levels
above. The hard mask material is largely chosen for its compatibility with subsequent
processing in the fabrication route, and can be chosen from any number of metallic
or dielectric materials.
The processing of MTJ structures to integrate them with CMOS circuitry is
discussed in greater detail later in the chapter.
14.4
MRAM Cell Structure and Circuit Design
14.4.1
Writing the Bits
The mechanism for switching the state of the free layer in MRAM lends itself well to
an array layout with a conventional planar semiconductor design and fabrication. A
typical rectangular MTJ array layout, with word lines (WLs) arrayed beneath the
devices and bit lines (BLs) arrayed atop the devices, is illustrated in Figure 14.7.
Current driven along the WLs or BLs generates a magnetic eld which imparts a
torque on the magnetization of the device. In normal operation, the superposition of
properly-sized write elds from both WL and BL will enable a switching event to
occur in the free layer of the device at the intersection of the two lines. The write elds
are chosen small enough so as not to exceed the coercivity of the pinned layer.
Potential pitfalls from this scheme include write errors from half-selected devices
(i.e., those subjected to only a WL or a BL eld, but not both) and, worse, write errors
j429
430
must be in terms of switching, described in equation form by the array quality factor
(AQF):
AQF
H sw
sHsw
14:2
Here, Hsw is the average switching eld of the devices and sHsw is the standard
deviation of the switching eld distribution of all elements in the array. In rough
terms, the AQF must be larger than about 30 in order to ensure a lifetime of 10 years,
although some relief can be gained through the use of error correction techniques.
Toggle MRAM was invented to circumvent the difculties faced by SW MRAM in
terms of the operating margin for half-selected bits [16]. As illustrated in Figure 14.9a,
the structure has taken the ux-closed antiferromagnet-pinned reference layer
structure (see Figure 14.4c) a step further by also ux-closing the ferromagnetic
free layer. This is achieved by depositing a spacer layer atop the free layer ferromagnet, followed by a second ferromagnet. The spacer can be chosen (as in the pinned
layer) to enhance antiparallel coupling, or the spacer can be chosen with zero or even
with some parallel coupling characteristics to decrease the write eld needed to
switch the bit. The magnetizations of the two ferromagnets in the free layer will point
in opposite directions, and their balance and proximity will ux-close the layers so
there is little eld seen emanating from the structure at a distance. The write
operation of this toggle-mode structure is illustrated in Figure 14.9b. Noting the
colors assigned to represent the magnetization of the free layers in Figure 14.9a
(green for the top layer, red for the bottom layer), the plots at the top of Figure 14.9b
show the relative orientation of the two magnetizations. Note that the initial state
is such that the magnetization of the MTJ has easy (preferred) axis at 45 to the word
and bit lines, rather than be aligned parallel to one of them as in SW MRAM.
j431
432
Figure 14.9 (a) Structure of the toggle-mode MTJ stack. (b) Time
evolution of the free layer switching. See text for details.
Figure 14.9b illustrates the need for staggered timing of WL and BL write-eld
pulses. To switch the state of the free layers, a magnetic eld is rst applied from the
WL along the positive y-direction. This magnetic eld cants the magnetizations of
both free layers as they try to align to the eld. The antiparallel nature of the
magnetic coupling between the free layers prevents the magnetizations from both
fully lining up with the applied word eld, as long as the eld is not too large to
overwhelm this antiparallel state. When the magnetizations are canted sufciently,
there is a net magnetic moment to the free layers, and this moment can be grabbed
like a handle by the eld now imparted by the BL. The BL applies a eld in the
positive x-direction, and the net moment of the two free layers follows this BL eld.
The WL eld is then shut off, and the net moment continues to rotate around
towards the applied BL eld. As the BL eld is shut off, the free layer magnetizations
relax into their energetically favorable antiparallel conguration, but now with
magnetizations exactly opposite to those at the start.
The name toggle-mode device is derived from the characteristic that cycling the
WLs and BLs in this manner will always switch the state of the device. To set a bit in a
particular state, a read operation must be performed to determine if a write toggle
operation is required. Aside from this drawback, and the additional complexity of the
magnetic stack, there are several advantages to the toggle-mode structure:
.
As alluded to above, the write operating margins can be substantially larger than for
devices with SW switching. Rather than a SW astroid boundary, the toggle-mode
devices exhibit an L-shaped boundary that does not approach the WL or BL axes.
The potential for half-select errors is dramatically reduced, and the requirement on
AQF is approximately halved.
In principle, shape anisotropy is not required to ensure that the bit has only two
preferred states for binary memory. One can utilize the intrinsic anisotropy of the
ferromagnetic free layers to dene two such states. This allows the use of circular
MTJ devices for the smallest memory cell size.
The ux-closed nature of the free layers greatly reduces dipole elds emanating
from the free layer. Such elds can affect the energetics of nearby devices, resulting
in variability of switching characteristics, depending on the states of such devices.
Thus, with ux-closed free layers, nearby devices can be packed in closer proximity
for improved scaling.
14.4.2
Reading the Bits
The array structure illustrated in Figure 14.7 is often termed a cross-point cell
(XPC) structure. More specically, XPC refers to the case where the MTJ devices are
located at the cross-points of the BLs and WLs, and are directly connected to the BLs
and WLs above and below the MTJ stack. This structure offers an extremely high
packing density for the lowest cost memory. The write mechanism is reasonably
straightforward as described above, as long as the MTJ resistance is not so low that it
shunts the write currents. More troublesome is that the read mechanism suffers
from a reduced signal-to-noise ratio in this XPC structure. In order to read the
resistance state of a XPC bit, a bias is applied between a desired BL and WL, and the
resistance measured. However, due to the interconnected nature of the XPC
structure, not only the resistance of the cross-point device is measured there are
parallel contributions of resistance from many other devices along sneak paths that
include traversing additional sections of BL and WL. Due to the resulting loss of
signal, the device must be read much more slowly to allow for integration to improve
the signal-to-noise ratio. Device read times can be substantially longer for such XPC
structures, making this type of memory far less desirable than one which can be read
as fast as DRAM, for example.
The solution to the problem of sneak paths is to insert an isolation mechanism
which ensures that read currents will only traverse a single MTJ device. For example,
this can be achieved by placing a diode in series with each MTJ. Although this seems
simple when drawn as a circuit schematic on paper, it is actually more straightforward
to place a eld effect transistor (FET) in series with each MTJ, and assign a second WL
to control the read operation. The FET cell circuit structure is shown schematically
in Figure 14.10, with separate WLs for the write and read operations. The BL is used
for both read and write operations.
Figure 14.11 illustrates the implementation of the circuit structure shown in
Figure 14.10, suitable for a densely packed array of MTJs. Structural additions to
standard CMOS circuitry include:
.
.
the via contact VJ between the bit line and the top of the MTJ stack
the MTJ device
j433
434
the local metal strap (MA) between the bottom of the MTJ stack and the via to M2
the via VA between the MA strap and the M2 wiring, which serves to isolate the MTJ
from the write WL while providing connection to the underlying FET structure for
reading.
A slightly higher packing density may be achieved with a mirror-cell design, where
adjacent bits mirror each other. The simple unmirrored design of Figure 14.11 is
preferable to minimize any across-array non-uniformity due to inter-level misalignment and inter-cell magnetic interference. Megabit and larger MRAM memories are
formed from multiple subarrays, with size determined largely by the resistance of the
BLs and WLs. There is a desire always to keep applied voltage low, for CMOS
compatibility and best array efciency. The required current to generate the necessary
MTJ switching elds then sets a maximum length on the BL or WL, depending on the
resistive voltage drop. Bootstrapped write drivers can be used to allow smaller write
drivers with improved write current control [18]. A 16 Mb MRAM under development
at IBM utilizes 128 Kb subarrays (see Figure 14.12), with 512 WLs and 256 BLs of
active memory elements.
The read operation is performed with sense ampliers that compare the desired bit to
a reference cell. The reference cell uses two adjacent MTJs xed in opposite states in a
conguration that acts like an ideal mid-point reference between the Rhigh and the Rlow
states [18]. Four BLs are activated in a given cycle, and are uniformly spaced along the
height of the array to reduce magnetic interference between activated BLs during the
write operation, and to minimize distance from the activated BLs to thesense ampliers
during a read operation. Additional reference BLs are located within the array, with one
set shared by sense ampliers 0 and 1, and one set shared by sense ampliers 2 and 3.
The array driving circuitry for MRAM memories is commonly standardized to an
asynchronous SRAM-like interface for easy interchangeability in battery-backed
SRAM applications. The IBM 16 Mb chip uses a 16 architecture that is prevalent
in mobile and handheld applications with packaging intended for simple direct
replacement of SRAM chips. As shown in Figure 14.13, the 16 Mb chip measures
79 mm2 with individual memory cells of 1.42 mm2, for an array efciency of almost
j435
436
30%. The array efciency may be improved by using more metal layers and by
eliminating some of the developmental test mode structures used in this chip. A
reduction of the standby current for power-critical applications is achieved through
the extensive use of high threshold, long-channel FET devices and careful grounding
of inactive terminals in the arrays and in the write driver devices [18].
Redundant elements are included in the chip to allow the correction of defective
array elements. Such redundancy is implemented with fuse latches and address
comparators in a manner consistent with industry-standard memory products. The
CMOS base technology is quite mature, so the focus of the redundancy is on
the MRAM features. Single-cell failures or partial WL failures (from MRAM reference
cell defects) are considered the most likely defects. The redundancy architecture favors
replacement of WLs to capture the partial WL fails from MRAM reference cell defects.
Redundancy domains are implemented at a high level in the block hierarchy so as to
span several blocks and be capable of effectively xing any random defects [18].
14.4.3
MRAM Processing Technology and Integration
there remains the need to pattern the shallow vias, the MTJs, the local interconnects,
and at least one level of wiring with contact to MTJs and the functional circuitry below.
Even for simple functional circuits, ve or more photomask levels are required to
complete the MRAM-centric portion of the structure.
14.4.3.1 Process Steps
In conjunction with the steps outlined in Figure 14.14, below is a discussion of the
important considerations for the process steps in the fabrication of the MRAMspecic levels.
1. VA contact via and ILD: The VA via provides a path for read current to ow from
the local (MA) metal strap down through a via chain to the underlying read
transistor. The most critical aspect of this module is that it must form a substrate
which is sufciently smooth for good magnetic stack growth.
2. Magnetic lm stack deposition: Arguably the most essential technological advance in
enabling MTJ MRAM was the development of tooling for the large-area deposition
of extremely uniform lms with well-controlled thickness. Such tooling has proven
suitable for the deposition of magnetic, spacer, and tunnel barrier lms with
sub-ngstrom uniformity across 200 mm and even 300 mm wafers [13]. The
critical aluminum oxide tunnel barrier is generally formed by depositing a thin
aluminum layer, followed by exposure to an oxidizing plasma [19].
3. Tunnel junction patterning: A commonly used and straightforward approach to
patterning the MTJs is with the use of a conducting hard mask. This is later
utilized as a self-aligned stud bridging the conductive MT wiring to the active
magnetic lms in the device. A thick hard mask, however, introduces additional
j437
438
difculties in that it can shadow the etch being used to pattern the magnetic devices.
Such shadowing can add an element of variability into the size of the devices, and
may also result in metal redeposits on the sidewall of the device structure. As
illustrated in Figure 14.15, sidewall redeposits are particularly troublesome for
commonly used MRAM stack materials because the materials do not readily form
volatile RIE byproducts that provide some isotropic character to the etch. Directional physical sputtering is the main mechanism for etching of the stack
materials [20]. Because the difculties in etching the magnetic stack materials
often outweigh the benets of a simpler process integration scheme, it is often
preferable to use a thinner hard mask for less etch shadowing, and an additional via
level (VJ in Figure 14.14) to connect the top of the MTJ with the bit line wiring.
4. MTJ encapsulation: Silicon nitride and similar compounds are desirable for their
adhesion to the MA and MTJ metal surfaces, and for strong interfacial bonds that
inhibit migration of metal atoms along the dielectric/metal interfaces. Such metal
migration is one well-documented cause of MTJ thermal degradation, and can
limit processing temperatures in patterned MTJ devices to below 300 C [21]. The
use of tetra-ethyl-orthosilicate (TEOS) as a precursor in the deposition of silicon
oxide lms [22] is known to offer the benets of a relatively inert depositing species
which can readily diffuse into spaces adjacent to high-aspect ratio structures, even
at temperatures below 250 C.
5. MA patterning: For suitable thickness of seed and reference layers, the series
resistance of layers remaining after MTJ etch is small enough to impart negligible
14.5
MRAM Reliability
One of the strong selling points of MRAM is its reliability: write endurance is
expected to be essentially innite, the magnetics are intrinsically rad-hard, and its
non-volatile memory storage can eliminate soft errors in many applications. As in any
new technology struggling for successful commercialization, there are certain
aspects of the new technology that are unproven and require demonstration of
reliability. Areas of potential reliability risk include [24].
14.5.1
Electromigration
Electromigration in the write WLs and BLs, resulting from high write current density.
Current pulses of 10 mA are typical for conservative wire cross-sections of 0.2 mm2,
corresponding to a current density of 5 MA cm2. This alone represents a serious
challenge to the reliability in the array, and can potentially be worsened by local disruptions to the quality and thickness of wire material. The VJ vias of Figure 14.14, or
direct connection between the BL and the MTJ hard mask in the thick hard mask integration scheme discussed above, can impact the BL wiring electromigration resistance.
Electromigration issues can potentially be improved through the use of bidirectional switching currents, which t neatly into toggle-mode MRAM operation, but
cost in terms of array efciency. One promising method for reducing electromigration stress is through the use of ferromagnetic liners in a U-shape around the BLs and
write WL. These liners serve to focus the magnetic eld onto the MTJs in the desired
row or column, and can increase the effective eld by as much as a factor of 2 for a
given current [25]. The use of ferromagnetic liners around the BL is illustrated in
j439
440
Figure 14.16. Similar, but inverted, structures can be formed around the write WL
(M2 in Figure 14.16) to enhance the eld from that wire. The potential reduction in
necessary current to obtain a required switching eld can dramatically reduce
electromigration issues. Not only do ferromagnetic liners offer potential reduction
in current density, but they also improve electromigration performance relative to
conventional copper processes. By reducing the interface diffusion of copper atoms,
ferromagnetic cladding on the top surface of the MT wire enhances electromigration
reliability to an extent similar to that seen in the industry by advanced Ta/TaN or
CoWP capping processes [26].
One added benet of the ferromagnetic liner eld focusing is the reduction of any
near-neighbor disturb effects. Because the eld is better focused on devices along the
desired WL and BL, adjacent devices are less likely to be switched by near-neighbor
elds, or the combination of near-neighbor elds and thermal activation.
14.5.2
Tunnel Barrier Dielectrics
These are subject to reliability concerns because of the extremely thin nature of the
barrier and related susceptibility to pinholes or dielectric breakdown. Aluminum
oxide tunnel barriers have so far proven quite robust. Time-dependent dielectric
breakdown (TDDB) and time-dependent resistance drift (TDRD) have been examined in 4 Mb arrays and found to exceed requirements for a 10-year lifetime [27]. The
voltage stresses on the tunnel barrier are relatively modest, as the read operation takes
place at 100300 mV because the MR is higher for a lower voltage. The write
operation is performed with one side of the MTJ oating, so there is no signicant
voltage stress on the MTJ during the higher-power write pulse.
14.5.3
BEOL Thermal Budget
The BEOL thermal budget for MRAM devices (<250300 C) is signicantly lower
than for conventional semiconductor fabrication processes (400 C), in order to
prevent degradation to the MTJs. This can affect the intrinsic quality of dielectrics
being used in the BEOL, and can also worsen seam and void formation around the
topographical features being encapsulated. A low thermal budget also prevents the
use of certain post-processing passivation anneals, and packaging materials and
processes. The move to lead-free solder with increased solder reow temperatures
represents a further challenge for MRAM.
14.5.4
Film Adhesion
This is a serious concern with the multiple new materials being introduced into the
integrated process. The novel etch and passivation techniques being used also may
leave behind poorly adherent layers which cannot be subjected to harsh wet cleans
without MTJ exposure and degradation. Delamination risks must be mitigated
through specially developed dry and wet cleans, the use of materials with tuned
stress, and the choice of materials with compatible thermal expansion.
14.6
The Future of MRAM
As of July, 2006, MRAM products such as the 4 Mb memory shown in Figure 14.17
have been available from Freescale Semiconductor [28]. The market space targeted by
Freescale includes networking, security, data storage, gaming, and printer data
logging and conguration storage. From a customer viewpoint, this product means
fewer part counts, a higher level of performance, higher reliability, greater environmental friendliness, and a lower cost solution than their current approaches, such as
battery-backed SRAM.
Progressing downwards from the available 180 nm technology, future generations
of MRAM are expected to utilize the same magnetic infrastructure with only
evolutionary improvements, to below the 90 nm node. However, constraining the
scaling are the following concerns:
j441
442
Increased switching elds: As devices are scaled to smaller volumes, the anisotropy
eld must be increased to compensate and maintain activation energy greater than
60 kBT [30]. Write elds will scale to be of similar magnitude to the anisotropy eld,
and will increase superlinearly with inverse device size. As with the MTJs, the write
wires must scale to a smaller footprint, making it more difcult to accommodate the
increasing switching elds. In addition, ferromagnetic cladding of the wires
becomes less effective because of the bending energy of the ux inside the cladding
as the wire corner radius sharpens. EPD device encapsulations will help in this
regard.
References
Acknowledgments
The author would like to thank: W.J. Gallagher for the gures and editorial assistance;
IBMs Materials Research Laboratory (MRL) for process development and fabrication;
P. Rice, T. Topuria, E. Delenia, and B. Herbst for the TEM imaging; J. DeBrosse, T.
Maftt, C. Jessen, R. Robertazzi, E. OSullivan, D.W. Abraham, E. Joseph, J. Nowak, Y.
Lu, S. Kanakasabapathy, P. Trouilloud, D. Worledge, S. Assefa, G. Wright, B. Hughes,
S.S.P. Parkin, C. Tyberg, S.L. Brown, J.J. Connolly, R. Allen, E. Galligan for various
contributions; and M. Lofaro for the advances in CMP. It is also acknowledged that
the studies summarized here were supported in part by the Defense Microelectronics
Activity (DMEA) and built on prior investigations conducted with Inneon (now
Qimonda) within the MRAM Development Alliance, and also on earlier MRAM
studies at IBM that were supported in part by DARPA.
References
1 DeBrosse, J. personal communication.
2 Gallagher, W.J. and Parkin, S.S.P. (2006)
Development of the magnetic tunnel
junction MRAM at IBM: From rst
junctions to a 16-Mb MRAM demonstrator
chip. IBM Journal of Research and
Development, 50, 523. Sincere thanks also
to John DeBrosse, John Barth, Chung
Lam, and Ron Piro.
3 Jones, J. (1976) Coincident current ferrite
core memories. Byte, 1, 622.
4 Hanaway, J.F. and Moorehead, R.W. (1989)
Space Shuttle Avionics System, NASA SP504 available at, http://klabs.org/DEI/
Processor/shuttle/sp-504/sp-504.htm.
5 (a) Binasch, G. et al. (1989) Physical Review
B-Condensed Matter, 39, 48284830;(b)
Baibich, M.N. et al. (1988) Giant
magnetoresistance of (001)Fe/(001)Cr
magnetic superlattices. Physical Review
7
8
j443
444
15
16
17
18
19
20
21
References
22
23
24
25
26
27
28
29
30
31
32
j445
446
j447
15
Phase-Change Memories
Andrea L. Lacaita and Dirk J. Wouters
15.1
Introduction
15.1.1
The Non-Volatile Memory Market, Flash Memory Scaling, and the Need for New
Memories
During the past decade, the impressive growth of the market for portable systems has
been sustained by the availability of successful semiconductor non-volatile memory
(NVM) technologies, the key driver being the Flash memories. In the past 15 years,
the scaling trend of these charge-based memories has been straightforward. The cell
density of NOR Flash, which is adopted for code storage, has doubled every one to two
years, following Moores law; the memory cell size is 1012 F2, where F is the
technology feature size. The NAND Flash, which is optimized for sequential data
storage, has been aggressively scaled and, nowadays, has a cell size of about 4.5 F2.
However, further scaling of both NOR and NAND Flash is projected to slow down,
due mainly to the tunnel oxide (NOR), which cannot be further thinned down without
impairing data retention, and to electrostatic interactions between adjacent cells
(NAND).
Moreover, as the scaling proceeds, the number of electrons stored on the oating
gate and present in the device channel decreases. As few electrons are involved,
effects such as the random telegraph noise arising from trapping processes are
expected to cause threshold instabilities and reading errors [1], while the requirements on retention become even more challenging. At the 32 nm node, the maximum acceptable leakage over a 10-year period will be less than 10 electrons per
cell [2]. All of these difculties, arising from the fundamental limitation of the charge
storage concept, are calling for novel approaches to non-volatile storage at the
nanoscale.
In recent years a number of different alternative memory concepts have been
explored. Most notably, memories based on switchable resistors are considered
j 15 Phase-Change Memories
448
PCM-based memory devices were rst proposed by J.F. Dewald and S.R. Ovshinsky
who, during the 1960s, reported the observation of a reversible memory switching in
chalcogenide materials [3, 4]. Chalcogenides are semiconducting glasses made from
the elements of Group VI of the Periodic Table, such as sulfur, selenium and
tellurium, and many of these demonstrate the desired material properties for
possible use in PCM applications. Two different chalcogenide material systems may
be discriminated, based on their switching properties [5]:
.
Memory-switching in structure reversible lms that may form crystalline conductive paths. A typical composition is Te81Ge15X4 close to the Ge-Te binary eutectic,
with X being an element from Group V or VI (e.g., Sb). The latter materials
also show threshold switching to initiate the high conduction in the glass state,
followed by an amorphous to crystalline phase transition which stabilizes the highconductive state.
by an external laser beam and not by Joule heating, while the binary information is
read out by exploiting the change in optical reectivity between the amorphous and
the crystalline state, rather than the difference in electrical resistivity.
The advancements in the materials used for optical disks, coupled with signicant
technology scaling and a better understanding of the fundamental electrical device
operation, eventually triggered the development of solid-state memory technology,
which led initially to the Ovonic Unied Memory (OUM) concept based on the use
of the Ge2Sb2Te5 chalcogenide compound [10, 11]. Since early 2000, the different
semiconductor industries have considered the exploitation of the same concept for
large-sized, solid-state memories [1214]. Phase-change memories are known by
different names. For example, the former OUM name was superseded by the terms
PCM and phase-change RAM (PRAM). Today, PCM are considered promising
candidates eventually to become the mainstream non-volatile technology, this being
due to their large cycling endurance [15, 16], fast program and access times, and
extended scalability [17, 18].
15.2
Basic Operation of the Phase-Change Memory Cell
15.2.1
Memory Element and Basic Switching Characteristics
The vertical OUM PCM memory element in the so-called Lance-like structure is
shown schematically in Figure 15.1. The active phase-change material (Ge2Sb2Te5;
GST) is sandwiched between a top metal contact and a resistive bottom electrode (also
called the heater). The programming current ows vertically from the bottom
j449
j 15 Phase-Change Memories
450
electrode through the heater, through the GST layer and to the top electrode. The
current concentration near the (narrow) heater/GST contact results in a local heating
of the GST in a semi-spherical volume where the amorphous/crystalline phase
change occurs. Amorphization of this area stops the low-resistive current path and
results in an overall large resistance.
The thermal and electrical switching characteristics of a vertical OUM PCM
memory element are shown in Figure 15.2, with temperature evolution in the GST
region above the heater contact in response to current pulses shown graphically in
Figure 15.2a [12]. In order to form the amorphous phase, a 50- to 100-ns current
pulse heats up the region until GST reaches its melting temperature (620 C). The
subsequent swift cooling, along the falling edge of the current pulse, freezes the
undercooled molten material into a disordered, amorphous phase below the glass
transition temperature. In order to recover the crystalline phase, Joule heating
from another current pulse, with a lower amplitude (resulting in temperatures
above the crystallization temperature but below the melting temperature), is used
to speed-up the spontaneous amorphous-to-crystalline transition: the crystalline
phase builds up in about 100 ns by a combination of nucleation and growth
processes.
The typical currentvoltage (IV) curve of a cell for both states is shown in
Figure 15.2b [19]. As the electrical resistivity of the two phases differs by orders of
magnitude, at low bias, the resistance of the two memory states ranges from few kO
(low resistance ON or SET state) to some MO (high resistance OFF or RESET
state). Reading is accomplished by biasing the cell and sensing the current owing
through it; for example, a few hundreds of millivolts across the cell in the SET
state generates 50100 mA. This current is able to load the bit-line capacitances
of a memory array, making possible a reading operation in 50 ns. The same bias
across the cell in the RESETstate is not able to generate enough current to trigger the
sensing amplier, thus resulting in the evaluation of a 0.
It should be noted that the IV curve in the high-resistance, amorphous state is
quite peculiar. As the bias reaches a certain voltage (the threshold switching voltage) a
snap-back takes place and the conductance abruptly switches to a high conductive
state (see Figure 15.2b). The IV curve of the crystalline GST does not feature
threshold switching, and approaches the IV of the amorphous state in the high
current zone.
The occurrence of this threshold switching is a very important characteristic of
PCM material. Indeed, without such a switching mechanism, which allows large
currents to ow in the amorphous material at low voltages (few volts), very high
voltages (100 V) would be required to switch the material to the on state, thus
making electronic programming effectively non-practical.
The ratio of the threshold switching voltage and the thickness of the amorphous
zone is usually referred to as the critical threshold switching eld; for GST this quantity
ranges between 30 and 40 V mm1. The critical threshold switching eld can be taken
as a guideline to compare different materials; for example, the lower the switching
eld the lower the switching voltage for the same thickness of the amorphous layer.
However, as shown in Figure 15.3, even if the threshold voltage does scale with the
memory resistance, which in turn depends on the amorphous layer thickness, the
line does not cross the origin [20]. The concept of threshold switching eld should,
therefore, be handled with some care.
j451
j 15 Phase-Change Memories
452
15.2.2
SET and RESET Programming Characteristics
The programming characteristic of a PCM cell [20] that is, the dependence of the cell
resistance R as a function of the programming current is shown in Figure 15.4. The
open symbols in Figure 15.4 refer to the resistance obtained when driving a cell from
the RESET state. During the measurement procedure, a 100-ns programming pulse is
applied and the cell resistance after programming is read at 0.2 V. Before the
subsequent measurement, the cell is brought again into the initial reference RESET
state by using a proper current pulse. The measurement cycle is then restarted, driving
the cell with a new 100-ns programming current pulse with a different amplitude.
During this procedure, three distinct regions can be recognized:
.
For programming pulses below 100 mA, the ON-state conduction is not activated
and the very small current does not provide any phase change.
In the 100 to 450 mA range, the resistance decreases following the crystallization of
the amorphous GST, reaching the minimum resistance in the SETstate, as denoted
by Rset.
Above 450 mA, the programming pulse melts some GST close to the interface with
the bottom electrode, leaving it in the amorphous phase.
The solid symbols in Figure 15.4 also show the RI characteristics obtained for the
same cell, but starting from the SET state. The resistance value changes only when
the current exceeds 450 mA and the chalcogenide begins to melt. The current is therefore
denoted as the melting current, Imelt. From thereon the curve overlaps to the RI of the
RESETstate. For programming pulses above 700 mA, the resistance of the cell reaches an
almost constant value. It transpires that the PCM cell can be switched between the two
SETand RESETstates using current pulses of 400 and 700 mA, respectively, these pulses
being independent of the initial cell state (resistance). Therefore, the cell can berewritten
with no need for any intermediate erase. The minimum current capable of bringing the
cell into the full RESET state (700 mA in Figure 15.4) is denoted as reset current, Ireset.
The orders of magnitude difference between the cell resistance in the SET and
RESETstates makes the PCM memory ideally suitable for a multibit operation. In this
scheme, the resistance of the cell may be set between the two extreme values, thus
placing more than two levels per cell. This approach may become a viable option to
further reduce the cost per bit of PCM devices.
15.3
Phase-Change Memory Materials
15.3.1
The Chalcogenide Phase-Change Materials: General Characteristics
The requirements for phase-change materials include easy glass formation during
quenching from the melt, as well as congruent crystallizing compositions to avoid
phase segregation during crystallization. Melting temperatures should be low to limit
the switching power, whereas for non-volatility a good stability of the amorphous
phase at application temperatures is required. It follows that the activation energy1)
for crystallization of the amorphous state should be high enough to enable long data
retention times. On the other hand, crystallization rates, at least at elevated temperatures, should be high enough to allow for a rapid amorphous to crystal transition,
preferably in the range of a few tens of nanoseconds.2)
Such materials have now been under investigation for many years for their
applications in DVD-RAM and DVD-R/W optical disk storage systems. Typically,
metal alloys containing chalcogenide elements [by denition, elements of Group VI
of the Periodic Table (O, S, Se, Te, Po)], and often referred to as chalcogenide
materials, are used. Chalcogenide elements are of interest as Se and Te compounds
are easy glass-formers, because of their relatively high melt viscosities [22]. Compositions searched for are those that form a stable state in the solid phase (polymorphic
transformations; i.e., where long-range diffusion is not required) [23].
The two typical chalcogenide material families used in PCM are both based on
compositions of Ge, Sb and Te: (i) the pseudo-binary GeTe-Sb2Te3 compositions; and
(ii) compositions based on the Sb70Te30 eutectic compound (see the Ge-Sb-Te
ternary phase diagram in Figure 15.5) [24, 25].
j453
j 15 Phase-Change Memories
454
not an eutectic but an azeotropic minimum [27]; that is, it fullls the basic requirement of a congruent crystallizing material.
j455
j 15 Phase-Change Memories
456
15.3.3
Specific Properties Relevant to PCM
The majority of PCM materials investigated to date were developed for (re)writable
optical discs, and existing knowledge of them, as well as experience of their reliability
in products, should allow their rapid introduction into CMOS integration technology.
However, their use in PC-RAM application may include various pitfalls:
.
There exist certain important operational differences between DVD and PCM
applications, that require the tuning/optimization of a number of specic material
parameters for PCM that are not important for optical applications (e.g., resistivity
in the on and off state, rather than reectivity changes) (see Table 15.1). In addition,
the amorphous phase should possess the particular property of threshold switching. The main material parameters for PCM applications are listed in Table 15.1.
More importantly, however, the operating differences may have a major inuence
in the operation stability and repeatability. Indeed, in DVD applications, programming is achieved by laser pulse power coupling, and reading by reectivity change.
Data programming and storage relies on average material properties, such as
reectivity and absorption, that are not very sensitive to local variations. For
example, they may be caused by small crystalline particles embedded in the
amorphous region or vice versa, due to incomplete/inhomogeneous nucleation
or amorphization. On the other hand, PCM relies on programming by Joule
heating; that is, by current conduction through the device. Furthermore, the SET
operation requires threshold current switching in the amorphous phase, which is a
lamentary process. Also, the reading is based on a resistance (i.e., current)
measurement. As the current conduction is greatly affected by the existence
of local inhomogeneities, programming and reading may become highly sensitive
to non-uniform/incomplete crystallization or amorphization. For example,
low-resistive current paths in the incomplete amorphized state, or amorphous
Symbol
Parameter
Optimization
Reference(s)
Tm
Tc
Melting temperature
Crystallization
Temperature
Activation energy
Resistivity crystalline
state
Resistivity amorphous
state
Critical threshold
switching eld
621 C
155 C
[37]
[37]
2.62.9 eV
350 O mm
[15, 38]
[18]
0.3 MO mm
[18]
3040 Vmm1
[24]
Ea
rc
ra
Ec
j457
j 15 Phase-Change Memories
458
15.4
Physics and Modeling of PCM
From the basic device operation described in Section 15.2 it transpires that materialphase transitions (amorphization and crystallization dynamics) and conductance
(threshold) switching mechanisms in the amorphous phase are the key processes
involved in PCM. The physics of these mechanisms are outlined in greater detail
below. The results of modeling studies implementing detailed microscopic descriptions of these effects have contributed greatly to an understanding of the subject, and
have also supported the basis of design optimization of these devices.
15.4.1
Amorphization and Crystallization Processes
The different phase transitions of the PCM material during programming are
illustrated graphically in Figure 15.8 [39]. Here, the crystalline material serves as
an ideal starting point. During the programming transition from a SET to a RESET
(high-resistance) state, the material is heated, begins to melt at solidus temperature,
Tsol, and is completely molten at the melting temperature, Tm. When the cooling rate
is higher than 109 K s1 [40], the material does not begin to solidify at Tm, but remains
an undercooled liquid. Below the glass transition temperature, TG, the material
freezes into the amorphous state.
The reverse situation is that, during the RESET to SET transition, the current pulse
heats the material above TG but below Tm, where it will begin to crystallize. There is no
uniquely dened crystallization temperature, and even at relatively low temperatures
(100200 C) crystallization may occur over long time scales (perhaps up to years).
(These processes govern the basic long-term temperature retention of the RESET
state, and will be discussed in Section 15.6.) For the SET programming, hightemperature rapid (<100 ns) crystallization processes are required, an understanding of which is based mainly on the general physical models for nucleation and
growth. Different models have been proposed, for example by Peng et al. [41] and by
Kelton [42], by which the temperature-dependent nucleation and growth rate can be
calculated. Calculations based on these models using the different material properties of, for example GST and AIST, indeed conrm the nucleation, respectively
growth-dominated crystallization mechanisms, that have been observed experimentally in these materials [43].
15.4.2
Band-Structure and Transport Model
j459
j 15 Phase-Change Memories
460
parameters, such as energy gap, trap densities and density of states for both phases,
adopted in the numerical simulations, are listed in Table 15.2. It should be noted that
the crystalline GST is p-type, due to a large density of vacancies (10% of the lattice
sites), while the amorphous GST is characterized by a large density of donor/
acceptor-like defects: the so-called Valence Alternation Pairs. This semiconductorlike picture is able to account for the peculiar conduction of the amorphous state and
for the threshold switching [19].
The physics involved in the threshold switching remains a subject of debate. Since
Ovshinsky rst reported threshold switching [4], different models have been proposed, with many groups supporting the idea that switching is essentially a thermal
effect and that the current in an amorphous layer rises above due to the creation of a
hot lament [45, 46]. Later, Adler showed that the effect is not thermal (at least in thin
chalcogenide lms), in agreement with Ovshinskys original picture. In their
pioneering studies [47, 48], Adler and colleagues showed that a semiconductor
resistor may feature switching, without any thermal effect. The condition for the
threshold snap-back to occur is the presence of a carrier generation depending on
Table 15.2 Electronic parameters for both crystal and amorphous phases. (From Ref. [33]).
Property
GST crystalline
GST amorphous
Egap [eV]
NC [cm3]
NV [cm3]
Vacancies [cm3]
C3 [cm3]
C1 [cm3]
mn-mp [cm2 V1 s1]
FC [V cm1]
0.5
2.5 1019
2.5 1019
5 1020
0.123.5
3 105
0.7
2.5 1019
1020
10171020
10171020
5200
3 105
eld and carrier concentration (e.g., impact ionization) competing with a ShockleyHallRead (SHR) recombination via localized states.
The numerical model reported in Ref. [33] implements Adlers picture accounting
for avalanche impact ionization in the amorphous and SHR recombination via the
localized defects.
The schematic dependence of the band structure along a cross-section of a PCM
device is shown in Figure 15.10, where the wide-gap region corresponds to the
amorphous GST. At low bias, the quasi Fermi levels in the amorphous GST are close
to their equilibrium position. As both the carrier density and their mobility is low (the
average hole mobility is about 0.15 cm2 V1 s1 [33]), the conduction regime is ohmic.
By increasing the voltage, the applied eld approaches the avalanche critical eld of
3 105 [10], signicantly increasing the carrier generation. The quasi Fermi levels thus
split and move close to the band-edges (Figure 15.10, lower diagram). Carrier
recombination mainly takes place in the region, close to the anode, where the electron
Fermi level approaches the conduction band. At large bias, all defects available for
recombination are full, and recombination may no longer be able to balance the
exponentially rising generation rate. The system reacts by reducing the voltage drop in
order to maintain the balance between recombination and generation, leading to the
electronic switching. Hence, the snap-back takes place and, after switching, the GST is
still amorphous but highly conductive. Generation is sustained by the large density of
free carriers. According to this picture, the minimum voltage required for the
switching to occur is of the order of the split between quasi Fermi levels (i.e., the
Figure 15.10 Band diagrams along the crosssection of a PCM cell in the RESET state
(according to Ref. [33]). (a) At low bias, quasiFermi levels are close to the equilibrium value.
(b) Close to threshold switching; generation by
j461
j 15 Phase-Change Memories
462
energy gap). This argument may justify why both, the holding voltage, VH, and the
asymptotic value at low R in Figure 15.2b, approaches approximately 0.5 V.
Although this picture so far has been successful in accounting for the experimental
ndings, it should be accepted with a degree of caution, and further investigations are
needed to better assess the material properties. The quantitative description of
impact ionization in these materials, as well as the role of the interfaces or of
PooleFrenkel mechanisms, deserve further investigation as many of the details still
lack direct experimental verication. However, recent industrial interest in PCMs
may lead to new experimental efforts and to the fabrication of devices purposely
designed to test the validity of these key assumptions.
15.4.3
Modeling of the SET and RESET Switching Phenomena
The above conduction model was then coupled to heat equation and to phase-transition
dynamics (nucleation and growth). When implemented in a three-dimensional (3-D)
semiconductor device solver, it highlighted substantial differences in the two phase
transitions [20]. The temperature maps during the SETRESET transition, and the
resulting phase distributions obtained with increasing current pulses with a plateau of
150 ns, are illustrated in Figure 15.11 [20]. In this gure, all of the pictures refer to a
Lance PCM device in which a cylindrical metallic heater is in contact with the GST layer.
The current ows almost uniformly across the polycrystalline GST, thus resulting in a
roughly hemispherical shape of the nal a-GST volume. As the programming current
increases above melting, the volume left in the amorphous state increases.
On the other hand, Figure 15.12 [20] shows the calculated nal phase distribution
for a RESET to SET transition with programming current of 130 mA (a) and 160 mA (b)
, respectively. Figure 15.12a shows that the current rst sparks by electronic threshold
switching in the weakest a-GST region, locally triggering Joule heating and consequent crystallization processes. In case (a), the pulse amplitude/time is not sufcient
to provide a complete crystalline path, and thus the active region features a residual aGST layer causing the large measured R. In case (b), a further increase in the applied
voltage eventually extends the hot lament in the whole of the active area. The initial
amorphous volume has been almost completely crystallized by the programming
pulse, resulting in a sufciently low resistance. It should be noted that the localized
phase transition is directly related to the details of the electronic switching mechanism, and thus can be reproduced only by a self-consistent model describing both
electrothermal and phase-change dynamics.
Figures 15.11 and 15.12 also suggest that, by changing the programming pulse, the
value of the cell resistance can be reliably placed in between the largest and the
minimum SET value, thus opening the way to a multi-level operation. For example,
four levels each with different resistance values might be programmed per cell,
thus reducing the cost per bit.
15.4.4
Transient Behavior
VTH and R are two key parameters of the memory cell. Figure 15.13 illustrates the
time dependence of these parameters as measured soon after the current pulse
programming the cell in the RESET state [19, 49]. The rst fast component of the
transient is referred to as recovery. On the longer time scale, in the so-called drift
regime, the VTH and R transients follow a slower power law. The recovery sets the
minimum time needed after programming before reading (if the cell is read soon
after being programmed in the RESET state, the read value might erroneously be 1).
j463
j 15 Phase-Change Memories
464
Figure 15.13 Low field resistance and threshold voltage for the
amorphous phase as a function of time after reset programming
operation. In about 50 ns, both low field resistance and threshold
voltage are recovered, after which they continue to increase due to
the drift phenomenon. (Figure from Ref. [19]).
15.5
PCM Integration and Cell Structures
15.5.1
PCM Cell Components
be conceived [52], but this would suffer read errors due to leakage through
neighboring on elements. Moreover, as important program disturbs may occur
in half-select regimes, each PCM memory cell should contain both the phasechange element and a selection device. The choice and dimensions of the selection
element is determined by cell size constraints and the RESET program current. A
MOSFET transistor is the most evident choice for memory integration in CMOS
technology (Figure 15.14) [53]). The need for Source and Drain contacts, such a
one transistor-one resistor (1T1R) cell would require a minimum area of 810 F2
(where F is the minimum feature size of the technology). However, MOSFETs have
limited current drivability, so that minimum size transistors cannot be used, and
cell sizes are much larger, up to 40 F2 in 0.18 mm technology with 0.6 mA reset
current [13].
An alternative is the use of a bipolar p-n-p select device. In that case, there are still
two (Base and Emittor) contacts per cell, while the Collector is a collective substate
contact. As the bipolar current drivability is much higher, the overall cell size is
smaller (only 10 F2 in 0.18 mm technology with 0.6 mA reset current [13]). The tradeoff is of course a more complex integration scheme for fabricating these bipolar
devices in a CMOS technology, restricting this solution for stand-alone memory
applications only.
In order to obtain the smallest cell area, a diode selector may be used [17], as a diode
would only need one contact and can handle large currents (a self-rectifying device
would be the most ideal case, so that a raw cross-array could have the same
functionality with minimum cell size). However, in the diode selection regime, both
bitlines and wordlines have to carry relatively high currents, leading to a partitioning
of the memory as well as to larger X-decoders. Moreover, isolation is also less perfect
and parasitic resistances in the full signal path are important. Both, series resistances
and leakage currents, contribute to read error, while diode integration may also result
in the formation of a parasitic bipolar transistor.
Figure 15.14 Schematic of 1T1R cell structure and memory array matrix. (Adapted from Ref. [53]).
j465
j 15 Phase-Change Memories
466
15.5.2
Integration Aspects
Besides formation of the selection transistor, PCM fabrication requires the integration of a phase-change element in a CMOS technology. The memory element may be
fabricated after the transistor processing (the so-called front-end-of-line processing;
FEOL), and either before (e.g., in-between Si contacts and rst metal interconnect
layer) or after the rst steps of the interconnect (e.g., on top of Metal 0 or Metal 1
interconnect levels). This latter scheme is the back-end-of-line or BEOL processing. A
schematic state-of-the-art process ow is shown in Figure 15.15 [54, 55].
The PCM material is typically deposited by sputtering (physical vapor deposition;
PVD) from a multi-element target with the desired composition. The as-deposited
phase is either amorphous (for room- or low-temperature deposition), or crystalline if
deposited above the crystallization temperature. In either case, due to the temperature budget of the following BEOL processing (up to 400500 C), the material will be
fully crystallized after the integration. In a number of cell concepts, the conformal
deposition of the PCM material and/or the ability to ll small pores is important [14],
and these requirements would ideally call for a conformal deposition technique such
as metalorganic chemical vapor deposition (MOCVD), rather than PVD. Critical
elements for the integration are a good adhesion of the PCM material on the
underlying substrate structure (typically a patterned metal electrode in a SiO2 matrix),
the material out-diffusion/inter-reaction/oxidation during high-temperature
steps [56], and dry-etch patterning of the PCM [57, 58].
Furthermore, suitable electrode materials are needed for both the heater contact
(where the metal will be in contact with the hot/molten PCM) and the (cold) top
electrode. Material stability, low contact resistance, and good adhesion are important
parameters. Poor electrode contact properties indeed have been identied as being
responsible, for example, for rst re effects (see Section 15.6). Typical electrode
materials are standard conductive barrier materials available in Si processing such as
TiN (e.g., Ref. [31]), although W is also often used [59]. The top electrode typically
denes the PCM area and can be used as (part of) a hard mask during the PCM
patterning.
Stability and controllability of the different process steps of the integration
technology are crucial for the preparation of large-density memory arrays. The most
important array characterization technique is therefore the distribution of the ON
and OFF program state resistances. Tight distributions are needed to maximize the
sensing window and to avoid bit errors. By process optimization, excellent distributions have been obtained on 256 Mb PCM after full integration [60].
15.5.3
PCM Cell Optimization
In the basic OUM PCM cell, the PCM and top electrode are planar layers deposited on
a plug-type, bottom heater contact. That part of PCM material effectively involved in
the phase switching is basically a hemispherical volume on top of the heater. To
reduce the heating power (or program current), it is important to try to conne the
dissipated heat as much as possible. While many different cell structures have been
proposed in literature (see Figure 15.16), the optimization of heat connement is in
fact based on two simple principles: (i) by concentrating the volume where effective
Joule heating takes place; and/or (ii) by improving the thermal resistance to reduce
the heat loss to the surroundings.
15.5.3.1 Concentrating the Volume of Joule Heating
The Joule heating volume can be conned by pushing the current through a small
cross-section with high current density. One obvious way to do this is to reduce the
contact area of the heater contact with the PCM material, for example by minimizing
the heater plug diameter (as in the small sub-litho contact heater cell [31]), by using
only a conductive liner as heater (as in the edge-contact cell [61]), or by lling the plug
with isolating dielectric material (as in the m-trench cell [13, 62] and ring-contact
heater cell [63]). The main advantage of using only the conductive liner is that at least
one dimension of the heater area is controlled by the liner thickness, and not by the
lithography.
Another way of conning the Joule heating volume is by structuring the PCM
material to a narrow cross-section (see bottle-neck cell [64], line heater cell [24], selfheating pillar cell [65]). Finally, further improvement can be made by increasing the
resistivity of the PCM material, for example by N or O doping [31, 32, 53]. In a
different approach, a highly resistive TiON layer is made between the TiN electrode
and the PCM material, in which layer the Joule heating will be concentrated resulting
in lower program power [66]. However, such an approach may negatively inuence
the contact resistance to the PCM.
15.5.3.2 Improving the Thermal Resistance
In the so-called conned cell structure [14], the PCM is deposed in a pore etched back
in the heater. This not only concentrates the Joule heating region but also surrounds a
j467
j 15 Phase-Change Memories
468
Figure 15.16 Different PCM cell structures for reducing the program power.
large part of this volume by a dielectric layer with reduced thermal conductivity. The
drawback here is the topography which, ideally, would require a conformal deposition
of the PCM. The alternative is to structure the PCM material, rather than the heater,
leading to a plug or pillar. This approach is based on the same principle [18].
Another way to improve the thermal resistance is by increasing the PCM thickness
(as this limits the heat ow to the top electrode heat sink) [14]. However, this option
should be traded-off with the threshold voltage required for electronic switching
during SET programming. One of the benets of the horizontal line cell [24] is also
the setting apart of the heated zone from a metal heater contact and a capping with
thermal insulating dielectric.
15.6 Reliability
Apart form changing only the cell structure, it is clear also that the correct material
selection (use of different dielectric materials, and especially the use of porous
dielectric materials) can improve the thermal heat connement [53]. It should be
noted, however, that the improved thermal isolation should not avoid the rapid
quenching from the melt during RESET, otherwise the device cannot be programmed to the OFF-state.
15.6
Reliability
15.6.1
Introduction
As for any other non-volatile memory technology, reliability is one of the major
concerns. The main specic reliability issues of PCM are: (i) data retention of the
RESET, affected by the (limited) stability of the amorphous state; (ii) endurance,
limited by the occurrence of stuck at RESET (open) or stuck at SET (short)-type
defects; and (iii) program and read disturbs that is, the stability of the amorphous
phase due to repeated, though limited, thermal cycling caused by reading or
programming neighboring cells.
While this section is based on reliability tests on the cell level (probing intrinsic
reliability), for large memories the effect of reliability tests (e.g., of a temperature bake
or of a large number of SET/RESET programming cycles) on the resistance distributions should also be evaluated to screen eventual extrinsic failures. However,
until now only a few (preliminary) results on array reliability statistics have been
reported [15, 67].
15.6.2
Retention for PCM: Thermal Stability
The most important requirement for a non-volatile memory is the ability to retain the
stored information for a long time, the typical specication being 10 years (at a
minimum of 85 C). As the SETstate is stable from a thermodynamic point of view, it
has no problem of data retention. On the other hand, the RESETstate, corresponding
to the amorphous phase, is instead meta-stable and may crystallize following a
dynamic which is heavily dependent on temperature. The retention of an amorphous
state is, therefore, critical.
The retention performance of PCM technology is addressed by performing
accelerated measurements at high temperatures. Figure 15.17 shows the typical
failure time under isothermal conditions and with no applied bias, measured at
several temperatures ranging from 150 to 200 C [15]. The failure time is dened as
the time required by a fully amorphized cell to lose the stored information. The
resistance value for failure has been instead dened as the geometric average
between set and reset resistances. The data clearly show an Arrhenius behavior
j469
j 15 Phase-Change Memories
470
with an activation energy of 2.6 eV, which extrapolates to a data retention capability of
10 years at 110 C.
A wide range of activation energies have been reported. In general, quite high
activation energies have been obtained from Ea 2.9 eV for recent fully integrated
cells [38] up to 3.5 eV for intrinsic material characterization of GST [17]. Even if such
high activation energies are favorable for long retention, however, the physics underlying these experimental values is still not completely understood. The details of the
material and integration processes apparently have a large effect, as activation energy
was found to be dependent on the presence of capping layers (from a low 2.4 eV for
uncapped GST to 2.7 eV for ZnS-SiO2-capped GST [68]). Furthermore, material doping
increases the activation energy (up to 4.4 eV has been reported for O-doped GST [32]).
In principle, the crystallization process during the accelerated retention tests
should be described by the same theoretical models for crystal nucleation and growth
as used to account for crystallization at much higher temperature (but at much
shorter times) during the SET programming pulse (see Section 15.4). Although the
conditions for the time window vary over many orders of magnitude (<100 ns during
SET, but >102103 s during retention tests), it has been shown [69] that remarkably!
the models are indeed able accurately to describe the crystallization processes under
both conditions. Moreover, the crystallization statistics also signicantly impact the
data retention measurements in high-temperature accelerated tests.
15.6.3
Cycling and Failure Modes
PCM cells have been shown to have an intrinsic long programming endurance that
is, up to 1012 SETRESET program cycles [13, 17] (Figure 15.18), which is much
superior with respect to Flash technology. In fact, cell endurance has been shown to
depend heavily on the interface quality of the heater-GST system and on the possible
interdiffusion between GST and adjacent materials. A non-optimized fabrication
technology results in devices showing the so-called rst re effect [47], namely a
15.6 Reliability
higher initial programming pulse required for the rst cycle of virgin devices. The
same devices usually feature poor characteristics in terms of stability during cycling
characterization, usually ending in a physical separation of the chalcogenide alloy
from the heater (stuck at RESET).
A second failure mode has been also observed (usually called short-mode failure
or stuck at SET), where the devices remain permanently in the highly conductive
condition. This phenomenon requires an auxiliary physical mechanism that either
forbids the phase-change transition of the GST or creates a conductive parallel path
that shunts the cell electrodes. Both cases require a chemical modication of the
chalcogenide alloy, suggesting that the interdiffusion of chemical species from
adjacent materials plays a role. A careful denition of the materials belonging to
the device active region is mandatory in order to achieve good reliability performance.
Finally, current density, as well as the high temperatures (>600 C) reached in the
active region during programming, must be considered as accelerating factors for the
previously proposed failure mechanisms (i.e., poor quality interface and contamination). This explains the strong decrease in endurance as a function of the
overcurrent, or, equivalently with energy per pulse larger than required for
switching [17]. The data in Figure 15.19 show that endurance at a constant programming current (700 mA) scales inversely to the reset pulse [70]. The longer the pulse,
j471
j 15 Phase-Change Memories
472
the larger the energy released per pulse, and the faster the interface degradation,
while the overall energy released up to the bit failure remains almost constant.
15.6.4
Read and Program Disturbs
For most memory technologies, one important concern is the ability of the cell to
retain data in the face of spurious voltage transients caused by reading and
programming in the memory matrix. For PCM cells such disturbs are not directly
induced by voltage pulses but rather by thermal spikes that can trigger the
crystallization of the metastable amorphous state.
A rst failure mechanism can be caused by multiple read accesses of a cell: the
small current owing through the device can induce a localized heating able to
accelerate the spontaneous amorphous to crystalline transition. In a second failure
mechanism, repeated programming operations on a cell can induce an unwanted
heating of the adjacent bits (thermal cross-talk) that can lose the data stored.
Read disturb tests have been described in literature [15, 71], indicating that the
repetitive reading of a cell in the high-resistance state with a current below 1 mA allows
for a 10-year bit preservation. Such a current is one order of magnitude larger than the
current owing through the cell in standard reading conditions, thereby conrming
the robustness to read-disturbs of PCM also in continuous reading, worst-case
conditions. Results from program disturb tests on demonstrators have shown that
cross-talk is not an issue down at the 90-nm technology node [15]. Thermal
simulations, furthermore, conrm program disturb immunity up to at least the
45-nm technology node [18].
15.7
Scaling of Phase-Change Memories
A new memory technology, to be competitive with the existing Flash, must feature a
small cell size combined with an excellent scalability beyond the 45-nm technology
node. In this section, the scaling potential of the PCM memory is addressed. The main
aspects are: (i) scaling of the thermal proles in order to avoid thermal disturbs at
shrinking cell separation; (ii) scaling of the program (RESET) current (and voltage), not
only to reduce the program energy/cell, but also because it affects the cell size through
the dimensions of the select transistor; and (iii) conservation of the basic material
characteristics down to very small dimensions. These aspects will have a crucial impact
on the scalability perspectives of PCM technology, and are still to be veried.
15.7.1
Temperature Profile Distributions
amorphous RESETstate can become unstable at much lower temperatures (<200 C),
it is crucial for the correct operation of a PCM memory that the heating remains very
localized and does not affect neighboring cells. Whilst this has been proven for current
technologies down to 90 nm (see Section 15.6), it may be less evident to maintain the
high temperatures localized in much further scaled technologies, beyond the 45-nm
node.
As far as all the linear dimensions are reduced isotropically, the temperature
distribution prole indeed does scale. This property transpires from the heat
equation:
k r2 T g rJ 2
15:1
where k is the thermal conductivity and g is the heat generation per unit volume. The
latter is proportional, via the electrical resistivity, r, to the square of the current
density, J.
Let us now assume that a new device is fabricated, by shrinking all the linear
dimensions by a factor a, but keeping the same boundary conditions (e.g., T(0) T0)
and material properties (e.g., k, r). In the new device it is:
k r0 T 0 g 0 rJ 0
2
That is:
r2 T 0 r
J 02
a2
15:2
Provided that J0 aJ, Equation 15.1 and 15.2 coincide, leading to the same
temperature prole, but on a spatial scale uniformly compressed by a factor a.
It follows that if two cells in the original device, at a distance d, do not suffer from
cross-talk, then in the isotropically scaled device two cells at a distance d/a will be
immune to thermal disturbs. The argument holds as far as J0 aJ, which will be
demonstrated in the following section.
In a more aggressive scaling scheme, only the contact area of the cell is scaled
down, without changing so much the thickness of the different layers. In such an
anisotropically approach, aiming to more drastically reduce the programming
current, cross-talk immunity is no longer granted. However, simulations results
show that, without any specic care or materials, thermal disturbs are not expected to
slow down cell scaling until the 45-nm node [18].
15.7.2
Scaling of the Dissipated Power and Reset Current
The highest power and programming current is required during the RESET operation, where locally the PCM material must be heated above the melting point.
j473
j 15 Phase-Change Memories
474
By using simplied models, the RESET power can be calculated as follows. Even if
the programming current pulses last only some tens of nanoseconds, the temperature rise DT of the hot spot, where the PCM material eventually melts down, may be
computed at steady state. This assumption holds true since the thermal transients in
such a small region are characterized by a nanosecond time constant, much faster
than the typical current pulse width.
At steady state, the power dissipated by Joule heating (PJ) balances the heat loss
(PHL DT/RTH), where RTH is the thermal resistance to the thermal sinks at room
temperature (i.e., top and bottom metal layers). On the other hand, the power PJ is
proportional to the current, I2, via the electrical resistance, PJ RI2. Using this
simple model, the temperature rise DTM, needed to reach the melting temperature,
can be written as: DTM PJ,M RTH RTH RIM2, where IM is the melting current. It
follows:
I2M DT=RTH R
15:3
4)
In the frame of the isotropic scaling rule , as the technology scales, the cell surface
area decreases as F2, but also the distances to the heat sinks decrease as F. The
thermal resistance will therefore linearly increase with the scaling factor: RTH a or,
equivalently, RTH F1. As for the thermal resistance, the electrical resistance of the
PCM cell also increases linearly with geometry scaling (R rlength/Area F1). It
follows from Equation 15.3 that the melting current and the programming current,
which is proportional to the latter, scales as F. The smaller the feature size, the smaller
the programming current: Ireset F (or Ireset 1/a). Note that the current density
J I/A, will scale as a, as assumed above in deriving Equation 15.2 while discussing
the immunity to thermal disturbs.
Although the scaling result is independent of the adopted cell architecture, this
does not mean that cell architecture is not important. At a xed technological node,
the cell architecture should be optimized, by accurate design of the geometry and
material engineering, to minimize the programming current and the dissipated
power (examples of different cell optimizations were provided in Section 15.5).
A more aggressive reduction of the programming current may be obtained by
scaling the contact area but not the other dimensions (e.g., the PCM thickness). This
choice will mainly affect the thermal and electrical resistances (R and RTH scale faster,
i.e., F2 instead of F1). It follows that Ireset F2 (or, 1/a2). The scaling properties
of the PCM cell are summarized in Table 15.3[18]. The more aggressive scaling will
however pose some problems of manufacture, as the aspect ratio of some cell features
(e.g., the thickness to cell size of the PCM material) will increase. Moreover, as
discussed above, at this point thermal disturbs may begin to enter the game.
The scalability of the reset current has been addressed experimentally by measuring several test devices with different contact areas [17, 18]. An example of resulting
values is given in Figure 15.20 [18]. From this gure it is clear that the reset current
follows the reduction of the contact area, and values as low as 50 mA have been
4) Scaling can be indicated either by a dependence on
technology feature size, F, or by a dependence on a
scaling factor a, with a 1/F.
Parameters
Heater contact area Acell
Vertical dimensions d
Electrical/thermal resistances R
Power dissipation Pcell
Current I
Voltage Vcell
Current density J
Isotropic
Aggressive
Scaling factor
1/a2
1/a
a
1/a
1/a
1
a
Scaling factor
1/a2
1
a2
1/a2
1/a2
1
1
achieved with a com1plete device functionality. The reset current reduction shown in
Figure 15.20 can be mainly ascribed to the increase of the heater thermal resistance,
RTH caused by the reduction of the contact area.
Data published recently show that Ireset is indeed scaling in between F and F2
(Figure 15.21) [19]. A scaling behavior stronger than F2 has been indeed reported by
Cho et al. [72]. Such a steep dependence, which is well beyond the F2 theoretical limit,
is a strong indication that other cell parameters related to the cell structure and/or
material characteristic have been changed.
15.7.3
Voltage Scaling
The required program voltage is determined by the threshold for the electronic
switching of the amorphous phase (SET). This voltage value scales with PCM
thickness, but is also strongly material dependent; for example, the threshold eld
is reported to be larger for GST (225) material (threshold eld 3040 V mm1)
than for the fast-growth doped SbTe materials (threshold eld 14 V mm1) [24].
Figure 15.20 Reset current versus contact area (from Ref. [18]).
Experimental values (dots), together with scaling trend line
(dashed line).
j475
j 15 Phase-Change Memories
476
In order to assess the scaling of PCM cell size, let us consider a typical MOS-select
transistor PCM cell layout (Figure 15.22). The cell size can be calculated as
(W 1F) (4.5 F), where W is the width of the access transistor. For a minimal
device, W 1 F, the cell size would be 9 F2. However, this is the ideal case as in
practice, W/L 1 to obtain the required drive current to reset the cell through the
access transistor. For W/L > 1, splitting the gate (the so-called dual gate concept) is
favorable to minimize the cell area [13], resulting in a cell area of 6W (1/2
W F).
Taking the Ireset values from the ITRS roadmap (predicting, in agreement with the
scaling laws derived in Section 15.7.2 above, a scaling according Ireset a1.5) [73], as
well as the predicted scaling of the drive current with the technology node into
account, the minimum access transistor size and the corresponding PCM cell size
can be predicted as a function of the technology node (Figure 15.23). From these
values it transpires that, at the 45-nm node, a cell size reduction to <15 F2 would
become possible (with Ireset 100 mA and W 2.7 F).
Further room may be obtained by using alternative solutions for ultra-scaled MOS
transistors, such as ultra-thin body fully depleted on SOI or Double Gate (or FinFET).
These devices are expected to have an improved driving current capability up to
2 mA mm1 (forecasted to be 2010) [73]. Long-term solutions may also involve the
adoption of vertical MOS structures that take advantage of having only a single contact
(and hence a reduced area on silicon) and an improved W (by about a factor of 3) with
respect to planar solutions.
Figure 15.24 shows the projection of the PCM scaling trend. The PCM cell size will
take full advantage of technology scaling, as no intrinsic limitations are expected to
halt further scaling. On the other hand, both NOR and NAND scaling are expected to
slow down due to scaling limitations (due to high program drain voltage for NOR and
electrostatic eld coupling for NAND). As a consequence, the PCM cell size will reach
the NOR cell size at about the 45-nm node, and may even reach the NAND cell size at
about the 32-nm node. Moreover, the memory is ideally suited to store more than two
j477
j 15 Phase-Change Memories
478
Figure 15.24 Scaling trends for NAND, NOR and PCM (with
bipolar select transistor). The phase-change memory technology
is expected to reach the same Flash-NAND size at the 32-nm
technology feature size. (Figure from Ref. [19]).
levels per cell. This approach may become a viable option to further reduce the cost
per bit of PCM devices, if reliable multi-level programming becomes feasible.
15.7.5
Scaling and Cell Performance: Figure of Merit for PCM
A low programming current can indeed be reached by tightly conning the current
ow and the corresponding heat generation. However, this usually results in a high
cell resistance, and there is an upper limit to the acceptable SETresistance value. The
larger the resistance, the lower the cell current during the read operation, and the
longer the time needed to charge the bit-line capacitance. In practice, for a read
operation to occur within 50 ns the SET resistance should be kept at less than 50 kO
(i.e., corresponding to a minimum read current in the SETstate of a few mAs), but this
constraint leads to a trade-off between programming current, and to the introduction
of a new gure of merit, the product RsetImelt, to compare different devices [19].
Figure 15.25 shows Imelt versus Rset as derived from published results, where the
constant RsetImelt lines are highlighted by the dashed lines. Different cell architectures and material systems correspond to different RsetImelt values. The data clearly
show the potential of PCM cells to reach programming currents of few tens of mAs.
15.7.6
Physical Limits of Scaling
To date, very few data are available regarding size effects on PCM behavior.
However, from optical memory measurements, phase-change mechanisms seem
scalable to at least 5 nm [74]. On the other hand, for both nanosized elements or
lms thinner than 20 nm, shifts in the crystallization behavior of GST (225) have
been observed [75]. Yet, it is unclear if these are intrinsic effects or they are related to
15.8 Conclusions
15.8
Conclusions
During recent years, PCM have evolved from an interesting new concept to a viable
memory technology, based on the use of improved and faster switching materials
(<100 ns) and on an improved understanding of phase transformation and electronic
switching processes in chalcogenide materials. PCM further shows excellent reliability properties, such as good data retention (10 years at 110 C) and very high
endurance (up to 1012 cycles, compared to <106 for FLASH), while the optimization
of cell design has resulted in a drastic reduction of the program current (down to few
hundred mA). In addition, integration technology has matured to a point where large
demonstrator circuits (up to 256 Mb, in 100-nm technology) have already been built.
Perhaps more importantly, the scaling potential of PCM has been assessed, and is
expected to result in small cell sizes (in the range of 10 F2 or smaller) for technologies
of 45 nm and below, this being concomitant with a further reduction in program
currents. For these reasons, PCM is expected eventually to replace the no-longerscaling Flash technologies.
j479
j 15 Phase-Change Memories
480
References
1 Kurata, H., Otsuga, K., Kotabe, A.,
Kajiyama, S., Osabe, T., Sasago, Y.,
Nerumi, S., Tosami, K., Kamohara, S. and
Tsuchiya, O. (2006) The impact of random
telegraph signals on the scaling of
multilevel Flash memories. IEEE Symp.
VLSI Circuits, Tech. Digp, pp. 140141.
2 Shin, Y. (2005) Non-volatile memory
technologies for beyond 2005. IEEE Symp.
VLSI Circuits, Tech. Digp, pp. 156159.
3 Dewald, J.F., Pearson, A.D., Northover,
W.R. and Peck, W.F. (1962) Journal of the
Electrochemical Society, 109, 243.
4 Ovshinsky, S.R. (1968) Reversible
electrical switching phenomena in
disordered structures. Physical Review
Letters, 21, 14501453.
5 Ovshinsky, S.R. and Fritzsche, H.
Amorphous semiconductors for
switching, memory, and imaging
applications. IEEE Trans. Elec. Dev., Vol.
ED-20, No. 2, February 1973, pp. 91105.
6 Neale, R., Nelson, D. and Moore, G. (1970)
Nonvolatile and reprogrammable, the
read-mostly memory is here. Electronics,
43, 5660.
7 Maimon, J., Hunt, K., Rodgers, J., Burcin,
L. and Knowles, K. Circuit demonstration
of radiation hardened chalcogenide nonvolatile memory. Proceedings of
Aerospace Conference, Vol. 5, pp.
5_23735_2379.
8 Yamada, N., Ohno, E., Nishiuchi, K.,
Akahira, N. and Takao, M. (1991) Rapidphase transitions of GeTe-Sb2Te3
pseudobinary amorphous thin lms for an
optical disk memory. Journal of Applied
Physiology, 69, 28492857.
9 Kolobov, A.V., Fons, P., Frenkel, A.I.,
Ankudinov, A.L., Tominaga, J. and Uruga,
T. (2004) Understanding the phase-change
mechanism of rewritable optical media.
Nature Materials, 3, 703708.
10 Wicker, G. (1996) A comprehensive model
of submicron chalcogenide switching
devices, Ph. D. Dissertation, Wayne State
University, Detroit, MI.
References
18 Pirovano, A., Lacaita, A.L., Benvenuti, A.,
Pellizzer, F., Hudgens, S. and Bez, R.
(2003) Scaling analysis of phase-change
memory technology. IEDM Technical
Digest, 699702.
19 Lacaita, A.L. Progress of phase-change non
volatile memory devices, presented at
European Phase Change Ovonic Science
(Joint E PCOS-IMST Workshop),
Grenoble, France, May 2931, 2006; http://
www.epcos.org.
20 Lacaita, A.L., Redaelli, A., Ielmini, D.,
Pellizzer, F., Pirovano, A., Benvenuti, A.
and Bez, R. (2004) Electrothermal and
phase-change dynamics in chalcogenidebased memories, IEEE International
Electron Devices Meeting Technical Digest,
pp. 911914.
21 Wuttig, M., Klein, M., Kalb, J., Lecner, D.
and Spaepen, F.,Ultrafast data storage with
phase change media: from crystal
structures to kinetics, Presented at the 5th
European Phase Change Ovonic Science
symposium (Joint E PCOS-IMST
Workshop), Grenoble, France, May 2931,
2006; http://www.epcos.org.
22 Hudgens, S. and Johnson, B. (2004)
Overview of phase-change chalcogenide
nonvolatile memory technology. MRS
Bulletin, 29, 829832.
23 Libera, M. and Chen, M. (1990)
Multilayered thin-lm materials for phasechange erasable storage. MRS Bulletin, 15,
4045.
24 Lankhorst, M.H.R., Ketelars, B.W.S.M.M.
and Wolters, R.A.M. (2005) Low-cost and
nanoscale non-volatile memory concept
for future silicon chips. Nature Materials, 4,
347352.
25 Borg, H., Lankhorst, M., Meinders, E. and
Leibbrandt, W. (2001) Phase-change media
for high-density optical recording.
Materials Research Society Symposium
Proceedings, Materials Research Society,
Vol. 674, V1.2.1V1.2.10.
26 Miao, S.S., Shi, L.P., Zhao, R., Tan, P.K.,
Lim, K.G., Li, J.M. and Chong, T.C.
Temperature dependence of phase change
random access memory cell. Extended
27
28
29
30
31
32
33
j481
j 15 Phase-Change Memories
482
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
References
52 Chen, Y., Chen, C.F., Chen, C.T., Yu, J.Y.,
Wu, S., Lung, S.L., Liu, R. and Lu, C. (2003)
An access-transistor-free (0T/1R) nonvolatile resistance random access memory
(RRAM) using a novel threshold
switching, self-rectifying chalcogenide
device, IEEE International Electron Devices
Meeting Technical Digest, pp. 905908.
53 Kim, K., Jeong, G., Jeong, H. and Lee, S.,
Emerging memory technologies,
Proceedings of the IEEE 2005 Custom
Integrated Circuits Symposium, 1821
September 2005, pp. 423426.
54 Kim, Y.T. et al. (2004) (Samsung) Extended
Abstracts of the 2004 International
Conference on Solid State Devices and
Materials, Tokyo D-3-2, pp. 244245.
55 Yi, H., Ha, Y.H., Park, J.H., Kuh, B.J.,
Horii, H., Kim, Y.T., Park, S.O., Hwang,
Y.N., Lee, S.H., Ahn, S.J., Lee, S.Y., Hong,
J.S., Lee, K.H., Lee, N.I., Kang, H.K.,
Chung, U. andMoon, J.T. (2003) Novel cell
structure of PRAM with thin metal layer
inserted GeSbTe, IEEE International
Electron Devices Meeting Technical Digest,
pp. 901904.
56 Alberici, S.G., Zonca, R. and Pashmakov,
B. (2004) Ti diffusion in chalcogenides: a
ToF-SIMS depth prole characterization
approach. Applied Surface Science, 231/232,
821825.
57 Yoon, S.-M., Lee, N.-Y., Ryu, S.-O., Park,
Y.-S., Lee, S.-Y., Choi, K.-J. and Yu, B.-G.
(2005) Etching characteristics of
Ge2Sb2Te5 using high-density helicon
plasma for the nonvolatile phase-change
memory applications. Japanese Journal of
Applied Physics, 44, L869L872.
58 Pellizzer, F., Spandre, A., Alba, S. and
Pirovano, A. (2004) Analysis of plasma
damage on phase change memory cells,
2004 IEEE International Conference on
Integrated Circuit Design and Technology,
p. 227.
59 Takaura, N., Terao, M., Kurotsuchi, K.,
Yamauchi, T., Tonomura, O., Hanaoka, Y.,
Takemura, R., Osada, K., Kawahara, T.
andMatsuoka, H. (2003) A GeSbTe phasechange memory cell featuring a tungsten
60
61
62
63
64
j483
j 15 Phase-Change Memories
484
65
66
67
68
69
j485
16
Memory Devices Based on Mass Transport in Solid Electrolytes
Michael N. Kozicki and Maria Mitkova
16.1
Introduction
486
applied when electrochemical effects occur in materials and devices with interfaces,
for example electrodes or electrochemically different material phases, that are closely
spaced perhaps by a few tens of nanometers, or less. In this size regime, the
functionality of ionic systems is quite different from the macro-scale versions, but in
a highly useful manner. For example, internal electric elds and ion mobilities are
relatively high in nanoionic structures and this, combined with the short length
scales, results in fast response times. In addition, whereas deposition electrochemistry and many batteries use liquids as ion transport media, nanoionics can take
advantage of the fact that a variety of solid materials are excellent electrolytes, largely
due to effects which dominate at the nanoscale. This allows nanoionic devices based
on solid electrolytes to be more readily fabricated using techniques common to the
integrated circuit industry, and also facilitates the marriage of such devices with
mainstream integrated electronics. Indeed, in-situ changes may be controlled by the
integrated electronics, leading to electronic-ionic system-on-chip (SoC) hybrids.
In this chapter, we describe the basic electrochemistry, materials science and
potential applications in information technology of mass-transport devices based on
solid electrolytes and nanoionic principles. The electrodeposition of even nanoscale
quantities of a noble metal such as silver can produce localized persistent but
reversible changes to macroscopic physical or chemical characteristics; such changes
can be used to control behavior in applications that go well beyond purely electronic
systems. Of course, electrical resistance will change radically when a low resistivity
electrodeposit (e.g., in the tens of mO cm or lower) is deposited on a solid electrolyte
surface which has a resistivity many orders of magnitude higher. This resistance
change effect has a variety of applications in memory and logic. Here, emphasis will
be placed on low-energy, non-volatile memory devices which utilize such resistance
changes to store information.
16.2
Solid Electrolytes
16.2.1
Transport in Solid Electrolytes
The origins of solid-state electrochemistry can be traced back to Michael Faraday, who
performed the rst electrochemical experiments with Ag2S and discovered that this
material was a good ion conductor [2]. Subsequently, greater emphasis was placed on
liquid electrolytes and their use in plating systems and battery cells, until the 1960s
and 1970s when a signicant rise in interest was noted in solid-state electrochemistry.
This renewed attention was spurred in part by the development of novel batteries
which had a particularly high power-to-weight ratio due to the use of solid electrolyte,
mainly beta-alumina, which is an excellent conductor of sodium ions [3]. Even though
these solid materials were clearly different from their liquid counterparts, many of
the well-known principles developed in the eld of liquid electrochemistry were
found to be applicable to the solid-state systems. One major difference between most
solid and liquid ion conductors is that in solids, the moving ions are of only one
polarity (cations or anions) and the opposite polarity species is xed in the supporting
medium. This has a profound effect on the types of structure that can be used for
mass transport (this subject will be outlined later in the chapter). The solid electrolyte
family currently includes crystalline and amorphous inorganic solids, as well as
ionically conducting polymers. In general, the best solid electrolytes have high ionic
but low electronic conductivity, chemical and physical compatibility with the electrodes used, thermodynamic stability over a wide temperature range, and the ability
to be processed to form continuous mechanically stable thin lm structures.
The mobile ions in a solid electrolyte sit in potential wells separated by low
potential barriers, typically in the order of a few tenths of an electron-volt (eV), or less.
The ions possess kinetic energy, governed by Boltzmann statistics, and so at nite
temperature will constantly try (with around 1012 attempts per second) to leave their
low-energy sites to occupy energetically similar sites within the structure. Thermal
diffusion will result from this kinetic energy, driving ions down any existing
concentration gradient until a uniform concentration is achieved. Subsequent
movement of the ions produces no net ux in any particular direction. The application of an electric eld to the electrolyte effectively reduces the height of the barriers
along the direction of the eld, and this increases the probability that an ion will hop
from its current potential well to a lower energy site (see Figure 16.1) [4]. An ion
current therefore results, driven by the eld. It should be noted that, unlike electrons,
ions in a solid are constrained to move through a conning network of narrow
channels. These pathways may be a natural consequence of order in the material, as
in the case of the interstitial channels present along certain directions in crystalline
materials, or they may be a result of long-range disorder, as in amorphous (glassy)
and/or nanoscopically porous materials. Glassy electrolytes, typically metal oxides,
suldes or selenides, are of particular interest as they can contain a wider variety of
routes for cation transport than purely crystalline materials. This is a major reason for
the interest in these materials and for Group VI glasses in particular and why they
feature heavily in this chapter.
j487
488
Considering the above dependence of ion hopping on ion energy and barrier
height, it should be no surprise that the expression for ion conductivity in the
electrolyte is
s s0 exp E a =kT
16:1
16:2
where ni is the number of mobile ions per unit volume, q is the ionic charge
(1.6 1019 C for singly charged ions), and m is the ion mobility. Interestingly, it is
thought that ni in high ion concentration solid electrolytes such as the Ag-Ge-S
ternaries is fairly constant around 1019 ions cm3 [6]. This means that the ion
mobilities in the solid electrolytes that are of interest to us are in the order of
102104 cm2 V1 s1.
16.2.2
Major Inorganic Solid Electrolytes
As mentioned above, a variety of materials can act as solid electrolytes. Anion (oxide)
conductors exist, such as ZrO2, layered La2NiO4/La2CuO4 [7], or Bi10V4MeO26, where
Me is a divalent metal such as Co, Ni, Cu, or Zn [8]. However, for mass-transport
devices there is a greater interest in electrolytes that conduct metallic cations as these
can be used to form solid metal electrodeposits. In general, the smaller an ion is, the
more mobile it should be as it will be able to slip more easily through the pathways in
the solid electrolyte. This should be especially true for small-ionic radius elements
such as the alkali metals (Li, Na, K). For example, Na has been successfully used in
beta-alumina and, to a lesser extent, in non-stoichiometric zirconophosphosilicate [9]
to produce good solid ion conductors. The high conductivity in the beta-alumina
compounds is a consequence of the structure which has open conduction pathways
and a large number of partially occupied sites where cations can reside. Of course, Li
conductors in general are of great interest because of their use in high-voltage/highpower density lithium ion batteries, but highly stable Li electrolytes are not easy to
produce and there are not many examples of lithium solid electrolyte batteries (Li/LiI/I2
is one of the few commercially available cells). Of course, the high chemical
reactivity of these mobile elements makes them unsuitable for most mass transport/
j489
490
16:3
Both charge carriers (Ag and e) contribute to the total conductivity, so that the
material may be regarded as a mixed conductor [11]. Analogous effects are expected to
occur in Cu-containing electrolytes (e.g., Cu2S), although the situation will be more
complex as participation of the electrons from the Cu d-orbital will result in a variety
of bonding congurations. Extensive information on transport in superionic conductors is provided in a review [12]. The main issue with these binary materials is that,
whereas they have been widely studied as superionic conductors, it is only their hightemperature phases that are of use in this respect, and this leads to severe practical
limitations for electronic device applications.
16.2.3
Chalcogenide Glasses as Electrolytes
characteristics and as such have found a multitude of uses. They lend themselves to a
variety of processing techniques, including physical vapor deposition (evaporation
and sputtering), chemical vapor deposition, spin casting, as well as melt-quenching.
Stable binary glasses typically involve a Group IVB or Group VB atom, such as GeSe
or AsS, with a wide range of atomic ratios possible. The bandgap of the Group VIB
glasses rises from around 13 eV for the tellurides, selenides and suldes, to 510 eV
for the oxides. The tellurides exhibit the most metallic character in their bonding, and
are the weakest glasses as they can crystallize very readily (hence their use in socalled phase change technologies as used in re-writable CDs and DVDs), and the
others exhibit an increasing glass transition temperature on moving further up the
Periodic Table column, with oxides having the highest thermal stability. The nonoxide glasses usually are more rigid than organic polymers but more exible than a
typical oxide glass, and other physical properties follow the same trend. This structural
exibility of these materials offers the possibility of the formation of voids through
which the ions can readily move from one equilibrium position to another and, as will
be seen later, allows the formation of electrodeposits within the electrolyte.
The addition of Group IB elements such as Ag or Cu transforms the chalcogenide
glass into an electrolyte as these metals form mobile cations within the material. The
ions are associated with the non-bridging chalcogen atoms, but the bonds formed are
relatively long 0.27 nm in Ag-Ge-Se and 0.25 nm in Ag-Ge-S ternaries [14]. As with
any coulombic attraction, the coulombic energy is proportional to the inverse of the
cationanion distance, so long bonds lead to reduced attractive forces between the
charged species. The Ge-chalcogenide glasses are therefore among the electrolytes
with the lowest coulombic energies [14]. The slightly shorter AgS bond length
leads to a higher coulombic attraction, which is a factor contributing to the observed
lower mobility of Ag in germanium suldes versus selenides of the same stoichiometry. Thermal vibrations will allow partial dissociation, which results in a two-step
process of defect formation followed by ion migration. The activation energy for this
process depends heavily on the distance between the hopping cation and the anion
located at the next nearest neighbor, as well as the height of the intervening barrier.
(Adiscussionoftherelationshipbetweencoulombicandactivationenergiesisprovided
in Ref. [14] but, in addition to having low coulombic energies, the Ge-chalcogenides
also have relatively low activation energies for ion transport.) In this respect, the
existence of channels due to the structure of the electrolyte is critical in the ion
transport process. As an example of this effect, the Ag conductivity in glassy AgAsS2
is 100-fold larger than that in the crystalline counterpart due to the more open
structure of the non-crystalline material [15].
The conductivity and activation energy for ion conduction of the ternary glasses is a
strong function of the mobile ion concentration. For example, in the Ag concentration range between 0.01 and 3 atom%, the room temperature conductivity of Ag-Ge-S
glass changes from 1014 to about 5 1010 O1 cm1, accompanied by a decline in
activation energy from 0.9 to 0.65 eV. However, above a small atomic percent, both
conductivity and activation energy change more rapidly as a function of Ag concentration (see Figures 16.4a and b, respectively [16]). This change in the slopes of the
conductivity and activation energy curves with Ag content in both Ag-Ge-S and
j491
492
The transformation that occurs in ternary electrolytes at over a small atomic percent of
metal is, by no means, subtle. Indeed, the material undergoes considerable changes
in its nanostructure that have a profound effect on its macroscopic characteristics.
These changes are a result of phase separation caused by the reaction of silver with the
available chalcogen in the host to form Ag2Se in Ag-Ge-Se and Ag2S in Ag-Ge-S
ternaries. For example, if it is assumed that the Ag has a mean coordination of 3, the
composition of ternary Ag-Ge-Se glasses may be represented as
Gex Se1 x 1 y Agy 3y=2Ag2 Se 1 3y=2Get Se1 t
16:4
where t is the amount of Ge in the Ge-Se backbone x(1 y)/(1 3y/2) [17]. For a Serich glass such as Ge0.30Se0.70, x 0.30, and y 0.333 at saturation in bulk glass;
hence, t 0.40. This means that the material consists of Ag2Se and Ge0.40Se0.60
(Ge2Se3) in the combination
16:7 Ag2 Se 10 Ge2 Se3 Ag0:33 Ge0:20 Se0:47
16:5
This electrolyte has a Ag2Se molar fraction of 0.63 (16.7/26.7) and a Ag concentration of 33 atom% [18]. It has been determined that the dissolution of Ag into a
16:6
where dc is the average measured diameter of the crystalline Ag-rich phase and Fv is
the volume fraction of this phase [20]. The volume fraction in the case of Ag2Se in
Ag0.33Ge0.20Se0.47 is 0.57 (for a molar fraction of 0.63), so the average spacing
between the Ag-rich regions is approximately 0.2 times their diameter. The average
diameter of the Ag2Se crystallites in Ag-diffused Ge0.30Se0.70 thin lms was determined, using X-ray diffraction (XRD) techniques, to be 7.5 nm [20], which means that
by Equation 16.6, they should be separated by approximately 1.5 nm of glassy Ge-rich
material. This general structure has been conrmed using high-resolution transmission electron microscopy (TEM). XRD analysis was also performed on a suldebased ternary thin lm with similar stoichiometry, Ag0.31Ge0.21S0.48, with much the
same results. In this case, the Ag2S crystallites are in the order of 6.0 nm in
diameter [21]. Even though the detected Ag-rich phases mainly correspond to the
room-temperature polymorphs, which are not particularly good ion conductors at
room temperature, the ternary is superionic at room temperature. This is not
surprising as defects, interfaces, and surfaces play a considerable role in ion transport
and the large surface-to-volume ratio of the crystallites within the ternary is likely to
greatly enhance ion transport. In addition, it has been noted that the Ag2Se phases
that form following the solid-state diffusion of Ag into Ge-Se may be distorted by
the effective pressure of the medium to produce high ion mobility phases [19]. The
nano-inhomogeneous ternary is ideal for devices such as resistance change memory
cells, as the relatively high resistivity leads to a high off-resistance in small diameter
devices, although the availability of mobile ions via the dispersed Ag-rich phases
means that the effective ion mobility is high.
The addition of Ag (or Cu) to the chalcogenide base glass can be achieved by
diffusing the mobile metal from a thin surface lm via photodissolution. This process
utilizes light energy greater than the optical gap of the chalcogenide glass to create
charged defects near the interface between the reacted and unreacted chalcogenide
layers [22]. The holes created are trapped by the metal, while the electrons move into
the chalcogenide lm. The electric eld formed by the negatively charged chalcogen
atoms and positively charged metal ions is sufcient to allow the ions to overcome the
energy barrier at the interface, and so the metal moves into the chalcogenide [23].
Prior to introduction of the metal, the glass consists of GeS4 (GeSe4) tetrahedra and,
in the case of chalcogen-rich material, S (Se) chains. The introduced metal will readily
react with the chain chalcogen and some of the tetrahedral material to form the
ternary. This Agchalcogen reaction, which essentially nucleates on the chalcogenrich regions within the base glass, results in the nanoscale phase-separated ternary
described above [24, 25].
j493
494
16.3
Electrochemistry and Mass Transport
16.3.1
Electrochemical Cells for Mass Transport
In order to move mass, it is clear that an ion current must be generated. Regardless of
ion mobility in the electrolyte, a sustainable ion current will only ow if there is a
source of ions at one point and a sink of ions at another; otherwise, the movement of
ions away from their oppositely charged xed partners would create an internal eld
(polarization) which would prevent current ow. The process of electrodeposition, in
which cations in the electrolyte are reduced by electrons from a negative electrode
(cathode), is essentially an ion sink as ions are removed from the electrolyte to
become atoms. However, in the absence of an ion source, the reduction of ions at the
cathode will occur at the expense of the electrolyte. The concentration of ions in the
solid electrolyte will therefore decrease during electrodeposition until the electrode
potential equals the applied potential and reduction will cease. Further reduction
requires greater applied voltage (governed by the Nernst equation), so that the
deposition process is effectively self-limiting for a moderate applied potential. It
should be noted also that a depleted electrolyte could allow the subsequent thermal
dissolution of an electrodeposit, which would not occur if the glass was maintained at
the chemical saturation point. This has important consequences for the stability of
any electrodeposit formed. It is therefore necessary to have an oxidizable positive
electrode (anode) one which can supply ions into the electrolyte to maintain ion
concentration and overall charge neutrality. In the case of a silver ion-containing
electrolyte, this oxidizable anode is merely silver or a compound or alloy containing
free silver. So, the most basic mass-transport device consists of a solid electrolyte
between an electron-supplying cathode and an oxidizable anode (see Figure 16.5).
These devices can have both electrodes in a coplanar conguration (as in Figure 16.5a),
or on opposite faces of the electrolyte (Figure 16.5b).
In such a device, the anode will oxidize when a bias is applied if the oxidation
potential of the metal is greater than that of the solution. Under steady-state
conditions, as current ows in the cell, the metal ions will be reduced at the cathode.
16:7
Cathode : Ag e ! Ag
16:8
with the electrons being supplied by the external power source. The deposition of Ag
metal at the cathode and partial dissolution of the Ag at the anode indicates that device
operation is analogous to the reductionoxidation electrolysis of metal from an
aqueous solution and much the same rules apply, except that in this case the anions
are xed. When a bias is applied across the electrodes, silver ions migrate by the
coordinated hopping mechanism (as described above) towards the cathode, under
the driving force of the applied eld and the concentration gradient. At the boundary
layer between the electrolyte and the electrodes, a potential difference exists due to the
transfer of charge and change of state associated with the electrode reactions. This
potential difference leads to polarization in the region close to the phase boundary,
known as the double layer [26]. The inner part of the double layer, consisting of ions
absorbed on the electrode, is referred to as a Helmholtz layer, while the outer part,
which extends into the electrolyte and is known to have a steep concentration gradient
(over a few tens of nanometers in these systems), is called the diffuse layer. Electrically,
the double layer very much resembles a charged capacitor, with a capacitance in the
order of 1014 F mm2 and resistance around 1010 O mm2 for a typical solid electrolyte
under small applied bias [27]. An important consequence of the electric double layer is
that, for the reductionoxidation reaction to proceed, the applied potential must
overcome the potential associated with the double layer. This means that no ion
current will ow and no sustained electrodeposition will occur until the concentration
overpotential is exceeded. Below this threshold voltage, the small observed steady-state
current is essentially electron leakage by tunneling through the narrow double layer.
Above the threshold, the ion current ows and the ions are reduced and join the
cathode, effectively becoming part of its structure, both mechanically and electrically.
The nature of the electrodeposits will be discussed in greater detail later in the chapter.
The intrinsic threshold is typically in the order of a few hundred millivolts. As the
overpotential is governed by the ease of transfer of electrons from the cathode to the
ions in the electrolyte, its precise value depends on factors such as the barrier height
between the cathode material (including surface/interface states) and the electrolyte,
and the bandgap/dielectric constant of the electrolyte. For example, the threshold
voltage of a Ni/Ag-Ge-Se/Ag structure is in the order of 0.18 V, whereas the threshold
of W/Ag-Ge-Se/Ag is around 0.25 V. Switching to a larger bandgap material, the
threshold of a W/Ag-Ge-S/Ag device becomes closer to 0.45 V. The threshold has an
Arrhenius dependence on temperature, with an activation energy that is in the order
of a few tenths of an electron volt or less, which means that for the W/Ag-Ge-S/Ag
structure, the threshold is still 0.25 V even at an operating temperature of 135 C.
Once a silver electrodeposit has formed on the cathode, the Ag metal becomes the
new cathode and the threshold for further deposition of Ag is much less, typically less
than half the original threshold. This reduced threshold for electrodeposition
j495
496
16.3.2
Electrodeposit Morphology
It is clear that the reduction of the ions results in the formation of neutral metal
atoms. However, what is not so obvious is the form that the electrodeposits take, as
the process depends on a number of factors and involves not only the basic principles
of electrochemistry but also transport phenomena, surface science, and metallurgy [2830]. In this section, some of the more important issues of electrodeposition
and deposit morphology with the ternary solid electrolytes will be considered. This
process is best illustrated in structures which support electrodeposit growth between
coplanar electrodes on an electrolyte layer [31], although growth through an electrolyte lm will also be considered.
In the most general case, the process of deposit formation starts with the
nucleation of the new metal atom phase on the cathode, and the deposits develop
with a structure that generally follows a VolmerWeber 3-D island growth mechanism [32]. The addition of new atoms to the growing deposit occurs due to a diffusionlimited aggregation (DLA) mechanism [33, 34]. In this growth process, an immobile
seed is xed on a plane in which particles are randomly moving around. Those
particles that move close enough to the seed in order to be attracted to it attach and
form the aggregate. When the aggregate consists of multiple particles, growth
proceeds outwards and with greater speed as the new deposits extend to capture
more moving particles. Thus, the branches of the core clusters grow faster than the
interior regions. The precise morphology of these elongated features depends on
parameters such as the potential difference and the concentration of ions in the
electrolyte [35]. At low ion concentrations and low elds, the deposition process is
determined by the (non-directional) diffusion of metal ions in the electrolyte and the
resulting pattern is fractal in nature; that is, it exhibits the same structure at all
magnications. For high ion concentrations and high elds, conditions common in
the solid electrolyte devices, the moving ions have a stronger directional component,
and dendrite formation occurs. Dendrites have a branched nature but tend to be more
ordered than fractal aggregates and grow in a preferred axis that is largely dened by
the electric eld. An example of dendritic growth is shown in Figure 16.6 for a Ag
electrodeposit on a Ag-saturated Ge0.30Se0.70 electrolyte between Ag electrodes.
Figure 16.6a is an optical micrograph of the electrodeposit, showing its dendritic
character; while Figure 16.6b is an electron micrograph of a similar deposit, showing
the extreme roughness of its surface at the nanoscale.
The above model for electrodeposit evolution assumes a homogeneous electrolyte.
However, since electrodeposit growth is obviously related to the presence of available
Ag ions in the electrolyte surface, the content and consistency of the electrolyte will
have a profound effect on electrodeposit morphology. In the case of electrolyte based
on a Ge0.30Se0.70 glass, the growth of low (about 20 nm high) continuous dendritic
deposits are observed on surface of the lms (see Figure 16.7a). In the case of the Gerich glasses (Ge0.40Se0.60), the growth of isolated, tall (>100 nm) electrodeposits can
be seen (Figure 16.7b) [36]. The Ge0.30Se0.70 material has the higher chalcogen
content of the two, and therefore will possess greater and more uniform quantities of
j497
498
ion-supplying Ag2Se following the addition of Ag. This leads to dendritic growth that
is closer to that expected with a homogeneous material. The isolated growth on the
Ge0.40Se0.60 electrolyte is a direct consequence of the greater degree of separation of
the Ag-containing phases.
The alternative device conguration has the electrodes on opposite sides of a thin
electrolyte lm, so that the growth of the electrodeposit is forced to occur through
rather than on the electrolyte. Even though the capture and reduction of ions will
essentially be by the same mechanism, it is unlikely that growth inside an electrolyte
lm will follow the same type of evolution as surface electrodeposition. At this point
in time, although our understanding of the exact mechanism of growth within these
electrolytes is incomplete, it is clear that the role of the nanoscale morphology should
be considered. The conning nature of the medium, with its somewhat exible
channels and voids, will distort the shape of the electrodeposit, and its nanoinhomogeneity (as discussed above) will have a profound effect on local potential
and ion supply. The net result is that the electrodeposit will not necessarily appear to
be fractal or dendritic in nature, instead taking a form that is governed by the shape of
the glassy voids and crystalline regions in the electrolyte. An example of such
distorted morphology is shown in Figure 16.8; this is an electron micrograph of
an electrodeposit within a 60 nm-thick Ag0.33Ge0.20Se0.47 electrolyte between a
tungsten bottom electrode and a silver top electrode. This was captured by overwriting a large (5 5 mm) device to produce multiple internal electrodeposits and then
using a focused ion beam (FIB) system to ion mill a hole through the electrolyte [37].
The lament appears to be around 20 nm across, but this is misleading as the feature
continues to grow through ion reduction by the electron beam.
As a nal comment on morphology, reversing the bias dissolves the electrodeposit
as it becomes the oxidizable element in the electrochemical cell. Macroscopically, this
appears to be the reverse of the growth process, with the electrodeposit dissolving
backwards from its tip (or tips in the case of a more two-dimensional dendrite). On
closer inspection, the deposit actually dissolves near the tip region into a string of
j499
500
metal islands which then disappear into the electrolyte. This is a consequence of the
uneven nature of the electrodeposit, created in part by the nano-morphology of the
electrolyte, which allows some regions (perhaps associated with grain boundaries in
the metal) to dissolve slightly faster than others.
16.3.3
Growth Rate
As in any deposition process, the growth rate of the electrodeposit, V, will depend on
the ion ux per unit area, F, which corresponds to the current density, J, and the
atomic density of the material being deposited, N, by
V F=N J=qN
16:9
16:10
where E is the electric eld. In large devices, the eld will be relatively low, as will the
current density. For example, in a 100 mm-long lateral (coplanar electrode) structure
with 10 V applied, the eld is 103 V cm1. Ag-saturated ternary electrolytes such as
Ag-Ge-Se have a conductivity around 102 S cm1, and so the ion current density for
this eld will be 10 A cm2. Dividing current density by the charge on each ion
(1.60 1019 for Ag) gives the ion ux density, in this case 6.25 1019 ions cm2 s1.
Using Equation 16.9 above with the atomic density of Ag (5.86 1022 atoms cm3)
gives a growth rate of approximately 103 cm s1, or 10 mm s1. This is a gross
simplication, as the complex morphology of the electrodeposits and the moving
boundary condition of the advancing electrodeposit will complicate the deposition
process. However, this is the approximate average velocity that is measured in a real
device for the above conditions.
The electrodeposit growth rates are much more difcult to model in devices that
have a thin electrolyte layer sandwiched between two electrodes, and at this point any
knowledge of the nano-morphology of the material must be invoked. The average
elds in this case range from 105 to 106 V cm1, for applied voltages of a few hundred
millivolts to a few volts across an electrolyte that is a few tens of nanometers thick.
Local elds may be higher still, as most of the applied bias will be dropped across the
high-resistance glassy areas between the lower resistivity, metal-rich nanoclusters.
Taking a eld of 106 V cm1 with the conductivity given above suggests that the
growth rate will be in the order of 1 cm s1 or 10 nm ms1. This is much slower than
the rate suggested by measured switching speeds observed in actual devices (as will
be shown in the next section), so the simple approach that was appropriate for
macroscale devices apparently fails at the nanoscale. This may be due to a number of
factors. For example, at elds of 106 V cm1 or more, the linear conduction equation
no longer holds [38] and the mobility will be higher than in the macroscopic case. It is
also likely that ni is larger than the previously assumed 1019 cm3, as the overall silver
concentration can be as high as 1022 cm3 in Ag-saturated chalcogenide glass
electrolytes, and more of this is likely to be mobile due to barrier lowering in
materials subjected to such high elds. The overall effect is that the current densities
could be sufciently high to make the electrodeposit growth rates several orders of
magnitude higher in nanoscale devices.
16.3.4
Charge, Mass, Volume, and Resistance
The atomic weight of Ag is 107.9 g mol1, and the metal has a bulk density of
10.5 g cm3. This means that each atom weighs approximately 1.79 1022 g (107.9
divided by Avogadros number, 6.022 1023 atoms per mol) and this, of course, is the
smallest amount of mass that can be transferred using silver as the mobile ion. Each
cm3 of Ag contains 5.86 1022 atoms, but a more useful unit in these nanoscale
systems in the nm3; such a volume contains 58.6 atoms on average, and will weigh
approximately 1020 g. If this is the electrodeposited mass, then each Ag ion
requires one electron from the external circuit to become reduced to form the
deposited atom. So, each nm3 of Ag will require 58.6 times the charge on each
electron (1.60 1019 C) which is 9.37 1018 C of Faradaic charge (we can also use
Faradays constant, 9.65 104 C mol1, to perform this calculation). This charge is
merely the integral of the current over time, and so a constant current of 1 mA would
supply sufcient charge in 1 ms to deposit 105 nm3 or around 1 fg (1015 g) of Ag. So,
in this mass-transfer scheme, current and time are the control parameters for the
amount of mass deposited.
j501
502
The amount of charge supplied will also determine the volume of the electrodeposit. The increase in metal volume at a point on the surface of the electrolyte (or
decrease in volume at the anode) could be useful in a variety of microelectromechanical applications but it is the electrical resistance of this volume that is perhaps of
most interest. The resistance, R, of an electrodeposit is given by
R L=sm A
16:11
where sm is the conductivity of the metal, and L and A are its length and crosssectional area, respectively (volume is L A). If the electrodeposited material is silver,
sm will range from a value close to 5 105 S cm1 (slightly higher than the bulk
resistance) for features with thickness and width that are greater than a few tens of
nanometers to much larger values for sub-10 nm features where surface scattering
will play a considerable role. For a silver electrodeposit that is 20 mm long, 2 mm wide,
and 20 nm thick (not unlike the example shown in Figure 16.7a), the resistance is
about 10 O. This volume would take a total of 7.50 109 C to form. Note that if the
underlying Ag-Ge-Se electrolyte (with conductivity 102 S cm1) was 20 mm long,
20 mm wide, and 50 nm thick, its resistance would be 2 107 O (or closer to 4 108 O
for Ag-Ge-S), which is many orders of magnitude higher than that of the
electrodeposit. Hence, the overall resistance of the newly formed structure is
dominated by the electrodeposit. Figure 16.9 shows, graphically, the measured
on state data from 50 nm-thick silver-doped arsenic disulde electrolyte on a thick
oxide layer on silicon substrates, patterned into channels with large silver contacts
(100 100 mm) at the ends [39]. The off resistance, Roff, is a geometric function of
the channel dimensions, following Roff L/sdW Rc, where s is the conductivity
of the electrolyte layer (in the 103 S cm1 range), d is its thickness, and W and L are
width and length, respectively. Rc is the contact resistance (at zero channel length),
mainly due to electrode polarization and tunneling at the measurement voltage
through the polarization barrier. Rc is in the range of 108 to low 109 O for the electrode
conguration used. A 10 10 mm (W L) device therefore exhibits an Roff around
1.5 GO. The gure shows the results from a number of 10 mm- wide devices for
programming using a 5-s voltage sweep from 0.5 to 1.8 V with a 25 mA current limit.
This produces a substantial surface electrodeposit with a resistance of around
1 O mm1 of device length. The average electrodeposit contact resistance in this case
is around 9 O.
The resistance of the electrodeposit that forms within the nanostructured
electrolyte is also determined by its volume, but in this case the inuence of the
different phases present at the nanoscale must also be considered. As discussed
above, electrodeposition in its early stages is likely to occur on the metal-rich
clusters, through the glassy high resistance regions between them. This means that
the initial connection through the electrolyte will essentially consist of metallic
bridges between the relatively low-resistivity clusters. In the case of a link that is
dominated by the conductivity of the clusters rather than that of the metal, an onresistance in the order of 20 kO in a 50 nm-thick Ag-Ge-Se electrolyte would
require a conducting region less than 10 nm in diameter (assuming that the
conductivity of the Ag2Se material is close to the bulk value of 103 S cm1 [40]). In
the case where the electrodeposit dominates the pathway that is, when the
electrodeposited metal volume is greater than that of the superionic crystallites in
the pathway the electrodeposit resistivity will determine the on-resistance. In this
case, a 10 nm-diameter pathway will have a resistance in the order of 100 O. This
means that the diameter of the conducting pathway will not exceed 10 nm for
typical programming conditions which require on-state resistances in the order of
a few kO to a few tens of kO. The small size of the conducting pathway in
comparison to the device area explains why on-resistance has been observed to be
independent of device diameter, whereas off-resistance increases with decreasing
area [41]. An electrodeposit this small means that the entire device can be shrunk
to nanoscale dimensions without compromising its operating characteristics. This
has been demonstrated by the fabrication of nanoscale devices as small as 20 nm
that behave much like their larger counterparts [20, 42]. The other benet of
forming a small-volume electrodeposit is that it takes little charge to do so; in an
extreme case, if half the volume of a sub-10 nm-diameter, 50 nm-long conducting
region was pure Ag, only a few fC of Faradaic charge would be required to form a
low-resistance pathway.
As discussed above, reversing the applied bias reverses the electrodeposition
process so that the electrodeposit itself becomes the oxidizable anode and is thereby
dissolved. The amount of charge necessary to do this is essentially the same as that
required to grow the link in the rst place. However, it is not just the oxidation of the
electrodeposited metal that is responsible for breaking of the link, especially in the
case of a connection between the electrodes that are mostly metal rather than a chain
of metallic and Ag-rich electrolytic clusters. The very narrow (and uneven) link is
j503
504
16.4
Memory Devices
16.4.1
Device Layout and Operation
As shown in Figure 16.5, the basic elements of a resistance change device the solid
electrolyte, the oxidizable electrode, and the inert electrode may be congured
either laterally or vertically. Whereas lateral devices may have utility in a variety of
applications (e.g., microelectromechanical systems; MEMS), it is the vertical conguration that is of most interest in the context of memory devices. Vertical structures
occupy the smallest possible area, which is critical for high-density memory arrays. In
addition, the distance that the electrodeposit must bridge in order to switch a vertical
device to its low-resistance state a key factor in determining switching speed is
dened by the electrolyte thickness rather than by a lithographically dened gap. As
the lm thickness can typically be made much smaller than a lateral gap using
conventional manufacturing technology, vertical structures switch faster than their
lateral counterparts.
A schematic representation of how vertical solid electrolyte memory devices may
be integrated in a complementary metal oxide-semiconductor (CMOS) circuit is
shown in Figure 16.10. In this case, the inert electrodes are the tungsten plugs that are
normally used to connect one layer of interconnect metal to another. The solid
electrolyte layer is placed on top of these individual tungsten electrodes, and a
common oxidizable electrode (or a bilayer of oxidizable metal and another electrode
material) caps the device structures. The individual devices are dened by each
tungsten plug. It should be noted that the storage elements are built in the
interconnect layers above the silicon devices in a back-end-of-line (BEOL) process,
which means that the CMOS fabrication scheme need not be changed. A further
advantage is that only one extra mask is required to dene which tungsten plugs are
covered with the device stack, and which are through-connections to the upper layers
of the interconnect. This helps to reduce the cost of integration and also facilitates
embedding the memory with standard logic. In order to obtain the maximum
performance from the devices, each storage cell is connected through the underlying
interconnect to a select transistor in a one transistor-one resistor (1T1R) cell array
(Figure 16.11a). In this scheme, the transistor is used to select the cell, and an
appropriate programming voltage is then applied across the device. Passive arrays, in
which sneak current paths through the cells are avoided using diode elements in the
array itself rather than transistors, are also possible using row and column electrodes
with device structures at their intersections (Figure 16.11b). This latter approach does
not allow high-speed operation but does lead to the densest array possible as there are
no transistors to enlarge the total cell area.
The programming of the solid electrolyte memory devices is relatively straightforward. A forward bias (oxidizable electrode positive with respect to the inert electrode)
in excess of the threshold required to initiate electrodeposition is used to write the
j505
506
device. A negative bias is used to erase the device. Reading the state of the device
involves the application of a bias that will not disturb or destroy the current state.
This typically means that the devices are read using a forward bias that is below the
minimum required to write under normal operating conditions. This is shown
schematically in Figure 16.12, which shows a currentvoltage plot of a solid electrolyte
memory device. Only leakage current ows in the off state, but when the conducting
pathway forms at the write voltage (Vwrite), the current quickly rises to the programming current limit (Iprog). It should be noted that the electrodeposition continues after
switching, albeit more slowly than the initial transition, until the voltage across the
device reaches the minimum threshold for electrodeposition. The lower on-state
resistance is preserved until the erase initiation voltage is reached, at which point the
conducting pathway breaks and the device resistance goes high. Further negative bias
is required to fully remove the electrodeposited material and return the device to its
original off state. The device is read using a positive voltage below Vwrite and the
current measured to determine the state. Note that in Figure 16.12, Vread has been
chosen to be between the minimum voltage for electrodeposition and Vwrite
(the consequences of this are discussed in Section 16.4.3). The following section
provides results from a variety of fabricated devices.
16.4.2
Device Examples
To date, electrochemical switches have been fabricated using thin Cu2S [43] and
Ag2S [44] binary chalcogenide electrolytes. The Cu2S devices have been demonstrated in small memory arrays [45] and recongurable logic [46], and although the
applications show promise, there is room for improvement in device performance
factors such as retention and endurance with this particular electrolyte. The studies
on Ag2S devices have concentrated on switching by the deposition and removal of
small numbers of silver atoms in a nanoscale gap between electrodes. This is of major
signicance as it demonstrates that the electrochemical switching technique has the
potential to be scaled to single atom dimensions. Various oxide-based devices have
also been demonstrated [47, 48], and these show great promise as easily integrated
elements. However, the lower ion mobility in these materials tends to make the
devices slower than their chalcogenide counterparts. Devices based on ternary
chalcogenide electrolytes, including Ag-Ge-Se, Ag-Ge-S, and Cu-Ge-S have been the
most successful to date, with the silver-doped variants having been applied in
sophisticated high-density memory arrays [49] and post-CMOS logic devices [50].
Ag-Ge-Te devices have also been explored [51], but these materials have a tendency to
crystallize at low temperatures and so may not be the best choice for devices that must
be integrated with CMOS using elevated processing temperatures.
To illustrate the operation of devices based on ternary electrolytes, the discussion
will be conned here to those utilizing Ag-Ge-S and Ag-Ge-Se materials. Of these, the
Ag-Ge-S electrolyte is the most compatible with BEOL processing in CMOS fabrication as it can withstand thermal steps in excess of 400 C without any degradation of
device characteristics. The suldes possess better thermal stability as there is less
change in the nanostructure at elevated temperature than in the case of selenide
electrolytes [21, 52]. A typical device operation is shown in Figure 16.13a and b, which
provide currentvoltage and resistancevoltage plots respectively for a 240 nmdiameter W/Ag-Ge-S/Ag device with a 60 nm-thick electrolyte [53]. The voltage sweep
runs from 1.0 to 1.0 to 1.0 V, and the current limit is 10 mA. As mentioned above,
the write threshold for this material combination is 0.45 V, at which voltage the device
j507
508
The switching speed of a 500 nm-diameter W/Ag-Ge-Se/Ag device with a 50 nmthick electrolyte is illustrated in Figure 16.15, which shows both measured and
simulated device results for write (Figure 16.15a) and erase (Figure 16.15b) operations [55]. For the write, a 150-ns pulse of 600 mV was applied to the device, and the
output of the transimpedance measurement amplier shows that the device initially
switches in less than 20 ns, while the resistance continues to fall more slowly,
ultimately reaching an on-resistance of 1.7 kO at the end of the write pulse. For the
erase, a 150-ns pulse of 800 mV was applied, whereupon the output of the
transimpedance measurement amplier shows that the device transitions to a
high-resistance state (the start of the voltage decay in the output signal) in around
20 ns. The electrodeposit is essentially metal that has been added to a chemically
saturated electrolyte, and this local supersaturation leads to high stability of the
electrodeposit and excellent device retention characteristics. The results of a retention assessment test on a 2.5 mm-diameter W/Ag-Ge-S/Ag device with a 60 nm-thick
electrolyte annealed at 300 C, and programmed using a 0 to 1.0 V sweep is shown
in Figure 16.16. The plot shows the off and on resistances measured using a 200 mV
read voltage. The off state was in excess of 1011 O (above the limit of the measurement
instrument) and remained undisturbed by the read voltage at this level for the
duration of the test. The on resistance remained below 30 kO during the test.
Following this, the device was erased using a 0 to 1.0 V sweep, and the off-state
resistance measured using a 200 mV sensing voltage as before. The device remained
above 1010 O beyond 105 s, demonstrating that the erased state was also stable. Other
studies have show that both on and off states are also stable at elevated temperature,
with a margin of several orders of magnitude being maintained even after 10 years at
70 C [42]. Figure 16.17 provides an example of cycling for a 75-nm Ag-Ge-Se
electrolyte device [20]. Trains of positive (write) pulses of 1.2 V in magnitude and
1.6 ms duration followed by 1.3 V negative (erase) pulses of 8.7 ms duration were
j509
510
used to cycle the devices. A 10 kO series resistor was used to limit current ow in the
on state. The results are shown in 109 and 1011 cycle ranges. The data in Figure 16.17
show that there might be a slight decrease in on current, but this is gradual enough to
allow the devices to be taken well beyond 1011 writeerase cycles (if this decrease is
maintained, there will only be a 20% decrease in on current at 1016 cycles).
As mentioned above in this section, oxide-based electrolytes may also be used in
memory devices. Of these, Cu-WO3 [47] and Cu-SiO2 [48] are of particular interest as
they utilize materials that are already in common use in the semiconductor industry,
Figure 16.17 Current in the on (upper plot) and off (lower plot)
state at various numbers of cycles for a 75-nm Ag-Ge-Se device.
The device was cycled using trains of positive (write) pulses of
1.2 V in magnitude and 1.6 ms duration, followed by 1.3 V
negative (erase) pulses of 8.7 ms duration. The solid line is a
logarithmic fit to the on current data. (From Ref. [20].).
namely Cu and W for metallization and SiO2 as a dielectric, and this will help to
reduce the costs of integration. In general, the switching characteristics for both
systems are very similar to those observed in metal-doped chalcogenide glasses, and
that is why the same switching mechanism is assumed for the oxide-based cells, even
though the material nanostructure is quite different from that found in the ternary
chalcogenide electrolytes. For example, in the case of Cu-WO3, the Cu must exist
within oxide in unbound form for successful device operation [47]. For Cu-SiO2, the
best results are attained via the use of porous oxide, formed by physical vapor
deposition, into which the metallic copper is introduced by thermal diffusion so that
it exists in free formin thenano-voids in thebase glass. Inthe case of W-(Cu/SiO2)-Cu
devices with a 12 nm-thick electrolyte, both unipolar (positive voltage for both
write and erase) as well as bipolar switching has been observed [48]. Unipolar
switching requires high programming currents (several hundred mA to several mA)
to thermally break the electrodeposited copper connection in forward bias. Bipolar
switching with a resistance ratio of 103 is achieved with switching voltages below 1 V
and currents down to the sub-mA range. Highly stable retention characteristics
beyond 105 s and switching speeds in the microsecond regime have been demonstrated, and the possibility of multi-bit storage exists due to the relationship between
on-state resistance and programming current. These results, combined with the initial
endurance testing which showed that more than 107 cycles were possible with these
structures, indicate that this technology shows promise as a low-cost, low-energy Flash
memory replacement technology.
16.4.3
Technological Challenges and Future Directions
The above results indicate that memory devices based on electrodeposition in solid
electrolytes show great promise. However, although several substantial development
efforts are under way, many questions remain unanswered with regards to the
physics and long-term operation of this technology. The most pressing issues relate to
the reliability of such devices. In any memory technology, the storage array is only as
good as its weakest cell. Reduced endurance (cycling between written and erased
states), poor retention, and stuck bits plague even the most mature memory
technologies. It may be many years before the issues concerning the solid electrolyte
approach are fully understood, but considerable optimism exists regarding reliability
which may set this technology apart from others. For example, many technologies
suffer from reduced endurance due to changes in the material system with time. In
this respect, solid electrolyte devices can exhibit diminishing off-to-on-resistance
ratio with cycle number if incorrect programming (overwriting and/or incomplete
erase) leads to a build up of electrodeposited metal within the device structure. The
convergence of the off and on states eventually leads to an inability to discriminate
between them. However, it is possible electrically to reset the solid electrolyte using
an extended or hard erase; this will then plate the excess material back on to the
oxidizable electrode and return the electrolyte to its original composition. This ability
to change material properties using electrical signals allows such corrections to be
j511
512
performed in the eld, and this may have a profound effect on device reliability.
Another issue that can occur in written devices is the upward drift in programmed onstate resistance with time at elevated temperature. This is thought to be due to
thermal diffusion of the electrodeposited metal, but it may also be a consequence of
electromigration during repeated read operations. However, a read voltage that lies
between the write voltage and the minimum voltage for electrodeposition will
essentially repair or refresh a high-resistance/open on state. To illustrate this, the
device characteristics shown in Figure 16.13 are revisited. If a read voltage between
0.22 and 0.45 V is used, an off/erased device will not be written, but a device that has
been previously programmed will actually have its on state strengthened. This autorefresh above the minimum threshold for electrodeposition is unique to electrochemical devices. It should be noted that although this effect is extremely useful, it
can also lead to problems in incorrectly erased devices (those which are open circuit
but still have electrodeposited material on the cathode), as these can also be written at
read voltages. Clearly, under-erasing must be avoided in order to maintain high
device reliability.
Attention is now turned to the scaling of solid electrolyte memory devices. This
involves two points of consideration: physical scaling; and electrical scaling. Physical
scaling of the types of device described in the previous section has already been
demonstrated to below 22 nm, with good operational characteristics [42]. In addition,
studies on the bridging of nanometer-sized gaps between a solid electrolyte and a top
electrode seem to suggest that atomic-scale electrodeposits could be used to change the
resistance of the device, and this may represent the ultimate scaling of the technology [44]. What is not known is how the high-performance phase-separated chalcogenide electrolytes will scale, as these contain crystallites that approach 10 nm in diameter.
Clearly, further investigations are required in this area, although some are already
under way. The other aspect of scaling is electrical scalability. For example, the supply
voltage for highly scaled systems around the 22 nm node of the ITRS is on the order of
0.40.6 V. This means that, in order to avoid the use of area-, speed-, and energy-sapping
chargepumps, thememorycellsmust beable to operate at theverylowvoltages atwhich
solid electrolyte devices can function. In addition, the critical current density for 22 nm
interconnect is only a few tens of mA, and thedevices must also be ableto operate at these
current levels which, once again, is achievable by solid electrolyte devices.
The nal consideration for the future relates to memory density in the Tb (1012
bits)/chip regime. Such high storage densities will eventually be required for highend consumer and business electronics to replace mechanical hard drives in smallform factor, portable systems. If it is assumed that a 20 20 mm2 chip has an
extremely compact periphery such that most of the area is storage array, and a
compact cell at 4 F2 (where F is the half-pitch), then Tb storage would require F to be
10 nm at most. Such small wires cannot be produced using standard semiconductor
fabrication technologies without signicant variations, and their current carrying
capacity is very small. Backing-off to F 22 nm means that multi-level cell (MLC)
storage more than one bit per physical storage cell will be necessary to achieve Tb
storage. The ability in solid electrolyte devices to control the on-resistance using the
programming current allows multiple resistance levels to be stored in each cell. For
References
example, four discrete resistance levels leads to 2 bits of information in each cell (00,
01, 10, 11). Such MLC storage has already been demonstrated in a solid electrolyte
memory array that was integrated with CMOS circuitry [56]. A combination of the
above characteristics demonstrated physical scalability with low-voltage/-current/power and MLC operation, and it would appear that solid electrolyte memory devices
are a strong contender for future solid-state memory and storage.
16.5
Conclusions
References
1 Maier, J. (2005) Nature Materials, 4, 805.
2 Faraday, M. (1838) Philosophical
transactions of the Royal Society of London,
3 Kummer, J.T. and Weber, N. (1966) US
Patent 3,458,356.
4 Kirby, P.L. (1950) British Journal of Applied
Physics, 1, 193.
j513
514
References
40 Miyatani, S.-Y. (1960) Journal of the Physical
Society of Japan, 15, 1586.
41 Symanczyk, R., Balakrishnan, M.,
Gopalan, C., Gr
uning, U., Happ, T.,
Kozicki, M., Kund, M., Mikolajick, T.,
Mitkova, M., Park, M., Pinnow, C.,
Robertson, J. and Ufert, K. (November
2003) Proceedings of the 2003 NonVolatile Memory Technology Symposium,
San Diego, California, p. 17-1.
42 Kund, M., Beitel, G., Pinnow, C., Rohr, T.,
Schumann, J., Symanczyk, R., Ufert, K.
and M
uller, G. (2005) IEDM Technical
Digest, 31.5.
43 Sakamoto, T., Sunamura, H., Kawaura, H.,
Hasegawa, T., Nakayama, T. and Aono, M.
(2003) Applied Physics Letters, 82, 3032.
44 Terabe, K., Hasegawa, T., Nakayama, T. and
Aono, M. (2005) Nature, 433, 47.
45 Kaeriyama, S., Sakamoto, T., Sunamura,
H., Mizuno, M., Kawaura, H., Hasegawa,
T., Terabe, K., Nakayama, T. and Aono, M.
(2005) IEEE Journal of Solid State Circuits,
40, 168.
46 Sakamoto, T., Banno, N., Iguchi, N.,
Kawaura, H., Kaeriyama, S., Mizuno, M.,
Terabe, K., Hasegawa, T. and Aono, M.
(2005) IEDM Technical Digest, 19.5.
47 Kozicki, M.N., Gopalan, C., Balakrishnan,
M. and Mitkova, M. (2006) IEEE
Transactions on Nanotechnology, 5, 535.
48 Schindler, C., Thermadam, S.C.P., Waser,
R. and Kozicki, M.N.(2007) IEEE Trans.
Electron Devices, 54, 2762.
j515
I
Logic Devices and Concepts
j3
1
Non-Conventional Complementary Metal-Oxide-Semiconductor
(CMOS) Devices
Lothar Risch
1.1
Nano-Size CMOS and Challenges
The scaling of complementary metal-oxide-semiconductor (CMOS) is key to following Moores law for higher integration densities, faster switching times, and reduced
power consumption at reduced costs. In todays research laboratories MOSFETs with
minimum gate lengths below 15 nm have already been demonstrated. An example of
such a small transistor is shown in Figure 1.1a, where the transmission electron
microscopy (TEM) cross-section shows a functional, fully depleted silicon-on-insulator (SOI) transistor with 14 nm gate length, 20 nm spacers using a 17 nm thin
silicon layer and a 1.5-nm gate dielectric. The gate has been dened with electronbeam (e-beam) lithography. For the contacts, elevated source drain regions were
grown with selective Si epitaxy to lower the parasitic resistance, and a high dose of
dopants was implanted into the epi layer for source and drain. In Figure 1.1b, a TEM
cross-section through the n of a SONOS memory FinFET is shown with a diameter
of 8 nm, surrounded by the ONO charge-trapping dielectric. As can be seen, many
critical features in Si-MOSFETs are already in the range in the range of 1 to 20 nm.
However, achieving the desired performance gain in electrical parameters from
scaling will in time become very challenging, as indicated in the International Technology Roadmap for Semiconductors (ITRS) by many red brick walls [1] (see Figure 1.2).
The three main limiting factors for a performance increase are related to physical
laws. Gate leakage stops SiO2 scaling (see Figure 1.3), while source drain leakage
reduction needs higher channel doping and shallower junctions. However, this
increases junction capacitance, junction leakage, gate-induced drain currents, reduces carrier mobility and increases parasitic resistance. Because of this, transistors
with astoundingly small gate lengths down to 5 nm have been realized [2]; although
these are the smallest MOSFETs produced until now, their performance is worse than
that of a 20-nm device.
When considering memories, the situation is not much different, and for
mainstream DRAM and Floating Gate Flash several constraints can be foreseen.
For DRAM, the storage capacitance at small cell size and a low leakage cell transistor
Nanotechnology. Volume 4: Information Technology II. Edited by Rainer Waser
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31737-0
become a critical issue. For Floating Gate, the high drain voltages and scaling of the
gate dielectric, as well as coupling to neighboring cells, are critical.
Therefore, on the way to better devices, two strategies are proposed by ITRS. The
rst strategy is to implement new materials as performance boosters. Among these
are high-k dielectrics and metal gates, high-mobility channels and low-resistivity or
Figure 1.2 ITRS 04 roadmap: gate lengths and currents for high
performance, low operation power, low standby power.
metal source drain junctions. This will lead to a remarkable improvement in the
performance of transistors. The second strategy is to develop new device structures
with better electrostatic control, such as fully depleted SOI and multi-gate devices.
These can also be utilized in DRAMs as low leakage cell transistors, as well as in
nanoscale non-volatile Flash memories.
1.2
Mobility Enhancement: SiGe, Strained Layers, Crystal Orientation
Carrier mobility enhancement of electron and holes provide the key to increase the
on-currents without higher gate capacitance and without degrading the off-currents.
Several methods have been developed, including SiGe heterostructures [3] with a
higher hole mobility for the p-channel transistor. This is achieved by growing a thin
epitaxial Si1x Gex layer, where x is the Ge concentration, with a thickness of 510 nm
for the channel region directly on Si (see Figure 1.4). On top of the SiGe layer a thin Si
cap layer is deposited with a thickness of 35 nm, which is also used for the growth of
the gate oxide. This forms a quantum well for the holes due to a step in the valence
band of the Si/SiGe/Si heterostructure, with a depth of about 150 mV for a Ge content
of 20%. The SiGe layer is under bi-axial compressive strain due to the smaller lattice
constant of Si compared to SiGe (see Figure 1.4a). The mobility is enhanced because
of the lower effective mass of the holes in SiGe and a split of the degenerated threevalence bands, thus reducing intervalley scattering. Compared to pure Si with a peak
hole mobility of about 110 cm2 Vs1, with 0.25 Ge 210 m2 Vs1 have been achieved [4],
j5
1.3
High-k Gate Dielectrics and Metal Gate
As indicated in the ITRS roadmap, scaling of the classical SiO2 gate dielectric and
increasing the gate capacitance in order to achieve higher drive currents reached
completion at about 2 nm for low standby power, due to FowlerNordheim tunneling
currents through the gate dielectric. By using nitrided oxides, the minimum thickness
could be extended to about 1 nm for high-performance applications with a gate leakage
current of about 103 A cm2 [9]. The introduction of high-k dielectrics allows the use of
thicker dielectric layers in order to reduce the tunneling currents at the same equivalent
oxide thickness, or to provide thinner dielectrics for continuous scaling. Unfortunately,
j7
Figure 1.8 Conduction band offset versus k-value for different high-k materials [10].
all known high-k materials have a smaller bandgap than SiO2. In Figure 1.8 the
conduction band offset as a function of the dielectric constant is shown for different
materials [10]. For the highest k materials such as Ta2O5 (k 30) or TiO2 (k 90), the
bandgap becomes too small and leads to increased gate leakage. Other critical issues are
the growth of an interfacial layer during processing. Today, the most mature high-k
dielectrics are based on Hf. Among these, HfO2 (k 1725), HfSiO (k 11) and
HfSiON (k 911), the latter are the more temperature-stable. An equivalent oxide
thickness of below 1 nm has been demonstrated for these high-k materials [10]. Other
candidates are ZrO2 and La2O3 with dielectricconstants between20 and 30;however, the
former is incompatible with a poly silicon gate and requires a metal gate.
For most high-k dielectrics a degradation of mobility is observed due to an increased
scattering by phonons or a high xed charge density at the interface. Especially for
Al2O3, the hole mobility reduction is not acceptable. For the best Hf-based high-k
dielectrics a 20% lower mobility has been achieved until now, compared to SiO2.
Closely related to the high-k dielectric is a new gate material which avoids the
depletion layer of poly silicon gates and the reaction of the high-k material with silicon
at higher process temperatures. Moreover, metal gates offer the possibility of
adjusting the threshold voltage with the workfunction of the gate material instead
of doping in the channel, and this decreases the mobility at higher doping concentrations. The desired workfunctions for bulk with n poly and p poly silicon
gates for low-power/high-performance applications with low doped transistor channels are shown in Figure 1.9.
Midgap-like materials such as TiN, TiSiN and W are suitable for n- and p-channel
transistors with threshold voltages in the range of 300 to 400 mV, especially for fully
depleted SOI or multi-gate transistors with lower channel doping concentrations. For
optimized logic processes with low Vt transistors for high performance, in the range
of 100 to 200 mV, dual metals with n and p poly-silicon-like workfunctions must
be integrated. For n-channel transistors Ru is a candidate, and for p-channel Ta or
RuTa alloys.
Figure 1.9 Desired workfunction for bulk and FD MOSFETS [24], Pacha ISSCC 2006.
Another gate material option is a tunable workfunction, such as fully silicided NiSi
implanted with As and B, or Mo implanted with N. Until now, a shift of the
workfunction in a conduction band direction of 200 to 300 mV has been reported [11].
A cross-section of a 50-nm transistor with a fully silicided NiSi gate is shown in
Figure 1.10. Here, two approaches have been pursued: the rst approach, with Thin
Poly, allows the simultaneous silicidation of the source/drain (S/D) and gate, while
the second approach uses CMP, offers the independent silicidation of the S/D and
gate, and also avoids the formation of thick silicides in the S/D [10].
1.4
Ultra-Thin SOI
Many of the device problems due to short channel effects are related to the silicon
bulk. The SOI [12] uses only a thin silicon layer for the channel, which is isolated from
j9
10
the bulk by a buried oxide. Several companies producing semiconductors have already
switched to SOI for high-performance microprocessors or low-power applications.
Typically, the thickness of the Si layer is in the range of 50 to 100 nm, and the doping
concentrations are comparable to those of bulk devices. This situation, which is
referred to as partially depleted SOI, has several advantages, most notably a 1020%
higher switching speed. However, further down-scaling faces similar issues as the bulk,
and here thinner Si layers [13], which lead to fully depleted channels, are of interest.
A schematic representation and a TEM cross-section of a thin-body SOI transistor
with 12-nm gate length and 16-nm Si thickness on 100 nm buried oxide is shown in
Figure 1.11. The gate has been dened with e-beam lithography while, for the
contacts, raised source drain regions were grown with selective Si epitaxy and a high
dose of dopands was implanted into the epi layer.
The experimental currentvoltage (IV) characteristics of n-channel SOI transistors with gate lengths down to 12 nm are shown in Figure 1.12. For gate lengths
>32 nm, subthreshold slopes of 65 mV dec1 have been reached but, due to the still
relatively thick Si body of 16 nm, short channel effects begin to increase below
30 nm gate length, and the transistors with 12 nm gate length cannot easily be
turned off.
A two-dimensional (2-D) device simulation of the electrostatic potential of an SOI
transistor with undoped channel and a thinner silicon body of 10 nm is shown in
Figure 1.13 at a drain voltage of 1.1 V and a gate voltage of 0 V. For a gate length of
30 nm the gate potential controls the channel quite well. However, even with 10 nm Si
thickness the potential barrier is slightly lowered at the bottom of the channel.
>This gives rise to an increase in the subthreshold slope as function of gate length,
even for 5 nm Si thickness and 1 nm gate oxide (see the device simulation in Figure
1.16). A single-gate SOI exhibits the ideal subthreshold slope of 60 mV dec1 down to
about 50-nm gate lengths. In the gate length range of 50 to 20 nm, the turn-off
characteristics are still good, and therefore ultra-thin SOI can provide a device
architecture which is superior to that of bulk and suitable for the 32-nm node.
A simple scaling rule for fully depleted SOI devices proposes a Si thickness of about
one-fourth of the gate length in order to achieve good turn-off characteristics.
Whilst in these devices the channel was either low or undoped, this is not feasible
in bulk devices because of the punch through from source to drain. The mobility of
the charge carriers and the on-current is higher due to lower electric elds; this is
shown graphically in Figure 1.14 for different channel doping concentrations. At a
j11
12
gate voltage overdrive of 1 V the saturation current of the undoped transistor is twice
that of the doped channel, at 4E18 cm3 [14].
Moreover, without channel doping the Zener tunneling currents are reduced as
well as electrical parameter variations, due to statistical uctuations of the doping
atoms.
1.5
Multi-Gate Devices
Further reduction of the gate length will require two or more gates for control of the
channel, together with thin Si layers. The advantage of a multi-gate is to suppress the
drain eld much more effectively.
This is illustrated in Figure 1.15, by using the same simulation conditions as in
Figure 1.13 and adding a bottom gate to the 30-nm SOI transistor. As shown in
Figure 1.15, the electrostatic potential barrier is much higher than in the single-gate
device. The better electrostatic control results in a steeper subthreshold slope; this can
be seen in Figure 1.16, with a drift diffusion simulation of single- and double-gate
transistors. A very thin Si thickness of 5 nm and a equivalent gate oxide thickness
of 1 nm has been assumed, with a drain voltage of 1 V. Compared to the single gate, a
Figure 1.16 Simulated subthreshold slopes of single- and double-gate SOI transistors.
10-nm gate length and a subthreshold slope of 65 mV dec1 are predicted for a double
gate, and even 5 nm seems feasible with a reasonable subthreshold slope.
The challenge for multi-gate transistors will be to develop a manufacturable
process with self-aligned gates to S/D regions. Three promising concepts have been
investigated within the EC project NESTOR [15]: the rst was a planar double-gate
SOI transistor, which uses wafer bonding [16]; the second was a gate all-around
device, based on silicon-on-nothing (SON) [17]; and the third was a FinFET type [18]
(see Figure 1.17).
1.5.1
Wafer-Bonded Planar Double Gate
j13
14
1.5.2
Silicon-On-Nothing Gate All Around
fabricated [17]. Within the EC project NESTOR, devices with gate lengths of 25 nm
have been achieved (see Figure 1.22a). These exhibit excellent short-channel
characteristics, with S 70 mV dec1, DIBL 11.8 mV, and high on currents of
1540 mA mm1 (Ioff 2 mA mm1, tox 2 nm) at 1.2 V (see Figure 1.22b). As shown in
j15
16
Figure 1.22a, the bottom gate is still larger than the top gate. Ongoing studies have
focused on a reduced bottom gate capacitance and a self-aligned approach.
Recently, multi-bridge transistors [21] have been reported using a similar type of
SiGe layer etch technique for the fabrication of two or more channels stacked above
each other and with an on-current of up to 4.2 mA mm1 at 1.2 V.
1.5.3
FinFET
The FinFET [18, 22] can provide a double- or triple-gate structure with relatively
simple processing (see Figure 1.23). First, the n on SOI is structured with a
tetraethylorthosilicate (TEOS) hardmask (Figure 1.23, left). A Si3N4 capping layer
shields the top of the n for a double-gate FinFET, and the same process ow can be
used for triple-gate devices, without the capping layer. Next, a gate dielectric and the
poly-Si gate are deposited and structured with litho and etching (Figure 1.23, center).
The buried oxide provides an etch stop for the denition of the n height. After this, a
gate spacer is formed, raised source/drain regions are grown with epitaxy, and highly
doped n or p regions implanted (Figure 1.23, right). The source/drain regions are
enhanced using selective Si epitaxy to lower the sheet resistance. The facet of the
Silicon epitaxy has been optimized to reduce the capacitance of drain to gate.
A TEM cross-section of a 20-nm tri-gate FinFET [23] is shown in Figure 1.24. Here,
the top of the Si n is also used for the channel, and no corner effects are observed at
low n doping concentrations. The n and the gate layer have been processed with
e-beam lithography. The smallest n widths are in the range of 10 nm (see also
Figure 1.30).
TEM cross-sections of a tri-gate device with larger ns of about 36 nm are also
shown in Figure 1.24. The n height is in the range of 35 nm, the corners are rounded
by sacricial oxidation, the gate dielectric is 23 nm SiO2, and the poly gate surrounds
the n with a slight under-etch of the buried oxide.
The measured IVg characteristics of n- and p-channel FinFETs with 20-nm and
30-nm gate length, respectively, are depicted in Figure 1.25. For the n-channel
transistor a saturation current of 1.3 mA mm1 (normalized by n height) at an offcurrent of 100 nA mm1 has been achieved at a gate voltage of 1.2 V, despite a relaxed
j17
18
gate oxide thickness of 3 nm. For the p-channel, a high on current of 500 mA/mm and
an off current in the range of 5 nA mm1 is measured at 30-nm gate length. The
FinFET has the advantage of self-aligned source and drain regions.
In Figure 1.25 the current was normalized on the height of a single n. The
electrical width of the device would be 2.2 times larger. For circuit applications, multins are often needed in order to achieve higher drive currents (in Figure 1.26 the
device has four ns) [24]. For a comparison with planar transistors it is important how
many ns with height, width and pitch can be integrated on the same area as for the
conventional device.
With respect to the switching time of multi-gate devices, the drive current together
with the gate capacitance must be considered. Here, it was shown by simulation, that
multi-gate devices can achieve 1020% faster delay times compared to single-gate
devices, mainly due to the better Ioff/Ion ratio [25]. This was conrmed experimentally
in Ref. [24] for inverter FO2 ring oscillators, where tri-gate FinFETs with TiSiN gate,
Figure 1.26 Scanning electron microscopy image of a multichannel FinFET [24] with four fins on SOI. The gate length is
60 nm, fin width 30 nm, and pitch 200 nm.
55 nm length, and a low-doped channel achieved, with 21 ps, a much better speed
performance than comparable planar MOSFETs in a 65 nm low-power CMOS
technology, especially for sub-1 V power supply voltages.
1.5.4
Limits of Multi-Gate MOSFETs
The physical limit for the minimum channel length of multi-gate transistors has been
investigated with 3-D quantum mechanical simulations using the tight binding
method [26]. The device is composed of atoms in the silicon crystal lattice; the current
can ow either by thermionic emission across the potential barrier of the channel, or
directly via tunneling through the barrier from source to drain (see Figure 1.27).
In Figure 1.28, the simulated source drain current as a function of gate voltage is
given with and without band to band tunneling for different gate lengths. An
aggressive Si thickness of 2 nm and equivalent oxide thickness of 1 nm has been
assumed. For gate lengths of 8 nm the tunneling contribution is on the order of the
current over the potential barrier. At 4 nm the current is increased by two orders of
magnitude by tunneling, but even 2-nm gates seem possible with off currents in the
range of mA mm1, corresponding to ITRS hp specications. Gate control is still
effective and would achieve a subthreshold slope of about 140 mV dec1.
1.6
Multi-Gate Flash Cell
Multi-gate transistors are also very suitable for highly integrated memories with
small gate lengths. Flash memory cells require thicker gate dielectrics than in logic
j19
20
Figure 1.28 Thermionic and total current (tunneling) of doublegate FinFETs simulated with the tight binding method [26].
applications, and therefore exhibit enhanced short channel effects. Currently, the
most widely used Flash cell consists of a transistor with a oating gate [27] or a chargetrapping dielectric [28] sandwiched between the gate electrode and the channel
region. A small amount of charge is transferred into the storage region either by
tunneling or hot electron injection. This can be stored persistently and read out by a
shift in the IVg characteristics. A schematic cross-section of a tri-gate FinFET
memory transistor with improved electrostatic channel control compared to a planar
device is shown in Figure 1.29, where a multilayer ONO gate dielectric around the n
serves as the storage element.
An experimentally realized memory structure [29] with a very small Si n of 12 nm
width and height of 38 nm is shown in Figure 1.30. The multilayer dielectric
consisted of 3 nm SiO2, 4 nm Si3N4 and 6.5 nm SiO2.
The charge is uniformly injected into the nitride trapping layer by FowlerNordheim tunneling. The electrical function has been veried experimentally down to
Figure 1.30 TEM cross-section of a tri-gate memory cell with 12 nm fin width and ONO dielectric.
20 nm gate length [29]. During an applied gate voltage of 12.5 V, 2 ms, electrons are
injected and shift the IVg curves to positive voltages (write) (see Figure 1.31).
A Vt shift of about 4 V (write) was obtained using a n width of 12 nm at gate
lengths of 80 to 20 nm. The application of a negative gate voltage (erase) of 11 V, 2 ms,
injects holes into the nitride layer or detraps electrons and shifts the I-Vg curves back
to low Vt.
Due to the large Vt shift, multi-level storage becomes also feasible. Four levels with
about 1 V separation have been programmed in the 40-nm memory transistor. The
charge of one level corresponds to about 100 electrons.
The retention time for the charge-trapping dielectric has been tested, and a
programming window of 3.6 V for single level was extrapolated after 10 years.
Excellent retention properties between all levels are observed (see Figure 1.32).
If operated in a 45 F2 high-density array such as NAND, the tri-gate cell would
enable memory densities up to 32 Gbit at a die size of 130 mm2 for the 25-nm node. A
schematic NAND layout is shown in Figure 1.33. Finally, scaling is limited by the
thickness of the two oxidenitrideoxide layers, plus the minimal gate electrode
thickness between the ns.
j21
22
Figure 1.32 Retention time for the 40-nm tri-gate memory cell
with oxidenitrideoxide (ONO) dielectric at room temperature.
The different symbols indicate the different write voltages of
Figure 1.31.
1.7
3d-DRAM Array Devices: RCAT, FinFET
For DRAMs, extremely low leakage current array devices below 1 fA per cell are
required in order to avoid too-high charge losses during the refresh time interval. A
contribution to the leakage current originates from the sub-Vt current of the cell
transistor, while others are junction leakage and tunneling currents through the
dielectric of the storage capacitor. With respect to the cell transistor, the channel
doping cannot be increased in order to improve the turn-off characteristics, because
of the electric eld, which will initiate trap-assisted tunneling leakage currents [30] at
E > 0.5 MV cm1. Therefore, the planar DRAM cell transistors can be scaled down
only to about 70 nm [30]. A schematic and a SEM cross-section of the 70-nm trench
DRAM cell are shown in Figure 1.34.
For future applications, new cell transistor structures must be implemented. For
stack DRAM cells, the transition to a recessed channel array transistor (RCAT)
has already been reported for the 90- to 80-nm generation [31]. Such a device, with a
U-shaped groove etched into silicon with a depth of about 200 nm, is shown in
Figure 1.35 [32]. After gate dielectric growth, the groove is lled with the poly-silicon
gate material. Bitline and storage node contacts are on the planar silicon. Such a
structure is suitable for sub-70-nm generations because it provides a longer channel
for lower Ioff currents. In this Extended U-shape Device, a gate wrap-around the Si
sidewalls with a depth of 610 nm increases the on-current and improves the
subthreshold slope. The 3d device has been integrated into a 90-nm DRAM test
array [32]. Simulation and measurement are shown in Figure 1.35b, with and without
a corner device of about 6 nm. The subthreshold slope is in the range of 95 to
130 mV dec1 at 85 C, and the side gates enhance the on-current by 30%.
Reducing the width of the cell transistor to sublithographic dimensions and
utilizing deeper vertical sidewalls leads to a fully depleted FinFET device with
improved electrostatic control and increased on-currents [32]. A schematic crosssection in bitline direction of a trench cell with a FinFET array transistor, together
with a SEM cross-section of a realized structure in 90 nm technology, are shown in
Figure 1.36. The n has a width of about 20 nm and a height of 50 nm. The transistor
has been implemented using a local Damascene technique for n and gate. The local
gates are connected with a WSi wordline, which is also used for the gate layer of the
planar transistors in the periphery circuits. The body is connected to the substrate and
isolated to the neighboring n with Sallow Trench Isolation.
j23
24
1.8 Prospects
1.8
Prospects
Assuming that lithography tools such as Extreme Ultra-Violet will be available for
the sub-45-nm technology nodes, it seems very likely that the scaling of Si CMOS
will continue down to the 22-nm node, with the start of production in the year
2016, according to the ITRS roadmap. In this scenario which is known as More
Moore technology costs must be reduced per chip from generation to generation,
and performance must be increased. This will be expected especially for memories
and microprocessors, and in order to fulll these requirements more challenging
new process modules, such as metal gate, high-k dielectrics, and strain will need to
be integrated with high yield and in good time. On the other hand, conventional
bulk CMOS may run into performance constraints below the 45-nm generation.
Multi-gate devices with thin silicon channels and better electrostatic control may
take over and will allow further downscaling, but with more complex processing.
For DRAMs and Flash, the integration of such 3d transistors with very low leakage
currents has already been started. Ultimately, beyond 10 nm the process tolerances
and variability of the electrical parameters will become the most limiting factors.
In addition, with the consistently good scaling potential of Si MOSFETs, many
applications such as low-frequency RF, analogue, and powerFETs, displays and
sensors do not require extremely small feature sizes. Therefore, additional
functionality on the chip referred to as More than Moore will be another key
trend.
Another important issue is the increasing research into new logic and memory
devices. Among these are the 1d wire structures of Si, Ge or carbon with source, drain
and gate, such as Si MOSFETs. These devices show similar IV characteristics to Si
(or even better), depending on the normalization of the current on the small width of
the devices. However, the manufacturability and integration on a large scale has still
j25
26
to be proven, and the key for success would be the integration capability with Si
CMOS.
With regards to memories, many promising new concepts have appeared, based
on new materials such as the storage element. Among these are included non-volatile
memories, with a large change in resistance, such as Phase-Change or Conductive
Bridging. These memories can be combined very well with a Si access transistor and
CMOS circuitry. With these evolutionary elements, non-conventional CMOS represents the most realistic approach for high-density logic and memories, and will
undoubtedly represent the dominant technology of the nanoelectronics era.
Acknowledgments
The studies on SOI MOSFETs have been partly supported within the BMBF project
HSOI, and Multi-Gate Devices within Extended CMOS and the EC Project NESTOR,
IST-2001-37114. The author thanks the NESTOR partners for their courtesy,
especially S. Deleonibus, T. Poiroux, P. Coronel, S. Harrison, N. Collaert, and
Y. Ponomarev. Thanks are also expressed to the authors colleagues at Inneon/
Qimonda for their contributions, notably M. Alba, L. Dreeskornfeld, J. Hartwich, F.
Hofmann, G. Ilicali, J. Kretz, E. Landgraf, T. Lutz, H. Luyken, W. R
osner, M. Specht,
M. Staedele, C. Pacha, and W. Mueller.
References
1 ITRS Roadmap 2004 edition, http://public.
itrs.net.
2 H. Wakabayashi, S. Yamagami, N.
Ikezawa, A. Ogura, M. Narihiro, K.-I. Arai,
Y. Ochiai, K. Takeuchi, T. Yamamoto, T.
Mogami, Sub-10 nm planar-bulk-CMOS
devices using lateral junction control
(5 nm CMOS), IEDM Technical Digest
2003, 989.
3 D. K. Nayak, J. C. S. Park, K. Wang, K. P.
MacWilliams, Enhancement-Mode
Quantum-Well GexSi1-x PMOS, IEEEEDL 1991, 12, 154.
4 L. Risch, et al., Fabrication and electrical
characterization of Si/SiGe p-channel
MOSFETs with a delta doped boron layer,
Proceedings of ESSDERC, p. 465, 1996.
5 K. Rim, S. Koester, M. Hargrove, J. Chu,
P. M. Mooney, J. Ott, T. Kanarsky, P.
Ronsheim, M. Ieong, A. Grill, H.-S. P.
Wong, Strained Si CMOS (SS CMOS)
References
10
11
12
13
14
15
16
j27
28
Further Reading
Short Course on Silicon: Augmented Silicon
Technology, Organizer T.-J. King, IEDM,
December 7, 2003.
Emerging Nano-Electronics: Scaling
MOSFETs to the Ultimate Limits and
Beyond-MOSFET Approaches, Organizers
P. Zeitzoff, T. Mogami, VLSI Technology
Short Course, June 14, 2004.
j29
2
Indium Arsenide (InAs) Nanowire Wrapped-Insulator-Gate
Field-Effect Transistor
Lars-Erik Wernersson, Tomas Bryllert, Linus Froberg, Erik Lind, Claes Thelander, and
Lars Samuelson
2.1
Introduction
Semiconductor nanowires [19] offer the possibility to form a new class of semiconductor device. Nanowire technology enables new material combinations and also the
possibility to enhance the potential control in down-scaled channels using wraparound gates. As the lateral dimensions of semiconductor materials are scaled down
towards 100 nm and below (which can be easily achieved with the nanowire
technology), fewer constraints become apparent in terms of lattice matching between
materials. This opens the path to a heterogeneous materials integration that cannot
be accomplished with conventional bulk semiconductor technology. For example, it
has been shown that segments of InP can be incorporated in indium arsenide (InAs)
nanowires [10], and that InAs nanowires can be grown on Si substrates [11], in spite of
about 3.5% and 7% lattice mismatch, respectively. These material combinations
cannot be synthesized in the bulk, nor with planar epitaxial techniques. The second
advantage is related to the challenges that the technology is facing as the planar
transistors are scaled down towards the 22 nm node and beyond. At this length scale,
the transistors are more sensitive to short-channel effects related to the reduced
potential control in the channel and the body of the devices. This is reected in an
increased output conductance and sub-threshold swing of the transistors that
degrade the transistor performance. Dual gates, trigates and FinFETs have been
demonstrated to reduce these issues. Taking the technology one step further is to
completely surround the channel with a wrapped gate, and for this technology vertical
nanowires are ideal.
Several groups have reported on the successful fabrication of vertical nanowire
transistors [1222]. Various implementations of Si transistors have been reported
and, in particular, it has been shown that the wire geometry may be used to fabricate
different advanced transistors with benets in sub-threshold characteristics and
switching behavior. The present authors effort has been focused on the vertical
30
implementation of IIIV transistors, and in the following sections are described the
processing and characteristics of both long- and short-channel transistors. The
importance is also demonstrated of introducing a high-k dielectric, its inuence on
the device characteristics, and the benets of heterostructure design.
2.2
Nanowire Materials
In these studies, InAs has been the primary choice of material in the transistors. For
various reasons, wrap-gate transistors based on silicon will naturally have a very
strong standing, due primarily to the compatibility with silicon technology in general
and also to the fact that Si nanowires can be made with diameters even <5 nm and yet
still be conducting. InAs, on the other hand, shows very strong lateral connement
effects already for diameters around 30 nm, making very narrow uncapped InAs
transistors depleted of charge-carriers and, in that sense, less promising. In contrast,
n-type InAs has highly attractive material properties, with a reproducible Fermi-level
pinning in the conduction-band, with a very high room-temperature mobility and
ideal ohmic-contact properties. The remainder of this chapter focuses on the use of
InAs as the active transistor channel material and the use of P-containing InAs1xPxalloys for enhancement of the transistor functionality and performance.
2.3
Processing
The nanowires used here are grown with chemical beam epitaxy (CBE), using
patterned Au discs to locate the nanowire growth and to set the diameter of the
nanowires (Figure 2.1). The ability to form well-dened matrices of nanowires is a key
feature both for the post-growth device processing and for the transistor design, in
that the number of wires determines the drive current and the transconductance of
the nanowire transistor. The uniformity in length provides good starting conditions
for uniform top contact formation. Typically, nanowire matrices ranging from 1 1
to 10 10 are used to form the vertical transistors in order to reach drive currents
approaching 10 mA, but a nanowire transistor may be dened by anything from a
single wire to, say, 104 wires. Outside the active transistor region, smaller arrays of
nanowires are formed to create alignment markers for optical lithography in the postgrowth processing described below. The seed for these wires are formed in the same
seed and growth steps as the actual transistor wires.
After the growth, either long-channel or short-channel transistors may be formed
by processing the transistors in two different ways, as shown in Figures 2.2 and 2.4,
respectively [1522]. For the long-channel transistors, the vertical nanowire matrix is
rst covered by a SiNx gate-dielectric layer, followed by a sputtered Ti/Au gate metal
that is covering the nanowires uniformly. The sample is spin-coated by a photoresist,
which is back-etched to the desired gate length, after which the gate metal is
2.3 Processing
selectively removed by wet-etching. After removal of the resist, the sample is covered
by a second resist layer and the SiNx is etched to open for formation of the drain Ti/Au
top contact by evaporation. Finally, an airbridge is created by electroplating from
the drain contact, and the resist is dissolved. With this technology, the transistor
structure shown in Figure 2.3 is formed. While this technology provides good longchannel devices, in which uctuations in the gate-length are less important due to the
smaller relative change, is seems difcult to reproducibly scale the denition of the
Figure 2.2 Processing scheme for long-channel transistors with sputtering and back-etching.
j31
32
gate-length below 100 nm when using this back-etch process. Instead, a direct
evaporation method is used to form gates with a length below 100 nm, as described
next.
For the processing of short-channel transistors, a direct evaporation of the gate
metal has been developed. In this process, the metal gate is evaporated onto the SiNxcovered nanowires, the main benet of this approach being that the gate-length is
determined by the thickness of the evaporated layer. This is in contrast to the
previously described long-channel process, where it is set by the thickness of the
back-etched polymer lm. A scanning electron microscopy (SEM) image of a formed
80 nm-thick gate is shown in Figure 2.5. As can be seen in the image, an intimate
contact is formed between the gate layer and the nanowire. Following gate formation,
the drain contact is formed by spin-coating the sample with a resist and wet etching of
Figure 2.5 InAs nanowires coated with SiNx penetrating an evaporated Ti/Au gate electrode [15].
the tips of the nanowires that penetrate the organic lm. Finally, an evaporated drain
top contact is formed over the wires.
2.4
Long-Channel Transistors
j33
34
using the Atlas device simulator created by Silvaco [16]. As these devices have a long
channel length and a wide diameter, effects related to lateral quantization, doping
uctuations and ballistic transport may be omitted, and the transistors may be
modeled within the drift-and-diffusion formalism. The simulated data in Figure 2.9
are obtained for Nd 2 1017 cm3 and me 10 000 cm2 V11 s1, and reproduce
the measured data both in the on-state and in the off-state. Thus, the measured
data may be reproduced both by analytical modeling and by simulation in a
MISFET model.
2.5
Short-Channel Transistors
Scaling is of importance to any FET-technology, and for the nanowire FET the scaling
of both the gate length and nanowire diameter must be considered. The processing
outlined above has been used the to fabricate transistors with 80 nm nominal gate
length [19, 20]. During the growth of these nanowires, matrices with different
nanowire diameters (70 and 55 nm) have been included on the same sample. Both
types of nanowire transistor were processed in the same batch, and the transistor
characteristics compared (see Figure 2.10). In both cases, good transistor characteristics were observed, with both transistors showing a limited current saturation, even
Figure 2.9 Measured and simulated sub-threshold characteristics for a long-channel transistor [15].
j35
36
Figure 2.10 Measured transistor characteristics for 80 nm gatelength NW-transistor with 70 nm (left) and 55 nm (right)
diameters [20].
at comparably large drain voltages. The increased saturation voltage arises from a
series resistance in the 1 mm separation between the gate and the drain. For biases
above 1 V, the larger-diameter transistors show a punch-through in the characteristics, a feature that was not observed in the narrower transistors that have a better
potential control. When the drain current was normalized with the circumference of
the nanowire, only a minor drive current reduction per gate width was observed as the
diameter was reduced; this demonstrates good scalability in the technology.
In order to scale the gate length further, the relatively thick SiNx dielectric was
replaced with 10 nm HfO2, a material with a higher dielectric constant (15)
(Figure 2.11) [21, 22]. The HfO2 was deposited using atomic layer deposition (ALD),
which gives a uniform dielectric coverage and a very accurate thickness. A 100-nm
layer of silicon oxide was also deposited, to act as a lower-k spacer layer between the
InAs substrate and the wrap-gate. Next, a 50-nm Cr gate layer was formed by metal
evaporation. Finally, a 100- to 200-nm-thick polymer layer was deposited on top of the
gate to provide insulation between the gate and the drain contact. Despite a very short
gate length (50 nm), considerably improved dc characteristics were observed compared to previous device designs. Transconductance values up to 0.8 S mm1 were
obtained (Vsd 1 V), with an inversed sub-threshold slope around 100 mV dec1. The
transconductance values were in this case normalized to the total nanowire circumference for the array. In addition to a gate swing of 0.5 V, an Ion/Ioff ratio >1000 at
Vsd 0.5 V following the conventional denition [23] was observed, whereas a
maximum Ion/Ioff ratio above 104 was measured.
2.6
Heterostructure WIGFETs
The wrap-gate transistors show a good on-state characteristics, but even the long gate
transistor characteristics suffer from a comparably large inverse sub-threshold slope
(100 mV dec1) and a non-negligible off-state current. This is worse for the devices
with a short gate and a comparably thick (40 nm) gate-dielectrics. The transistors with
high-k gate oxides also show effects related to the narrow InAs band gap that allows
for impact ionization processes and thus creates a limited potential barrier in the offstate. The nanowire technology offers alternative transistor designs in that heterostructure segments may be incorporated into the transistor channel to alter the band
gap in critical regions. A segment of InAsP was introduced into the InAs channel of
a nanowire transistor and the role of the barrier in transistor performance subsequently investigated [24]. A 150 nm-long segment of InAsP was introduced into a
4 mm-long, 50 nm-diameter InAs nanowire grown by CBE. The nominal P content in
the InAsP segment was 30%, and the conduction band barrier 180 meV. The
nanowire was placed in a lateral geometry with a Si/SiO2 back gate, where two
drain contacts and one source contact was used in order to fabricate and evaluate
transistors with the same geometry differing in only the InAsP barrier (Figure 2.12).
Room-temperature data for the two types of transistor are shown graphically in
Figure 2.13.
j37
38
Both transistors showed good characteristics with current saturation and a decent
drive current level. For a given bias condition (Vsd 0.3 V and Vg 2.0 V) the InAs
transistor had a factor 2:1 higher drive current than the InAsP transistor. This was
expected due to the introduction of the barrier. From the transfer characteristics,
however, it should be noted that the current reduction was not related to a degradation
in the transconductance, but rather to a shift in the threshold voltage. In fact, the
measured transconductance remained constant, and for a xed gate-overdrive the
drive current was the same. When turning to the sub-threshold characteristics, major
improvements were noted in both the inverse sub-threshold slope and the maximum
Ion/Ioff ratio as the barrier was introduced. Finally, temperature-dependent measurement of the current level was used to verify the presence of the barrier and to evaluate
its height (Figure 2.14). The role of the barrier in this geometry is not only to block the
off-current in the body of the wire, but also (and even more in this lateral geometry) to
block the leakage current along the edges of the wires.
2.7 Benchmarking
2.7
Benchmarking
It is of great value to perform an early evaluation of the potential in this new wrap-gate
transistor technology. Hence, the performance of 100 nm gate-length transistors
(structure shown in Figure 2.15) has been simulated and the characteristics evaluated
according to the metrics of high-performance logic devices, including the gate delay
Figure 2.15 Schematic of nanowire geometry used for the bench-marking [25].
j39
40
Figure 2.17 Simulated transfer characteristics (left), subthreshold characteristics, and deduced inverse sub-threshold
slope (right) for InAsP nanowire transistors with varying
P content [24].
2.8 Outlook
2.8
Outlook
Based on the experimental results obtained to date, the question might be asked as to
how far the nanowire FET technology may be developed? Critical issues for scaled
devices are related to the growth of narrow nanowires with diameters of 10 to 30 nm
and the processing of vertical gates on the 20 nm length scale. Based on experimental
j41
42
results, devices processed on these dimensions seem feasible in the near future.
However, in order for these devices to be competitive it will be necessary for the drive
current to be increased and the parasitics reduced. Likewise, good control of the
carrier concentration in the channel and in the source and drain regions will be
needed, as will an understanding and control of the interface properties in capped
wires. The main benet of the wire geometry the possibility for heterostructure
design in the axial and radial directions may well prove to be the key when
addressing these issues.
Acknowledgments
References
1 C. P. Auth, J. D. Plummer, Scaling theory
for cylindrical, fully-depleted,
surrounding-gate MOSFETs, IEEE
Electron Device Lett. 1997,18 (2), 7476.
2 H. Takato, K. Sunouchi, N. Okabe, A.
Nitayama, K. Hieda, F. Horiguchi, F.
Masuoka, Impact of Surrounding Gate
Transistor(SGT)forultra-high-densityLSIs,
IEEE Trans. Electron. Dev. 1991, 38 (3), 573.
3 S. D. Suk, S.-Y. Lee, S.-M. Kim, E.-J. Yoon,
M.-S. Kim, M. Li, C. W. Oh, K. H. Yeo, S. H.
Kim, D.-S. Shin, K.-H. Lee, H. S. Park, J. N.
Han, C. J. Park, J.-B. Park, D.-W. Kim, D.
Park, B.-I. Ryu, High performance 5 nm
radius twin silicon nanowire MOSFET
(TSNWFET): fabrication on bulk Si wafer,
characteristics, and reliability, Int. Electron
Devices Meeting Tech. Dig. 2005, 735738.
4 S. C. Rustagi, N. Singh, W. W. Fang, K. D.
Buddharaju, S. R. Omampuliyur, S. H. G.
Teo, C. H. Tung, G. Q. Lo, N.
Balasubramanian, D. L. Kwong, CMOS
inverter based on gate-all-around siliconnanowire MOSFETs fabricated using topdown approach, IEEE Electron Device Lett.
2007, 28 (11), 10211024.
References
11
12
13
14
15
16
17
j43
j45
3
Single-Electron Transistor and its Logic Application
Yukinori Ono, Hiroshi Inokawa, Yasuo Takahashi, Katsuhiko Nishiguchi, and Akira Fujiwara
3.1
Introduction
46
3.2
SET Operation Principle
simply the island or a quantum dot. The Coulomb island must be conductive so that
electrons can travel from the source to the drain via it. The role of this island is to
capture/donate one electron from/to the source/drain, and otherwise to hold
captured electrons. The region between the island and source (and also drain) must
not be a good conductor; it must basically be insulating, so that electrons move to/
from the island only by the tunneling. This region is called the tunnel capacitor or the
tunnel junction. One the other hand, the region between the island and the gate should
be insulating so as not to allow electrons to ow between them, as in a conventional
transistor. In the equivalent circuit, the double box symbolizes a tunnel capacitor,
which is a special capacitor that allows quantum mechanical tunneling of electrons,
as mentioned above. The region sandwiched by the tunnel capacitors corresponds to
the island, which is designated by an oval for visualization. The region between the
island and the gate can be expressed as a normal capacitor.
Figure 3.2 explains what happens when the gate voltage is varied with a xed small
source/drain voltage. When a positive voltage Vg is applied to the gate, positive
charges are induced there, whose number is given by CgVg/e, where Cg is the gate
capacitance. Then, in order to minimize the free energy of the system, the SET tries to
induce the same number of negative charges (i.e. electrons) in the island, and these
electrons are conveyed from the source or drain through the tunnel junctions. If
CgVg/e is some integer N, the island obtains N electrons. After reaching this number,
no more electron movement occurs. This is the Coulomb blockade state (the left
equivalent circuit in Figure 3.2). When CgVg/e is not an integer, for example, a half
integer N 1/2, the number of electrons in the island changes with time so that it
becomes N 1/2 on average. What actually occurs in the SET is as follows. First, one
electron enters the island from the source and the number of electrons in the island
j47
48
becomes N 1. Next, one electron emits to the drain from the island, resulting
in N. This one-by-one electron transfer is repeated so that there is a net current
between the source and drain. This is the single-electron-tunneling state (the right
equivalent circuit in Figure 3.2). As a result, when the gate voltage is swept, the
Coulomb-blockade state and the single-electron-tunneling state appears by turn, and
the drain current versus gate voltage characteristics exhibit a repetition of sharp
peaks, as shown in Figure 3.2. This is known as Coulomb-blockade oscillation. This
ON-OFF characteristic indicates that the SET can function as a switching device.
For the complete description of the SET operation, consider the stability chart in
the gate-voltage/drain-voltage plane in Figure 3.3. The rhombic-shaped regions
colored in red are the region for the Coulomb-blockade state, and are known as
Coulomb diamonds. Outside the Coulomb diamonds, the number of electrons in the
island uctuates between certain numbers. The degree of the uctuation is determined by how far the voltage conditions are from the Coulomb diamonds. In the blue
regions, the uctuation is minimum that is, the electron number changes only
between two adjacent integers. These regions are for the single-electron-tunneling
states. The shape and size of the Coulomb diamonds are determined only by the gate
and junction capacitances. For example, the maximum drain voltage for the Coulomb
blockade is given by e/CS, where CS Cg Cd Cs is the total capacitance of the
island and Cd and Cs are junction capacitances at the drain and source. Each slope of
the diamond is given by Cg/Cd and Cg/(Cg Cs).
A more detailed explanation of the SET operation can be found in textbooks [6, 7]
and review articles [8, 9]. At this point, mention should be made of only one more
item which is what the SET cannot do. As explained above, the SET can convey
electrons one by one, but the time interval of each transfer event is uncontrollable.
Figure 3.3 Stability chart of the SET in the gate/drain voltage plane.
3.3
SET Fabrication
When SETs are fabricated, two criteria should be borne in mind. First, the resistance
of the tunnel junction must be sufciently larger than the quantum resistance
Rq h/e2 (25.8 kW), where h is the Planck constant. Otherwise, the number of
electrons in the island uctuates because of the Heisenberg uncertainty principle.
Because of this requirement, the current drivability is low, which is one major
demerit of the SET as a logic element. Second, the energy for adding one electron to
the island must be larger than the thermal energy. Otherwise, heated electrons tunnel
through the barriers and the Coulomb blockade does not function. For example, as
the temperature rises, each peak in Figure 3.2 broadens, and nally smears out. This
relationship is expressed as E kT, where E is the addition energy. The addition
energy can be expressed in the form e2/CS, and thus a SET with a smaller island can
operate at a higher temperature.
When the de Broglie wavelength of electrons is much smaller than the island size
which is the case of metal islands that are not too small (1 nm) charges are
induced right at the island surface and E is determined only by the island size and the
spatial conguration of the electrodes. However, when the de Broglie wavelength is
comparable to the island size typically as in the case of semiconductor islands with a
nanometer size the quantum size effect causes the kinetic energy of an electron to
increase and hence E to increase. Thus, semiconductor islands can have larger
addition energy than a metal island of the same size.
In order to explain how small the island must be made, Figure 3.4 shows the
relationship between the island size (dot size) and the addition energy for a Si island
embedded in SiO2 dielectrics [13]. Both, the quantized level spacing and the charging
energy (which can be dened as the addition energy for ideal metals) increase as the
j49
50
Figure 3.4 Relationship between dot size and addition energy [13].
island size decreases, and this leads to an increase in the resultant addition energy. If
an addition energy 10 times larger than the room-temperature energy (25.9 meV) is
required for the proper operation of a SET circuit, then an island as small as 4 nm
is needed. The creation of such a small island and attaching tunnel junctions to it
represents a technological challenge in SET fabrication. However, any conducting
material can be used as long as the above criteria are satised, and an addition energy
much larger than the room-temperature energy has already been demonstrated.
Historically, research into single-electron devices began with metals [4] and then
expanded to semiconductors [1418] and other materials, such as carbon nanotubes [1922] and some molecules [2328].
Among metals, a major material is aluminum, because its oxide functions as a
good dielectric for tunnel junctions. Tunnel junctions are commonly made using
Dolans shadow evaporation technique [29]. The junction capacitance can be
controlled in such structures to within 10% if the AlAlOx junctions have a
relatively large capacitance of several hundred attofarads. Making smaller junctions
is less easy, and thus electrical measurements with this material are commonly
carried out below 1 K. However, it has been shown that making an extremely small
SET, the island of which has an addition energy much larger than kT of room
temperature, is possible [30].
Among semiconductors, Si is the most widely used material in research aimed at
practical applications. Si SETs are commonly made on a certain type of Si substrate
called silicon-on-insulator (SOI) [31]. In SOI substrate, a thin Si layer (typically
100400 nm) is formed on a buried SiO2 layer. Thinning the Si layer and reducing its
size in the lateral direction by lithography enables small Si structures to be produced.
It is possible to further miniaturize these structures by using thermal oxidation: Si
is consumed during the oxidation, and thus the volume of the Si structures is
reduced. With Si, it is relatively easy to make smaller islands compared with Al.
Common measurement temperatures in Si SETresearch are 1 K to 100 K, and several
groups have observed the Coulomb-blockade oscillation at higher temperatures
(100300 K) [13, 3248].
Figure 3.5 Basic device structure of PADOX SETs. The 1-D wire is
converted to a Coulomb island and attaching tunnel junctions
after thermal oxidation of the Si.
j51
52
Fabrication methods that use molecules as building blocks are anticipated. In such
methods, functions and characteristics are determined by chemical synthesis,
without relying on lithographic techniques. Research into the charging effect or
Coulomb blockade in a molecule began during the mid-1990s [23, 24], and more
recently persuasive data showing conductance modulation by gate potential have
been obtained [2528]. Although the present understanding of transport in a
molecule is improving, many issues of circuit integration, including architectural
design, synthesis, and interfacing with external circuits, remain.
At this point, mention should be made of an infamous problem in SETs, known
as the background charge problem [54]. Due to randomly distributed mobile and
immobile charges in the dielectrics, the device characteristics may change over
time and differ from one device to another. This is because SETs have a high
sensitivity to charges due to their small size, and it makes the integration of SETs
difcult. A typical case is seen in SETs made from metals and GaAs/AlGaAs
heterostructures. For example, the characteristics of SEDs with AlAlOx junctions
change at least once a day. In order to stabilize such behavior, it may be necessary to
wait for a long time after cooling down before measurements can be made [55].
The situation is similar for carbon nanotubes and some molecules, as these suffer
from a large noise superimposed on the current characteristics, the origin of which
is unknown.
The background charge problem is not specic to SETs, however, and may occur in
any nanoscale eld-effect device due to their high sensitivity to charges. In addition, the
amount, location, and stability of the background charges are highly material- and
process-dependent. In fact, it has already been shown that PADOX SETs have
excellent long-term stability. The drift of the characteristics is less than 0.01e over
a period of a week at cryogenic temperatures [56]. More practically, no noticeable
change in the characteristics have been observed for more than eight years, during
which time thermal cycling between room temperature and 20 K has occurred
several times [57]. It has also been shown that the voltage at which the rst Coulombblockade-oscillation peak appears is controllable [58]. These results demonstrate that
PADOX SETs are not signicantly inuenced by slowly moving or immovable
background charges, which indicates that the background charges problem is not
intrinsic but rather can be solved. At present, no clear answers have been identied
as to how seriously fewer xed charges and/or faster motion of charges, which causes
1/f noise, will obstruct integration. However, it is believed that a circuit design with
some degree of defect tolerance would relax the effects.
In summary, for room temperature operation, an island smaller than 10 nm is
necessary. At present, Si is preferable for the SET fabrication from the viewpoints
of operation stability and temperature. Some experimental data are available
showing the control of the peak positions and the peak intervals in current
characteristics. However, these parameters are still difcult to control in roomtemperature-operating SETs. Also, there are no data showing the control of the
resistance of room-temperature-operating SETs, and these points remain the subjects of future studies. A more complete description of the fabrication process for
Si SETs can be found in Ref. [59].
j53
54
3.4
Single-Electron Logic
Many logic styles have been proposed and analyzed for single-electron devices. Most
of them can be categorized into two groups: charge-state logic; and voltage-state logic.
Charge-state logic [8, 12], which uses one electron to represent one bit, is highly
specic to single-electron devices and might be in some sense the ultimate logic.
Devices other than SETs, such as single-electron pumps, are building blocks.
However, very few experimental studies have been reported regarding circuit
operation based on this scheme because of the difculty of the fabrication.
Voltage-state logic [6088], which will be described here in detail, uses the SET as a
substitute for the conventional MOS eld-effect transistor (FET); hence it is referred
to as SET logic. Although the circuit characteristics are predominated by the
Coulomb blockade and single-electron tunneling, these phenomena are not directly
employed for computation. Instead, the current produced by the sequence of singleelectron tunnelings is used, and the bit is represented by the voltage generated by the
accumulation of plural electrons. Actually, this is not a genuine single-electron logic
because 101 to 103 electrons will be used in the operation. In many aspects, this logic
is analogous to CMOS logic. The major advantage is that the accumulated technologies can be employed for CMOS circuit designs. However, the logic is not merely a
copy of CMOS logic because the SET has completely different current characteristics
from the MOSFET; that is, the current oscillates as a function of gate voltage. Some
important elemental circuits such as an inverter, an exclusive-or (XOR) gate, a partialsum/carry-out circuit, and an analog-to-digital converter, have been experimentally
demonstrated [43, 46, 89102]. Some of these will be introduced in the following
three subsections.
3.4.1
Basic SET Logic
The voltage gain of the SET can be dened as for conventional transistors [5]. It is
known from Figure 3.3 that when the drain voltage increases with a xed gate voltage,
a current begins to ow at the edge of the Coulomb diamond. As a result, the output
drain voltage Vd for a xed input drain current Id exhibits a Coulomb diamond as a
function of the gate voltage Vg. The measured characteristics for a Si SETare shown in
Figure 3.8 (upper panel). The two slopes in the gure correspond to the inverting and
non-inverting voltage gains GI and GNI. As shown in Figure 3.3, their values are
determined by the capacitances as
GI C g =C d
3:1
GNI Cg =(C g Cs )
3:2
Although GNI is always smaller than unity, GI exceeds unity if Cg > Cd. Therefore,
CMOS-like logic circuits can be prepared using SETs as substitutes for MOSFETs.
From Eq. (3.1) it is clear that the SET must have a large Cg in order to obtain a high
inverting gain, which in turn means that the total capacitance of the SET island will
tend to increase. Therefore, the voltage gain and operating temperature are in a tradeoff relationship and it is not easy to produce SETs with a larger-than-unity gain that
operate at high temperatures. A GI value larger than unity has been achieved in metalbased [103, 104], GaAs-based [105], and Si-based [48, 50, 106] SETs.
The current cut-off characteristics are determined by the subthreshold slope S in
the IdVg characteristics that rise and fall almost exponentially at the tails of the peaks.
Figure 3.8 (lower panel) shows the output drain current Id for a xed input drain
voltage Vd plotted as a function of Vg on a logarithmic scale. At a sufciently low
temperature and high tunnel resistance, S is given by
S [d(log10 Id )=dV g ] 1 ln10(CS =Cg )kT=e
3:3
This equation is similar to that for a MOSFET. It also indicates that a high inverting
voltage gain GI is needed to obtain a steep subthreshold slope. Upon Cs Cd, a GI of 4
corresponds to a CS/Cg of 1.5, or S 90 mV dec1 at room temperature.
j55
56
A logic circuit can be made by employing the above-mentioned voltage gain. The
complementary inverter, which is the most fundamental logic element, was fabricated using Si SETs [89]. Figure 3.9 (left) shows an atomic force microscopy (AFM)
image of an SET inverter made by Si. Two SETs with a voltage gain of about 2 were
packed in a small (100 200 nm) area. As shown in Figure 3.9 (right), the input
and output transfer curve attains both a larger than unity gain and a full logic swing
at 27 K. Other complementary SET inverters, made from Al [90] and carbonnanotube [91] have been reported, and resistive-loaded inverters have also been
fabricated [92, 93].
3.4.2
Multiple-Gate SET and Pass-Transistor Logic
An important feature of SETs is that they can have plural gates. Such a multi-gate
conguration enables the sum-of-products function to be implemented at the gate
input level. That is, the total charges induced in the gates are expressed as SCgiVgi,
where Cgi and Vgi are the gate capacitance and input voltage of the i-th gate. Provided
that the gate input voltage Vgi is set to e/2Cgi, the SET is ON when the number of the
ON gates is odd, and OFF when the number of the ON gates is even (Figure 3.10).
This function is XOR. The SET XOR gate [65, 68] is a powerful tool for constructing
arithmetic units such as adders and multipliers because XOR is nothing other than
what is termed half-sum, which is the lower order bit calculated by adding two onebit binary numbers. The XOR gate has also been demonstrated experimentally using
a Si dual-gate SET [94]. A scanning electron microscopy (SEM) image of the dual-gate
SET is shown in Figure 3.11. The XOR function was conrmed in output drain
current at 40 K, as shown in the gure.
j57
58
In the multi-gate conguration, the gate capacitance for each gate is inherently
smaller than that of the single-gate version. Therefore, it is more difcult to attain the
larger-than-unity gain as the number of the gate increases. The CMOS-domino-type
logic was proposed as a way of using SETs without a voltage gain [73]. A combinational
logic circuit is built in a SET logic tree, where SETs are used as pull-down transistors.
The point is that the tree is operated with a sufciently small drain voltage in order to
make the Coulomb blockade effective. The output signal is then amplied by using
MOSFETs before being transferred to the next logic segment.
A single-electron pass-transistor logic, where SETs are used both as pull-up
and pull-down transistors, has also been studied. The fundamental circuit of the
single-electron pass-transistor logic was fabricated using PADOX SETs, and half-sum
and carry-out for the half adder has been experimentally demonstrated [95, 96].
Figure 3.12 shows the equivalent circuits and the measurement data. Both half-sum
and carry-out are correctly output at 25 K. What is signicant here is that the gate and
total capacitances, and even the peak positions of the used SETs, were well controlled
for these operations. This is the rst arithmetic operation ever performed by SETbased circuits. There have been attempts to construct logic elements [97102] that
operate based on the above-mentioned domino-type logic, pass-transistor logic, or the
so-called binary-decision-diagram logic.
3.4.3
Combined SET-MOSFET Configuration and Multiple-Valued Logic
In SETs, the applicable drain voltage is limited to a value smaller than e/CS in order to
maintain the Coulomb blockade. This may be an obstacle to driving a series of SETs
and external circuits that require a high input voltage. A combined SET-MOSFET
conguration has been proposed as a way to overcome this drawback [83, 85].
Figure 3.13(left) shows the equivalent circuit of the inverter based on this conguration. A MOSFET with a xed gate bias Vgg is connected to the drain of a SET, and the
inverter is driven by a constant current load, I0. The MOSFET keeps the SET drain
voltage sufciently low, which helps to maintain the Coulomb blockade. As the drain
voltage is almost independent of the output voltage, Vout, a large output voltage and
voltage gain can be obtained.
The output voltage Vout and output resistance of the combined SET-MOSFET
inverter are given by [86]:
V out Gm(SET) Rd(SET) (1 Gm(MOS) Rd(MOS) )V in ;
3:4
3:5
where Gm(SET) is the transconductance of the SET, and Rd(SET) and Rd(MOS) are the
drain resistances of the SETand MOSFET, respectively. The voltage gain of the SET is
multiplied by that of the MOSFET, which means that the voltage gain of the SETMOSFET inverter becomes very large because of the large voltage gain of the
j59
60
Figure 3.14 Measurement set-up for the single-electron quantizer [85]. CLK clock.
MOSFET (see Figure 3.13, right). In fact, the measured voltage gain of the SETMOSFET inverter was about 40 [86].
In this conguration, another important point is that the Id Vg characteristics
reect the oscillatory IdVg characteristics (Figure 3.13, right). This characteristic is
referred to as the universal literal, which is a basic unit for multiple-valued logic.
Multiple-valued logics have potential advantages over binary logics with respect to the
number of elements per function and operating speed. They are also expected to relax
the interconnection complexity inside and outside of LSIs. These are advantageous,
as they allow a further reduction in the power dissipation in LSIs and the chip sizes.
However, success has been limited, partially because the devices that have been
used (MOSFETs and negative-differential-resistance devices, such as resonant
tunneling diodes) are inherently single-threshold or single-peak, and are not fully
suited for multiple-valued logic. The oscillatory behavior seen in Figure 3.13
shows that the SET is suitable for implementing multiple-valued logic. By exploiting
this behavior, a quantizer was fabricated. Figures 3.14 and 3.15 show the measurement set-up for the quantizer and the measured data, respectively. The triangular
input Vin was successfully quantized into six levels.
3.4.4
Considerations on SET Logic
Many research groups have claimed that SETs could provide low-power circuits. In
order to make clear the meaning of this claim, two parameters must rst be
discussed, namely information throughput I and the power density P. These
parameters can be written in the following forms:
I anf
3:6
P E bit nf
3:7
where n is the density of the binary switches, f the operating frequency, and Ebit the
bit energy. A dimensionless parameter, a, was introduced which was referred to
Figure 3.15 Experimental data for the single-electron quantizer [85]. CLK clock.
j61
62
addition energy should be sufciently large so as to avoid bit errors caused by thermal
noise. As charge-state logic uses single electrons to represent bit information, the bit
energy is given by (1/2)CSV2, where V e/CS. This is in effect the addition energy. If
the bit error requirements for CMOS LSIs are assumed, then the bit energy will have
to be 102 larger than the thermal noise energy. This means that an additional energy
102 larger than the room-temperature energy is needed. From Figure 3.4 it is clear
that an island as small as 1 nm is needed. SET logic, on the other hand, uses the
voltage generated at output terminals to represent bits, as do CMOS circuits. The bit
energy is therefore given by (1/2)CLV2, where CL is the load capacitance. If the term
V e/CS is adopted for the power supply voltage of SET logic circuits, then the bit
energy is CL/CS larger than the case for charge-state logic. Therefore, for SET logic,
the addition energy requirement does not come from the bit error requirement but
rather from the static power loss because a small addition energy causes the valley
current (i.e. the OFF-current) to increase. There is no clear guideline as to how small
the OFF-current should be, because the acceptable static power loss depends on the
degree of the power-saving ability of the system. If the requirement for low-operatingpower (LOP) applications are adopted as stated in the International Technology
Roadmap for Semiconductors (ITRS) the source/drain OFF-state leakage current
should be on the order of 10 9 A mm 1, which corresponds to 10 pA for 10-nm SETs.
This will be achievable. It is also necessary to have a large ON/OFF current ratio and,
again, considering the requirement for LOP applications, a ratio of 105 is needed.
From this ratio, the addition energy should be about 16 kTor larger, which was derived
based on the standard theory of single-electron tunneling. This estimation does not
consider the quantum leakage current, which becomes signicant as the junction
resistance approaches the quantum resistance. However, as long as the junction
resistance is not very close to the quantum resistance, the above criteria for the
addition energy will be a reasonable basis for later discussions. With this requirement,
an addition energy E as large as 0.4 eV is required for room-temperature operation.
This addition energy corresponds to the total island capacitance (CS e2/E ) of 0.4 aF,
an excitation voltage (e/CS) of 0.4 V and, from Figure 3.4, an island size of about 3 nm.
There are two time scales for evaluating SET speed: one is the intrinsic switching
time, and the other for the circuit speed. The intrinsic switching time denes how fast
the SETchanges its states, and is determined by the RC time constant of the tunneling,
CSRj, where Rj is the junction resistance. If it is assumed that CS 0.4 aF and
Rj 1 MW (4Rq), the switching speed will be 0.4 ps and thus the SET is a fairly fast
switching device. The problem is that only one electron is moved by the switching
event, and it thus takes a much longer time to change the state of the output terminal
with a larger capacitance. The time for changing the state of the output terminal is
determined by CLRSET, where RSET is the resistance of the SET (4Rj). If it is assumed
that CL 100 aFandRSET 4 MW, thetime is 0.4 ns or 2.5 GHz. It will alsobehelpful to
compare the SET current density with that of the present nMOS transistor (for LOP
applications), which is about 600 mA mm1. Assuming that the size, resistance, and
drain voltage of the SETare 3 nm, 4 MW, and 0.4 V, respectively, the SETcurrent density
will be 33 mA mm1. Although this is not fatally bad, it implies that the use of SETs is
restricted to a local communication with relatively small load capacitances. Crudely
speaking, the SET is inherently slower by the factor of at least 10 1 than FETs because
the SET cannot operate with the resistance smaller than Rq, whereas FETs can.
The SET can be made very small, but this does not necessarily mean it will be the
smallest. Ideally, a molecular-sized FETcan be imagined, and could be made as small
as the SET. Therefore, small size is not a major merit of the SET; rather, the main
merit is that its operation is guaranteed even at the molecular level, and some
parameters such as the switching speed and current peak-to-valley ratio can be
improved owing to the reduced capacitance. At this point, it might be safe to say that
SETs have no denite advantage over ultimately scaled-down MOSFETs from the
viewpoint of the physical size itself.
Now, a return should be made to Eqs. (3.6) and (3.7). Considering the above
arguments that SET size is comparable to that of ultimate future FETs, and that the
circuit speed is 101 to 102 worse than its CMOS counterparts, the functionality a
will need to be increased by 101 or more in order to make a drastic improvement in I/P
while keeping I2/P comparable to that for CMOS circuits.
Several ideas for improving the functionality have been reported. One is to use the
SET XOR gates introduced in Section 3.4.2. Figure 3.16 shows the equivalent circuit
of a full adder based on the SET XOR gate. A full adder can be constructed using six
SETs, whereas this requires 28 MOSFETs in CMOS logic [106]. This can be
interpreted as a 4.7. It has also been reported that, by integrating SET full adders,
multi-bit adders can be constructed in a very area-efcient manner: there are no long
wires in the carry-propagation path, which leads to fast operation in spite of a low
drivability of the SET [80]. Another idea is based on the SET-MOSFET conguration
introduced in Section 3.4.3. Based on this conguration, a SET logic gate family has
been proposed [87, 88].
Figure 3.16 The SET full adder. A and B are addends and C is carry.
CON is the control signal, which controls the phase of the
Coulomb blockade oscillation [80].
j63
64
These SET logic gates are useful for implementing binary logic circuits, multiplevalued logic circuits, and binary/multiple-valued mixed logic circuits in a highly
exible manner. As an example, a 7-3 counter is shown in Figure 3.17. This is a member
of the M-N counters, which are generalized counters dened in the framework of
Counter Tree Diagrams. Most adders, including those for redundant number systems,
could be represented in this framework. The 7-3 counter can be constructed using four
SETs and 10 FETs with some passive components, whereas 198 FETs are required in
CMOS logic. The functionality a is of the order of 101 in this case.
The increase in a in the two examples is due to the application of the SET periodic
function for implementing the operation add. This is because the parity and the
periodic function are the fundamentals of the arithmetic. These examples strongly
suggest that the best use of the SETs will be in arithmetic units such as adders, while
other arithmetic units such as multipliers are made from adders. Adders can be built
by a repetition of relatively simple layouts, and require lesser amounts of long wiring,
which will compensate for the low drivability of SETs. It is believed that, by pursuing
this direction, a more efcient way to increase the functionality will be found. For
this purpose, much larger-scale circuits should be investigated than have been
studied to data.
In summary, the SET can function as a fairly fast switching device and, although
the SET has low drivability, this will not prove fatal. The periodic function of the SET
current characteristics suggests that it should be applied to arithmetic units such as
adders and multipliers, which might reduce their dynamic power consumption. The
most suitable applications for SET-based voltage-state circuits will be for LOP
arithmetic units. However, there may be no merit in using SETs in terms of static
References
3.5
Conclusions
References
1 J. D. Meindl (Ed.), Proc. IEEE 2001, 89.
2 D. V. Averin, K. K. Likharev, J. Low Temp.
Phys. 1986, 62, 345.
3 L. S. Kuzmin, K. K. Likharev, J. Exp.
Theoret. Physics Lett. Lett. 1987, 45, 495.
4 T. A. Fulton, G. J. Doran, Phys. Rev. Lett.
1987, 59, 109.
5 K. K. Likharev, IEEE Trans. Magn. 1987,
23, 1142.
6 D. V. Averin, K. K. Likharev, Mesoscopic
Phenomena in Solids, Chapter 6, B. L.
8
9
10
j65
66
References
47 H. Harata, M. Saitoh, T. Hiramoto, Jpn. J.
Appl. Phys. 2005, 44, L640.
48 K. Miyaji, M. Saitoh, T. Hiramoto, Appl.
Phys. Lett. 2006, 88, 143505.
49 Y. Ono, Y. Takahashi, K. Yamazaki, M.
Nagase, H. Namatsu, K. Kurihara, K.
Murase, IEEE Trans. Electron Devices 2000,
47, 147.
50 Y. Ono, K. Yamazaki, Y. Takahashi, IEICE
Trans. Electron. 2001, E-84C, 1061.
51 K. Shiraishi, M. Nagase, S. Horiguchi, H.
Kageshima, M. Uematsu, Y. Takahashi, K.
Murase, Physica 2000, E 7, 337.
52 S. Horiguchi, M. Nagase, K. Shiraishi, H.
Kageshima, Y. Takahashi, K. Murase, Jpn.
J. Appl. Phys. 2001, 40, L29.
53 A. Fujiwara, Y. Takahashi, K. Murase, M.
Tabe, Appl. Phys. Lett. 1995, 67, 2957.
54 A. B. Zorin, F.-J. Ahlers, J. Niemeyer,
T. Weimann, H. Wolf, V. A. Krupenin,
S. V. Lotkhov, Phys. Rev. B 1996, 53,
13682.
55 W. H. Huber, S. B. Martin, N. M.
Zimmerman, Proceedings of Experimental
Implementation of Quantum Computation,
p. 176, Rinton Press, Princeton, 2001.
56 N. M. Zimmerman, W. H. Huber, A.
Fujiwara, Y. Takahashi, Appl. Phys. Lett.
2001, 79, 3188.
57 Y. Takahashi, Y. Ono, A. Fujiwara, K.
Shiraishi, M. Nagase, S. Horiguchi,
K. Murase, Proceedings of Experimental
Implementation of Quantum Computation,
p. 183, Rinton Press, Princeton, 2001.
58 A. Fujiwara, M. Nagase, S. Horiguchi, Y.
Takahashi, Jpn. J. Appl. Phys. 2003, 42,
2429.
59 Y. Takahashi, Y. Ono, A. Fujiwara, H.
Inokawa, J. Phys.: Condens. Matter 2002,
14, R995.
60 J. R. Tucker, J. Appl. Phys. 1992, 72, 4399.
61 M. I. Lutwyche, Y. Wada, J. Appl. Phys.
1994, 75, 3654.
62 M. Kirihara, N. Kuwamura, K. Taniguchi,
C. Hamaguchi, Ext. Abstracts 1994
International Conference on Solid State
Devices and Materials, p. 328, Business
Center for Academic Societies Japan,
Tokyo, 1994.
j67
68
j69
4
Magnetic Domain Wall Logic
Dan A. Allwood and Russell P. Cowburn
4.1
Introduction
The integrated circuit which, during recent years, has become the basis of modern
digital electronics, functions by making use of electron charge. However, electrons
also possess the quantum mechanical property of spin, which is responsible for
magnetism. New spintronic technologies seek to make use of this electron spin,
sometimes in conjunction with electron charge, in order to achieve new types of
device. Several spintronic devices are currently being developed that outperform
traditional electronics. Often, this results from an increased functionality, which
means that a single spintronic element performs an operation that requires several
electronic elements.
Different approaches to spintronics have been developed by the semiconductor
and magnetism communities. Although there have been some very impressive
demonstrations of spin-polarized charge transport and ferromagnetism in cooled
semiconductors [13], the lack of a reliable room-temperature semiconductor
ferromagnet has hampered their application. Within the magnetism community,
however, considerable success has been achieved at room temperature by using
common ferromagnetic materials such as Ni81Fe19 (Permalloy). This approach offers
the benets of low power operation, non-volatile data storage (no power required),
and a high tolerance of both impurities and radiation.
A great success of electronics has been the ability to use groups of transistors for
performing Boolean logic operations. Each type of operation has a particular
relationship between its input and output states, each of which can take the value
1 or 0. These relationships are shown in the truth tables in Table 4.1 for the
Boolean NOT, AND, and OR logic operations. Logical NOT has a single input and a
single output, with the output having the opposite state of the input. Logical AND has
two independent inputs and has a single output that is 1 for an input combination
of 11 and 0 otherwise. Conversely, a logical OR output is 0 for an input
combination of 00 and 1 otherwise. Importantly, a suitable combination of NOT
70
NOT
AND
OR
Input A
Output B
Input A
Input B
Output C
Input A
Input B
Output C
0
0
1
1
0
1
0
1
0
0
0
1
0
0
1
1
0
1
0
1
0
1
1
1
4.1 Introduction
Table 4.2 Common electronic circuit symbols and the equivalent
CMOS and domain wall logic devices.
j71
72
Alternatively, circuits made from planar magnetic nanowires can be used, with
wires typically 100 to 250 nm wide and 5 to 10 nm thick. The shape anisotropy
(geometry) of these wires creates a magnetic easy axis in the wire long axis direction
that denes the stable orientations of magnetization (Figure 4.1). This system with
two opposite stable magnetization orientations is ideal for representing logical 1
and 0 (see Table 4.2). Where opposite magnetizations meet they are separated by a
transition region through which magnetization rotates by 180 (Figure 4.1). This is
another form of a magnetic soliton, and is called a domain wall. For the wire
dimensions relevant here, domain walls are typically approximately 100 nm wide.
Domain walls can be moved by applying magnetic elds, and it is this ability which is
exploited in magnetic domain wall logic. Domain walls travel down sections of
nanowire between nanowire junctions where logic operations are performed.
Crucially, the inuence of nanowire imperfections on domain wall propagation is
very signicantly reduced compared with the propagation of solitons in interacting
dots. Furthermore, very little power is required either to propagate a domain wall or
to perform a logic operation, compared to the lowest power CMOS equivalents or
magnetic alternatives. This combination makes magnetic domain wall logic a robust,
low-power logic technology. The remainder of this chapter is devoted to explaining
how magnetic domain wall logic functions, and what the future prospects of the
technology might be.
4.2
Experimental
All of the magnetic structures shown here are fabricated by focused ion beam (FIB)
milling [18] of 5 nm-thick Permalloy lms. The lms were thermally evaporated onto
Si(0 0 1) substrates with a native oxide present in a chamber with a base pressure
<107 torr. FIB milling used 30 keV Ga ions which were focused to a diameter of
7 nm at the substrate. The Ga ions sputter the magnetic material and scatter
within the lm and substrate to implant at lateral distances up to 40 nm from the
spot center [19]. Both of these processes lead to a loss of ferromagnetic order in
the Permalloy lm and allow nanostructures to be dened. A 150 150 mm square
of magnetic material is cleared around each nanowire circuit to allow optical analysis
4.3
Propagating Data
Figure 4.2 Atomic force microscopy images and magnetooptically measured magnetization hysteresis curves from:
(a) 100 nm-wide wire and (b) 100 nm-wide wire with a 1 mm 1 mm
square domain wall injection pad [22]. Measuring either side of
the kink does not change the observed switching field. The
direction of the magnetic field Hx is indicated in both images.
j73
74
For this wire, magnetization reversal occurs at a coercive eld Hc 180 Oe.
Unwanted domain wall nucleation from wire ends, junctions and corners must
generally be avoided in logic circuits, as this will corrupt any existing data. It is
imperative, therefore, domain walls can be introduced and propagated at magnetic
elds lower than, in this case, 180 Oe. Figure 4.2b shows a structure with a similar
wire to that in Figure 4.2a, but now with a 1 mm 1 mm square pad attached to one
end. MOKE measurement of the wire shows that Hc 39 Oe. This reduction in Hc
is a result of the square pad undergoing magnetization reversal at Hc 26 Oe
(not shown) before a domain wall is injected into the wire at Hx 39 Oe. Different
regions of the wire all have the same coercive eld, even beyond the 30 kink,
showing that domain walls can propagate in wires and through changes of wire
direction at elds signicantly lower than nucleation elds. To quantify this low-eld
propagation more precisely, the domain wall velocity was measured in a 200 nm-wide
Permalloy wire (Figure 4.3), in an experiment described elsewhere [23]. Measured
domain wall velocities exceeded 1500 m s1 for certain elds applied along the
wire long axis. Other studies [2427] have shown that domain wall velocity does
not increase continually with eld, but rather reaches a maximum value before
reducing at higher elds. Interestingly, domain wall propagation is still observed
at elds as low as 11 Oe [23], albeit with very low velocities of 0.01 m s1. The data
in Figures 4.2 and 4.3 indicate that nanowire devices operating by domain wall
propagation will require 11 Oe < Hx < 180 Oe, although these eld values will
change once wire junctions are introduced.
Simply using a unidirectional eld will not allow domain walls to be separated
reliably and, hence, normal data streams containing both 1s and 0s cannot be
propagated. Instead, use is made of the orthogonal elds Hx and Hy to create a
magnetic eld vector that rotates in the plane of the sample to control domain wall
propagation around smooth 90 wire corners. An important rule for understanding
the nanowire circuits is that domain walls will propagate around corners of the same
sense of rotation as the applied eld that is, a clockwise rotating eld will lead to
domain walls traveling around corners clockwise. In a correctly designed nanowire
circuit, the sense of eld rotation will dene a unique direction of domain wall
propagation and, hence, data ow. This is an essential feature of a Boolean logic
system. Interestingly, the direction of data ow in magnetic domain wall logic can, in
principle, be reversed simply by reversing the sense of eld rotation.
At this point it must be considered how binary data are represented in magnetic
domain wall logic. It was mentioned in Section 4.1 how the two opposite magnetization directions supported in magnetic nanowires can be used to represent the
logic states 1 and 0. However, care must be taken here with the denition
chosen to use with magnetic nanowire circuits, as the wires can change directions.
Figure 4.4a and b show, schematically, two similar magnetic wires with opposite
magnetizations containing a single domain wall. A simple approach would be to say
that magnetization pointing to the right represents logical 1, and that pointing to
the left represents logical 0. However, Figure 4.4ce shows the magnetization
following a domain wall that propagates around a 180 wire corner. In the nal
situation (Figure 4.4e), the magnetization is continuous up to the domain wall,
meaning that there are no changes in logic state up to this point. However, the
absolute directions of magnetization are opposite on either side of the turn, and so
the simple denition cannot then be valid. Instead, the choice is made to dene
data representation in terms of the direction of magnetization relative to the
direction of domain wall motion. In Figure 4.4ce, the magnetization following
the domain wall is always oriented in the direction of domain wall motion, so the
logic state represented remains unchanged. This robust denition allows measurements from different parts of logic circuits to be interpreted correctly.
4.4
Data Processing
The NOT-gate was the rst domain wall logic device to be introduced [2830], and is
foundational to the development of all other logic elements. Figure 4.5 shows the
geometry of a NOT-gate, and illustrates its principle of operation. The NOT-gate is a
junction formed by two wires. For a given eld rotation, one wire will act as the input
j75
76
and the other wire as the output. A small central stub which emerges from the wire
junction is an important part of the device as it ensures there is sufcient shape
anisotropy to maintain a magnetization component in the direction in which the stub
points. The dimensions for an optimized NOT-gate design are given in Table 4.2.
Under a rotating magnetic eld, H, a domain wall enters the NOT-gate input wire
(Figure 4.5a) before reaching the wire junctions (Figure 4.5b). The magnetization
following the domain wall points in the direction of domain wall propagation.
Provided that there is sufcient eld, the domain wall expands over the junction
and splits in two, with one domain wall traveling along the central stub, leaving the
stub magnetization reversed, and the other free to propagate on the NOT-gate output
wire (Figure 4.5c). As the eld continues to rotate, the domain wall in the output
wire leaves the NOT-gate. The magnetization following the domain wall is now
pointing away from the direction of domain wall motion. The magnetization on
either side of the wire junction is reversed, and the device has inverted the input
logical state. The reversal in magnetization means that the inversion process would
be expected to require a one-half cycle of eld.
Figure 4.6a shows a structure containing a NOT-gate fabricated in a square loop of
magnetic wire that has not been used to test the operation of the logic device [2830].
The loop provides feedback to the NOT-gate by joining the output wire to the input
wire. Having a single inverter in the loop guarantees that at least one domain wall will
be present [2830] and removes the need, at this stage, for explicit data input. If the
principle of operation for a NOT-gate described above is correct, it should be possible
to predict the switching period of the NOT-gate/loop structure in terms of eld
cycles. As mentioned above, a domain wall will take a one-half eld cycle to
propagate through the NOT-gate junction. Traveling around the 360 wire loop
will then add another eld cycle. So, magnetization reversal would be expected
to be seen at any position in the loop every 3/2 eld cycles, giving a switching period
of three eld cycles.
Figure 4.6b shows a time-averaged MOKE trace obtained from position I on the
structure (Figure 4.6a) under anti-clockwise eld conditions [29]. As expected, a
j77
78
three-eld cycle switching period was observed, indicating that the wire junction is
performing as a NOT-gate. To validate this further, Figure 4.6c shows a MOKE trace
obtained from position II, the other side of the NOT-gate (Figure 4.6a), with the same
eld conditions. This trace is inverted compared to that in Figure 4.6b, except for a
one-half eld cycle delay, consistent with the operation of a NOT-gate outlined above.
As the NOT-gate has an equal number of input and output wires that have identical
geometry, it may be operated reversibly. Figure 4.6d and e show measurements from
positions I and II, respectively but now under a clockwise-rotating eld. The observed
phase relationship indicates that position II has now become the input and position I
the output. Figure 4.6f shows a MOKE trace that is obtained with the laser spot
positioned over the NOT-gate (position III, Figure 4.6a). A domain wall is observed to
enter and leave the NOT-gate to correlate with the traces shown in Figure 4.6d and e.
An important aspect of this initial demonstration is that the applied eld acts as
both power supply and clock to the magnetic circuit. The structure in Figure 4.6a
can be thought of as having the equivalent electronic circuit shown in Figure 4.7.
Electronic invertors do not have a delay in terms of a clock cycle, so a buffer must
be introduced with a delay of T/2, where T represents a clock period. Another buffer
with a delay of T is then introduced within a feedback loop to represent the time
spent by a domain wall propagating around the wire loop. The signal from this
circuit will replicate those observed in Figure 4.6.
One of the advantages of having input and output wires of identical forms is that
logic gates can be directly connected together. Figure 4.8a shows a magnetic shift
register circuit made of 11 NOT-gates within a wire loop [28]. The expected switching
period can again be calculated by 11 1/2 eld cycles for domain wall propagation
through NOT-gates plus one eld cycle for the loop to give a magnetization reversal
every 6.5 eld cycles, or a switching period of 13 eld cycles. Figure 4.8b shows
the MOKE trace obtained from the structure, in which the 13-eld cycle period is
clearly observed. The measurement was obtained over 30 min of averaging, indicating that almost 105 logical NOToperations were successfully performed. If there was
a problem with a domain wall propagating through a NOT-gate on just one occasion,
the resultant phase difference introduced would be clearly visible in the timeaveraged trace (Figure 4.8b). Although in this case the topology of the shift register
structure has been used to ensure the presence of a single domain wall, it will be seen
later (in Section 4.5) how similar shift registers can support complex data sequences.
The one-half cycle delay for domain wall propagation creates a natural buffer between
data bits, removing the need for any complex circuitry, such as the ip-op circuits
that are commonly used in electronic memories.
Characterizing the performance of magnetic domain wall NOT-gates is essential
for design optimization [29] and for integrating them with other types of nanowire
junction. Here, structures similar to that in Figure 4.6a were used to assess a NOTgates operation as a function of in-plane rotating eld amplitudes H0x and H0y . Three
types of operation are observed:
.
When the eld amplitudes are too low, domain walls experience pinning at the
NOT-gate junction, and this leads either to no switching for very low elds or else
j79
80
de-phasing of the time-averaged MOKE trace when domain walls are pinned
even once.
.
At high elds, additional domain wall pairs are nucleated in the structure and
the magnetization reversal has a single eld cycle period.
It should be noted that, after domain wall nucleation is observed at high elds, it
is necessary to reduce the number of domain walls back to one by applying eld
conditions for occasional de-pinning [29]. This allows domain wall pairs to meet
and annihilate. Figure 4.9 shows the resulting phase diagram describing NOT-gate
operation as a function of eld. There are two phase boundaries present, one
separating domain wall pinning from correct operation, and the other separating
correct operation from domain wall nucleation. The two boundaries meet to dene
an area of eld phase space in which the NOT-gate operates correctly. Figure 4.9 is
taken for a NOT-gate with the optimized dimensions given in Table 4.2.
The other circuit elements that are required for a realistic logic system are a
majority gate, signal fan-out, and signal cross-over. The NOT-gate operating phase
diagram (Figure 4.9) provides a useful and necessary reference for comparing the
performance of these additional elements to ensure compatibility. Figure 4.10ac
shows three structures used for testing the operating elds of majority gate junctions [31]. Each junction has two input wires and one output wire, with the structures
having (a) no, (b) one, and (c) two input wires terminated by a domain wall injection
pad. The low eld at which domain walls are introduced from an injection
pad means that they provide a means of testing majority gate junction operation
j81
82
when 0, 1, or 2 domain walls are present in the input wires. Clearly, the output arm
switching elds (Figure 4.10df) reduce as the number of domain walls present at the
junction increases. In terms of magnetization dynamics, it is interesting that a single
domain wall appears capable of expanding across a junction before propagating
through the output wire, and that two domain walls are able to interact to enable very
low output wire switching elds. Figure 4.10g shows a more detailed analysis of the
operation of optimized majority gate junctions (dimensions given in Table 4.2) as a
function of the in-plane rotating eld amplitudes. The different input conditions now
lead to two eld-space regions of operation, depending on the number of domain
walls present before switching. Crucially, comparison with Figure 4.9 shows that
there is overlap between the NOT-gate operating region and both operating regions of
the majority gate. The question remains, however, whether to use eld amplitudes
from the lower-eld operating region of the majority gate, or the higher. The answer is
to use both. For a majority gate aligned parallel to Hx, eld conditions H0x 120 Oe
and H0y 50 Oe will mean that the output wire will switch whenever there is just one
domain wall present. This corresponds to an input condition of 01 or 10. Clearly,
this should not happen for either an AND-gate or an OR-gate. However, by examining
the truth tables for each (see Table 4.1), it becomes obvious that a 10 input should
always lead to a 0 output for an AND-gate and a 1 output for an OR-gate. However,
Hx need not be symmetric; instead, a dc eld HDC
x can additionally be applied to bias
Hx so that for one sense of Hx the majority gate reverses with a domain wall in either
input, while in the other sense of Hx the majority gate requires domain walls in both
input wires. The AND/OR function of the gate is then selected by the polarity of Hdc.
Signal fan-out and signal cross-over junctions were developed in a similar
manner [32], with optimized geometries shown in Table 4.2. Figure 4.11a shows
a circuit that integrates all of the structures necessary for performing logic operations:
a NOT-gate, a majority gate, two signal fan-out junctions and a signal cross-over
element [33]. An anti-clockwise rotating eld with amplitudes of H0x 75 Oe and
H0y 88 Oe was used with HDC
x 5 Oe (Figure 4.11b) in order to circulate domain
walls in an anti-clockwise direction and select logical AND operation for the majority
gate. The NOT-gate/loop is similar to those discussed above, and will contain a single
domain wall and a magnetization switching period of three eld cycles, as before. The
difference from Figure 4.11a, however, is that a fan-out structure is incorporated
within the loop to split a domain wall each time it is incident on the junction. Part of
the domain wall will continue propagating around the loop, while the other part exits
the loop to the rest of the circuit. When used in this manner, the NOT-gate/loop acts
as a three-eld cycle period signal generator for testing the circuit. A domain wall that
exits the NOT-gate/loop is then split again at a second fan-out junction. MOKE
measurement at position I in Figure 4.11a shows that the three-eld cycle period
from the NOT-gate loop is preserved through both fan-out junctions (Figure 4.11b,
trace I). The domain walls from the second fan-out junction now have separate
paths before reaching the AND-gate inputs. The domain wall that passes position I
simply has to propagate through two 90 corners and some straight wire sections. The
resulting half-eld cycle delay between domain walls passing position 1 and arriving
at the AND-gate is indicated by trace II in Figure 4.11b. The other domain wall from
operation of the magnetic circuit within an anticlockwise rotating field with H0x 75 Oe,
H0y 88 Oe and HDC
x 5 Oe. Experimental
MOKE measurements from positions I and IV of
the circuit are shown. Traces II and III are inferred
from trace I, and show the magnetization state of
the AND-gates input wires.
the fan-out junction has to negotiate a cross-over junction and an additional 360 loop
before arriving at an AND-gate input at position III (Figure 4.11a). The inclusion of
the loop tests the operation of the cross-over element and will create a one-eld cycle
delay between domain walls arriving at positions II and III, as indicated in the
j83
84
inferred trace III in Figure 4.11b. Measurement at position IV in Figure 4.11a shows
that the output is high only when both inputs are high, showing that the majority gate
is operating correctly as an AND-gate. Furthermore, this demonstrates that all four of
the element types can operate under identical eld conditions in a single circuit.
4.5
Data Writing and Erasing
In the previous section, domain walls were introduced to nanowire junctions either
by using topological constraints of a nanowire circuit or domain wall injection from a
large area pad. These are both valid methods for developing logic elements, although
a method of entering user-dened data is still required to create a viable logic system.
Here, an element for data input is presented that is integrated with a domain wall
shift register [33]. Furthermore, it is shown how data can be deleted from the shift
register.
The design of the optimized data input element is shown in Table 4.2. Figure 4.12
shows the operating phase diagram of this element, obtained from simple test
structures, overlaid with that of a NOT-gate. A single phase boundary for the data
input element bisects the NOT-gate eld operating area. Above the phase boundary,
a domain wall is nucleated from the data input element, whereas below the phase
boundary no magnetic reversal occurs. Two sets of eld amplitudes can then be
identied for operating both NOT-gates and the data input element. Below the
data input element phase boundary are the read or no-write eld conditions of
Figure 4.12 Operating field phase diagram of optimized NOTgate and data input elements. Symbols represent the limits of the
NOT-gate operating region (&), maximum field for no domain
wall injection (*) and minimum field for reliable domain wall
injection (*) from the data input element, and selected write field
(~), read/no-write field (!) and erase field (^) conditions. The
lines are provided only as guides to the eye.
neighbors either by a wire junction (NOT or fanout) or a straight horizontal wire (indicated by
dotted line). The field directions Hx and Hy are
shown; the position of magneto-optical
measurement is indicated by the dotted ellipse.
Hno-write
90 Oe and H0y 50 Oe, and above the phase boundary are the write
x
eld conditions of Hwrite
138 Oe and H0y 50 Oe (Figure 4.12).
x
Figure 4.13 shows an image of a shift register containing eight NOT-gates and one
fan-out junction [33]. In addition, one NOT-gate has a data input element attached to
its central stub. The fan-out element provides a monitor arm for MOKE measurement, as used in Section 4.4 above. The shift register can be divided into ten cells,
each capable of holding a single domain wall and separated from its neighbors by a
total of 180 of wire turn. Due to topological restrictions, domain walls can only be
introduced or removed in pairs. Therefore, a data bit is represented by the presence
or absence of a domain wall pair, so the shift register in Figure 4.13 contains ve
data bits.
Figure 4.14ad shows, schematically, the operating principle of the data input
element connected to the NOT-gate [33]. Initially, no domain walls are present and the
two connecting wires to the NOT-gate have opposite magnetizations (Figure 4.14a).
As the eld rotates, the write eld amplitude is used (Figure 4.14b) so that a domain
wall is nucleated at the end of the data input element. This domain wall will propagate
to the NOT-gate junction, where it will split into domain walls DW 1 and DW 2, one in
each of the input/output wires (Figure 4.14b). As the eld rotates further
(Figure 4.14c), both domain walls follow the eld rotation and propagate clockwise
around corners. DW 1 propagates away from the NOT-gate, while DW 2 returns to the
junction (Figure 4.14c). Finally, the eld rotates to be oriented 180 from when
nucleation occurred, but now with no-write conditions (Figure 4.14d). DW 1 has
j85
86
propagated out of the section shown in Figure 4.14, while DW 2 has propagated
through the NOT-gate, to leave the NOT-gate and data input elements magnetization
back in their initial conguration (Figure 4.14d). One single half-cycle of write eld
conditions has created a pair of domain walls that is, a single data bit has been
written.
Figure 4.14e shows a eld sequence that is used to write data to the shift register
in Figure 4.13. A combination of write and no-write eld conditions is used to write
the ve-bit data sequence 11010. Time-averaged MOKE measurements were
performed during continual application of the read eld conditions. Trace I in
Figure 4.14f shows the MOKE signal obtained prior to the application of the write
eld sequence in Figure 4.14e. No transitions are observed, meaning that no domain
walls are present. After a single application of the write eld sequence, the MOKE
signal changes to show that pairs of domain walls are propagating continuously
around the shift register (Figure 4.14f, trace II). Crucially, the pattern of domain
wall pairs matches the original input data sequence of 11010, although the phase
of the measurement is such that the MOKE trace starts part-way through this
sequence. Note that in this case logical 1 is represented by a low MOKE signal,
due to the 180 wire turn between the data input element and the measurement
position. This observation conrms the principle of operation for a data input
element outlined above. Delays of an hour between writing and successfully reading
data have been seen, demonstrating the intrinsic non-volatility of the data storage.
The whole shift register can be lled with domain walls, destroying any data present,
by applying an over-write half-sinusoid eld pulse of amplitude H 0x 243 Oe and
1.85 ms pulse length (Figure 4.14f, trace III).
Individual domain wall pairs can also be removed from the shift register in
Figure 4.13. This represents a selective bitwise delete operation. Almost all of the
ten cells shown in Figure 4.13 are separated from their neighbors by a nanowire
junction. The exceptions are cells 1 and 2, which are separated by a straight section
of wire. Domain walls require read eld conditions to propagate successfully
24 Oe
through the shift register. However, when erase eld conditions of H erase
x
and H0y 50 Oe (see Figure 4.12) are used, domain walls cannot overcome the
pinning potentials associated with the nanowire junctions. The only possible domain
wall motion will be between cells 1 and 2, where there are no wire junctions.
Figure 4.15a shows the eld sequence for erasing a single pair of domain walls. The
rst half-cycle has erase eld conditions, so the only domain wall propagation will be
from cell 1 to cell 2. All other domain walls will remain pinned at the junctions
between cells. The next half-cycle has read eld amplitudes, so all domain walls will
propagate forward by one cell, with the exception of the pair of domain walls in cell 2
which will meet and annihilate. The second full eld cycle continues with read eld
conditions to move all domain walls on by two cells and allowing the eld sequence to
be repeated on the next domain wall pair. Figure 4.15b shows MOKE traces obtained
from the shift register following an over-write half-sinusoid pulse and between 0 and 5
erase eld sequences (Figure 4.15a). The MOKE traces have a ve-cycle period and
each erase sequence removes a pair of domain walls, validating the operating
principle described above.
j87
88
4.6
Outlook and Conclusions
two nanowires together. Similarly, the other high-level properties that have been
highlighted in this chapter such as inputoutput isolation and signal/power gain
are all intrinsic to the nanowire and do not have to be explicitly created.
The power dissipation per logic gate is extremely low. Microelectronic engineers
usually measure dissipation from a gate by the powerdelay product; that is to say, the
product of how much power is dissipated multiplied by how long the gate takes to
process a single function. The units of this quantity are energy, corresponding to the
energy dissipated during the evaluation of the function performed by the gate. The
powerdelay product of CMOS depends on the size of the devices. Hence, in order
to compare like with like, a 200 nm minimum feature size CMOS value of 102 pJ
is considered [34]. On very general magnetic grounds, it can be said that an upper
bound for the powerdelay product for domain wall logic is 2MsVH, where Ms is the
saturation magnetization of the magnetic material, V is the volume of magnetic
material in a gate, and H is the amplitude of the applied eld. Applying the
parameters for a typical 200 nm domain wall logic gate gives 105 pJ that is,
1000 times lower than the equivalent CMOS device. Because of the inefciencies
inherent in the generation of high-speed magnetic elds (see above), this does not
necessarily mean that domain wall logic chips will not consume much power. What it
does mean, however, is that the waste heat will be generated from the global eld
generator and not from the logic devices themselves. This is of particular relevance if
the devices are to be stacked into three-dimensional (3-D), neural-like circuits. The
two key technical difculties in doing this in CMOS are: (i) distributing the power and
clock to everywhere inside the volume of network; and (ii) extracting the waste heat
from the center of the network so that the device does not melt. It is believed that
domain wall logic is an excellent choice of primitive for 3-D architectures.
Non-volatility comes as standard. In a world of mobile computing and portable (or
even wearable) devices, the concept of instant-on is becoming increasingly important. Users accept that devices cannot be expected to operate when there is no power,
but as soon as the power becomes available they want the device to be ready, and not
have to undergo a long boot process, or to have forgotten what it was doing when the
power last failed. As there are currently very few non-volatile memory technologies
available which can be embedded directly into CMOS, a data transfer process is
usually required between a high-speed, volatile memory register in the heart of the
CMOS logic and an off-chip, low-speed, non-volatile store where the state variables of
the system are stored. With domain wall logic, all of this becomes redundant.
Provided that the rotating eld is properly controlled so that it stops gracefully as
power fails, and does not apply intermediate levels of eld leading to data corruption,
the domain wall logic circuit should simply stop and retain all of its state variables.
Then, as soon as the power returns, the logic continues from where it left off.
Domain wall logic can make use of redundant space on top of CMOS. Because no
complex heterostructures are required, the logic elements can sit in a single layer
fabricated as a Back End Of Line process after the CMOS has been laid down. This can
improve the efciency of the underlying CMOS by farming out some spaceconsuming task to the domain wall logic on top. As this space was never accessible
to CMOS itself anyway, it all counts as a gain.
j89
90
Being metals, the basic computational elements of domain wall logic are automatically radiation-hard, and so are suitable for use in either space or military
applications.
Domain wall logic is very good at forming high-density shift registers that could be
used as non-volatile serial memory for storing entire les, and so would not require
high-speed random access. The hard disk drive and NAND Flash devices for
example, as used to store photographs in a digital camera are examples of nonvolatile serial memory. At present, both of these devices are 2-D in form, but registers
made from domain wall logic elements have the potential to be stacked into three
dimensions, without incurring extra wiring complexity, as the data and power can be
transmitted remotely through magnetic elds (see above). In a hard disk drive the
data are stored as rows of magnetic domains, and this would remain the same in a
domain wall logic serial memory. What would differ is that, in a hard disk, the
domains are mechanically rotated on their disk underneath a static sensor, whereas in
domain wall logic the domains themselves would move under the action of an
externally applied magnetic eld along static domain wall conduits, potentially
stacked into an ultrahigh-density, 3-D array.
Acknowledgments
The research studies described in this chapter were funded by the European
Community under the Sixth Framework Programme Contract Number 510993:
MAGLOG. The views expressed are solely those of the authors, and the other
Contractors and/or the European Community cannot be held liable for any use that
may be made of the information contained herein. D.A.A. acknowledges the support
of an EPSRC Advanced Research Fellowship (GR/T02942/01).
References
1 T. Dietl, H. Ohno, F. Matsukura, J. Cibert,
D. Ferrand, Science 2000, 287, 1019.
2 H. Ohno, D. Chiba, F. Matsukura,
T. Omiya, E. Abe, T. Dietl, Y. Ohno,
K. Ohtani, Nature 2000, 408, 944.
3 Y. Ohno, D. K. Young, B. Beschoten,
F. Matsukura, H. Ohno, D. D. Awschalom,
Nature 1999, 402, 790.
4 G. A. Prinz, Science 1998, 282, 1660.
5 S. A. Wolf, D. D. Awschalom, R. A.
Buhrman, J. M. Daughton, S. von Molnar,
M. L. Roukes, A. Y. Chtchelkanova, D. M.
Treger, Science 2001, 294, 1488.
6 S. A. Wolf, D. Treger, A. Chtchelkanova,
MRS Bulletin 2006, 31, 400.
References
12 D. Meyners, K. Rott, H. Br
uckl, G. Reiss,
J. Wecker, J. Appl. Phys. 2006, 99, 023907.
13 W. C. Black, B. Das, J. Appl. Phys. 2000, 87,
6674.
14 R. P. Cowburn, M. E. Welland, Science
2000, 287, 1466.
15 R. P. Cowburn, Phys. Rev. B 2002, 65,
092409.
16 M. C. B. Parish, M. Forshaw, Appl. Phys.
Lett. 2003, 83, 2046.
17 A. Imre, G. Csaba, L. Ji, A. Orlov, G. H.
Bernstein, W. Porod, Science 2006, 311,
205.
18 G. Xiong, D. A. Allwood, M. D. Cooke,
R. P. Cowburn, Appl. Phys. Lett. 2001,
79, 3461.
19 D. Ozkaya, R. M. Langford, W. L. Chan,
A. K. Petford-Long, J. Appl. Phys. 2002, 91,
9937.
20 A. Hubert, R. Schafer, Magnetic Domains.
The Analysis of Magnetic Microstructures,
Springer-Verlag, Berlin, 1998.
21 D. A. Allwood, G. Xiong, M. D. Cooke, R. P.
Cowburn, J. Phys. D Appl. Phys. 2003, 36,
2175.
22 R. P. Cowburn, D. A. Allwood, G. Xiong,
M. D. Cooke, J. Appl. Phys. 2002, 91, 6949.
23 D. Atkinson, D. A. Allwood, G. Xiong,
M. D. Cooke, C. C. Faulkner, R. P. Cowburn,
Nature Mater. 2003, 2, 85.
j91
j93
5
Monolithic and Hybrid Spintronics
Supriyo Bandyopadhyay
5.1
Introduction
An electron has three attributes: mass; charge; and spin. An electrons mass is too
small to be useful for practical applications, but the charge is an enormously useful
quantity that is utilized universally in every electronic device extant. The third
attribute spin has played mostly a passive role in such gadgets as magnetic
disks and magneto-electronic devices, where its role has been to affect the magnetic
or the electrical properties in useful ways for example, in the giant magnetoresistance devices used to read data stored in the magnetic disks of laptops and Apple
iPods. Only recently has a conscious effort been made to utilize spin either singly or
in conjunction with the charge degree of freedom to store, process, and transmit
information. This eld is referred to as modern spintronics.
There are two distinct branches of spintronics:
.
Hybrid spintronics: these devices are very much conventional electronic devices,
as information is still encoded in the charge (ultimately detected as voltage or
current), but spin augments the functionality of the device and may improve device
performance. Examples of hybrid spintronic devices are spin eld effect transistors
(SPINFETs) [1] and spin bipolar junction transistors (SBJTs) [2], where information
is still processed by modulating the charge current owing between two terminals
via the application of either a voltage or a current to the third terminal. The process
by which the third terminal controls the voltage or current is spin-mediated hence
the term spin transistors.
Monolithic spintronics: here, charge has no direct role whatsoever. Rather, the
information is encoded entirely in the spin polarization of an electron, which may
be made to have only two stable values: upspin and downspin, by placing the
electron in a static magnetic eld. Upspin will correspond to polarizations
anti-parallel to the magnetic eld, while downspin will be parallel. These two
94
polarizations can encode binary bits 0 and 1 for digital applications. Toggling a bit
merely requires ipping the spin, without any physical movement of charge.
It has recently been argued that as no charge motion (or current ow) is required,
there can be tremendous energy savings during switching [3]. As a result,
monolithic spintronic devices are far more likely to yield low-power signal
processing units than are hybrid spintronic devices. An example of monolithic
spintronic devices is the Single Spin Logic (SSL) paradigm that is described in
Section 5.3.
In this chapter, the two most popular hybrid spintronic devices the SPINFET
and the SBJT will be described, and evidence provided that neither device is likely
to produce signicant advantages in terms of speed or power dissipation over
conventional charge-based transistors. The concept of single spin logic (SSL) will
then be discussed, and its signicant advantages in power dissipation over
SPINFET or SBJT outlined. SSL also has signicant advantages over any chargebased paradigm where charge, rather than spin, is used as the state variable to
encode information. Finally, it will be shown that the maximum energy dissipation
in switching a bit in SSL is the LandauerShannon limit of kTln(p) per bit
operation, where 1/p is the bit error probability. Some gate operations dissipate
less energy than this because of interactions between spins, which may reduce
dissipation [4], because many spins function collectively, as a single unit, to effect
gate operation. This collective, cooperative dynamics is conducive to energy efciency. Any discussion of adiabatic switching [5], which can reduce energy dissipation
even further, is avoided as it is very slow, error-prone, and therefore impractical.
The discussion of devices in non-equilibrium statistical distribution, where energy
dissipation can be reduced below the LandauerShannon limit [6] is also avoided,
simply because energy is required to maintain the non-equilibrium distributions
over time, and that energy must be dissipated in the long term. The nal section
includes a very brief engineers perspective on spin-based quantum computing
(included at the request of the editor).
5.2
Hybrid Spintronics
Hybrid spintronic devices are those where spin is used to enhance the performance
of charge but does not itself play a direct role in storing, processing or communicating information. The two most popular hybrid spintronic devices are the SPINFET
and the SBJT.
5.2.1
The Spin Field Effect Transistor (SPINFET)
j95
96
from the source contact, and absolutely no minority spin (i.e. x-polarized spin) is
injected. Immediately after injection into the channel, all spins are polarized along
the x direction. When the gate voltage is switched on, it induces an electric eld
in the y-direction that causes a Rashba spinorbit interaction [7] in the channel. This
spinorbit interaction acts like an effective magnetic eld in the z-direction (which is
the direction mutually perpendicular to the electrons velocity in the channel and
the gate electric eld). This pseudo-magnetic eld BRashba causes the spins to precess
in the x-y plane, as they travel towards the drain. The angular frequency of spin
precession (which is essentially the Larmor frequency) is given by W eBRashba/m,
where e is the electronic charge and m is the effective mass of the carrier. The pseudomagnetic eld BRashba depends on the magnitude of the gate voltage and the carrier
velocity along the channel according to
BRashba
2(m )2 a46
E y vx
eh2
5:1
where a46 is a material constant, Ey is the gate electric eld, and vx is the electron
velocity.1)
The spatial rate of spin precession is
df df 1
m a46
W=vx 2 2 E y
dx dt dx
h
dt
5:2
which is independent of the carrier velocity and depends only on the gate voltage (or
gate electric eld). The total angle by which the spins precess in the x-y plane as they
travel through the channel from source to drain is
F
2m a46
Ey L
h2
5:3
where L is the channel length. This angle is independent of the carrier velocity and
therefore is the same for every electron, regardless of its initial velocity or scattering
history in the channel. If the gate voltage (and Ey) is of such magnitude that F is an
odd multiple of p, then every electron has its spin polarization anti-parallel to the
drains magnetization when it arrives at the drain. These electrons are blocked by the
drain, and therefore the source to drain current falls to zero. Here, it has been
assumed that the drain is a perfect spin lter that allows only majority spins to
transmit, while completely blocking the minority spins. Without a gate voltage, the
spins do not precess2) and the source to drain current is non-zero. Thus, the gate
voltage causes current modulation via spin precession and this realizes transistor
action.3) This device is also briey discussed in Chapter 3 in Volume III of this series.
It should be clear that (in this device) although spin plays the central role in
current modulation, it plays no direct role in information handling. Information is
still encoded in charge which carries the current from the source to the drain.
The transistor is switched between the on and off states by changing the current
with the gate potential, or by controlling the motion of charges in space. The role of
spin is only to provide an alternate means of changing the current with the gate
voltage. Thus, this device is a quintessential hybrid spintronic device.
5.2.1.1 The Effect of Non-Idealities
The operation of the SPINFET described above is an idealized description. In a real
device, there will be many non-idealities. First, there will be a magnetic eld along
the channel because of the magnetized ferromagnetic contacts. This will cause
problems, as it will add to BRashba and the total effective magnetic eld will be
!
!
jBRashba Bchannel j, which is no longer linearly proportional to carrier velocity vx. As a
result, the precession rate in space will no longer be given by Eq. (5.2)4) and will not be
independent of the carrier velocity (or energy). Therefore, at a nite temperature,
different electrons having different velocities due to the thermal spread in carrier
energy, or because of different scattering history, will suffer different amounts of
precession F. As a result, when the current drops to a minimum, not all spins at
the drain end will have their polarizations exactly anti-parallel to the drains
magnetization. Those that do not, will be transmitted by the drain and contribute
to a leakage current in the off state [8]. This is extremely undesirable as it decreases
the ratio of on- to off-current and will lead to standby power dissipation when the
device is off.
A more serious problem is that the magnetic eld changes the energy dispersion
relations in the channel. In Figure 5.2, the energy dispersion relation (energy versus
wavevector) is shown schematically with and without the channel magnetic eld [9].
Without the magnetic eld, the Rashba interaction lifts the spin degeneracy at any
non-zero wavevector, but each spin-split band still has a xed spin quantization axis
(meaning that the spin polarization in each band is always the same and independent
of wavevector) (Figure 5.2a). The spin polarizations in the two bands are anti-parallel
and the eigenspinors in the two bands are orthogonal. Because of this orthogonality,
there can be no scattering between the two bands. Electrons can scatter elastically
2) Even without the gate voltage, there is obviously 3) The device described here is a normally on
device. If the magnetizations of the source and
some Rashba interaction in the channel due to
drain are anti-parallel instead of parallel, or if the
the electric eld associated with the hetero-inspin polarizations in the source and drain
terface. This eld exists because the structure
contacts have opposite signs (e.g. iron and colacks inversion symmetry along the direction
balt), then the device will be a normally off
perpendicular to the hetero-interface. This
device.
vestigial interaction will cause some spin precession even at zero gate voltage, but this effect 4) In Eq. (5.2), BRashba must be replaced by
!
!
is simply equivalent to causing a xed threshold
jB Rashba B channel j in W.
shift.
j97
98
or inelastically only within the same band, but this does not alter the spin polarization
since every state in the same band has exactly the same spin polarization. However,
if a magnetic eld is present in the channel, then the spin polarizations in both
bands become wavevector-dependent and neither subband has a xed spin polarization. Two states in two subbands with different wavevectors5) will have different
spin polarizations that are not completely anti-parallel (orthogonal). Therefore, the
matrix element for scattering between them is non-zero, which means that there is
nite probability that an electron can scatter between them. Therefore, any momentum randomizing scattering event (due to interactions with impurities or phonons)
will rotate the spin as the initial and nal states have different spin polarizations. This
rotation is random in time or space as the scattering event is random; therefore, it
will cause spin relaxation. This is a new type of spin relaxation, and it is introduced
solely by the channel magnetic eld [10]. It is similar to the ElliottYafet spin
relaxation mechanism [11] in the sense that it is associated with momentum
relaxation. Any such spin relaxation in the channel will randomize the spin
polarizations of electrons arriving at the drain and thus give rise to a signicant
leakage current. Therefore, the channel magnetic eld causes leakage current in
two different ways, both of which are harmful.
In Ref. [1], where the ideal SPINFET was analyzed, it was assumed that there is no
spin relaxation in the channel. The transfer characteristic shown Figure 5.1b, which
shows zero leakage drain current in the OFF-state, is predicated on this assumption.
A question might arise as to whether the usual spin relaxation mechanisms are
operative in the channel even without a channel magnetic eld. For the ideal
SPINFET, the answer is in the negative. The two spin relaxation mechanisms of
concern are the ElliottYafet mode [11] and the DyakonovPerel mode [12]. The
5) Eigenspinors in the two bands having the same
wavevector are still orthogonal.
former is absent if the eigenspinors are wavevector-independent (as is the case with
the ideal SPINFET), and the DyakonovPerel mode is absent if carriers occupy only
a single subband [13]. Thus, in the ideal 1-D SPINFET, there can be no signicant
spin relaxation (even if there is scattering due to interactions with non-magnetic
impurities and phonons). If any spin relaxation occurs, it will be due to hyperne
interactions with nuclear spins. Since such interactions are very weak, they can be
ignored for the most part. However, if there is an axial magnetic eld in the channel,
then all this changes and scattering with non-magnetic impurities or phonons will
cause spin relaxation (and therefore a large leakage current). Consequently, it is
extremely important to eliminate the channel magnetic eld.
There are two ways to eliminate (or reduce) the channel magnetic eld. One way is to
magnetize the contacts in the y-direction instead of the x-direction. Since BRashba is in
the z-direction, it makes no difference as to whether the spins are initially polarized in
the x- or y-direction, as they precess in the x-y plane. The SPINFETworks just as well if
the source and drain contacts are magnetized in the y direction instead of the xdirection. The advantage is that the magnetic eld lines emanating from one contact,
and sinking in the other, are no longer directed along the channel. Consequently, the
channel magnetic eld will be a fringing eld, which is much weaker.
A more sophisticated approach is to play off the Dresselhaus spinorbit interaction [14] against the channel magnetic eld. This spinorbit interaction is present
in any zinc-blende semiconductor that lacks crystallographic inversion symmetry.
In a 1-D channel oriented along the [100] crystallographic direction, the Dresselhaus spinorbit interaction gives rise to a pseudo-magnetic eld along the channel
(x-axis), just as the Rashba spinorbit interaction gives rise to a pseudo-magnetic
eld perpendicular to the channel (in the z-direction). The Dresselhaus eld
BDresselhaus can be used to offset the channel magnetic eld due to the contacts.
Since BDresselhaus depends on the carrier velocity vx, the ensemble average velocity
<vx> (which is the Fermi velocity for a degenerate carrier concentration) can be
tuned with a backgate to make BDresselhaus equal and opposite to the channel
magnetic eld, thereby offsetting the effect of the channel eld. This was the
approach proposed in Ref. [8].
In Figure 5.1b, the transfer characteristic (drain current versus gate voltage) is
shown for an ideal SPINFET and a non-ideal SPINFET (with a channel magnetic
eld), ignoring any xed threshold shift caused by a non-zero Rashba interaction in
the channel at zero gate voltage. It should be noted that the transfer characteristic is
oscillatory and therefore non-monotonic. As a result, the transconductance (qID/q
VG), where ID is the drain current and VG is the gate voltage, can be either positive or
negative depending on the value of VG (gate bias). A xed threshold shift can be
caused in any SPINFET by implanting charges in the gate insulator. Imagine now that
there are two ideal SPINFETs with transfer characteristics, as shown in Figure 5.1c.
These two can be connected in series to behave as a complementary metal oxide
semiconductor (CMOS) like inverter where an appreciable current ows only during
switching. This would be a tremendous advantage as CMOS like operation can be
achieved with just n-type SPINFETs where the majority carriers are electrons. In
conventional CMOS technology, both an n-type and a p-type device would have
j99
100
normally have been needed. Here, we need only n-type devices. However, all this
advantage is defeated if there is signicant leakage current owing through the
SPINFET when it is off. Thus, the leakage current is a rather serious issue and care
must be taken to eliminate it as much as possible.
5.2.1.2 The SPINFET Based on the Dresselhaus SpinOrbit Interaction
The Dresselhaus spinorbit interaction can also be gainfully employed to realize a
different kind of SPINFET [15]. In a 1-D channel, the strength of the Dresselhaus
interaction depends on the physical width of the channel. If the 1-D channel is
dened by a split-gate, then the voltages on the split gate can be varied to change
the channel width and therefore the strength of the Dresselhaus interaction. As with
the Rashba interaction, the Dresselhaus interaction also causes spins to precess in
space at a rate independent of the carrier velocity because it gives rise to a pseudomagnetic eld BDresselhaus along the x-axis. The spins precess in the y-z plane. By
varying the split gate voltage it is possible to change BDresselhaus and the precession
rate, and therefore the angle F, by which the spins precess as they traverse the
channel from source to drain. As the split gate voltage can be used to modulate the
source to drain current, transistor action can be realized. A schematic representation
of a Dresselhaus-type SPINFET is shown in Figure 5.3.
The advantage of the Dresselhaus-type SPINFET over the Rashba-type is that, in
the former, there is never a strong magnetic eld in the channel due to contacts [15],
as BDresselhaus is in the x-direction and therefore the ferromagnetic contacts must be
magnetized in the y-z plane (see Figure 5.3). By contrast, BRashba is in the z-direction,
and therefore in the Rashba-type device the ferromagnetic contacts must be magnetized in the x-y plane. As mentioned above, the Rashba-type device could avoid a
strong channel magnetic eld if the contacts were to be magnetized in the y-direction,
but this is difcult to do as the ferromagnetic layer thickness in the y-direction is
much smaller than that in the x- or z-directions. Thus, the Rashba-type device will
typically have some magnetic eld in the channel, while the Dresselhaus-type device
will not, except for the fringing elds. This feature eliminates many of the problems
associated with the channel magnetic eld in the Dresselhaus-type SPINFET, and
could lead to a reduced leakage current. A comparison between the Rashba-type and
Dresselhaus-type SPINFETs is provided in Ref. [15].
5.2.2
Device Performance of SPINFETs
In the world of electronics, the universally accepted benchmark for the transistor
device is the celebrated metal-oxide-semiconductor-eld-effect-transistor (MOSFET)
which has been and still is the workhorse of all circuits. Therefore, the SPINFET
must be compared with an equivalent MOSFET to determine if there are any
advantages to utilizing spin. Surprisingly, in spite of many papers extolling the
perceived merits of SPINFETs, this elementary exercise was not carried out until
recently. When an ideal SPINFET was compared with an equivalent ideal MOSFET
at low temperatures [16], the results were quite illuminating.
According to Ref. [1], the switching voltage necessary to turn a 1-D Rashba-type
SPINFET from on to off, or vice versa, is given by
V SPINFET
switching
ph2
2m Lx
5:4
where m is the effective mass, L is the channel length (or source-to-drain separation),
and x is the rate of change of the Rashba interaction strength in the channel per unit
change of the gate voltage. This is given by [16]:
x
D(2E g D)
2
h
2pe
2m E g (E g D)(3E g 2D) d
5:5
where Eg is the bandgap of the channel semiconductor, D is the spin orbit splitting in
the valence band of the semiconductor, e is the electronic charge, and d is the gate
insulator thickness. If d 10 nm,6) then it can be calculated that in an InAs channel,
x 10 28 C-m. This is the theoretical value, but an actual measured value is much
less than this [17]. The compound InAs has strong spinorbit interaction and
therefore is an ideal material for SPINFETs.
Now, imagine that the same structure is used as a MOSFET. Then, the switching
voltage that turns the MOSFET device off (depletes the channel of mobile carriers) is
EF/e, where EF is the Fermi energy in the channel.7) Thus, the ratio of the switching
7) An accumulation mode MOSFET has been as6) The ideal semiconductor is a narrow gap
sumed that is normally-on. It has also been
semiconductor such as InAs, which has strong
assumed that the normal channel carrier conRashba spinorbit interaction. In that case, the
centration is large enough that EF kT, where EF
gate insulator will probably be AlAs, which is
reasonably lattice-matched to InAs. Because the
is the Fermi energy and kT is the thermal energy.
conduction band offset between these two maTherefore, the comparison is strictly valid at very
terials is not too large, a minimum of 10 nm gate
low temperatures. It is believed that, at higher
insulator thickness may be necessary to prevent
temperatures,thefundamentalconclusionsfrom
too much gate leakage.
this comparison will not be signicantly altered.
j101
102
voltages is
V SPINFET
switching
V MOSFET
switching
ph2 e
2m LxE F
5:6
Slightly different types of SPINFET ideas have also been reported in the literature,
with names such as Non-ballistic SPINFET [19] or the Spin Relaxation Transistor [2022].
5.2.3.1 The Non-Ballistic SPINFET
The channel of the so-called non-ballistic SPINFET has a two-dimensional (2-D)
electron gas, like an ordinary MOSFET. Unlike in a 1-D structure (quantum wire), the
spin split bands in a 2-D structure (quantum well or 2-D electron gas) do not have a
xed spin quantization axis (meaning that the spin eigenstates are wavevectordependent; recall Figure 5.2b), even if there is no magnetic eld. The only exception
to this situation is when the Rashba and Dresselhaus interactions in the channel have
exactly the same strength. In that case, each band has a xed spin quantization axis,
and the spin eigenstate in either band is wavevector-independent.
In the non-ballistic SPINFET, the Rashba interaction is rst tuned with the
gate voltage to make it exactly equal to the Dresselhaus interaction, which is gate
voltage-independent in a 2-D electron gas. This makes the spin eigenstates wave-
vector-independent. Electrons are then injected into the channel of the transistor
from a ferromagnetic source with a polarization that corresponds to the spin
eigenstate in one of the bands. All carriers enter this band. As the spin eigenstate
is wavevector-independent, any momentum relaxing scattering in the channel,
which will change the electrons wavevector, will not alter the spin polarization
(recall the discussion in Section 5.2.1.1). Scattering can couple two states within the
same band, but not in two different subbands, as the eigenspinors in two different
subbands are orthogonal. Therefore, regardless of how much momentum-relaxing
scattering takes place in the channel, there will be no spin relaxation via the
ElliottYafet mode. There will also be no DyakonovPerel relaxation as it can be
shown that the pseudo-magnetic eld due to the Rashba and Dresselhaus interactions will be aligned exactly along the direction of the eigenspin. As all spins are
initially injected in an eigenstate, they will always be aligned along the pseudomagnetic eld. Consequently, there will be no spin precession which would have
caused DyakonovPerel relaxation. As no major spin relaxation mechanism is
operative, the carriers at the source will arrive at the drain with their spin polarization
intact. If the drain ferromagnetic contact is magnetized parallel to the source
magnetization, then all these carriers will exit the device and contribute to current.
In order to change the current, the gate voltage is detuned; this makes the Rashba and
Dresslhaus interaction strengths unequal, thereby making the spin eigenstates in the
channel wavevector-dependent. In that case, if the electrons suffer momentumrelaxing collisions due to impurities, defects, phonons, and surface roughness, their
spin polarizations will also rotate and this will result in spin relaxation. Thus, the
carriers that arrive at the drain no longer have all their spins aligned along the drains
magnetization. Consequently, the overall transmission probability of the electrons
decreases, and the current drops. This is how the gate voltage changes the source-todrain current and produces transistor action.
One simple way of viewing the transistor action is that when the Rashba and
Dresselhaus interactions are balanced, the channel current is 100% spin polarized
(all carriers arriving at the drain have exactly the same spin polarization). However,
when the two interactions are unbalanced, then the spin polarization of the current
decreases owing to spin relaxation. The spin polarization can decrease to zero but
no less than zero which means that, on average, 50% of the spins will be aligned
and 50% anti-aligned with the drains magnetization when the minimum spin
polarization is reached. The aligned component in the current will transmit and the
anti-aligned component will be blocked. Thus, the minimum current (off-current)
of this transistor is only one-half of the maximum current (on-current). As the
maximum ratio of the on-to-off conductance is only 2, this device is clearly unsuitable
for most if not all mainstream applications. A recent simulation has shown
that the on-to-off conductance ratio is only about 1.2 [23], which precludes use in any
fault-tolerant circuit.
The situation can be improved dramatically if the source and drain contacts have
anti-parallel magnetizations, instead of parallel. In that case, when the Rashba and
Dresselhaus interactions are balanced, the transmitted current will be exactly zero,
but when they are unbalanced, it is non-zero. Therefore, the on-to-off conductance
j103
104
becomes innity. However, there is a caveat. This argument pre-supposes that the
ferromagnetic contacts can inject and detect spins with 100% efciency, meaning
that only the majority spins are injected and transmitted by the ferromagnetic source
and drain contacts, respectively. That has never been achieved, and even after more
than a decade of research the maximum spin injection efciency demonstrated
to date at room temperature is only about 70% [24]. That means
Imaj I min
0:7
Imaj I min
5:7
where Imaj(min) is the majority (minority) spin component of the current. If the spin
injection efciency is only 70%, then 15% of the injected current is due to minority
spins. These minority spins will transmit through the drain, even when the Rashba
and Dresselhaus interactions are balanced. Thus, the off-current is 15% of the total
injected current (Imaj Imin), whereas the on-current is still at best 50% of the total
injected current. Therefore, the on-to-off ratio of the conductance is 0.5/0.15 3.3,
which is not much better than 2.8) Consequently, achieving a large conductance ratio
is very difcult, particularly at room temperature when the spin injection efciency
tends to be small. In order to make the conductance ratio 105 which is what todays
transistors have the spin injection efciency must be 99.999% at room temperature.
This is indeed a tall order, and may not be possible in the near term. If the off (leakage)
current is nearly one-third of the on-current (which is what it will be with present-day
technology), then the standby power dissipation will be intolerable and the noise
margin unacceptable.
The device described in Ref. [20] is identical to that in Ref. [19], and therefore the
same considerations apply.
5.2.3.2 The Spin Relaxation Transistor
The proposed spin relaxation transistor [21, 22] is very similar to the non-ballistic
SPINFET. With zero gate voltage, the spinorbit interaction in the channel is either
weak or non-existent, which makes the spin relaxation time very long. When the gate
voltage is turned on, the spinorbit interaction strength increases, which makes the
spin relaxation time short. Thus, with zero gate voltage, the spin polarization in the
drain current is large (maximum 100%), while with a non-zero gate voltage it is small
(minimum 0%). This device cannot be any better than the non-ballistic SPINFET. If
the drain and source magnetizations are parallel, then it is easy to see that in the on
state, the transmitted current is at best 100% of the injected current, while in the off
state, it is no less than 50% of the injected current (the same arguments as in
Section 5.2.3.1 apply). Therefore, the on-to-off conductance ratio is less than 2. With
anti-parallel magnetizations in the source and drain contacts, the off-current
approaches 0, and the conductance ratio approaches innity, but only if the contacts
inject and detect spin with 100% efciency. If the injection efciency is only 70%
8) Here it has been assumed that the drain is a
perfect spin lter, or equivalently, the spin detection efciency at the drain is 100%.
then, following the previous argument, the conductance ratio is no more than
3.3 [25]. Therefore, this device, too, is unsuitable for mainstream applications.
5.2.4
The Importance of the Spin Injection Efficiency
In every proposal for SPINFETs discussed here [1, 15, 1922] there has always been
the tacit assumption that spin injection and detection efciencies at the source/
channel and drain/channel interfaces are 100%. This is of course unrealistic. It can
be shown easily that the ratio of on- to off-conductance of SPINFETs of the type
proposed in Refs. [1, 15] is
r
1 xS xD
1 xS xD
5:8
where xS is the spin injection efciency at the source/channel interface and xD is the
spin detection efciency at the drain/channel interface. For SPINFETs of the types
proposed in Refs. [1922],
r
1
1 xS xD
5:9
j105
106
temperature exceeding 0 K, the thermal spread in the electron energy will cause
injected electrons to tunnel through both levels if the level separation is less than kT
(which it usually is). Therefore, both spins will be transmitted. This happens even if the
spin levels themselves are not broadened by kT because of weak spinphonon
coupling. As long as both spins are transmitted, the spin selection suffers and that
makes the spin injection efciency considerably less than 100%.
To summarize, as yet there is no known method to suggest even theoretically
the possibility of 100% spin injection efciency. As a result, all SPINFETs discussed
in this chapter suffer from the malady of a low on-to-off conductance ratio, and this
alone may make them non-competitive with MOSFETs.
5.2.5
Spin Bipolar Junction Transistors (SBJTs)
The SBJT is identical with the normal bipolar junction transistor, except that the base
is ferromagnetic and it has a non-zero spin polarization. The conduction energy band
diagram of a heterojunction n-p-n SBJT is shown in Figure 5.4. Assuming that the
carrier concentration in the base is non-degenerate, so that Boltzmann statistics
apply, the spin polarization in the base is
ab tanh(X=2kT)
5:10
where 2X is the energy splitting between majority and minority spin bands in the
base. Based on a small signal analysis, it was shown that the voltage and current gains
afforded by the SBJT is about the same as a conventional BJT [29], but the short-circuit
current gain b has a dependence on the degree of spin polarization in the base which
can be altered with an external magnetic eld using the Zeeman effect. Thus, the
external magnetic eld can act as a fourth terminal, and this can lead to non-linear
circuits such as mixers/modulators. For example, if the ac base current is a sinusoid
with a frequency w1 and the magnetic eld is an ac eld which is another sinusoid
with frequency w2, then the collector current will contain frequency components
w1 w2. This is one example where spin augments the role of charge, making the
SBJT another hybrid spintronic device.
5.2.6
The Switching Speed
The switching delay of any of the SPINFETs discussed above is limited by the transit
time of carriers through the channel (or base). This is entirely due to the fact that
information is encoded by charge (or current), and therefore the charge transit time is
the bottleneck that ultimately limits the switching speed. Thus, hybrid spintronic
devices do not promise any better speed than their charge-based counterparts.
5.3
Monolithic Spintronics: Single Spin Logic
At this point the discussion centers on monolithic spintronics where charge plays no
role whatsoever and spin polarization is used to store, process and transmit information. In 1994, the idea was proposed of single spin logic (SSL) where a single electron
acts as a binary switch and its two orthogonal (anti-parallel) spin polarizations encode
binary bits 0 and 1 [30]. Switching between bits is accomplished by simply ipping the
spin without physically moving charges. To the authors knowledge, this is the rst
known logic family (classical or quantum) based on single electron spins.
5.3.1
Spin Polarization as a Bistable Entity
The rst step in SSL is to make the spin polarization of an electron a bistable quantity
that has only two stable values that will encode the bits 0 and 1. In charge-based
electronics, the state variables representing digital bits (voltage, current or charge),
are not bistable but are continuous variables. So, why is the spin polarization required
to be bistable? The reason is that in the world of electronics, there are analog-to-digital
converters that can convert a continuous variable (analog signal) to a discrete variable
(digital signal). More importantly, logic gates act as ampliers with power gain and
can automatically restore digital signal at logic nodes [31] if the signal is corrupted by
noise. There are no equivalent analog-to-digital converters for spin polarization and
no spin ampliers, and therefore Nature must be relied upon to digitize spin
polarization and make signal degeneration impossible. This can happen if Nature
permits only two values of spin polarization that is, it inherently makes it bistable.
That can be accomplished by placing an electron in a static magnetic eld. The
Hamiltonian describing a single electron in a magnetic eld is
!
H ( p q A )2 =2m (g=2)mB B s
!
5:11
!
where A is the vector potential due to the magnetic ux density B, mB is the Bohr
!
magneton, g is the Lande g-factor, and s is the Pauli spin matrix. If the magnetic
j107
108
Although the binary bits 0 and 1 can be encoded in the two anti-parallel spin
polarizations, there remains a problem in that random bit ips caused by coupling of
spins to the environment will cause bit errors and corrupt the data. The probability
T
of a bit ip within a clock cycle is 1 e hti , where T is the clock period and <t> is
the mean time between random spin ips. In order to make this probability small, it
must be ensured that <t> T.
If the host for the spin is a quantum dot, then indeed <t> can be quite long. In
InP quantum dots, the single electron spin ip time has been reported to exceed
100 ms at 2 K [32]. More recently, several experiments have been reported claiming
spin ip times (or so-called longitudinal relaxation time, T1) of several milliseconds,
culminating in a recent report of 170 ms in a GaAs quantum dot at low temperature
(see the last reference in Ref. [33]). An extremely surprising result is that spin
relaxation time <t> in organic semiconductors can be incredibly long. It was found
that the spin relaxation time in tris(8-hydroxyquinolinolate aluminum) popularly
known as Alq3 can be as long as 1 s at 100 K [34]. If the clock frequency is 5 GHz, then
the clock period T is 200 ps, which is 5 109 times smaller than the spin ip time.
Therefore, the probability that an unintentional spin ip will occur between two
successive clock pulses is 1 e 1/(510E9) 2 10 10, which can be handled by
modern error correction algorithms [35].
Typical error probabilities encountered in todays integrated circuits range from
10 10 to 10 9. If a 5-GHz clock is used and an error probability 1/p of 10 9 is
required, the spin ip time needs to be only 200 ms, which is fairly easy to achieve
today at low temperatures (77 K).
5.3.3
Reading and Writing Spin
The basic idea behind implementing logic gates in SSL is to engineer the interactions between input and output spin bits in such a way that the inputoutput
relationship represents the truth table of the desired logic gate. This approach can
be illustrated by showing how a NAND gate can be realized. The NAND gate is a
universal gate with which any arbitrary combinational or sequential logic circuit
may be implemented, and it is realized with a linear chain of three electrons in
three quantum dots (see Figure 5.6). It will be assumed that only nearest-neighbor
electrons interact via exchange as their wavefunctions overlap. Second nearestneighbor interactions are negligible as exchange interaction decays exponentially
with distance.
For NAND gate implementation, the leftmost and rightmost spins in Figure 5.5
must be regarded as the two inputs bits, and the center spin as the corresponding
output bit. Assume that the downspin state (#) represents bit 1, and the upspin state
(") is bit 0. The global magnetic eld, dening the spin quantization axes, is in the
direction of downspin. It has been shown recently, using a Heisenberg Hamiltonian to describe the 3-spin array, that as long as the Zeeman splitting energies caused
by the local magnetic elds writing bits in the input dots is much larger than the
j109
110
Figure 5.5 Single spin realization of the NAND gate. (a) When two
inputs are [1 1]; (b) when two inputs are [0 0]; (c) when two inputs
are [1 0]; and (d) when two inputs are [0 1]. Reproduced from Ref. 3
with permission from American Scientific Publishers: http://www.
aspbs.com.
exchange coupling strength between neighboring dots, the ground-state spin congurations (determined by the directions of the local magnetic elds) are precisely
those shown in Figure 5.5 [39]. In other words, if the input bits are written with local
magnetic elds and the array is allowed to relax to the ground state in the presence of
the local magnetic elds, then the output bit conforms to the diagrams in
Figure 5.5ad. It is evident that these congurations represent the truth table of
the NAND gate:
Input 1
Input 2
Output
1
0
1
1
1
1
0
1
0
1
1
1
The Zeeman splitting in the input dots caused by the local magnetic elds writing
input data is much larger than the exchange coupling strength, which is roughly the
energy difference between the triplet and singlet states in two neighboring dots.
The NAND gate operates by relaxation to the thermodynamic ground state. It is the
natural tendency of any physical system to gravitate towards the minimum energy
j111
112
state (ground state), this being the law of thermodynamics. However, when a system
achieves the ground state it need not stay there forever, as noise and uctuations
can take it to an excited state. If that happens and the NAND gate strays from the
ground state, the results will be in error. This error probability is calculated next.
The NAND gate reaches ground state by exchanging phonons with the thermal
environment (phonon bath). This brings it into thermodynamic equilibrium with the
surrounding thermal bath. In that case, the occupation probability of any eigenstate
of the gate will be given by FermiDirac statistics. The ground-state occupation
probability is 1/[exp(Eground EF)/kT 1] (where EF is the Fermi energy) and the
excited state occupation probability is 1/[exp(Eexcited EF)/kT 1]. As the occupation probability of the ground state is not unity, the gate does not always work
correctly with 100% certainty in other words, the error probability 1/p is never zero.
However, it does decrease with increasing energy separation between the excited
and ground states.
The error probability 1/p is the sum of the ratios of the probabilities of being in the
excited and ground state, summed over all excited states. This quantity is approxiP
mately excited states e (E excited E ground )=kT , if the FermiDirac statistics are approximated with Boltzmann statistics. It transpires that, in the case of the NAND gate, the
second and higher excited states are far above in energy than the rst excited
state [39]. Therefore, only the rst excited state E1 can be retained in the sum above,
and hence E1 Eground kTln(p). It was also shown rigorously in Ref. [39] that
E1 Eground is: (i) 4J 2Z when the input bits are [1 1]; (ii) 4J 2Z when the input
bits are [0 0]; and (iii) 2Z when the input bits are [0 1] or [1 0]. Here, J is the exchange
coupling strength between two neighboring dots, and 2Z is the Zeeman splitting
energy in any dot due to the global magnetic eld. Therefore, in order to attain an
error probability of 1/p at a temperature T, it must be ensured that: (a) 2Z kT ln(p);
and (b) 4J 2Z kT ln(p), or 2J kTln(p). The maximum values of J or Z are usually
limited by technological constraints; for example, J is usually limited to 1 meV in gatedened quantum dots [45]. These limits will then determine the maximum temperature of operation if a certain error probability 1/p is insisted on (this issue will be
revisited in Section 5.3.10).
5.3.6
Related Charge-Based Paradigms
A similar idea for implementing logic gates using single electron charges conned in
quantum dashes was proposed by Pradeep Bakshi and coworkers in 1991 [46].
There, logic bits were encoded in bistable charge polarizations of elongated quantum
dots known as quantum dashes. Coulomb interaction between nearest-neighbor
quantum dashes pushes the electrons in neighboring dashes into antipodal positions, making the ground-state charge conguration anti-ferroelectric, just as
the exchange interaction in the present case tends to make the ground-state spin
conguration almost anti-ferromagnetic. Three Coulomb-coupled quantum dashes
would realize a NAND gate in a way very similar to what was described here.
j113
114
Bakshis idea inspired a closely related idea known as quantum cellular automata [47], which uses a slightly different host, namely a four- or ve-quantum dot
cell instead of a quantum dash to store a bit. Here too the charge polarization of
the cell is bistable and encodes the two logic bits. The only difference from Ref. [46]
is that coulomb interaction makes the ground-state charge conguration ferroelectric, instead of anti-ferroelectric.
In the schemes of Refs. [46, 47] it is difcult to implement only nearest-neighbor
interactions, at the exclusion of second-nearest-neighbor interactions, mainly because the Coulomb interaction is long range. The interaction in Refs. [46, 47] drops
off as a polynomial of distance, but never exponentially with distance, unless strong
screening can be implemented. If second-nearest-neighbor interactions are not
much weaker than their nearest-neighbor counterparts, the ground-state charge
conguration is weakly stable and not sufciently robust against noise. In this
respect, the spin-based approach has an advantage. As exchange interaction is short
range (it always drops off exponentially with distance), it is much easier to make
the second-nearest-neighbor interactions considerably weaker than the nearestneighbor interactions.
A second issue is that in Refs. [46, 47], there is internal charge movement within
each cell during switching, causing eddy currents. This is a source of dissipation
that is absent in the spin-based paradigm, as there is never any charge movement.
5.3.7
The Issue of Unidirectionality
The unidirectional propagation of logic signal was briey mentioned in Section 5.3.5.
This is a vital issue as the input signal should always inuence (and determine) the
output signal, but not the other way around. Unidirectionality is an important
requirement for logic circuits [31]. A transistor is inherently unidirectional as there
is isolation between its input and output terminals that guarantees unidirectionality.
Therefore, it is easy to make logic circuits with transistors. In SSL, there is unfortunately no isolation between the input and output of the logic gate as exchange interaction
is bidirectional. Consider just two exchange coupled spins in two neighboring
quantum dots; they will form a singlet state and therefore act as a natural NOT gate if
one spin is the input and the other is the output (the output is always the logic
complement of the input) [42]. However, exchange interaction, being bidirectional,
cannot distinguish between which spin is the input bit and which is the output.
The input will inuence the output just as much as the output inuences the input,
and the masterslave relationship between input and output is lost. As the input and
output are indistinguishable, it becomes ultimately impossible for logic signal to
ow unidirectionally from an input stage to an output stage, and not the other way
around. This issue has been discussed at length elsewhere [30, 48].
It is of course possible to forcibly impose unidirectionality in some (but not
all) cases by holding the input cell in a xed state until the desired output state
is produced in the output cell. In that case, the input signal itself enforces
unidirectionality because it is a symmetry-breaking inuence. This approach was
j115
116
5:13
where wRC 1/RC. The above formula holds if the clock circuit is modeled as a
capacitor C in series with a resistor R representing the resistance in the charging
path. Assuming that R 1 kW and C 1 aF, wRC 1015 rad s 1. If the clock
frequency is 5 GHz, then w 3.45 1010 rad s 1. Therefore, Ediss 2.5 10 3kT,
which is negligible.
When clock pads are used, the most attractive feature of SSL is removed, namely
the absence of interconnects (or wires) between successive devices. Wireless
exchange interaction plays the role of physical wires to transmit signals between
neighboring devices, but in order to transmit unidirectionally, each stage will
need to be clocked and this requires a physical interconnect. A clock pad must be
placed between pairs of quantum dots and wires must be attached to them to ferry
the clock signal. Of course, a clock signal is also needed in traditional digital circuits
involving CMOS, so that it is not an additional burden. Nevertheless, it still detracts
from the appeal of a wireless architecture, or the so-called quantum-coupled
architecture.
5.3.9
Energy and Power Dissipation
Most likely, by merely examining Figure 5.5, the reader can understand that the
maximum energy dissipated when the gate switches between any two states is 2Z,
which is the Zeeman splitting in the output dot caused by the global magnetic eld
(this result was proved rigorously in Ref. [39]). Since it was shown in Section 5.3.5
that 2Z kTln(p), the maximum energy dissipated during switching is kTln(p),
which was expected from the LandauerShannon result. The interesting point
however is that the energy dissipated can be less than kT ln(p), depending on the
initial and nal states that is, depending on the old and new input bit strings. If the
gate switches from the state in Figure 5.5c to that in Figure 5.5a, the energy dissipated
is actually (2/3) kTln(p), while if it switches from the state in Figure 5.5b to that in
Figure 5.5c, the energy dissipated is (1/3)kTln(p) [39]. The reductions by factors of 1/3
and 2/3 are due to interactions between spins. Some implications of interactions
were discussed in Ref. [4] in the context of reducing energy dissipation. What really
happens here is that the three spins act collectively in unison as a single unit, because
of the exchange-coupling between them, which reduces the total energy dissipation.
The maximum energy dissipation occurs when the gate switches from the state
in Figure 5.5b to that in Figure 5.5a. That energy is kTln(p). With p 10 9, this
energy is 21 kT. By contrast, modern-day logic gates dissipate more than 50 000 kT
when they switch [54].
5.3.10
Operating Temperature
j117
118
Clocking
Bit flip
2.48 10 3 kT (sinusoidal)
4 10 26 Joules
0 (adiabatic)
kT ln(p) [1/p 10 9]
3 10 22 Joules
With a bit density of 1011 cm 2, the dissipation per unit area is 3 10 22 Joules
5 GHz 1011 cm 2 0.15 W cm 2. In comparison, the Pentium IV chip, with a
bit density three orders of magnitude smaller, dissipates about 50 W cm 2 [57]. SSL
dissipates 300 times less power with a bit density three orders of magnitude larger.
5.3.12
Other Issues
In charge-based devices such as MOSFETs, logic bits 0 and 1 are encoded by the
presence and absence of charge in a given region of space. This region of space could be
the channel of a MOSFET. When the channel is lled with electrons, the device is on
and stores the binary bit 0. When the channel is depleted of electrons, the device is
off and stores the binary bit 1. Switching between bits is accomplished by the
physical motion of charges in and out of the channel, and this physical motion
consumes energy.10) There is no easy way out of this problem since charge is a scalar
quantity and therefore has only a magnitude. Bits can be demarcated only by a
difference in the magnitude of charge or, in other words, by the relative presence
and absence of charge. Therefore, to switch bits the magnitude of charge must be
changed, and this invariably causes motion of the charges with an associated current
ow. Spin, on the other hand, has a polarization which can be thought of as a pseudovector with two directions: up and down. Switching between bits is accomplished
by simply ipping the direction of the pseudo-vector with no change in magnitude of
anything. As switching can be accomplished without physically moving charges (and
causing a current to ow), spin-based devices could be inherently much more energyefcient than charge-based devices, a point which was highlighted in Ref. [3].
It is because of this reason that SSL is much more energy efcient than the Pentium
IV chip.
5.4
Spin-Based Quantum Computing: An Engineers Perspective
5:14
The coefcients a and b are complex numbers, and the phase relationship between
them is important. The qubit is the essential ingredient in the quantum Turing
machine rst postulated by Deutsch to elucidate the nuances of quantum
computing [58].
The power of quantum computing accrues from two attributes: quantum parallelism
and entanglement. Consider two electrons whose spin polarizations are made bistable
by placing them in a magnetic eld. If the system is classical, these two electrons can
encode only two bits of information: the rst one can encode either 0 or 1 (downspin or
upspin), and the second one can do the same. By analogy, N number of electrons can
encode N bits of information, as long as the system is classical.
But now consider the situation when the spin states of the two electrons are
quantum mechanically entangled so that the two electrons can no longer be
considered separately, but rather as one coherent unit. In that case, there are four
possible states that this two-electron system can be in: both spins up; rst spin up
and the second spin down; the rst spin down and the second spin up; and
both spins down. The corresponding qubit can be written as:
qubit aj""i bj"#i c j#"i dj##i
jaj2 jbj2 jcj2 jdj2 1
5:15
Obviously, this system can encode four information states, as opposed to two.
By analogy, if N qubits can be quantum mechanically entangled, then the system
can encode 2N bits of information, as opposed to simply N. This becomes a major
advantage if N is large. Consider the situation when N 300. There is no way in
which a classical computer can be built that can handle 2300 bits of information as
that number is larger than the number of electrons in the known universe. However,
if just 300 electrons could be taken and their spins entangled (a very tall order),
then 2300 bits of information could be encoded. Thus, entanglement bestows on a
quantum computer tremendous information-handling capability.
The above must not be misconstrued to imply that a quantum computer is a
super-memory that can store 2N bits of information in N physical objects such as
electrons. When a bit is read, it always collapses to either 0 or 1. In Eq. (5.17), the
probability that the qubit will be read as 0 is |a|2, and the probability that it will be
read as 1 is |b|2. As either a 0 or a 1 is always read, a quantum computer allows access
to no more than N bits of information. Thus, it is no better than a classical memory
in fact, it is worse! Because of the probabilistic nature of quantum mechanics, a
stored bit will sometimes be read as 0 (with probability |a|2) and sometimes as 1
(with probability |b|2 1 |a|2). If repeated measurements are made of the stored
bits, the exact same result will never be achieved every time, and thus the quantum
system is not even a reliable memory.
The power of entanglement does not result in a super memory, but it is utilized in a
different way, and is exploited in solving certain types of problem super efciently.
j119
120
Two well-known examples are Shors quantum algorithm for factorization [59] and
Grovers quantum algorithm for sorting [60]. Factorization is an NP-hard problem,
but using Shors algorithm, the complexity can be reduced to P. Grovers algorithm
for sorting has a similar advantage. Suppose that there is a need to sort an N-body
ensemble to nd one odd object. By using the best classical algorithm, this will take
N/2 tries, but using Grovers algorithm it will take only HN tries. Thus, entanglement
yields super-efcient algorithms that can be executed in a quantum processor where
qubits are entangled. That is the advantage of quantum computing.
5.4.1
Quantum Parallelism
5:17
The output |O> has been obtained in the time required to perform a single
computation. Now, if the functional F(f(x1), f(x2), . . . f(xM)) can be computed from
|O>, then a quantum computer will be extremely efcient. This is an example where
quantum parallelism can speed up the computation tremendously.
There are two questions, however. First, can the functional be computed from a
knowledge of the superposition of various f(xi) and not the individual f(xi)s? The
answer is yes, but for a small class of problems the so-called DeutschJosza class of
problems which can benet from quantum parallelism. Second, can the functional
be computed correctly with unit probability? The answer is no. However, if the
answer obtained in the rst trial is wrong (hopefully, the computing entity can
The all-important question that stirs physical scientists and engineers is: Which
system is most appropriate to implement a qubit? It must be one where the phase
relationship between the coefcients a and b in Eq. (5.17) are preserved for the
longest time. Charge has a small phase-coherence time which saturates to about 1 ns
as the temperature is lowered towards 0 K because of coupling to zero point motion of
phonons [62]. Spin has a much longer phase-coherence time as it does not couple to
phonons efciently. As a result, the phase-coherence time may not rapidly decrease
with increasing temperature. Measurements of the phase-coherence time (also called
the transverse relaxation time, or T2 time) in an ensemble of CdS quantum dots has
recently been shown actually to increase with increasing temperature [63]. It is
believed that this is because the primary phase relaxation mechanism for electron
spins in these quantum dots is hyperne interactions with nuclear spins. The nuclear
spins are increasingly depolarized with increasing temperature, and that leads to an
actual increase in the electrons spin coherence time with increasing temperature.
Therefore, it is natural to encode a qubit by the coherent superposition of two spin
polarizations of an electron.
In 1996, the idea was proposed of encoding a qubit by the spin of an electron
in a quantum dot [42]. A simple spin-based quantum inverter was designed
which utilized two exchange-coupled quantum dots. This was not a universal
quantum gate, but relied on quantum mechanics to elicit the Boolean logic NOT
function. The spin of the electron was used as a qubit. To the present authors
knowledge, this was the rst instance where the spin of an electron in a quantum
dot was used to implement a qubit in a gate. This idea was inspired by single spin
logic and so in many ways SSL could be regarded as the progeny of spin-based
quantum computing.
Unlike SSL which is purely classical and does not require phase coherence
quantum computing relies intrinsically on phase coherence and is therefore much
more delicate. While the phase coherence of single spins can be quite long, it is doubtful
that several entangled spins will have sufciently long-lived phase coherence to allow a
j121
122
5.5
Conclusions
In this chapter, an attempt has been made to distinguish between the two
approaches to spintronics, namely hybrid spintronics and monolithic
spintronics. It is unlikely that the hybrid approach will bring about signicant
advances in terms of energy dissipation, speed, or any other metric. The monolithic
approach, on the other hand, is more difcult, but also more likely to produce major
advances. The SSL idea revisited here is a paradigm that may begin to bear fruit with
the most recent advances in manipulating the spins of single electrons in quantum
dots [6471]. This is a classical model and does not require the phase coherence that
is difcult to maintain in solid-state circuits. There is also no requirement to
entangle several bits, but rather a need to exchange-couple two bits pairwise.
Thus, SSL is much easier to implement than quantum processors based on single
electron spins.
Acknowledgments
The author acknowledges useful discussions with Profs. Marc Cahay and Supriyo
Datta. These studies were supported by the National Science Foundation under grant
ECCS-0608854, and by the Air Force Ofce of Scientic Research under grant
FA9550-04-1-0261.
References
1 S. Datta, B. Das, Appl. Phys. Lett. 1990, 56,
665.
2 (a) J. Fabian, I. Zutic, S. Das Sarma, Appl.
Phys. Lett. 2004, 84, 85; (b) M. E. Faltte, Z.
G. Yu, E. Johnston-Halperin, D. D.
Awschalom, Appl. Phys. Lett. 2003, 82,
4740.
3 S. Bandyopadhyay, J. Nanosci. Nanotechnol.
2007, 7, 168.
4 S. Salahuddin, S. Datta, Appl. Phys. Lett.
2007, 90, 093503.
5 (a) V. V. Zhirnov, R. K. Cavin, J. A. Hutchby,
G. I. Bourianoff, Proc. IEEE 2003, 91, 1934;
(b) R. K. Cavin, V. V. Zhirnov, J. A. Hutchby,
6
7
8
9
10
11
References
12 (a) M. I. Dyakonov, V. I. Perel, Sov. Phys.
JETP 1971, 33, 1053; (b) M. I. Dyakonov, V.
I. Perel, Sov. Phys. Solid State 1972, 13,
3023.
13 S. Pramanik, S. Bandyopadhyay, M. Cahay,
IEEE Trans. Nanotech. 2005, 4, 2.
14 G. Dresselhaus, Phys. Rev. 1955, 100, 580.
15 S. Bandyopadhyay, M. Cahay, Appl. Phys.
Lett. 2004, 85, 1814.
16 S. Bandyopadhyay, M. Cahay, Appl. Phys.
Lett. 2004, 85, 1433.
17 J. Nitta, T. Takazaki, H. Takayanagi, T.
Enoki, Phys. Rev. Lett. 1997, 78, 1335.
18 Suman Datta, Intel Corporation, private
communication.
19 J. Schliemann, J. C. Egues, D. Loss, Phys.
Rev. Lett. 2003, 90, 146801.
20 X. Cartoixa, D. Z.-Y. Ting, Y.-C. Chang,
Appl. Phys. Lett. 2003, 83, 1462.
21 (a) K. C. Hall, W. H. Lau, K. Gundogdu, M.
E. Flatte, T. F. Boggess, Appl. Phys. Lett.
2003, 83, 2937; (b) K. C. Hall, K.
Gundogdu, J. L. Hicks, A. N. Kocbay, M. E.
Flatte, T. F. Boggess, K. Holabird, A.
Hunter, D. H. Chow, J. J. Zink, Appl. Phys.
Lett. 2005, 86, 202114.
22 K. C. Hall, M. E. Flatte, Appl. Phys. Lett.
2006, 88, 162503.
23 E. Sar, M. Shen, S. Siakin, Phys. Rev. B
2004, 70, 241302(R).
24 G. Salis, R. Wang, X. Jiang, R. M. Shelby, S.
S. P. Parkin, S. R. Bank, J. S. Harris, Appl.
Phys. Lett. 2005, 87, 262503.
25 S. Bandyopadhyay, M. Cahay,
www.arXiv.org/cond-mat/0604532.
26 M. E. Flatte, K. C. Hall, www.arXiv.org/
cond-mat/0607432.
27 P. A. Dowben, R. Skomski, J. Appl. Phys.
2004, 95, 7453.
28 T. Koga, J. Nitta, H. Takayanagi, S. Datta,
Phys. Rev. Lett. 2002, 88, 126601.
29 S. Bandyopadhyay, M. Cahay, Appl. Phys.
Lett. 2005, 86, 133502.
30 S. Bandyopadhyay, B. Das, A. E. Miller,
Nanotechnology 1994, 5, 113.
31 D. A. Hodges, H. G. Jackson, Analysis and
Design of Digital Integrated Circuits, 2nd
edition, McGraw-Hill, New York, 1988,
p. 2.
j123
124
60
61
62
63
64
65
66
67
68
69
70
71
j125
6
Organic Transistors
Hagen Klauk
6.1
Introduction
j 6 Organic Transistors
126
6.1 Introduction
j127
j 6 Organic Transistors
128
6.2
Materials
6.2 Materials
Figure 6.5 Schematic representation of regiorandom P3HT (left) and regioregular P3HT (right).
j129
j 6 Organic Transistors
130
known as PQT-12 (see Figure 6.7). PQT-12 has shown air-stable carrier mobilities as
large as 0.2 cm2 V 1 s 1, and has been employed successfully in the fabrication of
functional organic circuits and displays.
To further improve the performance and stability of alkyl-substituted polythiophenes, researchers at Merck Chemicals incorporated thieno[3,2-b]thiophene moieties into the polymer backbone [10] (see Figure 6.8). The effect of this is two-fold:
.
The delocalization of carriers from the fused aromatic unit is less favorable than from
a single thiophene unit, so the effective p-conjugation length is further reduced and
the ionization potential becomes even larger than for polyquaterthiophene.
6.2 Materials
Figure 6.7 Left: Chemical structure of poly(3,30000 didodecylquaterthiophene) (PQT-12). Right: A schematic
representation of the lamellar p-stacking arrangement.
(Reproduced with permission from Ref. [9].)
j131
j 6 Organic Transistors
132
6.2 Materials
Figure 6.12 Left: Triisopropylsilyl (TIPS) pentacene. Right: Triethylsilyl (TES) anthradithiophene.
j133
j 6 Organic Transistors
134
transistors are almost exclusively due to positively charged carriers. Currents due to
negatively charged carriers are almost always extremely small in these materials, and
often too small to be measurable, even when a large positive gate potential is applied
to induce a channel of negatively charged carriers. Several explanations for the highly
unbalanced currents have been suggested. One explanation is that the difference in
mobility between the two carrier types is very large, due either to different scattering
probabilities or perhaps to different probabilities for charge trapping, either at grain
boundaries within the lm or at defects at the dielectric interface [29, 30]. Another
explanation is that charge injection from the contacts is highly unbalanced due to
different energy barriers for positively and negatively charged carriers.
Interestingly, there are a number of organic semiconductors that show
usefully large mobilities for negatively charged carriers. These materials include
peruorinated copper phthalocyanine (F16CuPc), a variety of naphthalene and
perylene tetracarboxylic diimide derivates, uoroalkylated oligothiophenes, and the
fullerene C60. The carrier mobilities measured in n-channel FETs based on these
materials are in the range of 0.01 to 1 cm2 V 1 s 1. Some of these materials are very
susceptible to redox reactions and thus have poor environmental stability. For
example, C60 shows mobilities as large as 5 cm2 V1 s 1 when measured in ultrahigh vacuum [31, 32], but when exposed to air the mobility drops rapidly by as many
as four or ve orders of magnitude. Similar degradation has been reported for some
naphthalene and perylene tetracarboxylic diimide derivatives [33, 34]. Other materials, in particular F16CuPc [35], have been found to be very stable in air, although the
exact mechanisms that determine the degree of air stability are still unclear. Air-stable
n-channel organic FETs with carrier mobilities as large as 0.6 cm2 V 1 s1 [36] have
been reported.
6.3
Device Structures and Manufacturing
From a technological perspective, the most useful organic transistor implementation is the thin-lm transistor (TFT). The TFT concept was initially proposed and
developed by Weimer during the 1960s for transistors based on polycrystalline
inorganic semiconductors, such as evaporated cadmium sulde [37]. The idea was
later extended to TFTs based on plasma-enhanced chemical-vapor deposited
(PECVD) hydrogenated amorphous silicon (a-Si:H) TFTs [38]. Today, a-Si:H TFTs
are widely employed as the pixel drive devices in active-matrix liquid-crystal
displays (AMLCDs) which accounted for $50 billion in global sales in 2005. Organic
TFTs were rst reported during the 1980s [4, 39]. To produce an organic TFT, the
organic semiconductor and other materials required (gate electrode, gate dielectric,
source and drain contacts) are deposited as thin layers on the surface of an
electrically insulating substrate, such as glass or plastic foil. Depending on the
sequence in which the materials are deposited, three organic TFT architectures can
be distinguished, namely inverted staggered, inverted coplanar, and top-gate (see
Figure 6.13).
For the top-gate structure the source and drain contacts can be patterned with high
resolution prior to the deposition of the semiconductor, and the contact resistance
can be as low as for the inverted staggered architecture. However, the deposition of
a gate dielectric and a gate electrode on top of the semiconductor layer means that
great care must be exercised to avoid process-induced degradation of the organic
semiconductor, and the possibility of material mixing at the semiconductor/
dielectric interface must be taken into account.
A variety of methods exists for the deposition and patterning of the individual
layers of the TFT. For example, gate electrodes and source and drain contacts are often
made using vacuum-deposited inorganic metals. Non-noble metals, such as aluminum or chromium, are suitable for the gate electrodes in the inverted device
j135
j 6 Organic Transistors
136
structures, as these metals have excellent adhesion on glass substrates. Noble metals
(most notably gold) are a popular choice for the source and drain contacts, as they tend
to give lower contact resistance than other metals. The metals are conveniently
deposited by evaporation in vacuum and can be patterned either by photolithography
and etching or lift-off, by lithography using an inkjet-printed wax-based etch
resist [42], or simply by deposition through a shadow mask [43].
An alternative to inorganic metals are conducting polymers, such as polyaniline and
poly(3,4-ethylenedioxythiophene):poly(styrene sulfonic acid) (PEDOT:PSS; see
Figure 6.14). These are chemically doped conjugated polymers that have electrical
conductance in the range between 0.1 and 1000 S cm 1. The way in which continuous
advances in synthesis and material processing have improved the conductivity of
Baytron PEDOT:PSS over the past decade is shown in Figure 6.15. Unlike inorganic
metals, conducting polymers can be processed either from organic solutions or from
aqueous dispersions, so gate electrodes and source and drain contacts for organic TFTs
can be prepared by spin-coating and photolithography [44], or by inkjet-printing [45].
One important aspect in organic TFT manufacturing is the choice of the gate
dielectric. Depending on the device architecture (inverted or top-gate), the dielectric
material and the processing conditions (temperature, plasma, organic solvents, etc.)
must be compatible with the previously deposited device layers and with the
substrate. For example, chemical-vapor-deposited (CVD) silicon oxide and silicon
nitride which are popular gate dielectric materials for inorganic (amorphous or
polycrystalline silicon) TFTs may not be suitable for use on exible polymeric
substrates, as the high-quality growth of these lms often requires temperatures that
exceed the glass transition temperature of many polymeric substrate materials.
The thickness of the gate dielectric layer is usually a compromise between the
requirements for large gate coupling, low operating voltages, and small leakage
currents. Large gate coupling (i.e. a large dielectric capacitance) means that
the transistors can be operated with low voltages, which is important when the
TFTs are used in portable or handheld devices that are powered by small batteries or
by near-eld radio-frequency coupling, for example. Also, a large dielectric capacitance ensures that the carrier density in the channel is controlled by the gate and not
by the drain potential, which is especially critical for short-channel TFTs. One way to
increase the gate dielectric capacitance is to employ a dielectric with larger permittivity e (C e/t). However, as Veres [46], Stassen [47], and Hulea [48] have pointed out,
the carrier mobility in organic eld effect transistors is systematically reduced as
the permittivity of the gate dielectric is increased, presumably due to enhanced
localization of carriers by local polarization effects (see Figure 6.16).
Alternatively, low-permittivity dielectrics with reduced thickness or thin multilayer
dielectrics with specically tailored properties may be employed. The greatest
concern with thin dielectrics is the inevitable increase in gate leakage due to defects
and quantum-mechanical tunneling as the dielectric thickness is reduced. A number
of promising paths towards high-quality thin dielectrics with low gate leakage for lowvoltage organic TFTs have recently emerged. One such approach is the anodization of
aluminum, which has resulted in high-quality aluminum oxide lms thinner than
10 nm providing a capacitance around 0.4 mF cm 2. Combined with an ultra-thin
molecular SAM, such dielectrics can provide sufciently low leakage currents to
allow the fabrication of functional low-voltage organic TFTs with large carrier
j137
j 6 Organic Transistors
138
mobility [49]. Another path is the use of very thin crosslinked polymer lms prepared
by spin-coating [50, 51]. With a thickness of about 10 nm, these dielectrics provide
capacitances as large as 0.3 mF cm 2 and excellent low-voltage TFT characteristics.
Finally, the use of high-quality insulating organic SAMs or multilayers provides a
promising alternative [5254]. Such molecular dielectrics typically have a thickness of
2 to 5 nm and a capacitance between 0.3 and 0.7 mF cm 2, depending on the number
and structure of the molecular layers employed, and they allow organic TFTs to
operate with voltages between 1 and 3 V.
6.4
Electrical Characteristics
mCdiel W
(V GS V th )2
2L
6:2
Equation (6.1) describes the relationship between the drain current ID, the gatesource voltage VGS and the drain-source voltage VDS in the linear regime, while
Eq. (6.2) relates ID, VGS and VDS in the saturation regime. Cdiel is the gate dielectric
capacitance per unit area, m is the carrier mobility, W is the channel width, and L is the
channel length of the transistor. For silicon MOSFETs, the threshold voltage Vth is
dened as the minimum gate-source voltage required to induce strong inversion.
Although this denition cannot strictly be applied to organic TFTs, the concept is
nonetheless useful, as the threshold voltage conveniently marks the transition
between the different regions of operation.
Figure 6.17 shows the currentvoltage characteristics of an organic TFT that
was manufactured on a glass substrate using the inverted staggered device structure
(see Figure 6.13a) with a thin layer of vacuum-evaporated pentacene as the semiconductor, a self-assembled monolayer gate dielectric, and source/drain contacts
prepared by evaporating gold through a shadow mask [54]. The device operates
as a p-channel transistor with a threshold voltage of 1.2 V. By tting the
currentvoltage characteristics to Eqs. (6.1) or (6.2), the carrier mobility m can be
estimated; for this particular device it is about 0.6 cm2 V 1 s 1 (Cdiel 0.7 mF cm 2,
W 100 mm, L 30 mm).
Equations (6.1) and (6.2) describe the drain current for gate-source voltages above
the threshold voltage. Below the threshold voltage there is a region in which the drain
current depends exponentially on the gate-source voltage. This is the subthreshold
region; for the TFT in Figure 6.17 it extends between about 0.5 V and about 1 V.
Within this voltage range the drain current is due to carriers that have sufcient
thermal energy to overcome the gate-controlled barrier near the source and mainly
diffuse, rather than drift, to the drain:
qjV GS V th j
for V GS between V th and V SO
6:3
ID I 0 exp
nkT
The slope of the log(ID) versus VGS curve in the subthreshold region is determined
by the ideality factor n and the temperature T (q is the electronic charge, k is
j139
j 6 Organic Transistors
140
Boltzmanns constant, and VSO is the switch-on voltage which marks the gate-source
voltage at which the drain current reaches a minimum [55]). It is usually quantied as
the inverse subthreshold slope S (also called subthreshold swing):
S
qV GS
nkT
ln10
q
q(log10 ID )
6:4
The ideality factor n is determined by the density of trap states at the semiconductor/
dielectric interface, Nit, and the gate dielectric capacitance, Cdiel:
n 1
S
qN it
Cdiel
6:5
kT
qN it
ln10 1
q
C diel
6:6
When Nit/Cdiel is small, the ideality factor n approaches unity. Silicon MOSFETs
often come close to the ideal room-temperature subthreshold swing of 60 mV per
decade, as the quality of the Si/SiO2 interface is very high. In organic TFTs the
semiconductor/dielectric interface is typically of lower quality, and thus the
subthreshold swing is usually larger. The TFT in Figure 6.17 has a subthreshold
swing of 100 mV dec 1, from which an interface trap density of 3 1012 cm 2 V 1
is calculated.
The subthreshold region extends between the threshold voltage Vth and the switchon voltage VSO. Below the switch-on voltage ( 0.5 V for the TFT in Figure 6.17) the
drain current is limited by leakage through the semiconductor, through the gate
dielectric, or across the substrate surface. This off-state current should be as small as
possible. The TFT in Figure 6.17 has an off-state current of 0.5 pA, which corresponds
to a on off-state resistance of 3 TW.
To predict an upper limit for the dynamic performance of the transistor, it is useful
to calculate the cut-off frequency [56]. This is the frequency at which the current gain
is unity, and is determined by the transconductance and the gate capacitance:
fT
gm
2pCgate
6:7
qI D
qV GS
6:8
Thus, the transconductance can be extracted from the currentvoltage characteristics; for the pentacene TFT in Figure 6.17 the transconductance is about 2 mS at
VGS 2.5 V. The transconductance is related to the other transistor parameters as
follows:
gm
mCdiel W
V DS
L
6:9
gm
mCdiel W
(V GS V th )
L
6:10
The gate capacitance Cgate is the sum of the intrinsic gate capacitance (representing
the interaction between the gate and the channel charge) and the parasitic gate
capacitances (including the gate/source overlap capacitances and any fringing
capacitances). For the transistor in Figure 6.17 the total gate capacitance is estimated
to be about 30 pF, so the cut-off frequency will be on the order of 10 kHz.
The currentvoltage characteristics of an n-channel TFT based on peruorinated
copper phthalocyanine (F16CuPc) are shown in Figure 6.18 [54]. It has a threshold
voltage of 0.2 V, a switch-on voltage of 0.8 V, an off-state current of about 5 pA,
and a carrier mobility of about 0.02 cm2 V 1 s 1 (Cdiel 0.7 mF cm 2, W 1000
mm, L 30 mm). The transconductance is about 0.8 mS at VGS 1.5 V and the gate
capacitance is about 300 pF, so the TFT will have a cut-off frequency of about 500 Hz.
The availability of both p-channel and n-channel organic TFTs makes the implementation of organic complementary circuits possible. From a circuit design
perspective, complementary circuits are more desirable than circuits based only on
one type of transistor, as complementary circuits have smaller static power dissipation and greater noise margin [54]. The schematic and electrical characteristics of an
organic complementary inverter with a p-channel pentacene TFT and an n-channel
F16CuPc TFT are shown in Figure 6.19.
In a complementary inverter the p-channel FET is conducting only when the input
is low (Vin 0 V), while the n-channel FET is conducting only when the input is high
(Vin VDD). Consequently, the static current in a complementary circuit is essentially
determined by the leakage currents of the transistors and can be very small (less than
100 pA for the inverter in Figure 6.19). As a result, the output signals in the steady
states are essentially equal to the rail voltages VDD and ground. During switching
there is a brief period when both transistors are simultaneously in the low-resistance
on-state and a signicant current ows between the VDD and ground rails. Thus, most
of the power consumption of a complementary circuit is due to switching, while the
static power dissipation is very small.
j141
j 6 Organic Transistors
142
Figure 6.19 (a) Schematic, (b) actual photographic image and (c)
transfer characteristics of an organic complementary inverter
based on a p-channel pentacene TFT and an n-channel F16CuPc
TFT.
The dynamic performance of the inverter is limited by the slower of the two
transistors, in this case the n-channel F16CuPc TFT. Figure 6.20 shows, in graphical
from, the inverters response to a square-wave input signal with an amplitude of 2 V
and a frequency of 500 Hz that is, the cut-off frequency of the F16CuPc TFT.
To allow organic circuits to operate at higher frequencies, it is necessary to increase
the transconductance and reduce the parasitic capacitances. From a materials point of
view, this can be done by developing new organic semiconductors that provide larger
carrier mobilities [36]. Ideally, the carrier mobilities of the p- and n-channel TFTs
should be similar. From a manufacturing point of view, the critical dimensions of the
devices must be reduced that is, the channel length and overlap capacitances must
be made smaller. However, as the channel length of organic TFTs is reduced, the
6.5 Applications
transconductance does not necessarily scale as predicted by Eqs. (6.9) and (6.10). The
main reason for this is that the contact resistance in organic TFTs can be very large, as
the contacts are typically not doped. Consequently, as the channel length is reduced,
the drain current becomes increasingly limited by the contact resistance (which is
independent of channel length), rather than by the channel resistance. This is shown
in Figure 6.21 for pentacene TFTs in the inverted staggered conguration and in the
inverted coplanar conguration, both with channel length ranging from 50 mm to
5 mm and all with a channel width of 100 mm.
For long channels (L 50 mm), where the effect of the contact resistance on the
TFT characteristics is small, the transconductance is similar for both technologies
(gm 1 mS; m 0.5 cm2 V1 s 1). The difference between the two devices congurations becomes evident when the channel length is reduced. For the coplanar TFTs the
potential benet of channel length scaling is largely lost due to the signicant contact
resistance (5 104 W cm). The staggered conguration offers signicantly smaller
contact resistance (103 W cm), as the area available for charge injection from the
metal into the carrier channel is larger (given by the gate/contact overlap area), and as
a result the transconductance for short channels (5 mm) is signicantly larger in the
case of the staggered TFTs (7 mS versus 2 mS). The staggered TFT with a channel
length of 5 mm and a transconductance of 7 mS has a total gate capacitance of about
5 pF, so the cut-off frequency is estimated to be on the order of 200 kHz (at an
operating voltage in the range of 23 V).
6.5
Applications
j143
j 6 Organic Transistors
144
6.5 Applications
to leak from the pixel during the tframe, then the minimum required TFT off-state
resistance can be estimated:
Roff
tframe
0:01Cpixel
6:11
Thus, for a tframe of 20 ms and a pixel capacitance of 1 pF, the TFTs must have an
off-state resistance of 2 TW, or greater.
For an extended graphics array (XGA) display with 768 rows and a tframe of 20 ms
the time available to charge the capacitors in one row is tselect tframe/M 26 ms. If a
combined (pixel plus data line) capacitance of 2 pF is assumed, and it is specied that
the capacitors be charged to within 1% of the target data voltage, then the maximum
allowed TFT on-state resistance can be estimated:
Ron
tselect
4:6Cpixel
6:12
L
mCdiel W(V GS V th )
6:13
j145
j 6 Organic Transistors
146
6.5 Applications
Figure 6.25 A flexible 48 48 pixel active-matrix organic lightemitting diode (OLED) display with pentacene organic TFTs
developed at Penn State University. (Reproduced with permission
from Ref. [60].)
with a TFT geometry of W/L 5. Thus, the static transistor performance requirements for active-matrix OLED displays can be met by organic TFTs T1 and T2
occupying only a small fraction of the pixel area. A photograph of an active-matrix
OLED display with two pentacene TFTs and a bottom-emitting OLED in each
pixel [60] is shown in Figure 6.25.
Compared with liquid-crystal and electrophoretic displays, active-matrix OLED
displays are far more demanding as far as the uniformity and stability of the TFT
parameters are concerned. For example, if the TFT threshold voltage in a liquidcrystal display changes over time, or is not uniform across the display, the image
quality is not immediately affected as the select voltage is usually large. In an OLED
display, however, the threshold voltage of transistor T2 directly determines the drive
current and thus the OLED brightness. Consequently, even small differences in
threshold voltage have a dramatic impact on image quality and color delity. In order
to reduce or eliminate the effects of non-uniformities or time-dependent changes of
the TFT parameters, more complex pixel circuit designs have been proposed [61]. In
these designs, additional TFTs are implemented to make the OLED current independent of the threshold voltages of the TFTs. A pixel circuit with a larger number of
TFTs is likely to occupy a greater portion of the total pixel area, but may signicantly
improve the performance of the display.
A second potential application for organic TFTs is in large-area sensors for the
spatially resolved detection of physical or chemical quantities, such as temperature,
pressure, radiation, or pH. As an example, Figure 6.26 shows the schematic of an
active-matrix pressure sensor array. Mechanical pressure exerted on a sensor element
leads to a reversible and reproducible change in the resistance of the sensor element.
To allow external circuitry to access the resistance of each individual sensor it is
necessary to integrate a transistor with each sensor element. During operation,
the rows of the array are selected one by one to switch the TFTs in the selected row to
the low-resistance state (similar to the row-select procedure in an active-matrix
display) and the resistance of the each sensor element in the selected row is measured
j147
j 6 Organic Transistors
148
through the data lines by external circuitry. This is repeated for each row until the
entire array has been read out. The result is a map of the 2-D distribution of
the desired physical quantity (in this case, the pressure) over the array. By reading the
array continuously a dynamic image can be created (again, similar to an active-matrix
display). One application of a 2-D pressure sensor array is a ngerprint sensor for
personal identication purposes. Another interesting application is the combination
of spatially resolved pressure and temperature sensing over large conformable
surfaces to create the equivalent of sensitive skin for human-like robots capable of
navigating in unstructured environments [62].
6.6
Outlook
Organic transistors are potentially useful for applications that require electronic
functionality with low or medium complexity distributed over large areas on
unconventional substrates, such as glass or exible plastic lm. Generally, these
are applications in which the use of single-crystal silicon devices and circuits is either
technically or economically not feasible. Examples include exible displays and
sensors. However, organic transistors are unlikely to replace silicon in applications
References
characterized by large transistor counts, small chip size, large integration densities,
or high-frequency operation. The reason is that, in these applications, the use of
silicon MOSFETs is very economical. For example, the manufacturing cost of a
silicon MOSFET in a 1-Gbit memory chip is on the order of $10 9, which is less than
the cost of printing a single letter in a newspaper.
The static and dynamic performance of state-of-the-art organic TFTs is already
sufcient for certain applications, most notably small or medium-sized exible
displays in which the TFTs operate with critical frequencies in the range of a few tens
of kilohertz. Strategies for increasing the performance of organic TFTs include
further improvements in the carrier mobility of the organic semiconductor (either
through the synthesis of new materials, through improved purication, or by
enhancing the molecular order in the semiconductor layer) and more aggressive
scaling of the lateral transistor dimensions (channel length and contact overlap). For
example, an increase in cut-off frequency from 200 kHz to about 2 MHz can be
achieved either by improving the mobility from 0.5 cm2 V 1 s 1 to about 5 cm2
V 1 s 1 (assuming critical dimensions of 5 mm and an operating voltage of 3 V), or
by reducing the critical dimensions from 5 mm to about 1.6 mm (assuming a mobility
of 0.5 cm2 V 1 s 1 and an operating voltage of 3 V). A cut-off frequency of about
20 MHz is projected for TFTs with a mobility of 5 cm2 V 1 s 1 and critical dimensions of 1.6 mm (again assuming an operating voltage of 3 V).
However, these improvements in performance must be implemented without
sacricing the general manufacturability of the devices, circuits, and systems. This
important requirement has fueled the development of a whole range of large-area,
high-resolution printing methods for organic electronics. Functional printed organic
devices and circuits have indeed been demonstrated using various printing techniques, but further studies are required to address issues such as process yield and
parameter uniformity.
One of the most critical problems that must be solved before organic electronics
can begin to nd use in commercial applications is the stability of the devices and
circuits during continuous operation, and while exposed to ambient oxygen and
humidity. Early product demonstrators have often suffered from short lifetimes due
to a rapid degradation of the organic semiconductor layers. However, recent advances
in synthesis, purication, processing, in addition to economically viable encapsulation techniques, have raised the hope that the degradation of organic semiconductors
is not an insurmountable problem and that organic thin-lm transistors may soon be
commercially utilized.
References
1 W. Warta, N. Karl, Hot holes in
naphthalene: High, electric-eld
dependent mobilities, Phys. Rev. B 1985,
32, 1172.
2 M. C. J. M. Vissenberg, M. Matters, Theory
of eld-effect mobility in amorphous
j149
j 6 Organic Transistors
150
10
11
References
22 S. Lee, B. Koo, J. Shin, E. Lee, H. Park, H.
Kim, Effects of hydroxyl groups in
polymeric dielectrics on organic transistor
performance all-organic active matrix
exible display, Appl. Phys. Lett. 2006,
88, 162109.
23 P. Herwig, K. M
ullen, A soluble pentacene
precursor: Synthesis, solid-state
conversion into pentacene and application
in a eld-effect transistor, Adv. Mater. 1999,
11, 480.
24 A. Afzali, C. D. Dimitrakopoulos, T. L.
Breen, High-performance, solutionprocessed organic thin lm transistors
from a novel pentacene precursor, J. Am.
Chem. Soc. 2002, 124, 8812.
25 M. M. Payne, S. R. Parkin, J. E. Anthony, C.
C. Kuo, T. N. Jackson, Organic eld-effect
transistors from solution-deposited
functionalized acenes with mobilities as
high as 1 cm2/Vs, J. Am. Chem. Soc. 2005,
127, 4986.
26 C. C. Kuo, M. M. Payne, J. E. Anthony, T. N.
Jackson, TES anthradithiophene solutionprocessed OTFTs with 1 cm2/V-s mobility,
2004 International Electron Devices
Meeting Technical Digest, 2004, p. 373.
27 S. K. Park, C. C. Kuo, J. E. Anthony, T. N.
Jackson, High mobility solution-processed
OTFTs, 2005 International Electron
Devices Meeting Technical Digest, 2005, p.
113.
28 K. C. Dickey, J. E. Anthony, Y. L. Loo,
Improving organic thin-lm transistor
performance through solvent-vapor
annealing of solution-processable
triethylsilylethynyl anthradithiophene,
Adv. Mater. 2006, 18, 1721.
29 L. L. Chua, J. Zaumseil, J. F. Chang, E. C.
W. Ou, P. K. H. Ho, H. Sirringhaus, R. H.
Friend, General observation of n-type eldeffect behaviour in organic
semiconductors, Nature 2005, 343, 194.
30 R. Schmechel, M. Ahles, H. von Seggern,
A pentacene ambipolar transistor:
Experiment and theory, J. Appl. Phys. 2005,
98, 084511.
31 J. Yamaguchi, S. Yaginuma, M. Haemori,
K. Itaka, H. Koinuma, An in-situ
32
33
34
35
36
37
38
39
40
j151
j 6 Organic Transistors
152
41
42
43
44
45
46
47
48
49
References
59 P. Wellmann, M. Hofmann, O. Zeika,
A. Werner, J. Birnstock, R. Meerheim,
G. He, K. Walzer, M. Pfeiffer, K. Leo,
High-efciency p-i-n organic lightemitting diodes with long lifetime,
J. Soc. Information Display 2005,
13, 393.
60 L. Zhou, A. Wanga, S. C. Wu, J. Sun, S.
Park, T. N. Jackson, All-organic active
matrix exible display, Appl. Phys. Lett.
2006, 88, 083502.
j153
j155
7
Carbon Nanotubes in Electronics
M. Meyyappan
7.1
Introduction
Since the discovery of carbon nanotubes (CNTs) in 1991 [1] by Sumio Iijima of the
NEC Corporation, research activities exploring their structure, properties and
applications have exploded across the world. This interesting nanostructure exhibits
unique electronic properties and extraordinary mechanical properties, and this has
prompted the research community to investigate the potential of CNTs in numerous
areas including, among others, nanoelectronics, sensors, actuators, eld emission
devices, and high-strength composites [2]. Although recent progress in all of these
areas has been signicant, the routine commercial production of CNT-based
products is still years away. This chapter focuses on one specic application eld
of CNTs, namely electronics, and describes the current status of developments in this
area. This description is complemented with a brief discussion of the properties and
growth methods of CNTs, further details of which are available in Ref. [2].
7.2
Structure and Properties
156
Figure 7.1 A strip of graphene sheet rolled into a carbon nanotube; m and n are chiral vectors.
structures are simply known as chiral nanotubes. It is important to note that, at the
time of this writing, exquisite control over the values of m and n is not possible.
Transmission electron microscopy (TEM) images of a SWNTand a MWNTare shown
in Figure 7.2, where the individual SWNTs are seen to have a diameter of about 1 nm.
The MWNT has a central core with several walls, and a spacing close to 0.34 nm
between the two neighboring walls (Figure 7.2b).
A SWNTcan be either metallic or semiconducting, depending on its chirality that
is, the values of n and m. When (n m)/3 is an integer, the nanotube is metallic,
otherwise it is semiconducting. The diameter of the nanotube is given by d (ag/p)
(n2 mn m2)0.5, where ag is the lattice constant of graphite. The strain energy
7.3 Growth
caused in the SWNT formation from the graphene sheet is inversely proportional
to its diameter. There is a minimum diameter that can afford this strain energy,
which is about 0.4 nm. On the other hand, the maximum diameter is about 3 nm,
beyond which the SWNT may not retain its tubular structure and ultimately will
collapse [3].
In the case of MWNTs, the smallest inner diameter found experimentally is about
0.4 nm, but typically is around 2 nm. The outer diameter of MWNT can be as large as
100 nm. Both, SWNTs and MWNTs, while preferentially being defect-free, have been
observed experimentally in various defective forms such as bent, branched, helical,
and even toroidal nanotubes.
The bandgap of a semiconducting nanotube is given by Eg 2dccg/d, where dcc is
the carboncarbon bond length (0.142 nm), and g is the nearest neighbor-hopping
parameter (2.5 eV). Thus, the bandgap of semiconducting nanotubes of diameters
between 0.5 and 1.5 nm may be in the range of 1.5 to 0.5 eV. The resistance of a
metallic SWNT is h/(4e2) 6.5 KW, where h is Plancks constant. However, experimental measurements typically show higher resistance due to the presence of
defects, impurities, structural distortions and the effects of coupling to the substrate
and/or contacts.
In addition to their interesting electronic properties, SWNTs exhibit extraordinary
mechanical properties. For example, the Youngs modulus of a (10,10) SWNT is over
1 TPa, with a tensile strength of 75 GPa. The corresponding values for graphite
(in-plane) are 350 and 2.5 GPa, whereas the values for steel are 208 and 0.4 GPa [3].
Nanotubes can also sustain a tensile strain of 10% before fracturing, which is
remarkably higher than other materials. The thermal conductivity of the nanotubes is
substantially high [3, 4], with measured values being in the range of 1800 to
6000 W mK1 [5].
7.3
Growth
The oldest process for preparing SWNTs and MWNTs is that of arc synthesis [6], with
laser ablation subsequently being introduced during the 1990s to produce CNTs [7].
These bulk production techniques and large quantities are necessary when using
CNTs in composites, gas storage. and similar applications. For electronics applications, it may be difcult to adopt pick and place strategies using bulk-produced
material. Assuming that the need for in-situ growth approaches that currently used to
produce devices for silicon-based electronics, it is important at this point to describe
the techniques of chemical vapor deposition (CVD) and plasma-enhanced chemical
vapor deposition (PECVD), both of which allow CNT growth on patterned
substrates [8].
Chemical vapor deposition is a frequently used technique in silicon integrated circuit
manufacture when depositing thin lms of metals, semiconductors and dielectric
materials. The CVD of CNTs typically involves a carbon-bearing feedstock such as CO
or hydrocarbons including methane, ethylene, and acetylene. It is important to
j157
158
7.3 Growth
In silicon integrated circuit manufacture, PECVD has emerged as a lowertemperature alternative to thermal CVD for the deposition of thin lms of silicon,
or its nitride or oxide. This strategy is not entirely successful in CNTgrowth, primarily
because the growth temperature is tied to catalyst effectiveness as opposed to
precursor dissociation [10]. Nevertheless, some reports have been made concerning
nanotube growth at low temperature, or even at room temperature. However, these
results are not reliable as they do not explicitly measure the growth temperature (i.e.
the wafer temperature), but instead report only the temperatures on the bottom side
of the substrate holder. Neither did any of these studies appreciate the fact that the
plasma and particularly the dc plasma used in most studies heats the wafer
substantially, particularly at the very high bias voltages commonly used. In such a
case, even external heating via a heater may not be needed, and in most cases the
temperature difference between the wafer and the bottom of the substrate holder may
be several hundred degrees or more, depending on the input power [11]. Even if any
degree of growth temperature reduction is achieved using PECVD, the material
quality is relatively poor. Most of these structures are often conical in terms of
conguration, with a continuously tapering diameter from the bottom to the top.
Regardless of such issues, PECVD has one clear advantage over CVD, in that it
enables the production of individual, freestanding, vertically aligned MWNT structures as opposed to individual, wavy nanotubes. These freestanding structures are
invariably disordered with a bamboo-like inner core and, for that reason, are referred
to as multi-walled carbon nanobers (MWNFs) or simply carbon nanobers
(CNFs) [10]. PECVD is also capable of producing wavy MWNTs which are very
similar to the thermally grown MWNTs.
To date, a variety of plasma sources have been used in CNT growth, including
dc [12, 13], microwave [14], and inductive power sources [15]. The plasma efciently
breaks down the hydrocarbon feedstock, thus creating a variety of reactive radicals
which are also the source for amorphous carbon. For this reason the feedstock is
typically diluted with hydrogen, ammonia, argon or nitrogen to maintain the
hydrocarbon fraction at less than about 20%. PECVD is performed at low pressures,
typically in the range of 1 to 20 Torr. A scanning electron microscopy (SEM) image of
PECVD-grown MWNFs is shown in Figure 7.5, wherein the individual structures are
j159
160
well separated and vertical. However, the TEM image reveals a disordered inner core
and also the catalyst particle at the top. In contrast, in most cases of MWNT growth
by thermal and plasma CVD, the catalyst particle is typically at the base of the
nanotubes.
7.4
Nanoelectronics
7.4 Nanoelectronics
when the need for transition to alternatives emerges, there are a few expectations
from a viable alternative:
.
.
.
.
.
.
The new technology must be easier and cheaper to manufacture than Si CMOS.
A high current drive is needed with the ability to drive capacitances of interconnects
of any length.
A reliability factor enjoyed to date must be available (i.e. operating time > 10 years).
A high level of integration must be possible (>1010 transistors per circuit).
A very high reproducibility is expected.
The technology should not be handicapped with high heat dissipation problems
currently forecast for the future-generation silicon devices, or attractive solutions
must be available to tackle the anticipated heat loads.
Of course, the present status of CNTelectronics has not yet reached the point where its
performance in terms of the above goals can be evaluated. This is due to the fact that most
efforts to date relate to the fabrication of single devices such as diodes and transistors, and
little has been targeted at circuits and integration (as will be seen in the next section). In
summary, the present status of CNT electronics evolution is similar to that of silicon
technology between the invention of the transistor (during the late 1940s) and the
development of integrated circuit in the 1960s. It would take at least a decade or more to
demonstrate the technological progress required to meet the above-listed expectations.
7.4.1
Field Effect Transistors
The early attempts to investigate CNTs in electronics consisted of fabricating eld effect
transistors (FETs) with a SWNT as the conducting channel [17]. Tans et al. [18] reported
rst a CNT-FET where a semiconducting SWNT from a bulk-grown sample was
transplanted to bridge the source and drain contacts separated by about a micron or
more (see Figure 7.6). The contact electrodes were dened on a thick 300-nm SiO2 layer
grown on a silicon wafer acting as the back gate. The 1.4-nm tube with a corresponding
bandgap of about 0.6 eV showed IV characteristics indicating gate control of current
ow through the nanotube. In the FET, the holes were the majority carriers and the
conductance was shown to vary by at least six orders of magnitude. The device gain of this
early device wasbelow 1,dueprimarily to thethick gate oxide and high contact resistance.
At almost the same time, Martel et al. [19] presented their CNT-FET results using
a similar back-gated structure. The oxide thickness was 140 nm, and 30 nm-thick
j161
162
gold electrodes were dened using electron-beam lithography. The room-temperature IVSD characteristics showed that the drain current decreased strongly with
increasing gate voltage, thus demonstrating CNT-FET operation through hole
transport. The conductance modulation in this case spanned ve orders of
magnitude. The transconductance of this device was 1.7 nS at VSD 10 mV, with
a corresponding hole mobility of 20 cm2 V1 s1. The authors concluded that the
transport was diffusive rather than ballistic and, in addition, the high hole
concentration was inherent to the nanotubes as a result of processing. This
unipolar p-type device behavior suggested a Schottky barrier at the tubecontact
metal interface. Later, the same IBM group [20] showed that n-type transistors could
be produced simply by annealing the above p-type device in a vacuum, or by
intentionally doping the nanotube with an alkali metal such as potassium.
All of the early CNT-FETs used the silicon substrate as the back gate. This
unorthodox approach has several disadvantages. First, the resulting thick gate oxide
required high gate voltages to turn the device on. Second, the use of the substrate for
gating led to inuencing all devices simultaneously. For integrated circuit applications, each CNT-FET needs its own gate control. Wind et al. [21] reported the rst top
gate CNT-FET which also featured embedding the SWNT within the insulator rather
than exposing it to ambient, as had been done in the early devices. It was considered
that such ambient exposure would lead to p-type characteristics and, as expected, the
top gate device showed signicantly better performance. A p-type CNT-FET with a
gate length of 300 nm and gate oxide thickness of 15 nm showed a threshold voltage
of 0.5 V and a transconductance of 2321 mS mm1. These results were better than
those of a silicon p-MOSFET [22] with a much smaller gate length of 15 nm and an
oxide thickness of 1.4 nm performing at a transconductance of 975 mS mm1. The
CNT-FET also showed a three- to four-fold higher current drive per unit width
compared to the above silicon device. Nihey et al. [23] also reported a top-gated device
albeit with a thinner (6 nm) gate oxide TiO2 with a higher dielectric constant. This
device showed a 320 nS transconductance at a 100 mV drain voltage.
Most recently, Seidel et al. [24] reported CNT-FETs with short channels (18 nm), in
contrast to all previous studies with micron-long channels. This group used nanotubes with diameters of 0.7 to 1.1 nm, and bandgaps in the range of 0.8 to 1.3 eV. The
impressive performance of these devices included an on/off current ratio of 106 and a
transconductance of 12 000 mS mm1. The current-carrying capacity was also very
high, with a maximum current of 15 mA, corresponding to 109 A cm2. Another
recent innovation involved a nanotube-on-insulator (NOI) approach [25], similar to
the adoption of silicon-on-insulator (SOI) by the semiconductor industry, which
minimizes parasitic capacitance.
As noted above, many of the CNT-FET studies conducted to date have used SWNTs,
this being due to their superior properties compared to other types of nanotubes such as
MWNTs and CNFs. As the bandgap is inversely proportional to the diameter, largediameter MWNTs are invariably metallic. Martel et al. [19] fabricated the rst MWNT
FETs showing a signicant gate effect. The real advantage of MWNTs is that they can be
grown vertically up to reasonable lengths for a given diameter. Choi et al. [26, 27] took
advantage of this point to fabricate vertical transistors using MWNTs grown using an
anodic alumina template which essentially contained nanopores of various diameter
7.4 Nanoelectronics
such that they were able to control both the diameter and pore density. The vertically
aligned MWNTs grown using nanopores are shown in Figure 7.7. Following this, the
device fabrication consisted of depositing SiO2 on top of the aligned nanotubes. The
electrode was then attached to the nanotubes through electron-beam-patterned holes,
and nally the top metal electrode was attached. A schematic of the CNT-FETarray and a
SEM image of a 10 10 array is shown in Figure 7.8. For these devices, the authors
claimed a tera-level transistor density of 2 1011 cm2.
j163
164
Beyond single CNT-FETs, early attempts to fabricate circuit components have also
been reported [2830]. Liu et al. [28] fabricated both PMOS and CMOS inverters based
on CNT-FETs. Their CMOS inverter connected two CNT-FETs in series using an
electric lead of 2 mm length. One of the FETs was a p-type device, while the other an
n-type device was obtained using potassium doping (see Figure 7.9). The devices
used the silicon substrate as a back gate. The inverter was constructed and biased in
the conguration shown in Figure 7.9b, after which a drain bias of 2.9 V was applied
and the gate electrode was swept from 0 to 2.5 V, dening logics 0 and 1, respectively.
As seen in the transfer curve in Figure 7.9b, when the input voltage is low (logic 0) the
p-type and n-type devices were on and off, respectively (corresponding to their
respective high- and low-conductance states). Then, the output is close to VDD,
producing a logic output of 1. When the input voltage is high (logic 1), the reverse is
the case: the p-type transistor is off and the n-type device is on, with a combined
output close to 0, producing a logic 0. The transfer curve in Figure 7.9b should not
have a slope if this inverter were functioning ideally (which would correspond to a
stepwise Vout versus Vin behavior). However, this rst demonstration had a leaky
p-device and so control of the threshold voltage of both devices was not perfect, thus
leading to the slope seen in Figure 7.9b. Derycke et al. [29] also demonstrated an
inverter using p- and n-type CNT-FETs. Beyond inverters, Bachtold et al. [30]
fabricated circuits to perform logic operations such as NOR, and also constructed
an ac ring oscillator.
7.4 Nanoelectronics
Figure 7.10 (a) A neural tree constructed using numerous Yjunctions of carbon nanotubes. (b) A network of interconnected
SWNTs showing a few Y-junctions [36].
j165
166
Figure 7.10b shows a very rudimentary attempt [36] to create such a CNT tree by
utilizing self-assembled porous, collapsible polystrene/divinyl benzene microspheres
to hold the catalyst. A controlled collapse of the spheres leads to the creation and release
of the catalyst on the substrate for the CVD of SWNTs. Few Y-junctions showing an
interconnected three-dimensional network of nanotubes are visible in Figure 7.10b.
7.4.2
Device Physics
The physics describing the operation of the CNT-FETs has been described in several
theory papers [3942] and summarized and reviewed elsewhere [17]. Here, the
available information would be used to predict upper limits on CNT-FET performance. Yamada argues [17] that in nanoFETs the properties of the bulk material do
not inuence the device performance. In micro and macro FETs, the drain current is
proportional to carrier mobility, which varies from material to material through its
dependence on material-related properties such as effective mass and phonon
scattering. In nanoFETs with ideal contacts, the drain current is determined by the
transmission coefcient of an electron ux from the source to drain. When the carrier
transport is ballistic, this coefcient is 1 and the material properties do not enter the
picture directly. However, the material properties in practice enter indirectly as
practical contacts (and hence the transmission coefcient at the contact) depend on
the channel material and the interaction between the metalchannel semiconductor
interface. The same applies to the preparation of the insulation material that
determines the gate voltage characteristics. One-dimensional nanomaterials such
as SWNTs inherently can suppress the short-channel effects arising from a deeper,
broader distribution of carriers away from the gate, which occurs in reduced-size,
two-dimensional silicon devices.
By using such an ideal device under ballistic transport and ideal contacts, Guo
et al. [43] evaluated the performance of CNT-FETs. These authors considered a 1-nm
SWNT with an insulator thickness of 1 nm and dielectric constant of 4. The geometry
also was idealized to be a coaxial structure with contacts at either end of the
nanotubes, and the gate wrapped around the nanotube. The computed on-off current
ratio of 1120 was far higher than planar silicon CMOS devices with the same insulator
parameters and power supply. The transconductance of this structure was also very
high at 63 mS at 0.4 V, which was two orders of magnitude higher than any CNT-FET
device discussed in Section 7.4.1. When a planar CNT-FET is compared with a planar
Si-MOSFET with similar insulator parameters, the CNT-FET shows an on-current of
790 mA mm1 at VDD 0.4 V, in contrast to the 1100 mA mm1 value for silicon. In a
following study, the same authors [44] computed the high-frequency performance of
this ideal device and projected a unity gain cut-off frequency ( fT) of 1.8 THz. Their
analysis also showed that the parasitic capacitance dominates the intrinsic gate
capacitance by three orders of magnitude. In a similar investigation, Hasan et al. [45]
computed fT to be a maximum of 130 L1 GHz, where L is the channel length in
microns. As there is a desire to increase the current drive and reduce
parasitic capacitance per tube, parallel array of nanotubes as the channel warrants
In relative terms, very few studies have been conducted on the use of nanotubes as
memory devices. Rueckes et al. [47] proposed a crossbar architecture for constructing
non-volatile random access memory with a density of 1012 element per cm2 and an
operation frequency of over 100 GHz. In this architecture, nanotubes suspended in a
n m array act like electromechanical switches with distinct on and off states.
A carbon nanotube-based ash memory was fabricated by Choi et al. [48], in which the
source-drain gap was bridged with a SWNTas a conducting channel and the structure
had a oating gate and a control gate. By grounding the source and applying 5 V and
12 V at the drain and control gate, respectively, a writing of 1 was achieved. This
corresponds to the charging of the oating gate. To write 0, the source was biased at
12 V, the control gate xed at 0 V, and the drain allowed to oat. Now, the electrons on
the oating gate were tunneled to the source and the oating gate was discharged. In
order to read, a voltage VR was applied to the control gate and, depending on the state
of the oating gate (1 or 0), the drain current was either negligible or nite,
respectively. Choi et al. [48] reported an appreciable threshold modulation for their
SWNT ash memory operation.
7.5
Carbon Nanotubes in Silicon CMOS Fabrication
One of the anticipated problems in the next few generations of silicon devices is that
the copper interconnect would suffer from electromigration at current densities of
106 A cm2 and above. The resistivity of copper increases signicantly for wiring line
widths lower than 0.5 mm. In addition, the etching of deep vias and trenches and voidfree lling of copper in high-aspect ratio structures may pose technological
j167
168
challenges as progress continues along the Moores law curve. All of these issues
together demand investigation of alternatives to the current copper damascene
process. In this regard, it is important to note that CNTs do not break down even
at current densities of 108 to 109 A cm2 [49] and hence can be a viable alternative to
copper. Kreupl et al. [50] were the rst to explore CVD-produced MWNTs in vias and
contact holes. They measured a resistance of about 1 W for a 150 mm2 via which
contained about 10 000 MWNTs, thus yielding a resistance of 10 KW per nanotube.
Further studies by this group demonstrated current densities of 5 108 A cm2,
which exceeds the best results for metals, although the individual resistance of the
MWNTs was still high at 7.8 KW. Srivastava et al. [51] provided a systematic evaluation
of CNT and Cu interconnects and showed that, for local interconnects, nanotubes
may not offer any advantages, partly due to the fact that practical implementations of
nanotube interconnects have an unacceptably high contact resistance. On the other
hand, their studies showed an 80% performance improvement with CNTs for long
global interconnects. It is important to note that, even in the case of local interconnects, very few studies have been conducted on contact and interface engineering;
however, with further investigation the situation may well improve beyond these
early expectations.
Given the potential of CNTs as interconnects, it is necessary to devise a processing
scheme that is compatible with the silicon integrated circuit fabrication scheme.
In the via and contact hole schemes, Kreupl et al. [50] followed a traditional approach
by simply replacing the copper lling step with a MWNT CVD step. If this proves to
be reliable and specically if the MWNTs do not become unraveled during the
chemical mechanical polishing (CMP) step then it would be a viable approach,
provided that a dense lling of vertical nanotubes can be achieved. Even then, the
conventional challenges in deep aspect ratio etching and void-free lling of features,
which arise due to shrinking feature sizes, remain. The dry etching of high-aspect
ratio vias with vertical sidewalls will increasingly become a problem, and further
processing studies must be performed to establish the viability of this approach.
In the meantime, Li et al. [52] described an alternative bottom-up scheme
(see Figure 7.11) wherein the CNT interconnect is rst deposited using PECVD
at prespecied locations. This is followed by tetraethylorthosilicate (TEOS) CVD of
SiO2 in the space between CNTs, and then by CMP to yield a smooth top surface of
SiO2 with embedded CNT interconnects. Top metallization completes the fabrication. The interconnects grown using PECVD in Ref. [52] are CNFs with a bamboolike morphology. Whilst they are really vertical and freestanding compared to
MWNTs, thus allowing ease of fabrication, their resistance is higher. This, combined with high contact resistance, resulted in a value of about 6 KW for a single
50-nm CNF. Further annealing to obtain higher quality CNFs and, more importantly, interface engineering to reduce contact resistance, can prove this approach
valuable. It would also be useful for future three-dimensional architectures. A
detailed theoretical study conducted by Svizhenko et al. [53] showed that almost 90%
of the voltage drop occurs at the metalnanotube interface, while only 10% is due to
transport in the nanotube, thus emphasizing the need for contact interface
engineering.
7.5.2
Thermal Interface Material for Chip Cooling
j169
170
Atomic force microscopy (AFM) is a versatile technique for imaging a wide variety of
materials with high resolution. In addition to imaging metallic, semiconducting and
dielectric thin lms in integrated circuit manufacture, AFM has been advocated for
critical dimension metrology. Currently, the conventional probes of either silicon or
silicon nitride which are sited at the end of an AFM cantilever have a tip radius of
curvature about 2030 nm, which is obtained by micromachining or reactive ion
etching. These probes exhibit signicant wear during continuous use, and the worn
probes can also break during tapping mode or contact mode operation. Carbon
nanotube probes can overcome the above limitations due to their small size, high
aspect ratio and the ability to buckle reversibly. Their use in AFM was rst
demonstrated by Dai et al. [57], while a detailed discussion of CNT probes and their
construction and applications is also available [58].
A SWNT tip, attached to the end of an AFM cantilever, is capable of functioning as
an AFM probe and provides better resolution than conventional probes. This SWNT
probe can be grown directly using thermal CVD at the end of a cantilever [59]. An
image of an iridium thin lm collected using a SWNT probe is shown in Figure 7.12.
The nanoscale resolution is remarkable, but more importantly the tip has been shown
to be very robust and signicantly slow-wearing compared to conventional
probes [59]. Due to thermal vibration problems, the SWNTs with a typical diameter
of 1 to 1.5 nm cannot be longer than about 75 nm for probe construction. In contrast,
however, the MWNTs with their larger diameter can form 2- to 3-mm-long probes.
It is also possible to sharpen the tip of MWNTs to reach the same size as SWNTs, thus
allowing the construction of long probes without thermal stability issues, but with the
resolution of SWNTs [60]. Both, SWNTand sharpened MWNTprobes have been used
to image the semiconductor, metallic, and dielectric thin lms commonly encountered in integrated circuit manufacture [5860].
In addition to imaging, MWNT probes nd another important application in the
prolometry associated with integrated circuit manufacture. As via and other feature
sizes continue to decrease, it will become increasingly difcult to use conventional
prolometers to obtain sidewall proles and monitor the depth of features. Although
AFM is advocated as a replacement in this respect, the pyramidal nature of standard
AFM probes would lead to artifacts when constructing the sidewall proles of
trenches. Hence, a 7- to 10-nm MWNT probe might be a natural choice for this
task. An image of a MWNT probe for this purpose, and the results of proling a
photoresist pattern generated by interferometric lithography, are shown in
Figure 7.13. While early attempts consisted of manually attaching a SWNT to a
cantilever [57], followed by direct CVD of a nanoprobe on a cantilever [58, 59],
Ye et al. [61] reported the rst batch fabrication of CNT probes on a 100-mm wafer
j171
172
using PECVD. Unfortunately, the yield obtained was only modest, due mainly to
difculties encountered in controlling the angle of the nanotube to the plane.
7.6
Summary
In this chapter, the current status of CNT-based electronics for logic and memory
devices has been discussed. Single-walled CNTs exhibit intriguing electronic properties that make them very attractive for future nanoelectronics devices, and early studies
have conrmed this potential. Even with substantially longer channel lengths and
thicker gate oxides, the performance of CNT-FETs is better than that of current silicon
devices, although of course the design and performance of the former are far from
being optimized. While all of this is impressive, the real challenge is in the integration
of a large number of devices at reasonable cost to compete with and exceed the
performance status quo of silicon technology at the end of the Moores law paradigm.
In addition, all of the studies conducted to date have been along the lines of following
silicon processing schemes, with one-to-one replacement of a silicon channel with a
CNT channel while maintaining the circuit and architectural schemes. Thus, aside
from changing the channel material, there is no novelty in this approach. The structure
and unique properties of SWNTs may be ideal for bold, novel architectures and
processing schemes, for example in neural or biomimetic architecture, although very
few investigations have been carried out in such non-traditional directions. Clearly,
CNTs in active devices are a long-term prospect, at least a decade or more away. In the
meantime, opportunities exist to include this extraordinary material into silicon
CMOS fabrication not only as a high-current-carrying, robust interconnect but also
as an effective heat-dissipating, thermal interface material.
Acknowledgments
The author is grateful to his colleagues at NASA Ames Center for Nanotechnology for
providing much of the material described in this chapter.
References
1 S. Iijima, Nature 1991, 354, 56.
2 M. Meyyappan (Ed.), Carbon Nanotubes:
Science and Applications, CRC Press, Boca
Raton, FL, 2004.
3 J. Han,in: M. Meyyappan (Ed.), Carbon
Nanotubes: Science and Applications, CRC
Press, Boca Raton, FL, 2004, Chapter 1.
4 M. A. Osman, D. Srivastava,
Nanotechnology 2001, 12, 21.
References
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
j173
174
j175
8
Concepts in Single-Molecule Electronics
Bjorn Lussem and Thomas Bjrnholm
8.1
Introduction
176
of molecular electronics experienced its rst drawback when it was reported that some
of the early results might be due to artifacts [6].
These reports on possible artifacts helped to settle the expectations laid on
molecular electronics to a reasonable level, and in the current phase of development
more emphasis has been placed on proving that the observed effects are in fact
molecular, and on identifying experimental set-ups that avoid the possible introduction of artifacts.
In this chapter, a brief overview is provided of the eld of single-molecule
electronics, beginning with a short theoretical introduction that aims to dene the
concepts and terminology used. (A more extensive explanation of the theory can be
found in Part A of Volume III of this series.) In the following sections the text is more
factual, and relates to how single molecules can actually be contacted and which
functionalities they can provide. The means by which these molecules may be
assembled to implement complex logical functions are then described, followed by a
brief summary highlighting the main challenges of molecular electronics.
8.2
The General Set-Up of a Molecular Device
In this section, the basic concepts used in subsequent sections will be explained, and
the presence of two domains of current transport strong coupling and weak coupling
will be outlined.
Electrical transport across single molecules is remarkably different from conduction in macroscopic wires. In a large conductor, charge carriers move with a mean
drift velocity vd, which is proportional to the electric eld, E. Together with the density
of free charge carriers, this proportionality gives rise to Ohms law.
For a single molecule this model is not applicable. Instead of considering drift
velocities and resistances, which are only dened as average over a large number of
charge carriers, concern is centered on the transmission of electrons across the
molecule.
The general set-up of a molecular device is shown in Figure 8.1. The molecule is
connected by two electrodes, labeled source and drain, while the electrostatic
potential of the molecule can also be varied by using a gate electrode.
If a voltage is applied between the source and drain (i.e. a negative voltage with
respect to the drain), the electrochemical potential of the source, mS, shifts up and the
potential of the drain, mD, moves down. An energy window is opened between these
two potentials; in this energy window lled states in the source oppose empty states at
the same energy in the drain.
However, as the two electrodes are isolated from each other, electrons cannot easily
ow from the source to the drain. Only if a molecular level enters the energy window
between mS and mD, can electron transport be mediated by these molecular levels (see
Figure 8.1b). Therefore, each time the electrochemical potential of the source aligns
with a molecular level the current rises sharply. In Figure 8.1b, for example, the
source potential exceeds the lowest unoccupied molecular orbital (LUMO), and
electrons can be transmitted across this level. Similarly, the drain potential can drop
below the highest occupied molecular orbital (HOMO), which would also initiate
electron ow.
8.2.1
The Strong Coupling Regime
Depending on the strength of the coupling of the electrodes with the molecule, there
are two domains of the electron transport: the weak coupling limit and the strong
coupling limit. To distinguish between these two limits, coupling strengths GS and GD
can be dened that describe how strongly the electronic states of the source or drain
|ii, | ji interact with the molecular eigenstates |mi. GS and GD have the dimension of
energy; a high energy means that the electrode states can strongly couple with the
molecular states.
High coupling energies therefore result in electronic wavefunctions of the
electrodes that can extend into the molecule, so that charge can be easily transmitted
from the source, across the molecule towards the drain. Thus, the current across the
molecule can be expressed in terms of a transmission coefcient T(E )
I
2e
h
mS
T(E)dE
8:1
mD
2e2
h
8:2
j177
178
8:3
E C (N 1)
CS
2 CS CD CG
|{z}
CS
To include this energy, the energy diagram shown in Figure 8.1b may be rened (see
Figure 8.2b). Two levels are included in this diagram, which correspond to the
HOMO (lower) and LUMO states shown in Figure 8.1b. The LUMO is oated
e2
upwards by E C (1) 2C
, while the HOMO is moved down by the same amount.
S
Figure 8.2 (a) The molecule is coupled to source, gate and drain
by capacitances. (b) The energy level in the weak coupling limit.
Additional charging energy must be provided by the source
voltage.
Thereby, a large energy gap is opened within which no electrons can ow and hence
the current is blocked; this effect is known as coulomb blockade.1)
8.3
Realizations of Molecular Devices
In the preceding section, the coupling of the molecule to the electrodes was described
by the coupling strengths GS and GD. However, this theoretical description of contact
between molecule and electrodes hides the complexity and difculties that must be
overcome in order to contact a single molecule. The main strategy that is followed to
contact single molecules is to use specically designed molecular anchoring groups
that bind and self-organize on the contacts. In the following section, some key
examples are provided of anchoring groups and self-organization strategies.
8.3.1
Molecular Contacts
Molecular end groups must provide a chemical bond to the contacting metal that is,
they must offer a self-organizing functionality. Furthermore, the nature of the contact
determines the coupling strength G and, therefore, how strongly the molecular states
couple with the electronic states of the electrodes. In the case of strong coupling,
electrons can be easily transmitted across the molecule and the resistance of the
molecule should be low; conversely, new effects such as coulomb blockade can occur
for weak coupling, and this may be exploited for new devices. Thus, the suitable choice
of molecular contact is one of the main issues in the design of a molecular device.
Various molecular contacts have been proposed. Besides the most common gold
sulfur bond, sulfur also binds to other metal such as silver [9] or palladium [10]. Sulfur
may be replaced by selenium [11], which yields higher electronic coupling. A further
increase in coupling strength is provided by dithiocarbamates [12, 13], which is
explained by resonant coupling of the binding group to gold. Other binding groups
include CN [14], silanes [15], and molecules directly bound to either carbon [16] or
silicon [17].
Using these binding groups, it is possible to contact single or at least a low number
of molecules. In the past, several experimental set-ups have been developed which
differ in the numbers of molecules contacted. Whereas some set-ups allow contacting
single molecules (i.e. the method of mechanically controlled break junctions, nanogaps
or scanning probe methods), in other arrangements the demand of a single molecule is
relaxed and a small number of molecules is contacted (e.g. in the crossed wire set-up or
in a crossbar structure). One further distinction between these set-ups is the number of
electrodes that can contact each molecule that is, if besides the source and drain a
gate electrode is also present.
1) More details on single-electron effects can be
found in Chapter 2 of Volume III in this series.
j179
180
8.3.2
Mechanically Controlled Break Junctions
The concept of mechanically controlled break junctions dates back to 1985, when the
method was used to obtain superconducting tunnel junctions [18]. In 1997, it was
applied to contact a single molecule between two gold electrodes [19]. By comparing
the current versus voltage characteristic of symmetric and asymmetric molecules, it
was shown that only single molecules are contacted [20]. Since then, a variety of
different molecules have been studied using this technique, and in particular the use
of a low-temperature set-up was seen to provide a signicant improvement in data
quality [2129].
The general set-up of a mechanically controlled break junction is shown in
Figure 8.3. A metallic wire, which is thinned in the middle, is glued onto a exible
Figure 8.3 The mechanically controlled break junction. (From Ref. [30].)
substrate. Often, the wire is under etched so that a freestanding bridge is formed.
Underneath the substrate, a piezo element can press the sample against two
countersupports, which causes the substrate to bend upwards such that a strain is
induced in the wire. If the strain becomes too large, the wire breaks and a small
tunneling gap opens between the two parts of the wire. The length of the tunneling
gap can be precisely controlled by the position of the piezo element.
To contact a single molecule, either a solution of the molecule is applied to the
broken wire, or the molecules have already been preassembled onto the wire before it
is broken. As described above, these molecules have chemical binding groups at both
ends that easily bind to the material of the wire. As the molecule has binding groups at
both ends it can bridge the tunneling gap if the length of the latter is properly
adjusted. In this way a single-molecule device is formed.
Mechanically controlled break junctions represent a stable and reliable method for
contacting single molecules. Most importantly, the correlation between molecular
structure and current versus voltage characteristic can be studied, which will
stimulate the understanding of the conduction through single molecules. At present,
however, there is no way of integrating these devices that is, it is not possible to
contact a larger number of molecules in parallel and to combine these molecules into
a logic device.
8.3.3
Scanning Probe Set-Ups
Due to its high spatial resolution, scanning probe methods (SPM) are well suited to
contact single molecules, and several strategies have evolved during recent years.
One strategy is to contact a so-called self-assembled monolayer (SAM) of the
molecule of interest with an atomic force microscope, using a conductive tip [3134]
(see Figure 8.4b). SAMs are formed by immersing a metallic bottom electrode into a
solution of molecules (see Figure 8.4a) which must possess an end group that
covalently binds to the metal layer. In the rst layer, the molecules attach covalently to
the metal; all following layers are then physisorbed onto this rst chemisorbed layer.
The physisorbed layers can easily be washed off using an additional rinsing step, such
that only the rst, chemisorbed, layer remains on the metal. A famous example of
such a molecular end group/metal surface combination is sulfur on gold. Thiolates,
and especially alkanethiols, are known to perfectly organize on a gold surface and to
build a SAM that covers the gold electrode [35].2)
These SAMs can be contacted by a conductive atomic force microscopy (AFM) tip
(see Figure 8.4b). Depending on the tip geometry, a low number of molecules (ca. 75)
can be contacted [31]. Using this method, it has been shown that the current through
alkanethiolates and oligophenylene thiolates decreases exponentially with the length
of the molecule, and that the resistance of a molecule is dependent on the metal used
to contact the molecule [36].
2) See also Chapter 9, which provides a broader
introduction into self-organization and SAMs.
j181
182
the lament will break and a small tunneling gap is opened between the substrate and
the tip. The whole set-up is immersed in a solution of molecules that have functional
binding groups at both ends. If the tunneling gap is approximately the size of the
molecule, there is a probability that one molecule will bind to the tip and the substrate
and will therefore bridge the gap.
As a third strategy of contacting single molecules, the molecules of interest can be
embedded into a SAM of insulating alkanethiols (see Figure 8.4df ). In this way it is
possible to obtain single, isolated molecules which protrude from the surrounding
alkanethiol SAM (see Figure 8.4f ). The conductance of the molecule can be
measured either by placing the STM tip above the molecule [40, 41] or by measuring
the height difference between the embedded molecule and the surrounding alkanethiol SAM [37, 42, 43]. This height difference is not only dependent on the
differences in length of the alkanethiol and the molecule but also reects differences
in the conductivities of the molecules. Therefore, the conductivity of the embedded
molecule can be calculated from the height difference.
8.3.4
Crossed Wire Set-Up
This set-up consists of two crossed wires, which almost touch at their crossing
point. One of these wires is modied with a SAM of the molecule of interest. A
magnetic eld is applied perpendicular to one wire, and a dc current is passed
through this wire. This causes the wire to be deected due to the Lorentz force, and
consequently the separation of the two wires can be adjusted by setting the dc
current [4447].
It has been shown that the number of contacted molecules is dependent on the
wire separation, and that the current versus voltage characteristics measured at
different separations are all integer multiplies of a fundamental characteristic. Thus,
it is proposed that this fundamental curve represents the characteristic of only a
single molecule [45].
These measurements can also be carried out at cryogenic temperatures [46]. In this
case, the vibronic states of the molecule can be identied which provide a molecular
ngerprint and prove that the molecule is actively involved in the conduction
process (see also Section 8.4.6) [46].
8.3.5
Nanogaps
Similar to the break junction method, in the nanogap set-up a small gap is formed in a
thin metal wire. However, this gap is not formed by bending a exible substrate and
mechanically breaking the wire, but it is prepared on a rigid substrate by using
various methods.
One such method is electromigration. The preparation of the nanogap starts with
the denition of a thin metallic wire on an insulating substrate. A SAM is then
deposited on top of this wire by immersing the sample into a molecular solution.
j183
184
Subsequently, a voltage ramp is applied to the wire. If the current that ows through
the wire becomes too large, the wire breaks due to electromigration, which is reected
by a drop in current. Such gaps are approximately 1 nm in width [48] and, with a
certain degree of probability, are bridged by a molecule that was deposited onto the
wire in advance; thus, a single molecular device has been built.
Alternatively, nanogaps can be formed by preparing an electrode pair with a gap of
30 nm [49, 50] using electron-beam lithography. This gap can be shrunk to
molecular dimensions by electrochemically depositing metal atoms [51, 52]. Combined with electrochemical etching, this method allows a precise control over the wire
separation. A combination of lithography and low-temperature evaporation has also
been used to fabricate 1- to 2-nm gaps directly on a gate oxide [53].
Although all of these techniques dene the gaps laterally, the precise control over the
vertical thickness of thin lms can be exploited to dene a vertical nanogap [54]. The
preparation of these gaps starts with the deposition of a thin SiO2 layer on top of a
highly p-doped silicon bottom electrode (see Figure 8.5). A gold electrode is then
deposited on top of the SiO2 layer, and subsequently the oxide can be etched in
hydrouoric acid, thus yielding a thin gap between the Si bottom and Au top electrode.
A rich variety of molecules has been measured using nanogaps, including a
coordination complex containing a Co atom [55], a divanadium molecule [56],
C60 [57, 58], C140 [59], and phenylenevinylene oligomers [53, 60].
8.3.6
Crossbar Structure
The major technological problem in the crossbar set-up lies in the deposition of the
top electrode. The metal/molecule interface may be unstable and metal ions can
migrate through the molecular layer [6163], thus shorting the device. The probability of metal ions penetrating the molecular layer is dependent on the molecular top
group. A group which binds the metal at the top is more resistant, and many metal/
molecular end groups have been examined, including Al on CO2H [64], OH and
OCH3 [65], Cu, Au, Ag, Ca and Ti on OCH3 [66, 67] or Au, Al and Ti on disuldes [68].
Ti is shown to be critical for metallization, because it reacts strongly with the molecule
and partially destroys the SAM [69].
An alternative to these molecular end groups is to use aromatic end groups and to
crosslink them with electron irradiation [70, 71]. This method yields stable Ni lms
on top of a molecular layer. Similarly, the molecular layer can be protected by a spunon lm of a highly conducting polymer lm (e.g. PEDOT) [72].
In most devices the top electrode is deposited by evaporation techniques, so that
the metal atoms arrive at the molecular layer with a high energy, and the probability of
the atoms punching through the layer is high. Attempts have been made to reduce the
energy of the atoms by indirect evaporation and cooling the substrate [73], or by socalled printing methods in which the metal lm is gently deposited on the
molecular layer from a polymeric stamp [74, 75].
8.3.7
Three-Terminal Devices
The incorporation of a third electrode (the gate) opens up new possibilities. First, the
molecular levels can be shifted upwards and downwards by the gate relative to
the levels of the electrodes, which can be used to analyze the electronic structure of
the molecule. The gate is also necessary for building a molecular transistor. As will be
shown later, these transistors can be used to build logic circuits.
The basic working principle of a molecular single-electron transistor is illustrated
in Figure 8.6. Without a gate voltage applied, no molecular states lie in the energy
window between the source and drain potential and thus, the current is blocked.
However, the molecular level can be moved down into the energy window, if the gate
voltage is increased. Therefore, by applying a gate voltage it is possible to switch the
transistor on.
In order to understand, how the gate can shift the molecular levels, the capacitive
network shown in Figure 8.2a should be considered. The voltage between the source
and the molecule, VSM, and between the molecule and the drain, VMD, is related to the
source VS and source and gate voltage VG as follows:
V SM
CD CG
CG
VS
VG
CS
CS
8:4
V MD
CS
CG
VS
VG
CS
CS
8:5
with CS CS CD CG
j185
186
Therefore, the molecular level shifts up or down relative to the source and drain
energies. The amount of the shift is proportional to the term CCGS V G . In order to obtain
good gate control, CCGS (the gate-coupling parameter) must be large and, ideally, close
to unity; hence, the gate must be placed very close to the molecule.
One elegant method of obtaining a high gate control is to use an electrochemical
gate (c.f. Figure 8.6b). The molecular device (e.g. the nanogap or the STM set-up) is
immersed in an electrolyte, and the source and drain voltages are varied relative to a
reference electrode which is also immersed in the solution [38, 76, 78] and takes on
the function of the gate. The effective gate distance is given by the thickness of
the double layer of ions at the electrodes [38], which allows the application of
high electric elds. Several molecules, including peralene tetracarboxylic diimide [76], a molecule containing a viologen group [79], oligo(phenylene ethynylene)s [80] or different transition metal complexes [77], have been studied using this
type of gate.
Champagne et al. succeeded in including a gate in a break junction set-up [81]
which consists of a freestanding, under-etched gold bridge deposited onto a silicon
wafer. Underneath the bridge, the silicon is degenerately doped and serves as the gate
electrode. The bridge is broken by the electromigration technique, and the size of the
so-formed gap is adjusted by bending the silicon substrate. A C60 molecule is
immobilized in this gap, so that a molecular transistor with a gate-molecule spacing
of about 40 nm is realized.
The most straightforward method of including a gate is provided by the nanogap
set-up. Here, the source and drain are formed on an insulating substrate; however,
the insulating layer (e.g. SiO2 or Al2O3) can be very thin, and the underlying
(conductive) substrate may be used as gate [5558, 82] (cf. Figure 8.6c). Compared
to an electrochemical gate, this set-up has the advantage that the measurements can
be conducted at cryogenic temperatures, which makes the observance of coulomb
blockade effects easier and also allows the use of inelastic electron tunneling
spectroscopy (see Section 8.4.6) to study the molecules.
8.3.8
Nanogaps Prepared by Chemical Bottom-Up Methods
Several strategies for nanogap preparation have been based on the pioneering studies
of Brust et al. [83, 84], where the chemical preparation of metal nanoparticles was
protected by a ligand shell. Two-dimensional arrays of such particles constitute a test
bed that may be used to interconnect metal particles separated by a few nanometers
by various organic molecules (see Figure 8.7) [85, 86].
By mixing hydrophobic nanoparticles with surfactants, more one-dimensional
structures may be formed where molecules interconnect segments of gold nanowires
separated by a few nanometers (see Figure 8.8a) [87, 88].
By using a single metal particle inserted in a metal gap prepared top-down,
two well-dened nanogaps may also be realized at the gapparticle interface
(cf. Figure 8.8b) [90, 91].
Although all of these systems are easily prepared and are stable at room temperature, as yet it has not been possible to control the gap formation accurately enough to
prepare a single gap bridged by a single molecule. Neither have individual gates been
reported.
8.3.9
Conclusion
j187
188
through a shadow mask at shallow angles [53], the magnetic bead junction [90, 93], or
the nanopore concept [9497]. Each set-up has its own strengths and weaknesses:
some allow the characterization of single molecules (e.g. break junction experiments
and SPM set-ups), whereas others are interesting in terms of later applications
(e.g. crossbar set-ups) or allow the inclusion of a gate electrode (nanogaps). It appears,
however, that there is no ideal set-up, and often the intrinsic molecular behavior may
be determined only by a combination of different experimental methods.
8.4
Molecular Functions
In this section, it is shown which molecules have been measured and which
functionalities these molecules can provide. As the ultimate goal of molecular
electronics is to provide a universal logic, each technology which aims to achieve
this must fulll several basic requirements [30].
The most basic requirement is that a complete set of Boolean operators can be built
out of the molecular devices, such that every Boolean function can be obtained. One
complete set of operators is for example a disjunction (OR) and an inversion (NOT),
or alternatively, a conjunction (AND) and inversion. However, all complete sets have
to include inverting gates that is, an inversion.
Disjunction and conjunction can be relatively easily built out of a resistor and a
diode (for a description, see Section 8.4.2.4), and so it is vital to identify molecules that
conduct current only in one direction.
Similarly, complete elds of disjunctions and conjunctions can be implemented in
crossbar structures in the form of so-called programmable logic arrays (PLA) (see
Section 8.5.1). At the crossing points of these PLAs, molecules that can be switched
on or off are needed; that is, the molecules must possess two conduction states, one
isolating and one conducting. Molecules which demonstrate this behaviour include
hysteretic switches, examples of which are described in Section 8.4.4.
These set-ups do not provide inverting logic, as negation is still missing. One set-up
which provides inversion is the molecular single electron transistor (see Section 8.4.5).
Even with only two terminal devices it is pfossible to construct an inversion using a socalled crossbar latch (see Section 8.4.4.1). Again, hysteretic switches are required for
this. Inverting logic (e.g. an exclusive disjunction, XOR)3) can also be built from
a variant of a simple diode which displays a negative differential resistance (NDR) region
that is, a region in which the current drops with increasing voltage. These gates are
described in Section 8.4.3.1.
A second requirement for the implementation of logic is that the gates must
provide a means of signal restoration. At each stage of the circuit the signal voltage,
which represents a logical 0 or 1, will be degraded. To be able to concatenate several
logic gates, a means of restoring the original levels must be found, and this
requirement can only be relaxed for small circuits, as long as the degradation of
the signal voltage is tolerable.
In conventional CMOS logic, such restoration is provided by a non-linear transfer
characteristic of the gates [30]. This strategy can also be followed with molecular
single-electron transistors. Similarly, signal restoration can also be obtained by two
terminal devices using hysteretic switches in the form of the crossbar latch or using
NDR diodes in the form of the molecular latch, as proposed by Goldstein et al. [98].
Another requirement for the technology is that there must be elements that transmit
signals across longer distances that is, a type of molecular wire (see Section 8.4.1) must
3) The XOR gate can be converted into a negation if
one input is xed to 1.
j189
190
The most basic electronic device is a simple wire. However, in molecular electronics it
is less easy than might be thought to construct a suitable molecule which transmits
current with a low conductance across longer distances. So, what makes a molecule
a good conductor? Starting from the short theoretical instruction provided in
Section 8.2, certain conclusions can be drawn regarding the properties of an ideal
molecular wire.
First, in order to obtain a low resistance a strong electronic coupling of the molecule
and the electrodes is preferable. As discussed in Section 8.3.1, such a coupling can be
obtained by choosing a suitable molecular binding group, for example a group that
provides resonant coupling to the electrodes, such as dithiocarbamates.
Due to this choice of molecular binding group, the limit of strong coupling is valid
and electrons are transmitted across the molecular levels. However, a prerequisite for
such transmission is that this molecular level, extending from source to drain,
actually exists. Extended molecular levels can be formed by delocalized p-systems,
where in the tight binding approximation pz orbitals of isolated carbon atoms add and
form an extended, delocalized orbital. Therefore, aromatic groups (e.g. polyphenylene) are often used as the building blocks for molecular wires.
Another important property of a molecular wire is its ability to conduct current at a
low bias, and therefore the molecular level used for transport should be close to the
Fermi level of the contacts. Often, this requirement is expressed in terms of a low
HOMOLUMO gap.
Many molecules have been proposed as molecular wires, including polyene, polythiophene, polyphenylenevinylene, polyphenylene-ethynylene [99] (see Figure 8.9),
oligomeric linear porphyrin arrays [100] or carbon nanotubes (CNTs) [[101] and
references therein]. CNTs constitute a special category among molecular wire candidates as they may be either metallic or semiconducting, depending on their chirality.
However, it is difcult to selectively prepare or isolate only one type of CNT, namely the
metallicform. Furthermore,itisstill challengingtoorganize andorientCNTs,although
some techniques are available to arrange CNTs in a crossbar structure [e.g. see [102]]. A
more detailed discussion on CNTs is provided in Chapter 10.
8.4.2
Molecular Diodes
Molecular diodes are the next step towards a higher complexity. Indeed, when
combined with resistors, diodes are already sufcient to build AND and OR gates.
The rst molecular electronic device to be proposed by Aviram and Ratner was just
such a molecular diode [103], and consisted of a donor and an acceptor group
Figure 8.9 Building blocks for molecular wires: (a) polyene; (b)
polythiophene; (c) polyphenylenevinylene; and (d) polyphenyleneethynylene.
separated by a tunneling barrier. This set-up is often compared to the p- and n-layers
in a conventional diode. As alternative to the AviramRatner approach, a molecular
diode can also be formed by asymmetric tunnel barriers at the source and drain
electrodes [104]. This concept is based on the different electrostatic coupling of the
electrodes. However, as will be seen later, it is very difcult to couple the molecule
symmetrically to both electrodes, which in turn makes it difcult to distinguish
between rectication due to the AviramRatner mechanism and rectication due to
asymmetric coupling.
8.4.2.1 The AviramRatner Concept
The AviramRatner concept is illustrated schematically in Figure 8.10. A molecule,
which consists of an acceptor and a donor group, is connected to the source and drain.
The acceptor and donor are isolated by a tunneling barrier, which ensures that the
Figure 8.10 The diode proposed by Aviram and Ratner, and the
suggested rectifying mechanism. For further details, refer to
Section 8.4.2.1.
j191
192
molecular levels of the two parts do not couple. The HOMO of the donor lies close to
the Fermi level, in contrast to the acceptor, where the LUMO is adjacent to the Fermi
level.
If a negative voltage VS is applied to the diode (see Figure 8.2a for polarity), the
potential of the source is raised with respect to the drain. Electrons can ow relatively
easily from the source, across the acceptor and donor, towards the drain. However, at
the opposite polarity a much higher voltage is needed to allow electrons to ow from
drain to source. Thus, the molecule is considered to rectify the current.
8.4.2.2 Rectification Due to Asymmetric Tunneling Barriers
In contrast to the AviramRatner mechanism, rectication due to asymmetric
tunneling barriers is based on a difference in the source and drain capacitances.
This difference can be obtained by attaching two insulating alkane chains to a
conjugated part (see Figure 8.11). The alkane chains are functionalized with an end
group, which provides binding functionality (e.g. sulfur for gold electrodes). The
capacitance between the conjugated part of the molecule and the electrode is
inversely proportional to the length of the alkane chains; varying these lengths is
therefore a suitable way of adjusting the source and drain capacitances.
The rectifying mechanism can be explained by the energy diagram shown in
Figure 8.11. The HOMO and the LUMO levels of the conjugated part of the molecule
are included in the gure. These energy levels correspond to the molecular level of
the (unbound) molecule plus the charging energy, as explained in Section 8.2. As can
be seen in Figure 8.11, the levels lie asymmetrical with respect to the Fermi level of
the electrodes.
Current can only ow when the electrochemical potential of the source or the
drain aligns with, or even exceeds, the LUMO that is, when eVSM D for
electrons owing from source to drain, or eVMD D for the reverse bias. Here, D is
the difference between the Fermi level of source and drain at zero bias and the
LUMO.
VSM and VMD are given by Eqs. (8.4) and (8.5) (the gate capacitance must be set to
zero). It follows for the voltage VD ! S, at which electrons start to ow from drain to
1 CS =CD D 1 h D
h e
C S =C D e
|{z}
8:6
h
V S ! D (1 h)
D
e
8:7
If the source capacitance is smaller than the drain capacitance (h < 1), |VS ! D| is
smaller than |VD ! S| and electrons can ow from source to drain at a lower absolute
bias than in the opposite direction; that is, the molecule shows a rectication
behavior.
8.4.2.3 Examples
Starting from the molecule proposed by Aviram and Ratner (see Figure 8.10) [105],
many other molecules containing acceptor and donor groups have been proposed [106]. One of the most extensively studied is g-hexadecylquinolinium tricyannoquinodimethanide (C16H33 Q-3CNQ; c.f. Figure 8.12) [107110]. Although
this molecule consists of a donor and an acceptor group, it deviates from the normal
AvriamRatner diode in that the two parts are not coupled by an insulating s-group
but rather by a delocalized p-group, which makes an analysis of the rectication
behavior more difcult [109]. Furthermore, due to the alkane chain at one side of the
molecule, it is often coupled asymmetrically to the electrodes. Thus, it is difcult to
distinguish between the AviramRatner mechanism and rectication due to asymmetric tunneling barriers. To circumvent this problem, decanethiol-coated gold
STM tips are used as a second contact to a SAM of C10H21 Q-3CNQ, which places
the same length of alkane groups at both ends of the donor and acceptor
groups [111].
8.4.2.4 DiodeDiode Logic
Already with these rather simple diodes it is possible to build AND and OR gates in
the form of so-called diodediode logic [112]. In the AND gate (see Figure 8.13a) both
Figure 8.12 The molecular diode C16H33 Q-3CNQ in its neutral state.
j193
194
inputs, A and B, are connected via reversely biased diodes and a resistor to the
operating voltage. Only if both inputs are high (i.e. 1) is the output C high, and this
results in an AND function.
The OR gate is shown in Figure 8.13b. In contrast to the AND gate, one input is
already sufcient to push the output to a high voltage, and thus this gate implements a
disjunction.
8.4.3
Negative Differential Resistance Diodes
Diodediode logic yields AND and OR gates and, in order to obtain a complete set of
Boolean variables, these gates can be combined with diodes that show a negative
differential resistance (NDR). By using these modied gates it is possible to obtain
inversion.
The NDR effect is illustrated schematically in Figure 8.14a. The current of the
diode rises with increasing voltage up to a certain threshold voltage, above which the
current drops. This rather odd behavior is already known from conventional semiconductors in the form of resonant tunneling diodes.
The concept of resonant tunneling can also be used for molecular NDR devices, as
shown in Figure 8.14b and c [113]. The molecule consists of two conjugated
molecular leads and an isolated benzene ring in the middle. In the absence of any
inelastic processes, current can only ow if the molecular levels of the left lead of the
isolated benzene ring and of the right lead, align. Such alignment occurs only at
certain voltages, at which the levels are said to be in resonance. Such resonance is
illustrated graphically in Figure 8.14c. If the voltage is detuned from this resonant
value for example, if it is further increased then the current will drop and the
device will show an NDR effect.
Another molecule, which shows a prominent NDR effect, is shown in
Figure 8.15b [4, 94]. It exhibits peak-to-valley ratios as high as 1030:1, although the
exact nature of the NDR effect observed in this molecule is currently a matter of
intense research [114].
Figure 8.14 (a) The NDR effect. (b, c) The concept of a resonant
tunneling diode consisting of a single molecule.
VR
V tot V NDR
R R0
R R0
VR
V tot V NDR
R R0 =2
R R0 =2
if either A or B is high
8:8
8:9
j195
196
Figure 8.15 (a) XOR gate using NDR-diodes. (b) A well-known molecule that shows NDR.
These two characteristics, called load lines, are included in Figure 8.14a. The
crossing points of the NDR characteristic with these load lines are the operating
points for only one input high (point A), or for both inputs high (point B). It transpires
that, if both inputs are high, the NDR diode is forced into its valley region, and
therefore only a low current ows and only a low voltage drops across R. Thereby, the
output signal C goes low. From this XOR gate it is easy to obtain inversion; the only
change to be made is to set one input (e.g. A) xed to 1. The output C is then simply
the negation of B.
8.4.4
Hysteretic switches
As noted above, hysteretic switches can be used to build PLAs and to yield signal
restoration and inversion. The general current versus voltage characteristic of a
hysteretic switch is shown in Figure 8.16a.
A hysteretic switch displays two conduction states: one insulating, and one
conducting. It is possible to toggle between the two by applying a voltage which
exceeds a certain threshold value. For example, in Figure 8.16 a positive voltage is
needed to switch the device on (i.e. from the insulating to the conducting state), and a
negative voltage to switch it off.
Such bistability can be obtained when the molecule possesses two different states
that are almost equal in energy, that are separated by an energy barrier, and that show
different conduction behaviors. Different origins of these two states are conceivable [99]; for example, they may result from redox processes, from a change in
conguration of the molecule, a change in conformation, or a change in electronic
excitation.
The molecule shown in Figure 8.16b is an example of a molecule which has been
proposed, in theory, to show hysteretic switching [115]. It consists of a xed molecular
backbone (the stator) and a side group with a high dipole moment (the rotor). By the
application of an electric eld, the rotor orients its dipole moment in the direction of
the eld. Bistability is obtained by the formation of hydrogen bonds between the
stator and rotor that x the latter in one of two stable positions relatively to the stator.
The two conduction states are due to different conformations of the molecule
(stabilized by hydrogen bonds). Switching is initiated by interaction of the dipole
moment of the rotor with the electric eld.
Bipyridyl-dinitro oligophenylene-ethynylene dithiols (BPDN-DT) are other examples of switching molecules. These are a variation of the molecule shown in
Figure 8.15b, and their bistability has recently been conrmed by using various
measurement techniques [28, 93].
Rotaxanes and catenanes are, even if controversially discussed, additional candidates for molecular switches. These molecules consist of two interlocked rings such
that, by reducing or oxidizing the molecule, one ring rotates within the other. Two
stable, neutral states which differ in the position of the inner ring are thus realized.
The switching is therefore initiated by a redox process. The two different states are
provided by the different conformations of the molecule, similar to the molecule
shown in Figure 8.16b.
Although the preliminary results showing a switching effect of this molecule were
questioned (see Section 8.4.6), recent results have conrmed the original proposed
switching mechanism and indicates that, in the case of earlier results, this mechanism was occasionally hidden by artifacts [75, 116, 117].
8.4.4.1 The Crossbar Latch: Signal Restoration and Inversion
As noted in Section 8.4.2.4, AND and OR gates can be built using simple diodes.
However, for a complete set of Boolean variables, negation is also required. Signal
inversion can be obtained by NDR diodes (c.f. Section 8.4.3.1) or alternatively by the
so-called crossbar latch, which also provides a means of signal restoration [118].
The crossbar latch (see Figure 8.17a) consists of two hysteretic switches which are
connected to the signal line (at voltage VL) and one control line (at voltage VCA or VCB).
The two switches are oppositely oriented.
The idealized current versus voltage characteristic of a hysteretic switch shown in
Figure 8.16 is assumed. By application of a positive voltage, the switch opens; an
opposite voltage is then needed to close the switch. These voltages are always those
j197
198
that are applied across the molecules, depicted in Figure 8.17a as VSA and VSB.
However, in the circuit shown in Figure 8.17a, only the voltages of the control lines
VCA and VCB are set. Therefore, the voltage that is applied across the molecule depends
on the voltage on the signal line, VL.
V SA V L V CA
V SB V L V CB
8:10
(note the opposite orientation of switch B)
8:11
The voltage on the signal line in Figure 8.17 represents the logical state. Voltage
intervals are dened that represent a logical 1 or 0. In general, the signal is
degraded, so that the signal level will be at the lower end of the intervals.
To yield signal restoration that is, to pull the signal level up to the upper end of the
dened interval the following procedure can be followed:
.
First, a large positive voltage is applied to the control line A, and a large negative
voltage to control line B. This voltage is large enough to always open switches A and
B, regardless of the voltage level of the signal line. This is shown in Figure 8.17b
(unconditional open).
Second, a small negative voltage is applied to control line A, and a small positive
voltage to control line B. These voltages are so small that they close switch A only if
the signal line carries a 1, and they close switch B if there is a 0 on the signal line
(see conditional close in Figure 8.17b).
Therefore, depending on the voltage on the signal line, switch A or B is closed and
the opposite switch is open. To yield signal restoration, VCA is connected to a full 1
signal and VCB to 0. Inversion can also be obtained; the only modication is that a
logical 1 must be connected to VCB and 0 to VCA.
This scheme shows that it is possible to build a complete set of Boolean variables
with only two terminal devices. Therefore, hysteretic switches represent a very
valuable element, which explains the high activity in this eld of research.
8.4.5
Single-Molecule Single-Electron Transistors
In Section 8.3.7 it was shown how a third electrode the gate could be included into
the device set-up. The most convenient method to prepare such three-terminal
devices is a nanogap that is deposited onto a gate, which is isolated from the device by
a thin insulator. A three-terminal device essentially forms a transistor. If the molecule
is only weakly coupled to the source and drain, the transistor is termed a singlemolecule single-electron transistor [119121]. Whilst the basic working principles of
such a transistor were described in Section 8.3.7, and a more extensive explanation is
provided in Figure 8.18.
j199
200
In Figure 8.18a, the Nth and (N 1)th state of the molecule are shown. These two
levels correspond, for example, to the HOMO and LUMO in Figure 8.2b. In
Figure 8.18a, electrons can hop or tunnel onto or off the molecule by the processes
(a) to (d) that is, from the source or drain onto the molecule (processes a and b), or
from the molecule to the source or drain (processes c and d).
Electrons can only ow from the source or the drain onto the molecule if the
potential of the electrode aligns with (or exceeds) the molecular level. Below these
potentials, no electrons can ow; rather, the current is blocked, which is known as
coulomb blockade. This behavior is often visualized in a plot as in Figure 8.18b, in
which the source voltage VS is plotted against the gate voltage VG.
In Figure 8.18b, four lines build a diamond, the color of these lines corresponding
to processes (a) to (d) in Figure 8.18a. In the interior of the diamond, the source and
gate voltages are too low to overcome the injection barriers, and the current is
blocked. By increasing the source and gate voltage, the working point of the transistor
can be moved to outside the diamond. As the working point crosses one line in the
VS/VG plane, the current sets in (e.g. if it crosses the dark yellow line, electrons can
hop from the drain onto the molecule).
Depending on the process, there are different barriers that must be surmounted by
the electron. For electrons hopping from source onto the (N 1)th level (process a),
the voltage between source and molecule VSM must exceed the barrier D1 (see
Figure 8.18 for a denition of D1) that is, eVSM D1.
VSM and VMD are governed by Eqs. (8.4) and (8.5), respectively. By combining
Eqs. (8.4) and (8.5) with the conditions for current ow (e.g. eVSM D1 for process a),
linear relationships are yielded between VS and VG that represent the equations of the
four lines in Figure 8.18b.
Single-molecule single-electron transistors resemble in one important aspect
conventional MOSFETs. The resistance between source and drain can be controlled
by the gate voltage that is, these transistors represent electronic switches and
can be used to build logical circuits. As an example, an inverter based on a singleelectron transistor is shown in Figure 8.19. In fact, logic circuits consisting
of conventional single-electron transistors have already been presented [121, 123].
Single-electron transistors consisting of single molecules have also been realized
[53, 55, 57, 59, 60, 82, 122], and an inverter consisting of a multiwall carbon nanotube
has also been built [124]. For further information on single-electron devices, the
reader is referred to Chapter 2 in Volume I of this Handbook, and to Chapter 6 in the
present volume.
8.4.6
Artifacts in Molecular Electronic Devices
j201
202
8.5
Building Logical Circuits: Assembly of a Large Number of Molecular Devices
In the previous sections it has been shown how single (or at least a low number of )
molecules can be contacted, and which functionalities these molecules can provide.
Furthermore, it has been described, how these single molecules can be combined to
small logic gates, for example as AND, OR, XOR gates, or as the crossbar latch. In this
section, the discussion proceeds one step further to determine how a large number of
devices can be assembled. And what implications does the use of single molecules
have for the architecture of future logic circuits? As already discussed (in Section 8.3),
for single-molecule devices there is at present no method available to deterministically place a single molecule on a chip. Thus, reliance must be placed on statistical
processes and the ability of the molecules to self-organize, for example in the form of
SAMs.
This dependence on self-organization bears the rst implication for the architecture of molecular devices. Self-organization will always result in very regular
structures, which have only a low information content [133]. In comparison to
current CMOS circuits, in which a huge number of transistors is connected statically
to other transistors, and which include therefore a high information content, this lack
of information must be fed into the molecular circuit by an additional, postfabrication training step. In other words, the technology and architecture has to be
re-congurable.
j203
204
A second implication of the self-organization process is given by the fact that these
circuits will always contain defective parts, and the high yields necessary for CMOS
architectures are not feasible. Again, the defective sites on the molecular chip must be
identied and isolated in a post-fabrication step.
Several architectures have been proposed for future molecular circuits, and these
differ in how strongly they rely on self-organization. PLAs based on crossbars, for
example, use self-organization only for the preparation of the SAMs. The electrodes
that contact the SAM are commonly dened by lithography (although techniques are
available to prepare crossbars completely by self-organization, e.g. [102]). In contrast
to PLAs, the Nano-Cell architecture (see Section 8.5.2) relies completely on selforganization.
8.5.1
Programmable Logic Arrays Based on Crossbars
In Section 8.4.2.4 it was described, how simple AND and OR gates can be implemented
using diodes and resistances only. In the following it will be shown, how large arrays
of these gates can be implemented using crossbars, as described in Section 8.3.6.
The equivalent circuit of a crossbar is shown in Figure 8.21a and b. A SAM of
rectifying diodes is contacted by orthogonal top and bottom electrodes. These arrays
can easily be converted into AND and OR gates. For an AND circuit, the vertical
electrodes must be connected to the high voltage, and the horizontal lines to the input
variables (see Figure 8.21a). Similarly, for OR, the horizontal lines are connected to
the low voltage and the vertical lines are the signal lines.
However, these circuits have one drawback, in that only a single disjunction or
conjunction can be implemented. If, for example, a simple AND connection of two
input variables is built, all other horizontal input lines must be set to the high voltage.
Therefore, at the end of each vertical line, only the conjunction of A and B is
computed.
This problem can be circumvented if some diodes can be switched off that is,
if some input lines can be isolated from the output lines. Thus, a combination of
a hysteretic switch (as described in Section 8.4.4) and a diode, for example by
asymmetrically coupling a bistable molecule to the electrodes, would be highly
benecial. By using these switches, individual crossing points could be switched
off and on by the application of a high-voltage pulse. The state of the
molecules at the crossing points will therefore determine the logical function
that is computed.
The AND and the OR gate can be combined to a PLA, as shown in Figure 8.22.
The output of the AND plane is fed into the OR plane. Most diodes at the crossing
points are switched off, so that certain Boolean functions are realized. The
Boolean functions of the rst two horizontal lines in the OR plane are given
in Figure 8.22.
Such a PLA can compute every Boolean function if not only the input variables but
also the negation of them is supplied to the PLA. Therefore, the negation of each
variable must be computed, which can be done by using NDR-diodes as described in
Section 8.4.3.1. Again, these NDR diodes can be implemented in a crossbar, so that all
negations of all input signals can be realized simultaneously. As an alternative, the
negation can be supplied by the surrounding CMOS circuitry, which would result in
hybrid molecular/CMOS circuits.
Based on hysteretic switching diodes, the PLA is re-congurable. Therefore, the
logic functions are programmed in a post-fabrication step; defective junctions can be
j205
206
Figure 8.23 NanoCell trained as NAND gate, as proposed by Tour et al. (From Ref. [134].)
identied and disregarded in the circuit. This makes the PLA architecture a
promising architecture for molecular electronics. A more detailed explanation of
architectures based on crossbars is provided in Chapter 11 of this volume.
8.5.2
NanoCell
However, for a proof of concept, Tour et al. assumed the case of omnipotent
programming [134], which means that the position of each molecule is known and
that it can be individually programmed to its low or high state.
Based on this assumption, the network can be trained for a certain task; for
example, to perform a NAND operation, as shown in Figure 8.23. The state of the
network can be described by a list of the switching states of all molecules, typically,
which molecule is open or which is closed. A function can be dened, which evaluates
by how much the output (1 in Figure 8.23) resembles a NAND combination of the
inputs A and B. The task of training is now reduced to nd a network state, which
sufciently minimizes (or, depending on the denition, maximizes) the evaluation
function.
Tour and coworkers have used a genetic algorithm to identify such a network state
that is, to determine which molecules must be switched on, and which off. The output
signal is determined by a SPICE simulation and compared to the desired functionality. Using this genetic algorithm, Tour and colleagues were able to train NanoCells
as inverters, NAND gates, or complete 1-bit adders [134].
8.6
Challenges and Perspectives
j207
208
References
1 International Technology Roadmap for
Semiconductors; 2005 Edition. public.
itrs.net. 2006.
2 A. Aviram, M. Ratner, Chem. Phys. Lett.
1974, 29, 277283.
3 C. P. Collier, G. Mattersteig, E. W. Wong,
Y. Luo, K. Beverly, J. Sampaio, F. M.
Raymo, J. F. Stoddart, J. R. Heath, Science
2000, 289, 11721175.
4 J. Chen, M. A. Reed, A. M. Rawlett, J. M.
Tour, Science 1999, 286, 15501552.
5 R. F. Service, Science 2001, 294,
24422443.
6 R. F. Service, Science 2003, 302, 556.
7 S. Datta, Nanotechnology 2004, 15,
S433S451.
8 S. Datta, Quantum Transport Atom to
Transistor, Cambridge University Press,
2005.
9 M. Zharnikov, S. Frey, H. Rong, Y. J. Yang,
K. Heister, M. Buck, M. Grunze, Phys.
Chem. Chem. Phys. 2000, 2, 33593362.
10 J. C. Love, D. B. Wolfe, R. Haasch, M. L.
Chabinyc, K. E. Paul, G. M. Whitesides, R.
G. Nuzzo, J. Am. Chem. Soc. 2003, 125,
25972609.
11 L. Patrone, S. Palacin, J. P. Bourgoin, J.
Lagoute, T. Zambelli, S. Gauthier, Chem.
Phys. 2002, 281, 325332.
12 P. Morf, F. Raimondi, H. G. Nothofer, B.
Schnyder, A. Yasuda, J. M. Wessels, T. A.
Jung, Langmuir 2006, 22, 658663.
13 J. M. Wessels, H. G. Nothofer, W. E. Ford,
F. von Wrochem, F. Scholz, T. Vossmeyer,
A. Schroedter, H. Weller, A. Yasuda, J. Am.
Chem. Soc. 2004, 126, 33493356.
14 J. Chen, L. C. Calvet, M. A. Reed, D. W.
Carr, D. S. Grubisha, D. W. Bennett,
Chem. Phys. Lett. 1999, 313, 741748.
15 A. Ulman, Chem. Rev. 1996, 96,
15331554.
16 S. Ranganathan, I. Steidel, F. Anariba, R.
L. McCreery, Nano Lett. 2001, 1, 491494.
17 M. R. Kosuri, H. Gerung, Q. M. Li, S. M.
Han, P. E. Herrera-Morales, J. F. Weaver,
Surface Sci. 2005, 596, 2138.
References
33 D. J. Wold, R. Haag, M. A. Rampi, C. D.
Frisbie, J. Phys. Chem. B 2002, 106,
28132816.
34 A. M. Rawlett, T. J. Hopson, L. A.
Nagahara, R. K. Tsui, G. K.
Ramachandran, S. M. Lindsay, Appl. Phys.
Lett. 2002, 81, 30433045.
35 F. Schreiber, Prog. Surface Sci. 2000, 65,
151256.
36 J. M. Beebe, V. B. Engelkes, L. L. Miller, C.
D. Frisbie, J. Am. Chem. Soc. 2002, 124,
1126811269.
37 K. Moth-Poulsen, L. Patrone, N. StuhrHansen, J. B. Christensen, J. P. Bourgoin,
T. Bjornholm, Nano Lett. 2005, 5,
783785.
38 X. L. Li, B. Q. Xu, X. Y. Xiao, X. M. Yang, L.
Zang, N. J. Tao, Faraday Disc. 2006, 131,
111120.
39 B. Q. Xu, X. Y. Xiao, N. J. Tao, J. Am. Chem.
Soc. 2003, 125, 1616416165.
40 D. I. Gittins, D. Bethell, D. J. Schiffrin, R.
J. Nichols, Nature 2000, 408, 6769.
41 Y. Yasutake, Z. J. Shi, T. Okazaki, H.
Shinohara, Y. Majima, Nano Lett. 2005, 5,
10571060.
42 L. A. Bumm, J. J. Arnold, M. T. Cygan, T.
D. Dunbar, T. P. Burgin, L. Jones, D. L.
Allara, J. M. Tour, P. S. Weiss, Science
1996, 271, 17051707.
43 B. Lussem, L. Muller-Meskamp, S.
Karthauser, R. Waser, M. Homberger, U.
Simon, Langmuir 2006, 22, 30213027.
44 A. S. Blum, J. G. Kushmerick, S. K.
Pollack, J. C. Yang, M. Moore, J. Naciri, R.
Shashidhar, B. R. Ratna, J. Phys. Chem. B
2004, 108, 1812418128.
45 J. G. Kushmerick, J. Naciri, J. C. Yang, R.
Shashidhar, Nano Lett. 2003, 3, 897900.
46 J. G. Kushmerick, J. Lazorcik, C. H.
Patterson, R. Shashidhar, D. S. Seferos, G.
C. Bazan, Nano Lett. 2004, 4, 639642.
47 J. G. Kushmerick, D. B. Holt, J. C. Yang, J.
Naciri, M. H. Moore, R. Shashidhar, Phys.
Rev. Lett. 2002, 89.
48 H. Park, A. K. L. Lim, A. P. Alivisatos, J.
Park, P. L. Mceuen, Appl. Phys. Lett. 1999,
75, 301303.
j209
210
References
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
j211
212
j213
9
Intermolecular- and Intramolecular-Level Logic Devices
Franoise Remacle and Raphael D. Levine
9.1
Introduction and Background
Today, there is an intense research activity in the eld of nanoscale logic devices towards
miniaturization and qualitative improvement in the performance of logic circuits
[116]. A radical and potentially very promising approach is the search for quantum
computing [1724], and this is reviewed as an emerging technology in Ref. [25]. Other
alternatives are based on neural networks [26], on DNA-based computing [2731], or on
molecular quantum cellular automata [3237]. Single-electron devices should also be
mentioned because if they use chemically synthesized quantum dots (QDs) they are
molecular in nature [3840]. Devices that have been implemented rely on the ability to
use molecules as switches and/or as wires, an approach known as molecular electronics [5, 12, 4144]. This approach is currently being extended in several interesting
directions, including the modication of the electronic response of the molecule
through changing its Hamiltonian [4547]. In this chapter, these topics are rst
reviewed, after which ongoing studies on an alternative computational model, where
the molecule acts not as a switch but as an entire logic circuit, are discussed. Both,
electrical and optical inputs and outputs are considered. Advantage is then taken of the
discrete quantum states of molecules to endow the circuits with memory, such that a
molecule acts as a nite state logic machine. Speculation is also made as to how such
machines can be programmed. Finally, the potential concatenation of molecular logic
circuits either by self-assembly or by directed synthesis so as to produce an entire logic
array, is discussed. In this regard, directed deposition is also a possible option [4852].
9.1.1
Quantum Computing
Quantum computing can be traced to Feynman, who advocated [53] the use of a
quantum computer instead of a classical computer to simulate quantum systems.
The rational isthatquantum systems,whensimulated classically, are verydemanding in
computing resources. A quantum state is described by two numbers its amplitude
Nanotechnology. Volume 4: Information Technology II. Edited by Rainer Waser
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31737-0
214
and its phase and the number of quantum states, N, grows exponentially with the
number of degrees of freedom of the system. The sizes of the matrices necessary to
describea system quantummechanicallyscales asN2, and forlarge systems thisnumber
becomes rapidly prohibitively large. A computer that operates quantum mechanically
will require far less resources because the computations can be massively parallel due to
the superposition principle of quantum mechanics. Conceptually, to compute quantum
mechanically required the extension of classical Boolean logic to quantum logic [19, 54]
and to set up quantum logic gates [5561] that operate reversibly [62, 63].
In quantum computing, the logic is processed via the coupling structure between
the levels of the Hamiltonian. Typically, this coupling is induced by external electrical
and magnetic elds. Nuclear magnetic resonance (NMR) is a particularly promising
direction for both pump and probe [20, 64]. Quantum implementations are very
encouraging for search algorithms [6568], where a power law reduction in the
number of queries can be obtained. For operations where the answer is more
complex than a YES or NO such as Fourier transform operations the read-out
remains a key problem because of the collapse of the wave function when one of its
component is read. One very successful outcome of quantum computing and
quantum information is cryptography [69], where the very effective factorization
algorithms (public key cryptography, Shor algorithm [70, 71]) show the potential that
is available. Quantum computing has very much caught the popular imagination,
and several excellent introductory books (e.g. Ref. [72]) are now available.
9.1.2
Quasiclassical Computing
The essential difference between quantum computing and the approach discussed here
is that a quantum gate operates on both the amplitude and the phase of the quantum
state. The phase is very sensitive to noise, and quantum computing theorists have
devised various ways to protect the phase from external unwanted perturbations [7379]
or to seek to correct a corrupted phase. Because the authors background is in molecular
dynamics and coherent control, they are aware that the phase of quantum states is
extremely difcult to protect, and in this chapter adopt a quasi-classical approach [80]
where, while the time evolution of the molecular system is quantal, what matters in
terms ofinputsand outputsare the populationsofthe states that is, the squaremodulus
of the amplitudes. This approach relies on classical logic and does not require reversible
gates. There are two special characteristics of quantum computing: parallelism and
entanglement. Currently, the authors investigations center on understanding the
potential of the quasi-classical approach in terms of parallelism.
9.1.3
A Molecule as a Bistable Element
In the following sections, it is explained that the essential difference with what is
advocated in this chapter is that the molecule can do and has been shown to do
much more than act as a switch.
Quantum cellular automata (QCA) represents another promising approach to
molecular computing where the molecules are not used as switches but as structured
charge containers and information transmitted via coupled electric elds [3234].
The charge conguration of a cell composed of a few QDs and, more recently, of
molecular complexes is the support for encoding the binary information. Most
studies on QCA consists of theoretical design, with very few experimental implementations. While the QD-based implementations [33, 34] operate at very low
temperatures, theoretical modeling predicts that molecular quantum cellular
automata could be operated at room temperature [36, 37, 83].
9.1.4
Chemical Logic Gates
j215
216
9.1.5
Molecular Combinational Circuits
Combinational circuits are the simplest logic units, being built of logic gates (the
most common are NOT, AND, OR, XOR, NOR, NAND) and providing a specic set of
outputs, given a specic set of inputs. These circuits have no memory. In transistorbased computer circuits even the simpler such gates require to be built as a network
of switches. Studies on molecular combinational circuits forms part of an intense
research effort aimed at recasting into logic functions the fact that molecules can
respond selectively to stimuli (inputs) of different forms (chemical, optical, electrical)
and produce outputs (chemical, optical, electrical). The advantage is that a molecule
which implements a combinational circuit acts not as a single switch but as an entire
gate. However, most of molecular gates proposed until now have been based on
chemistry in solution, and use at one stage or another a chemical stimulus (e.g.
concentration of ions such as H , Na ) coupled to optical or electrical stimuli. This
leads rather slow rates of processing of the information and concatenation. In
contrast, the authors studies do not involve chemical inputs, and this allows faster
rates to be reached. It has already been shown that, within less than 1 ps (1012 s), it
is possible to implement combinational circuits on a single molecule using selective
photo-excitation for providing the inputs and the intra- and inter-molecular dynamics
for processing the information. (See Refs. [100102] for an example of sub-ps logic
gates using femtosecond (1015 s) pulses as input, and Refs. [94, 103] for an
example of concatenation by intermolecular coupling.)
One advantage of using molecules in the gas phase is that far fewer molecules are
required than in solution. In the preliminary gas-phase schemes, the reading of the
outputs is achieved by detecting fragments of the molecule used to implement logic,
which means that this particular molecule is not available for a new computation.
This is not intrinsically a problem because about 10 molecules are needed to obtain a
good signal-to-noise ratio; hence, the computation can be continued with other
molecules present on the sample. This represents a problem for cycling, however,
and is why the target is to explore the possibilities offered by non-destructive optical
and electrical reading. Another route to explore is to increase the number of
operations (more than 32 bits) performed on a single molecule by using the fast
intramolecular dynamics for concatenation, which decreases the need for cycling and
the need for I/O. As discussed above and further elaborated below, this will have
major implications for the miniaturization of logic devices by implementing compound logic at the hardware level.
In this way it was possible to implement logic functions on a single molecule up to
and including the ability to program that is, to use the same physics to realize
different computations.
In terms of technology transfer it is clear, however, that what the industry would
very much prefer is some form of extension to the CMOS technology. So, the need is
to combine the many advantages of working with molecules in the gas phase with the
need to anchor molecules on the surface. In this connection, studies with selfassembled small arrays of QDs is of central interest.
9.1.6
Concatenation, Fan-Out and Other Aspects of Integration
It is an essential characteristic of modern logic circuits that the gates are integrated.
Specically, the output of one gate can be accepted as the input of the next gate;
the two gates are thereby concatenated. Very simple examples of concatenation are
the NotAND (NAND) or NotOR, etc. gates. It should also be possible to deliver the
output of one gate as input to several gates. In 2001, it was shown [94] how to
concatenate the logic performed by two molecules using electronic energy transfer as
the vehicle for the information forwarding. It is clear that electron transfer
including electron transfer between QDs, proton transfer, and vibration energy
transfer can all be used for this purpose. However, this remains an as-yet poorly
traveled course and further studies are clearly called for as it is a possible key to highdensity circuits.
9.1.7
Finite-State Machines
So far, only combinational logic has been discussed, where combinational gates
combine the inputs to produce an output. A nite-state machine does more and the
more is very essential and is also something well suited to what a molecule can do. A
nite-state machine accepts inputs to produce an output that is dependent both on
the inputs and on the current state of the machine. In addition to producing an
output, such a machine will update the state of the machine so that it is ready for the
next operation. Technically, what the nite-state machine has is a memory, and the
circuit has as many different states of memory as states of the machine (see
Figure 9.1). It has been shown possible to build very simple nite-state and Turing
machines at the molecular level [101, 104]. Molecules have a high capacity for
memory because of their many quantum states; for example, in conformers, after
radiationless transition, the molecule can undergo relaxation and be in a stable state
for a long time. Retrieval of the information may then be effected by either optical or
electrical pulses (see below).
j217
218
A nite-state molecular machine has been proposed that can be cycled based on
laser pulses [101, 104]. In recent investigations, the same optical nite-state machine
has been used to implement a cyclable full adder and subtractor [96]. In a different
direction, recently a molecular nite-state machine in the gas phase has been
proposed by showing that a linear sequential machine can be designed using the
vibrational relaxation process of diatomic molecules in a buffer gas. An alternative to
optical pulses for providing inputs is to apply voltage pulses. In a recent study on QD
assemblies (e.g. Refs. [105, 106]), it was shown that the application of a gate voltage
allows the molecular orbitals to be tuned in resonance with the nanoelectrodes so that
a current ows, whereas there is no current in the absence of the gate voltage (see
Figures 9.2 and 9.3). A similar control is also possible on a single molecule [107109]
and for single QDs or several coupled dots.
9.1.8
Multi-Valued Logic
So far, it has been taken as given that an input or an output can have one of two
possible values. A laser beam can be on or off, a molecule can uoresce or not, and a
charge can transfer or not, and so on. Therefore, two-valued or Boolean logic is being
implemented. Ever since Shannon showed, in 1938, the equivalence between a
Boolean logic gate and a network of switches, computer circuits have been assembled
from switches. First, the switches were electromechanical in nature, followed by
vacuum tubes and then the transistor. Now that the discrete nature of the carriers of
electrical charge has made it unclear as to how the size of the transistor might be
further reduced, a major and very serious effort has been made to use a molecule
as a switch.
Previously, several ways have been discussed of using a molecule as an entire logic
gate ( a connected net of switches), or even as a nite-state machine rather than
simply as a switch. There is, however, another possible generalization, which also is
well-suited to what molecules are and what they can do namely, to go to multi-valued
logic [110112]. What this means is that there are more than two allowed mutually
exclusive answers to every question. This makes the numbers to be dealt with shorter
(e.g. 1001 is the binary number that is written as 9 in base 10). In their studies, the
present authors took three as a compromise between shorter numbers (e.g. 10 is
ternary for 9) and the errors that can result in making a choice between too many
alternatives. The molecular physics is straightforward, to take zero, one or two
electrons in the valence orbital [113]. But clearly, there are other choices such as a
Raman transition to several different nal vibrational states. Molecules are willing
and able to be pumped and to be probed to multiple states. It is not clear, however, if
the industry is willing to learn to go beyond two.
9.2
Combinational Circuits by Molecular Photophysics
j219
220
Table 9.1 Truth tables of the binary AND, OR, XOR and INH gates.
INPUT
OUTPUT
x AND y
x OR y
x XOR y
x INH y
0
1
0
1
0
0
1
1
0
0
0
1
0
1
1
1
0
1
1
0
0
1
0
0
parallel correspond to the OR gate. Another binary gate which is often used in
combinational circuits is the XOR gate (see Table 9.1) which corresponds to the
addition of the binary inputs modulo 2, x y.
The addition modulo 2 of two binary numbers works exactly as would be expected,
in base 10. When the two binary inputs are both 1, the result of their sum modulo 2 is
0, but a 1 (the carry digit) must be reported in the next column of binary digits [114]
(see Table 9.2).
When using the XOR and the AND gates, it is possible to build a half adder. This
has two inputs, the two numbers to be added modulo 2, and two outputs, the sum
modulo 2 of the two inputs, realized by a XOR gate and the carry, realized by an AND
gate, the truth tables for which are shown in Table 9.1. A half adder is an important
building block for molecular combinational circuits because, by concatenating two
half adders, a full adder can be built. Usually, as shown by an example in Table 9.2,
binary numbers of length more than one digit must be added. A full adder is more
complex than a half adder, in that it takes the carry digit from the addition of the
previous two digits into account (called the carry in). A full adder has therefore three
inputs the two digits to be added and the carry digit from the previous addition and
two outputs the sum modulo 2 of the three inputs and the carry digit for the next
addition (see Table 9.2) called the carry out.
In photophysics logic implementations, the inputs are provided by laser pulses,
their physical action being to excite the molecule, typically to electronic excited states.
The logical value 1 is encoded as the laser being on, and the logical value 0 as the laser
being off. The molecules in the sample act independently of one another, and there
is uncertainty as to whether every molecule absorbs light. It is therefore not the case
that a single molecule sufces to provide an output. When working with an ensemble
of molecules, there is no need to read a strict yes or a strict no from each molecule.
Table 9.2 Addition of the two binary numbers x 010 and y 111.
Carry
x
y
Sum
0
1
0
1
1
0
0
1
1
What is needed is to excite enough molecules to be above the threshold for the
detection of light absorption. The detection of the outputs is by uorescence and/or by
detecting ions. The detection of ions is relatively easy, so by monitoring the absorption
by photoionization of the molecule, it is sufcient if only a small number of molecules
respond to the input. An excited molecular ion typically fragments, and the detection
of ionic fragments can also be used to encode outputs. Although ionic fragments can
be detected very efciently, the price is that the molecule self-destructs at the end of the
computation and so the gate cannot be cycled. The details of an optically addressed half
and full adder that can be cycled are provided in Section 9.1.3 [96].
First, however, the implementation of a half adder will be discussed and two
approaches for doing this will be compared. It will then be shown how to implement a
fault-tolerant full adder on a single molecule [115] using photophysics in the gas
phase of the 2-phenylethyl-N,N-dimethylamine (PENNA) molecule [116118]. This
implementation follows the lines of the 2001 implementation of a full adder on the
NO molecule [95]. Finally, the realization of a full adder by concatenation of two half
adders is discussed, where the logic variables are transmitted between the two half
adders by energy transfer between two aromatic molecules that are photoexcited [94]
in solution.
9.2.1
Molecular Logic Implementations of a Half Adder by Photophysics
A discussed above, a half adder has two outputs: Addition modulo 2 is implemented
by the XOR gate, and the carry digit is the result of an AND operation. A molecular
realization of the logical XOR operation is challenging because the output must be 0
when both inputs are applied. For a photophysics implementation of the inputs this
means that when both lasers are on, there is not the output that is observed when only
one laser is on. The realization of the AND gate is comparatively easier because an
output is produced only if both inputs are on, and so any reproducible experiment,
which requires two inputs for generating an output can implement an AND
gate [103]. The truth table for the implementation of a half-adder by optical excitation
is given in Table 9.3.
Two-photon resonance-enhanced absorption by aromatic chromophores has been
used as an effective way to implement an XOR operation on optical inputs at the
molecular level [94, 95, 104]. The two photons (each of a somewhat different color and
x(laser 1)
y (laser 2)
Carry (AND)
Sum (XOR)
(carry, sum)
0
1
0
1
0
0
1
1
0
0
0
1
0
1
1
0
(0,0)
(0,1)
(1,0)
(1,1)
j221
222
Figure 9.4 Two-photon ionization with the two inputs being lasers
of unequal frequency can be made fault-tolerant to stimulated
emission. This is particularly so if one laser operates on the 0,0
transition while the other pumps a vibrationally excited level of S1.
For details, see the text.
One way of implementing a fault-tolerant half adder is to combine the result of the
carry and the sum digit into one word (carry, sum). This is shown in the fth column
of Table 9.3. As can be seen, the four distinct pairs of inputs of the half adder that is,
(0,0), (1,0), (0,1) and (1,1) corresponds only to three distinct outputs of (0,0), (0,1)
and (1,0). This is because in addition modulo 2, if the carry is 1, the value of the sum is
necessarily 0. Therefore, instead of assigning a separate physical probe for the sum
and the carry, it is possible to assign a physical probe for the three different logical
values of the word (carry, sum). It should be noted that the binary meaning of the
word (carry, sum) corresponds to the number of inputs with value 1. In the case of
PENNA, the choice was made to assign the word (0,1) to the presence above threshold
Table 9.4 Truth table and experimental probes used for the two
ways of implementing a half-adder on PENNA.
x
y
Carry Sum Probe for
(UV(1)) (UV(2)) (AND) (XOR) carry (AND)
0
Probe for
XOR
j223
224
of a uorescence signal from S1, and the word (1,0) to the detection above a threshold
value of a N-end fragments. The fault tolerance of this scheme arises from the fact
that in case of inputs (1,1), a N-end fragment ion detection will be reported,
irrespective of the intensity of the uorescence signal from S1. The scheme is fault
tolerant with respect to the extent of uorescence from S1 in case of two-photon
excitation. Detecting the (0,0) output is straightforward as no excitation is provided.
The two ways of implementing a half adder on PENNA are summarized in Table 9.4.
This half-adder self-destructs at the end of the computation because the local
ionization at the chromophore end causes the PENNA ion to fragment. However, this
scheme allows for a remarkable sensitivity in the detection of the outputs because
very few ions can already be detected with a good signal-to-noise ratio. Although this
involves an ensemble of molecules, not all of which provide an answer,. However,
response is needed from only 100 molecules to obtain acceptable statistics, and this
response occurs quite rapidly.
In a full adder, the word (1,1) as an output is allowed, so that a full adder has four
distinct binary words as outputs, namely (0,0), (0,1), (1,0) and (1,1). As discussed
above, it has also one more input than the half adder, the carry digit. It is shown in the
next section how a full adder can be implemented on the PENNA molecule using the
same fault-tolerant scheme for probing the outputs. This manner of implementation
of a full adder is contrasted with the implementation based on the concatenation of
two half adders, in which the sum and the carry are detected separately.
9.2.2
Two Manners of Optically Implementing a Full Adder
A full adder has three inputs and produces two outputs, the sum which is the addition
modulo 2 of the two inputs and the carry in digit and the carry out, which for the next
cycle of computation becomes the carry in:
sum out x y carry in
9:1
9:2
where means addition modulo 2 (XOR), means binary product (AND) and
means OR. The carry out logic equation can be simplied to
carry out x y x carry in y carry in
9:3
In the implementation on PENNA that was proposed above, the two inputs x and y
are encoded as for the half adder, by two UV photons with slightly different
wavelengths. The carry in digit is encoded as a laser pulse of green light which is
intense enough that two photons can be absorbed to allow the transition to the S1 state
by a non-resonance-enhanced two-green photon transition. The four outputs words
(carry out, sum): (0,0), (0,1), (1,0) and (1,1) are detected each by a distinct experimental probe. As in the half adder implementation of PENNA, the output word (0,1)
is detected as uorescence from the S1 state while the output (1,0) as the presence of
N-end fragment ions. The detection of the output (0,0) is straightforward as it
corresponds to no inputs. The output (1,1) corresponds to the three inputs having the
value 1; that is, the PENNA molecule is excited by two UV (x and y) and two green
photons (carry in). Experimentally, this amount of energy is causing fragmentation at
the C-end (instead of fragmentation at the N-end which occurs when only two inputs
are 1, in which case it is only the equivalent in energy of two UV photons). The
presence of C-end fragment ions above a given threshold is therefore used to detect
the (1,1) output. The excitation scheme and experimental probes of the outputs are
shown in Figure 9.6 and the corresponding truth table in Table 9.5. Note that, as in
the case of the half adder, the output word (carry, sum) counts in binary how many
inputs are 1.
Another way to implement a full adder is by concatenation of two half adders. The
corresponding combination logic circuit is shown in Figure 9.7.
The physical implementation, by photoexcitation of a donoracceptor complex in
solution[94], is an example of intermolecular concatenation of two half adders by
energy transfer. It demonstrates that one molecule is able to communicate its logical
output to another molecule. The implementation is on a specic pair (rhodamine
6Gazulene), for which considerable data are available, but the scheme is general
enough to allow a wide choice of donor and acceptor pairs. The rst half-adder is
realized on rhodamine 6G, and the second half adder on azulene. The midway sum is
transmitted from the rst to the second half adder by electronic energy transfer
between rhodamine 6G and azulene.
j225
226
Table 9.5 Truth table and detection scheme for the outputs for
the optical implementation of a full adder on PENNA.
x
(UV(1))
y
(UV(2))
Carry in
(vis, two-photon)
Carry
out
Sum
out
Output word
(carry,sum)
0
1
0
1
0
1
0
1
0
0
1
1
0
0
1
1
0
0
0
0
1
1
1
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
(0,0)
(0,1)
(0,1)
(1,0)
(0,1)
(1,0)
(1,0)
(1,1)
No signal
uorescence from S1
Fluorescence from S1
N-end fragment
Fluorescence from S1
N-end fragment
N-end fragment
C-end fragment
The physical realization of the two half adders relies (as in the case of the half adder
implementation on PENNA discussed above) on the fact that the absorption of one or
two UV photons (inputs x and y) by the donor (acceptor) molecule leads to distinct
outputs. However here, unlike the case of PENNA, the absorption of the second
photon does not lead to ionization, but rather to absorption by a second excited
electronic state, S2. In general, a molecular half adder will be available for molecules
which have a detectable one-photon and a detectable two-photon absorption. This
seems to go against Kashas rule [121], but in fact there are enough exceptions.
Azulene and many of its derivatives provide one class. The emission from the second
electronically excited state, S2, is often as strong or stronger as the uorescence from
S1 [122, 123]. More in general, emission from S2 is not forbidden; rather, due to
competing non-radiative processes it often has a low quantum yield but it is denitely
detectable, particularly so as it is much to the blue as compared to the emission from
S1. If necessary, the emission from S2 can be detected by photon counting. There is,
therefore, a case where the outputs of the XOR gate and the AND gate that constitute
the half adder can be probed separately, with sufcient delity. The output of the XOR
gate of the rst half adder is encoded as populating the S1 state of rhodamine 6G,
while the output of the XOR gate of the second half adder (the sum out) is encoded as
detecting uorescence of the S1 state of azulene. The output of the rst AND gate
Figure 9.7 Combinational circuit of a full adder implemented by concatenation of two half adders.
j227
228
the midway sum is 0, meaning that even if the carry in is 1, the carry 2 cannot be 1. In
other words:
carry out carry 1 carry 2
9:4
The carry out is therefore physically probed by monitoring the uorescence from
the S2 states of rhodamine 6G and azulene, which logically corresponds to the rst
line of Eq. (9.2).
The advantage of an all-optical scheme for the full adder implementation compared to the implementation on PENNA as discussed above is that the adder does not
self-destruct at the end of the computation. Another advantage is that it operates
relatively rapidly. The energy transfer rate for a solution of 103 M azulene, estimated
using the S1 uorescence spectrum of rhodamine 6G and the absorption spectrum of
azulene, is about 1010 s1. This rate is sufcient for present needs, but it can be
increased [130] if the two chromophores are incorporated within a single molecular
unit using a short bridge to connect them [124]. The increase in the rate will be
particularly signicant (ve orders of magnitude) if the bridge is rigid [130, 131].
It should be emphasized that a rigid bridge is required to achieve a very high rate.
Many other couples based on commonly used laser dyes as donors and azulene
derivatives [128, 132] may also be utilized for implementation of the logic
gate [133, 134].
9.3
Finite-State Machines
Finite-state (also called sequential) machines are combinational circuits with memory capability. The memory registers are the internal state(s) of the machine [135, 136].
As in a combinational circuit, the outputs of the machine depend on the inputs,
but in addition the output also depends on the current state of the machine. It is
this dependence of the output on the state of the machine that endows nite-state
machines with a memory. The memory of the machine corresponds to the state of
the experimental system, and this state can be changed by applying suitable
perturbations, such as optical or voltage pulses. As in the other logic schemes, the
logic part is an encoding of the subsequent dynamics of the system.
The nite-state machine computational model takes advantage of two aspects that
are natural for quantum systems:
.
.
A physical quantum system has discrete internal states and its response to
perturbation will in general depend on what state it is in.
Perturbations can be applied sequentially, so that the machine can be cycled.
By taking advantage of the two points above, the implementation of several forms
of nite-state machine was proposed: a simple set-reset that can be either optically [101] or electrically addressed [106]; an optical ip-op [104] and full adder and
subtractor [96]; an electrically addressed full adder [106] and a electro-optically
addressed counter [137]. Beyond that it has been shown, using optical addressing,
that a molecule can be programmed and behaves (almost) like a Turing machine [101].
The caveat almost is introduced because a molecule can have only a nite number
of quantum states, whereas a Turing machine has an unlimited memory. Possibly
this is not a true limitation since if indeed the number of quantum states of the
universe is nite sometimes known as the holographic bound then no physical
system can strictly act as a Turing machine.
In this section, a review is conducted of optically addressed nite-state machines,
up to a full adder (Section 9.3.1) and an electrically addressed machine (Section 9.3.2).
If molecules and/or supramolecular assemblies are to offer an inherent advantage
over the paradigm of switching networks, it will likely be through each molecule
acting as a nite-state unit.
9.3.1
Optically Addressed Finite-State Machines
Laser pulses are used to optically address atomic or molecular discrete quantum
states. All of the schemes discussed here are based on the Stimulated Raman
Adiabatic passage (STIRAP) pump-probe control scheme, that allows the population
of the quantum states of atoms or molecules to be manipulated. The advantages of the
STIRAP control scheme for implementing nite-state machines are that the external
perturbation can induce a change of state with a very high efciency (close to 100%),
and that the residual noise which accumulates when the machine is cycled can be
erased by resetting it. Moreover, the perturbation has a distinctly different effect on
the system depending on the initial state. These advantages are supported by
experimental results for atomic (i.e. Ne [138]) and molecular systems (i.e. SO2 [139],
NO [140]), and the dynamics is well-described by solving the quantum mechanical
time-dependent Schrodinger equation [141143].
Here, the operation of nite-state machines using quantum simulations on a
three-level system with a L-level scheme is described (see Figure 9.9). The pump
pulse, with photons of frequency wP, is nearly resonant (up to a detuning DP) with the
1 ! 2 transition, while the Stokes pulse, with photons of frequency wS, is nearly
resonant (up to a detuning DS) with the 2 ! 3 transition. Levels 1 and 3 are long-lived,
but level 2 is metastable because it can uoresce. The spontaneous emission from
level 2 provides a readable output. The important feature of this level structure is that
there are two routes for going from level 1 to level 3. The rst route is a kinetic or
j229
230
the system is initially in level 1, the SP pulse transfers to level 3 via the STIRAP route,
without any signicant population in level 2. On the other hand, if the system is
initially in level 3, the SP pulse will pump it down to level 1 via the kinetic route, that
can be detected by spontaneous emission from level 2. The time prole of the SP
pulse is shown in Figure 9.10.
The quantum simulations illustrating the two routes that are possible using a SP
pulse as an input are shown in Figure 9.11. The purpose of the simulation is to show
that, by either route, the SP pulse achieves an essentially 100% population transfer
between levels 1 and 3. The main source of noise is the spontaneous emission from
level 2 that can end up either in level 1 or 3 or to yet another level, in which case the
molecule is lost from the ensemble. To achieve a population transfer close to 100%
between levels 1 and 3, rather intense pulses are needed, with the result that in the
kinetic route the population in level 2 remains low (see Figure 9.10). Only a few
photons are necessary to detect the output, so that detecting the output does not
introduce too much noise. After a few cycles, the noise accumulation can be corrected
for by resetting the machine (see Ref. [104]).
j231
232
For the three-level structure shown in Figure 1.9, the Hamiltonian in the rotating
wave approximation takes the form [141, 144, 145]:
0
1
2w1
WP texpiwP t
0
1@
WP texp iwP t
2w2
WS texp iwS t A
9:5
H
2
0
WS texpiwS t
2w3
where the two pairs of levels are coupled by nearly resonant transient laser pulses. The
Rabi frequency [141, 146] is denoted as W(t). It is given by the product of the amplitude of
the laser pulse, E(t) and the transition dipole, m: W(t) m E(t)/h. The central frequency
of the Pump and Stokes lasers is almost resonant with the 1 ! 2 and the 2 ! 3
transitions; that is, wP w2 w1 DP and wS w2 w3 DS and the detunings are
small and taken to be equal in the simulation, DP DS D. Therefore, the two lasers
are off resonance for the transitions for which they are not intended. In the rotating-wave
approximation, the Hamiltonian couples between levels using only the component of
the oscillating electrical eld that is in resonance or nearly so for the two levels. The
Hamiltonian [Eq. (9.5)] is that used in earlier studies of STIRAP [141, 147149].
The Hamiltonian [Eq. (9.5)] can be recast in the interaction picture where it takes
the form:
0
1
0
0
WP texp iDP t
1
~ @ WP texpiDP t
9:6
0
WS texpiDS t A
H
2
0
0
WS texp iDS t
The wavefunction of the system, y (t), is a linear combination of the three levels, with
time-dependent coefcients:
yt
3
X
~c i tjii;
~c i t c i texp iwi t
9:7
i1
where ~c t are the coefcients in the interaction picture. These satisfy the matrix
equation of the time-dependent Schrodinger equation, i d~c=dt H~c, which is solved
numerically without invoking the adiabatic approximation [150]. The total probability,
cT c ~cT ~c, is conserved because the Hamiltonian is Hermitian.
Figure 1.11 shows the effect of acting with two SP pulses successively, for the
system being initially in level 1 [panel a and in level 3 (panel b)]. The time prole of the
sequence of two SP pulses in shown in Figure 9.10.
The simulations start with the molecule either in level 1 (Figure 9.11, panel a), or
level 3 (panel b). The sequence of pulses as shown in Figure 9.10 returns the system to
the level it started from. In a single cycle of the machine the SP pulse is applied
only once. Parameters of the simulation given in reduced time units (t/s) are:
WP(t/s) WS(t/s) 20.05 exp (((t/s)ti)2/2), with ts1 8, tp1 9.25, ts2 18.75,
tp2 20. The detuning D DS DP 4(s/t). The area of the pulse, At
Wt=sdt=s is 6.38 p. These details are quoted since the achievement of an
essentially complete population transfer by the kinetic route (as shown in Figure 9.11)
is sensitive to the intensity of the pulse and also to the detuning.
The physics shown in Figure 9.11 is all that is required to implement nite-state
machines. The implementation of a full adder and a full subtractor are discussed
below; these are implemented in a cyclable manner, with each full addition or
subtraction requiring two steps. The inputs x and y are both encoded as an SP pulse.
The duration of a computer time step is taken to be somewhat longer than the
duration of the input SP pulse. Here (unlike Section 9.3.1, where the combinational
circuits implement a full adder) there is no need for concatenation because the carry
(borrow) is encoded in the state of the machine for the rst step and the midway sum
(the midway difference) is encoded in the state of the machine for the second step.
This is a major advantage of nite-state machines. The state of the machine encodes
intermediate values needed for the computation.
A logic value of 0 is encoded for the carry-in or of the borrow digit as the molecule
being in level 1, and a logic value of 1 as the molecule being in level 3. During the
course of the discussion, it will also be shown (see Table 9.8) how the rst cycle of the
operation can also be logically interpreted as a T-ip-op [136] (T for toggle) machine.
In order to cycle an adder after two optical inputs, the machine should be in a state
that corresponds to the carry for the next addition, so that it is ready for the next
operation. At present, a scheme which does exactly that cannot be devised, as two
more operations are required in order for the machine to be ready for the next cycle.
The reason for this is that, as shown below, at the end of the two cycles, the sum out is
encoded in the state of the machine. So, the rst requirement is to read the state of the
machine in order to obtain the sum out as an output. This can be readily done by
applying a SP pulse (as explained in Ref. [104] and shown in Figure 9.11). If the
machine is in logical state 1 (level 3), an output from level 2 will be obtained, whereas
if it is in logical state 0 (level 1) there will be no input. The machine is then restored to
state 0 (level 1) by applying a second SP pulse if needed. Next, the carry out must be
encoded in the internal state of the machine. If uorescence was observed either in
the rst step or in the second step of the addition, it means that the carry is 1 and a SP
pulse must be input in order to bring the machine to internal state 1 (level 3).
Depending on the value of the sum and the carry out, the preparation of the machine
for the next cycle may be automatic, in the sense that reading the sum out can
coincide with encoding the carry in.
In a full addition, the order into which the three inputs, x, y and carry in, are added
does not matter. This is unlike the case of a full subtractor, where the order does
matter that is, x y and y x differ by a sign. In order that the rst step is the same
for the full addition and the full subtraction discussed below, the process is started by
adding the carry in and the y input digit. The nite-state machine implementation of a
full adder goes along lines similar to the combinational circuit implementation by
concatenation of two half adders. It is simply the order of adding the three inputs that
differs, in order to take advantage of the memory provided by the internal state of the
machine. The rst step can be summarized by the Boolean equations:
statet 1 carry in y
carry 1
carry in y
9:8
j233
234
state(t) : carry in
y(t) : SP pulse
state(t 1)
(XOR) : midway sum
output(t 1)
(AND) : carry 1
0 (level
0 (level
1 (level
1 (level
0
1
0
1
0 (level
1 (level
1 (level
0 (level
0
0
0
1
1)
1)
3)
3)
1)
3)
3)
1)
occurs if the input is (1,1) that is, the carry in was 1 (system in level 3) and the input
SP pulse is 1, so that is induces a transition from level 3 to level 1 via the kinetic route.
At the next interval the second digit, x, is input as a SP pulse. The truth table is given
in Table 9.7, and corresponds to the following logic equations.
State (t 2) is the XOR sum of the three inputs (x, y, and the carry in):
statet 2 statet 1 x statet y x carry in y x
9:9
and corresponds the sum out given by Eq. (9.1) above. Using a bar to denote negation
carry2
statet 1 x carry in y x
y x
carry in y carry in
9:10
x y carry in x y carry in
The carry out is obtained by reading uorescence from level 2, either at time t 1 or
at t 2, carry out carry 1 carry 2, which corresponds to Eq. (9.4) above.
It can now be shown how encoding level 1 as the logical value 1 of the internal state
of the machine and level 3 as the logical value 0 and still using a SP pulse as the input,
known as y, leads to different state equations and different machines. With this
convention, the following logic equations are obtained:
statet 1 y t statet yt statet
9:11
Outputt yt statet
9:12
State(t 1) :
midway sum
x(t 1) :
SP pulse
State(t 2) (XOR) :
sum
Output(t 2) (AND) :
carry 2
0 (level
0 (level
1 (level
1 (level
0
1
0
1
0 (level
1 (level
1 (level
0 (level
0
0
0
1
1)
1)
3)
3)
1)
3)
3)
1)
State (t)
State (t 1)
Output (t)
0 (level
0 (level
1 (level
1 (level
0
1
0
1
0 (level 3)
1 (level 1)
1 (level 1)
0 (level 3)
0
1 (kinetic)
0
0 (STIRAP)
3)
3)
1)
1)
The equation for the next state corresponds to a XOR operation identical to Eq. (9.8),
while the logical equation for the output corresponds to an INH gate (see Table 9.1).
The truth table corresponding to the logic Eqs. (9.11) and (9.12) is given in Table 9.8.
This machine can be logically interpreted in two different ways. The rst approach
is to note that the machines output monitors the direction of the change of state as
induced by the input. The output is 1 if the pulse induces the logical change of state is
0 ! 1. For the change 1 ! 0 there is no output. Viewed in this manner [104], the
machine is a ip-op because it maintains a binary state until directed by the input to
switch state. Specically, the machine is similar to a T ip-op [136] because a single
input toggles the state. Flip-ops are key components as they provide a memory
element for storing one bit. The data in Table 9.8 show that the state indeed ips, but
the machine has no provision for knowing what is the present state of the machine.
As discussed above, knowledge of the state of the machine can be readily implemented by applying two SP pulses one to interrogate the state and one to restore the
machine to its initial state. This is in the sense that the machine is endowed with
memory.
Another way in which to view the machine represented by Eqs. [9.119.12] and
Table 9.8 is to see it as a half subtractor, where the minuent digit x is encoded in the
state of the machine and the subtrahend y as a SP pulse, so that the machine
computes x y. In a half subtractor, the difference is given by the XOR of the two
digits (so that it is equivalent to the sum) but instead of a carry a borrow is needed,
which is given by the INH function (see Table 9.1).
diff x y
9:13
borrow x y
9:14
Therefore, the state at t 1 [Eq. (9.11)] gives the difference while the output
[Eq. (9.12)] gives the borrow. Another way of implementing a half subtractor is
discussed in Ref. [96], where the initial convention of level 1:0 and level 3:1 is
maintained but the input is reverse and is now a PS pulse. It can be readily checked
that by encoding x in the state and y as a PS pulse, Eqs. [9.119.12] are obtained for the
next state and for the output.
There are two ways to implement a full subtractor (see Ref. [96] for details). The rst
method is by combining two half subtractors, along the lines used for the full adder
discussed above. The other method is more interesting because it closely mimics the
j235
236
implementation of the full adder, which means that the same logic device can be
used, either to add or to subtract. This is what is meant by the ability to program a
molecule: the same set of levels and of inputs can be used to implement different
logic operations.
9.3.2
Finite-State Machines by Electrical Addressing
Present state
Set input
Reset input
Name of action
Next State
0
0
1
1
1
0
0
1
1
0
0
0
0
0
0
0
1
1
No change
set
set
No change
reset
reset
0
1
1
1
0
0
if it is already in 0. The case where the two inputs are both 1 is not dened.
The operation of the setreset machine is summarized in Table 9.9, and a state
diagram is shown in Figure 9.12.
Here, a single QD tethered in a three-terminal device is considered, that is
submitted to a source-drain and to a gate bias. Its discrete level structure is
described using the orthodox theory [163165], which assumes that discrete level
structure of the QD is due solely to quantization of charge on the dot. The oneelectron level spacing of the dot is assumed to be continuous because it is much
smaller than the change in electrostatic energy of the QD that occurs when an
electron is added to or removed from it by varying the source-drain or the gate
bias. The electrostatic energy of a QD with N electrons in a three-terminal device is
given by
Q2
N 2 e2
Ne X
1
Ci V i
UN
CT i
2CT
2CT
2CT
!2
Ci V i
9:15
Ci V i
il;r;g
9:16
j237
238
Figure 9.13 Electron transfer to/from the left and right electrode
possible for a N QD in a three-terminal device.
In Eqs. (9.16) and (9.15), F is the electrostatic potential, CT is the total capacitance
of the system (CT Cl Cr Cg where Cl,r are the capacitances of the junctions
to the left and right electrodes), and Cg is the capacitance of the gate electrode. Vl,r
are the source and the drain voltages. For a given gate voltage, an electron will be
transferred to the dot or will leave the dot to the left or the right electrode when
one of its discrete level falls within the energy window opened by the source-drain
bias, Vsd, which is the difference between the bias of the right and on the left
electrode. As shown in Figure 9.13, if the dot possesses initially N electrons, there
are therefore four resonance conditions for electron transfer to/from the left and
the right electrodes.
The four resonance conditions are:
0
1
e @e
V
Ne Cg V g A e
DE l ! QD
CT 2
2
DE QD ! l
DE r ! QD
0
1
e @e
V
Ne Cg V g A e
CT 2
2
0
1
e @e
V
Ne Cg V g A e
CT 2
2
9:17
0
1
e @e
V
Ne Cg V g A e
DE QD ! R
CT 2
2
where V Vsd and a symmetric junction is assumed so that Cl Cr and Vl Vr V/2.
DE UN UN1 must be 0 for the process to be allowed. It is the free energy
difference for adding or removing an electron to the QD. Note that when only the
charge on the dot is quantized, UN UN1 varies linearly with the applied sourcedrain and gate bias. The threshold for transferring an electron is given by the
resonance condition, DE 0, which allows stability maps to be drawn of the charged
QD as a function the gate and the source-drain bias. A stability map for N 0, 1
electrons on the dot is shown in Figure 9.14. The areas in gray are the zones where the
number of electrons on the QD is stable.
In the orthodox theory [164] the rates of transfer from the QD to the source and
the drain electrodes are given by
G
2
DE
T ! 0K 2
! 2 jDEjq DE
e2 R 1 expDE=kT
e R
9:18
where R is the resistance of the junction through which the electron passes, and is
inversely proportional to the coupling between the QD and the electrode.
For implementation of the setreset, two charges states of the QD are used, namely
N 1 and N, where N is the number of extra electrons on the QD. For the simulation
shown below, N 0 and N 1 were utilized. The logical state 0 of the setreset
machine was encoded as the QD with N 0 extra electrons, and the logical state 1 of
the machine was encoded as the QD with N 1 extra electrons. From Figure 9.13, it
can be seen that there are two rates for adding an electron to a QD with N 0, Gr ! QD
and Gl ! QD, and two rates for removing an electron from a N 1 QD, GQD ! r and
GQD ! l. Their analytical forms at T 0 K are
0
1
C
V
e
V
g gA
GQD ! l; l ! QD / @
2CT
2
CT
9:19
0
1
C
V
e
V
g
g
A
GQD ! r; r ! QD / @
2CT
2
CT
These four rates are plotted in Figure 9.15 for a xed gate voltage as a function of the
source-drain bias, V.
In order for the setreset machine to operate properly, a set voltage must be chosen
such that the rate of transfer of an electron from the left electrode to the QD with N = 0
is much larger than the rate for leaving the dot with N = 1 to the right electrode, so that
an electron is added to the dot and stays on the dot for a nite time. For the reset
voltage, it is sufcient that the rate of leaving the dot with N = 1 to the left electrode is
signicant. The rate Gr ! QD corresponds to adding an electron to the QD with N = 0.
The operation of the setreset device is more robust if the resistance of the right
junction is much larger than that of the left one.
To check that the setreset machine operates properly, the probability of getting an
extra electron on the QD is monitored as a function of time while applying a timedependent source-drain bias. By dening Q as the probability for having N 1 extra
j239
240
electrons on the dot, and Pl and Pr as the probabilities for this extra electron to be on
the left and on right electrode, respectively, the following kinetic scheme is obtained:
dP l
GQD ! l Qt Gl ! QD P l t
dt
dQ
Gl ! QD Pl t Gr ! QD Pr t GQD ! l GQD ! r Qt
dt
9:20
dP r
GQD ! r Qt Gr ! QD P r t
dt
The time prole of the applied source-drain bias (a) and result of the integration of the
kinetic scheme (b) are shown in Figure 9.16. The logical state 0 of the device is dened
as Pl Q, while state 1 is dened as Q Pl. It can be seen that the effect of the set
voltage is to ll the dot with one extra electron, whilst applying the reset pulse empties
the dot of that extra electron. This shows that a single QD with two electrically
addressable discrete levels can operate as a setreset machine. It has also been shown
that varying not only the source-drain but also the gate bias allows a full adder to be
implemented on this system [106].
This subsection is concluded with a description of the implementation of another
form of nite-state machine, a counter, on an array of two QDs anchored on a surface.
This implementation is based on a scheme for addressing and/or reading the states
of the dots electrically or optically that has been experimentally realized and
characterized [162] (see Figure 9.17).
A counter [136] is a machine that is able to accept N inputs and to provide an output
for every N inputs. The states of the counter are Si, i 0, 1, 2, . . . , N 1 and transition
can occur between successive states only when an input i, i 0, 1, 2, . . . , N 1, is
received. After the count of N inputs, the state SN1 is reached, an output is produced
and the next input resets the counter to its initial state, S0.
In the implementation based on the device shown in Figure 9.17 the index of the
state is determined by the number of extra electrons on the Au QD. This number can
be controlled optically or electrically (further details may be found in Refs. [137, 162]).
The counter functions as follows. First, the CdS QD is irradiated in a solution of
TEA (102 M) so as to charge the Au QD. The irradiation is then stopped. The initial
state, S0, corresponds physically to the Au QD charged with four extra electrons. This
number can be determined using surface plasmon resonance spectroscopy, by
measuring the shift of the plasmon resonance of the gold surface due to the charging
j241
242
of the Au QD. On each occasion that an input is to be provided an input the index
of the state must be incremented; this is done by decreasing the potential applied
to the Au surface by a step sufcient to discharge an electron onto the surface.
The magnitude of the required voltage drop is determined by the capacitance of
the Au QD that is, by the energy needed to charge or discharge the dot by one
electron [137]. In the experiment the Au QD is passivated by a ligand, tiopronin, that
has a high dielectric constant (16; see Ref. [162]), so that the charging energy is
exceptionally low (
30 meV for Au QD of 2.3 0.5 nm diameter). When the dot is
fully discharged, after four voltage drops, S4 is reached; this is the last state of the
counter for which there are no extra electrons on the dot. At this point, the counter
must be returned to the state S0, so that it is available for the next counting cycle, and
an output signal must then be provided. Unlike the usual system for counters, in this
scheme the last input does not reset the counter to the state S0. For this reason, even
though for a dot with four extra electrons, there are ve states S0, S1, S2, S3, and S4
and modulo 4 is counted rather than modulo 5, the last step, S3 ! S4, is used to
produce the output and reset the counter. The output is produced by monitoring the
disappearance of the plasmon angle shift or by measuring the value of the surface
potential. To reset the counter, the CdS dot is irradiated again. It should be noted that,
in principle, the maximal number of four extra electrons on the dot is not a limitation,
but up to 15 oxidation states of monolayer-protected Au QDs have been reported [166].
It is possible, therefore, to implement counters with a higher value of N.
9.4
Perspectives
The entire discussion in this chapter is based on the premise that there is a desire to
design molecule-based logic circuits and not only switches. Results to date that have
9.4 Perspectives
.
.
Technical studies in progress exploit these results towards increasing the logical
capacity and depth ( number of switches) that can be implemented on a single
molecule, or on a supramolecular assembly by the application to multifunctional group
molecules where the intramolecular dynamics are used to concatenate the logical
operations carried separately by the different groups. Next, in the order of integration is
the assembly and concatenation of an array of molecules or an array of quantum dots.
Further studies are also needed to take even greater advantage of the large number
of quantum states available, in a hierarchical order (electronic, backbone vibrations,
torsions, rotations), which allows the processing in one cycle of far more information
than a binary (classical or quantum) gate and, in the same direction, the use of more
sophisticated optical and electrical inputs and readouts.
The rst results to reach the level of technological implementation will most likely
be the use of a single molecule not as a switch but rather as a combinational circuit.
This will likely happen in the context of the architecture of a 2-D array cross-bar, which
is the favored device geometry as foreseen by Hewlett Packard and others. However,
even this progress will take time before it becomes a technology. The essential
difference to be advocated is that at each node is placed not a switch but a single
molecule acting as the equivalent of an entire network of switches. The very fast logic
is conducted within the node, but the slower, wire-mediated communication between
the nodes will remain. In the second round, communication between the nodes will
be carried out by concatenation through self-assembly of the array using molecular
recognition. Part of this endeavor is to achieve realistic programming abilities with
special reference to selective intramolecular dynamics.
The key further breakthroughs that are currently required include:
.
.
The design of molecular logic circuits that can be cycled reliably many times, and to
explore whether this can be done using all-optical schemes.
Input/output operations that reduce dissipation and allow fan-out and macroscopic
interface, with special reference to the use of pulse shaping, electrical read/write
and integrate storage within the logic unit.
Beyond what is already available, it will be necessary to improve concatenation in
order to reduce not only the need for cycling but also for interfacing with the
macroscopic world. This will in turn lead to a need for molecular systems with
special reference to devices on surfaces and their application as logic units.
j243
244
Acknowledgments
References
1 C. Joachim, J. K. Gimzewski, A. Aviram,
Nature 2000, 408, 541.
2 C. P. Collier, E. W. Wong, M. Belohradsk,
F. M. Raymo, J. F. Stoddart, P. J. Kuekes,
R. S. Williams, J. R. Heath, Science 1999,
285, 391.
3 C. P. Collier, G. Mattersteig, E. W. Wong,
Y. Luo, K. Beverly, J. Sampaio, F. M.
Raymo, J. F. Stoddart, J. R. Heath, Science
2000, 289, 1172.
4 P. R. Ashton, R. Ballardini, V. Balzani,
A. Credi, K. R. Dress, E. Ishow, C. J.
Kleverlaan, O. Kocian, J. A. Preece,
N. Spencer, J. F. Stoddart, M. Venturi,
S. Wenger, Chem-Eur. J. 2000, 6, 3558.
5 M. A. Reed, J. M. Tour, Sci. Am. 2000, 282,
86.
6 R. M. Metzger, Acc. Chem. Res. 1999, 32,
950.
7 R. M. Metzger, J. Mater. Chem. 2000, 10,
55.
8 A. P. de Silva, N. D. McClenaghan, J. Am.
Chem. Soc. 2000, 122, 3965.
9 A. P. de Silva, Y. Leydet, C. Lincheneau,
N. D. McClenaghan, J. Phys. - Cond. Mater.
2006, 18, S1847.
10 Y. Luo, C. P. Collier, J. O. Jeppesen, K. A.
Nielsen, E. Delonno, G. Ho, J. Perkins,
H.-R. Tseng, T. Yamamoto, J. F. Stoddart,
J. R. Heath, ChemPhysChem 2002, 3, 519.
11 V. Balzani, A. Credi, M. Venturi,
ChemPhysChem 2003, 4, 49.
12 J. M. Tour, Molecular Electronics, World
Scientic, River Edge, USA, 2003.
13 T. Nakamura, Chemistry of Nanomolecular
Systems: Towards the Realization of
Molecular Devices, Volume 70, Springer,
Berlin, 2003.
14 F. M. Raymo, Adv. Mater. 2002, 14, 401.
References
31 A. Okamoto, K. Tanaka, I. Saito, J. Am.
Chem. Soc. 2004, 126, 9458.
32 P. D. Tougaw, C. S. Lent, J. Appl. Phys.
1994, 75, 1818.
33 A. O. Orlov, I. Amlani, G. H. Bernstein,
C. S. Lent, G. L. Snider, Science 1997, 277,
928.
34 I. Amlani, A. O. Orlov, G. Toth, G. H.
Bernstein, C. S. Lent, G. L. Snider, Science
1999, 284, 289.
35 L. Boni, M. Gattobigio, G. Iannaccone,
M. Macucci, J. Appl. Phys. 2002, 96, 3169.
36 H. Qi, S. Sharma, Z. H. Li, G. L. Snider,
A. O. Orlov, C. S. Lent, T. P. Fehlner, J. Am.
Chem. Soc. 2003, 125, 15250.
37 J. Twamley, Phys. Rev. A 2003, 67, 052328.
38 D. L. Klein, R. Roth, A. K. L. Lim, A. P.
Alivisatos, P. L. McEuen, Nature 1997,
389, 699.
39 W. Liang, M. P. Shores, J. L. Long,
M.Bockrath,H.Park,Nature2002,417,725.
40 D. N. Weiss, X. Brokmann, L. E. Calvet,
M. A. Kastner, M. G. Bawendi, Appl. Phys.
Lett. 2006, 88, 143507.
41 A. Nitzan, M. A. Ratner, Science 2003, 300,
1384.
42 J. R. Heath, M. A. Ratner, Physics Today
2003, 56, 43.
43 A. H. Flood, J. F. Stoddart, D. W.
Steuerman, J. R. Heath, Science 2004, 306,
2055.
44 C. Joachim, M. A. Ratner, Nanotechnology
2004, 15, 1065.
45 J. Fiurasek, N. J. Cerf, I. Duchemin,
C. Joachim, Physica E 2004, 24, 161.
46 S. Ami, M. Hliwa, C. Joachim, Chem.
Phys. Lett. 2003, 367, 662.
47 R. Stadler, S. Ami, M. Forshaw, C.
Joachim, Nanotechnology 2002, 13,
424.
48 A. Bezryadin, C. Dekker, Appl. Phys. Lett.
1997, 71, 1273.
49 S. Karthauser, E. Vasco, R. Dittmann,
R. Waser, Nanotechnology 2004, 15, S122.
50 C. R. Barry, J. Gu, H. O. Jacobs, Nano Lett.
2005, 5, 2078.
51 B. Lussem, L. Muller-Meskamp, S.
Karthauser, R. Waser, Langmuir 2005, 21,
5256.
52 J. J. Urban, D. V. Talapin, E. V.
Shevchenko, C. B. Murray, J. Am. Chem.
Soc. 2006, 128, 3248.
53 R. P. Feynman, Feynman Lectures on
Computations, reprint with corrections,
Perseus Publishing, Cambridge, MA,
1999.
54 D. Deutsch, Proc. R. Soc. Lond. A 1989,
425, 73.
55 D. P. DiVincenzo, Proc. R. Soc. Lond. A
1998, 454, 261.
56 R. Cleve, A. Ekert, C. Macchiavello, M.
Mosca, Proc. R. Soc. Lond. A 1998, 454,
339.
57 A. Ekert, R. Jozsa, Proc. R. Soc. Lond. A
1998, 356, 1769.
58 R. Jozsa, Proc. R. Soc. Lond. A 1998, 454,
323.
59 D. Loss, D. P. DiVincenzo, Phys. Rev. A
1998, 57, 120.
60 G. Burkard, D. Loss, D. P. DiVincenzo,
Phys. Rev. B 1999, 59 2070.
61 K. R. Brown, D. A. Lidar, K. B. Whaley,
Phys. Rev. A 2001, 65, 012307.
62 C. H. Bennett, IBM J. Res. 1973, 17,
525.
63 C. H. Bennett, Int. J. Theoret. Phys. 1982,
21, 905.
64 D. Cory, A. Fahmy, T. Havel, Proc. Natl.
Acad. Sci. USA 1997, 94, 1634.
65 L. K. Grover, in Proceedings 28th ACM
Symposium on the Theory of Computing,
1996.
66 D. Deutsch, R. Jozsa, Proc. R. Soc. Lond. A
1992, 439, 553.
67 L. K. Grover, Phys. Rev. Lett. 1997, 79,
4709.
68 T. Tulsi, L. K. Grover, A. Patel, Quant. Inf.
Comp. 2006, 6, 483.
69 C. H. Bennett, F. Bessette, G. Brassard,
L. Slavail, J. Smolin, J. Crypt. 1992,
5, 3.
70 P. W. Shor, in: S. Goldwasser (Ed.),
Proceedings, 35th Annual Symposium on
the Foundations of Computer Science,
IEEE Computer Society Press, Los
Alamitos, CA, 1994.
71 A. Ekert, R. Jozsa, Rev. Mod. Phys. 1996,
68, 733.
j245
246
References
113 G. C. Schatz, M. A. Ratner, Quantum
Mechanics in Chemistry, Prentice-Hall,
New York, 1993.
114 M. M. Mano, C. R. Kime, Logic and
Computer Design Fundamentals,
Prentice-Hall, Upper Saddle River, NJ,
2000.
115 F. Remacle, R. Weinkauf, R. D. Levine, J.
Phys. Chem. A 2006, 110, 177.
116 W. Cheng, N. Kuthirummal, J. Gosselin,
T. I. Solling, R. Weinkauf, P. Weber, J.
Phys. Chem. A 2005, 109, 1920.
117 L. Lehr, T. Horneff, R. Weinkauf, E. W.
Schlag, J. Phys. Chem. A 2005, 109,
8074.
118 R. Weinkauf, L. Lehr, A. Metsala, J. Phys.
Chem. A 2003, 107, 2787.
119 R. B. Bernstein, J. Phys. Chem. 1982, 86,
1178.
120 R. B. Bernstein, Chemical Dynamics via
Molecular Beam and laser techniques,
Oxford University Press, New York, 1982.
121 G. Wiswanath, M. Kasha, J. Chem. Phys.
1956, 24, 574.
122 M. Beer, H. C. Longuett-Higgins, J. Chem.
Phys. 1955, 23, 1390.
123 J. W. Sidman, D. S. McClure, J. Chem.
Phys. 1955, 24, 757.
124 S. Speiser, Chem. Rev. 1996, 96 1953.
125 I. Kaplan, J. Jortner, Chem. Phys. Lett.
1977, 52, 202.
126 I. Kaplan, J. Jortner, Chem. Phys. 1978, 32,
381.
127 S. Speiser, Appl. Phys. B 1989, 49, 109.
128 S. Speiser, N. Shakkour, Appl. Phys. B
1985, 38, 191.
129 M. Orenstein, S. Kimel, S. Speiser, Chem.
Phys. Lett. 1978, 58, 582.
130 N. Lokan, M. N. Paddow-Row, T. A. Smith,
M. LaRosa, K. P. Ghiggino, S. Speiser, J.
Am. Chem. Soc. 1999, 121, 2917.
131 S. Speiser, F. Schael, J. Mol. Liq. 2000,
86, 25.
132 S. Speiser, Opt. Commun. 1983, 45, 84.
133 U. Peskin, M. Abu-Hilu, S. Speiser,
Optical Mater. 2003, 24, 23.
134 S. Speiser, J. Luminescence 2003, 102, 267.
135 T. L. Booth, Sequential Machines and
Automata Theory, Wiley, New York, 1968.
j247
248
163
164
165
166
II
Architectures and Computational Concepts
j251
10
A Survey of Bio-Inspired and Other Alternative Architectures
Dan Hammerstrom
10.1
Introduction
Since the earliest days of the electronic computer, there has always been a small group
of people who have seen the computer as an extension of biology, and have
endeavored to build computing models and even hardware that are inspired by,
and in some cases are direct copies of, biological systems. Although biology spans a
wide range of systems, the primary model for these early efforts has been neural
circuits. Likewise, in this chapter the discussion will be limited to neural
computation.
Several examples of these early investigations include McCulloch-Pitts Logical
Calculus of Nervous System Activity [2], Steinbuchs Die Lernmatrix [3], and
Rosenblatts Perceptron [4]. At the same time, an alternate approach to intelligent
computing, Articial Intelligence (AI), that relied on higher-order symbolic functions,
such as structured and rule-based representations of knowledge, began to demonstrate signicantly greater success than the neural approach. In 1969, Minsky and
Papert [5] of the Massachusetts Institute of Technology published a book that was
critical of the then current bio-inspired algorithms, and which succeeded in
eventually ending most research funding for that approach. Consequently, signicant research funding was directed towards AI, and the eld subsequently ourished.
The AI approach, which relied on symbolic reasoning often represented by a rstorder calculus and sets of rules, began to exhibit real intelligence, at least on toy
problems. One reasonably successful application was the expert system, and there
was even the development of a complete language, ProLog, dedicated to logical rulebased inference.
A few expert system successes were also enjoyed in actually elded systems, such
as Soar [6], the development of which was started by Alan Newells group at Carnegie
Mellon University. However, by the 1980s AI in general was beginning to lose its
luster after 40 years of funding with ever-diminishing returns.
Since the 1960s, however, there have always been groups that continued to study
biologically inspired algorithms, and two such projects mostly as a result of their
Nanotechnology. Volume 4: Information Technology II. Edited by Rainer Waser
Copyright 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31737-0
252
being in the right place at the right time had a huge impact which re-energized the
eld and led to an explosion of research and funding. The rst project incorporated
the investigations [7] of John Hopeld, a physicist at Caltech, who proposed a model
of auto-associative memory based on physical principles such as the Ising theory of
spin-glass. Although Hopeld nets were limited in capability and size, and others had
proposed similar algorithms previously, Hopelds formulation was both clean and
elegant. It also succeeded in bringing many physicists, armed with sophisticated
mathematical tools, into the eld. The second project was the invention of the backpropagation algorithm by Rumelhart, Hinton, and Williams [8]. Although there too
similar studies had been conducted previously [9], the difference with Rumelhart and
colleagues was that they were cognitive scientists creating a set of techniques called
parallel distributed processing (PDP) models of cognitive phenomena, where backpropagation was a part of a larger whole.
At this point, it would be useful to present some basic neuroscience, followed by
details of some of the simpler algorithms inspired by this biology. This information
will provide a strong foundation for discussing various biologically inspired hardware
efforts.
10.1.1
Basic Neuroscience
10.1 Introduction
One of the rst biologically inspired models was the perceptron which, although very
simple, was still based on biological neurons. The primary goal of a perceptron is to
j253
254
O f (W T X )
where f() is an activation function, and is a step function here (if sum > 0, then
f (sum) 1); however, f() can also be a smooth function (see below). A layer has some
number (two or more) of perceptrons, each with its own weight vector and individual
output value, leading to a weight matrix and an output vector. In a single layer of
perceptrons, each one sees the same input vector.
The Delta Rule, which is used during learning is,
Dw ij a(dj oj )x i
where dj is the desired output, and oj is the actual output.
The delta rule is fundamental to most adaptive neural network algorithms.
Rosenblatt proved that if a data set is linearly separable, the perceptron will eventually
nd a plane that separates the set. Figure 10.3 shows the two-dimensional (2-D) layout
of data for, rst, a linearly separable set of data and, second, for a non-linearly
separable set. Unfortunately, if the data are not linearly separable the perceptron fails
miserably, and this was the point of . . .the book that killed neural networks,
10.1 Introduction
Perceptrons by Marvin Minsky and Seymour Papert (1968). Perceptrons cannot solve
non-linearly separable problems; neither do they function in the kind of multiple
layer structures that may be able to solve non-linear problems, as the algorithm is
such that the output layer cannot tell the middle layer what its desired output should
be. Attention is now turned to a description of the multi-layer perceptron.
10.1.3
A Slightly More Complex Neural Model: The Multiple Layer Perceptron
j255
256
Even two-level networks can approximate complex, non-linear functions. Moreover, this technique generally nds good solutions, which are compact, leading to
fast, feed-forward (non-learning) execution time. Although it has been shown to
approximate Bayesian decisions (i.e. it results in a generally good estimate of where
Bayes techniques would put the decision surface), it can have convergence problems
due to many non-local minima. It is also computationally intensive, often taking days
to train with complex large feature sets.
10.1.4
Auto-Association
10.1 Introduction
operation plays the role of competitive lateral inhibition, which is a major component
in all cortical circuits. In the BCPNN model
pof Lansner and his group
p [11], the nodes
are divided into hypercolumns, typically N nodes in each of the N columns, with
1-WTA being performed in each column.
An auto-associative network starts with the associative model just presented and
feeds the output back to the input, so that the x and y are in the same vector space and
l k. This auto-associative model is called an attractor model in that its state space
creates an energy surface with most minima (attractor basins) occurring when the
state is equal to a training vector. Under certain conditions, given an input vector x0 ,
then the output vector y that has the largest conditional probability P(x0 |y) is the most
likely training vector in a Bayesian sense. It is possible to dene a more complex
version with variable weights, as would be found during dynamic learning, which
also allows the incorporation of prior probabilities [12].
10.1.5
The Development of Biologically Inspired Hardware
With BP and other non-linear techniques in hand, research groups began to solve more
complex problems. Concurrent to this there was an explosion in neuroscience that was
enabled by high-performance computing and sophisticated experimental technologies, coupled with an increasing willingness in the neuroscience community to begin
to speculate about the function of the neural circuits being studied. As a result, research
into articial neural networks (ANNs) of all types gained considerable momentum
during the late 1980s, continuing until the mid-1990s when the research results began
to slow down. However, like AI before it and fuzzy logic, which occurred concurrently
ANNs had trouble in scaling to solve the difcult problems in intelligent computing.
Nevertheless, ANNs still constitute an important area of research, and ANN technologies play a key role in a number of real-world applications [13, 14]. In addition, they
are responsible for a number of important breakthroughs.
During the heady years of the late 1980s and early 1990s, while many research
groups were investigating theory, algorithms, and applications, others began to
examine hardware implementation. As a consequence, there quickly evolved three
schools of thought, though with imprecise dividing lines between them:
.
The rst concept was to build very specialized analog chips where, for the most part,
the algorithms were hard-wired into silicon. Perhaps the best known was the aVLSI
(low-power analog VLSI) technology developed by Carver Mead and his students at
Caltech.
The second concept was to build more general, highly parallel digital, but still fairly
specialized chips. Many of the ANN algorithms were very computer-intensive, and it
seemed that simply speeding up algorithm execution and especially the learning
phase would be a big help in solving the more difcult problems and the
commercialization of ANN technology. During the late 1980s and early 1990s these
chips were also signicantly faster than mainstream desktop technology; however,
this second group of chips incorporated less biological realism than the analog chips.
j257
258
The third option was to use off-the-shelf hardware, digital signal processing (DSP)
and media chips, and this ultimately was the winning strategy. This approach was
successful because the chips were used in a broader set of applications and had
manufacturing volume and software inertia in their favor. Their success was also
assisted by Amdahls law (see Section 10.2.1).
The aim of this chapter is to review examples of these biologically inspired chips in
each of the main categories, and to provide detailed discussions of the motivation for
these chips, the algorithms they were emulating, and architecture issues. Each of the
general categories presented is discussed in greater detail as appropriate. Finally, with
the realm of nano- and molecular-scale technology rapidly approaching, the chapter
concludes with a preview of the future of biologically inspired hardware.
10.2
Early Studies in Biologically Inspired Hardware
The hardware discussed in this chapter is based on neural structures similar to those
presented above, and, as such, is designed to solve a particular class of problems that
are sometimes referred to as intelligent computing. These problems generally
involve the transformation of data across the boundary between the real world and the
digital world, in essence from sensor readings to symbolic representations usable by
a computer; indeed, this boundary has been called the digital seashore.1) Such
transformations are found wherever a computer is sampling and/or acting on realworld data. Examples include the computer recognition of human speech, computer
vision, textual and image content recognition, robot control, optical character
recognition (OCR), automatic target recognition, and so on. These are difcult
problems to solve on a computer, as they require the computer to nd complex
structures and relationships in massive quantities of low-precision, ambiguous,
and noisy data. These problems are also very important, and an inability to solve
them adequately constitutes a signicant barrier to computer usage. Moreover, the
list of ideas has been exhausted, as neither AI, ANNs, fuzzy logic, nor Bayesian
networks2) have yet enabled robust solutions.
At the risk of oversimplifying a complex family of problems, the solution to these
problems will, somewhat arbitrarily, be partitioned into two domains: the front end
and the back end (see Figure 10.4):
.
Front-end operations involve more direct access to a signal, and include ltering and
feature extraction.
Back-end operations are more intelligent, and include storing abstract views of
objects or inter-word relationships.
In moving from front end to back end, the computation becomes increasingly
interconnect driven, leveraging ever-larger amounts of diffuse data at the synapses
for the connections. Much has been learned about the front end, where the data are
input to the system and where there are developments in traditional as well as
neural implementations. Whilst these studies have led to a useful set of tools and
techniques, they have not solved the whole problem, and consequently more
groups are beginning to examine the back-end the realm of the cerebral cortex
as a source of inspiration for solving the remainder of the problem. Moreover, as
difcult as the front-end problems are, the back-end problems are even more so.
One manifestation of this difculty is the perception gap discussed by Lazzaro
and Wawrzynek [15], where the feature representations produced by more biologically inspired front-end processing are incompatible with existing back-end
algorithms.
A number of research groups are beginning to refer to this backend as intelligent
signal processing (ISP), which augments and enhances existing DSP by incorporating contextual and higher level knowledge of the application domain into the data
transformation process. Simon Haykin (McMaster University) and Bart Kosko (USC)
were editors of a special issue of the Proceedings of the IEEE [16] on ISP, and in their
introduction stated:
ISP uses learning and other smart techniques to extract as much
information as possible from signal and noise data.
If you are classifying at Bayes optimal rates and you are still not solving the
problem, what do you do next? The solution is to add more knowledge of the process
being classied to your classication procedures, which is the goal of ISP. One way to
do this is to increase the contextual information (e.g. higher-order relationships such
as sentence structure and word meaning in a text-based application) available to
the algorithm. It is these complex, higher-order relationships that are so difcult for
us to communicate to existing computers and, subsequently, for them to utilize
efciently when processing signal data.
j259
260
They grow very large if the probability space is complex, such as when pairs of
symbols are modeled rather than single symbols; yet most human language has
very high order structure.
The key point here is that most neuro-inspired silicon and in particular the
analog-based components is primarily focused on the front-end DSP part of the
problem, since robust, generic back-end algorithms (and subsequently hardware)
have eluded identication. It has been argued by some that if the front end was
performed correctly, then the back-end would be easier, but whilst it is always easier to
do better in front-end processing the room for improvement is smaller there.
Without robust back-end capabilities, general solutions will be more limited.
10.2.1
Flexibility Trade-Offs and Amdhals Law
During the 1980s and early 1990s, when most of the hardware surveyed in this
chapter was created, there was a perception that the algorithms were large and
complex and therefore slow to emulate on the computers of that time. Consequently,
specialized hardware to accelerate these algorithms was required for successful
applications. What was not fully appreciated by many was that the performance of
general-purpose hardware was increasing faster than Moores law, and that the
existing neural algorithms did not scale well to the large sizes that would have fully
beneted from special purpose hardware. The other problem was Amdahls law.
As discussed above, these models have extensive concurrency which naturally
leads to massively parallel implementations. The basic computation in these models
is the multiply-accumulate operation that forms the core of almost all DSP and which
can be performed with minimal, xed point, precision. Also during the early 1990s,
when many of the studies on neural inspired silicon were carried out, microprocessor
technology was actually not fast enough for many applications.
The problem is that neural network silicon is highly specialized and there are
specic risks involved in its development. One way to conceptualize the trade-offs
involved in designing custom hardware is shown in Figure 10.5. Although costperformance3) can be measured, exibility cannot be assessed as easily, and so the
graph in Figure 10.5 is more conceptual than quantitative. The general idea is that the
more a designer hard-wires an algorithm into silicon, the better the cost-performance
of the device, but the less exible.
The line, which is moving slowly to the right according to Moores law, shows these
basic trade-offs and is, incidentally, not likely to be linear in the real world. Another
j261
262
It should not be concluded from the discussions so far that specialized chips are
never economically viable. Rather, the continued success of graphics processors and
DSPs are examples of specialized high-volume chips, and some neural networks
chips4) have found very successful niches. Nonetheless, it does illustrate some of
the problems involved in architecting a successful niche chip. An example is the
commercial DSP chips used for signal processing and related applications, these
provide unique cost-performance, efcient power utilization, and just the right
amount of specialization in their niche to hold their own in volume applications
against general-purpose processors. In addition, they have enough volume and
history to justify a signicant software infrastructure.
In light of what is now known about Amdahls law and ISP, the history and state of
the art of neuro-inspired silicon can now be surveyed.
10.2.2
Analog Very-Large-Scale Integration (VLSI)
There is no question that the most elegant implementation technique developed for
neural emulation is the sub-threshold CMOS technology pioneered by Carver Mead
and his students at CalTech [21]. Most MOS eld effect transistors (MOSFETs) used
in digital logic are operated in two modes, either off (0) and on (1). For the off state the
gate voltage is more or less zero and the channel is completely closed. For the on state,
the gate voltage is signicantly above the transistor threshold and the channel is
saturated. A saturated on state works ne for digital, and is generally desired to
maximize current drive. However, the limited gain in that regime restricts the
effectiveness of the device in analog computation. This is due to the fact that the more
gain the device has, the easier it is to leverage this gain to create circuits that perform
useful computation and which also are insensitive to temperature and device
variability.
However, if the gate voltage is positive (for the nMOS gate) but below the point
where the channel saturates, the FET is still on, though with a much lower current.
In this mode, which sometimes is referred to as weak inversion, there is useful gain
and the small currents signicantly lower the power requirements, though FETs
operating in this mode tend to be slower. Carver Meads great insight was that when
modeling biologically inspired circuits, signicant computation could be carried out
using simple FETs operating in the sub-threshold regime where, like real neurons,
performance resulted from parallelism and not the speed of the switching devices.
Moreover, as Carver and colleagues have shown, these circuits do a very good job
approximating a number of neuroscience functions.
By using analog voltage and currents to represent signals, the considerable expense
of converting signal data into digital, computing the various functions in digital, and
then converting the signal data back to analog, was eliminated. Neurons operate
slowly and are not particularly precise, yet when combined appropriately they
perform complex and remarkably precise computations. The goal of the aVLSI
4) One example is General Vision; http://www.
general-vision.com/.
j263
264
research community has been to create elegant VLSI sub-threshold circuits that
approximate biological computation.
One of the rst chips developed by Carver et al. was the silicon retina [22]. This was
an image sensor that performed localized adaptive gain control and temporal/spatial
edge detection using simple local neighborhood functional extensions to the basic
photosensitive cell. There subsequently followed a silicon cochlea and numerous
other simulations of biological circuits.
Two examples of these circuits are shown in Figure 10.6. A transconductance
amplier (voltage input, current output) and an integrate and re neuron are two of
the most basic building blocks for this technology. The current state of aVLSI
research is very well described by Douglas [23], of the Neuroinformatics Institute,
ETH-Zurich:
Fifteen years of Neuromorphic Engineering: progress, problems, and
prospects. Neuromorphic engineers currently design and
fabricate articial neural systems: from adaptive single chip
sensors, through reexive sensorimotor systems, to behaving
mobile robots. Typically, knowledge of biological architecture
and principles of operation are used to construct a physical
emulation of the target neuronal system in an electronic
medium such as CMOS analog very large scale integrated
(aVLSI) technology.
Initial successes of neuromorphic engineering have included
smart sensors for vision and audition; circuits for non-linear
adaptive control; non-volatile analog memory; circuits that
provide rapid solutions of constraint-satisfaction problems such
as coherent motion and stereo-correspondence; and methods
for asynchronous event-based communication between analog
computational nodes distributed across multiple chips.
These working chips and systems have provided insights into the
general principles by which large arrays of imprecise processing
elements could cooperate to provide robust real-time computation of
sophisticated problems. However, progress is retarded by the small
size of the development community, a lack of appropriate high-level
conguration languages, and a lack of practical concepts of neuronal
computation.
Although still a modest-sized community, research continues in this area, the
largest group being that at ETH in Zurich. The commercialization of this technology
has been limited, however, with the most notable success to date being that of
Synaptics, Inc. This company created several products which used the basic aVLSI
technology, the most successful being the rst laptop touch pads.
10.2.3
Intels Analog Neural Network Chip and Digital Neural Network Chip
During the heyday of neural network silicon, between 1986 and 1996, a major
semiconductor vendor, Intel, produced two neural network chips. The rst, the
ETANN [24] (Intel part number 80170NX), was completely analog, but it was
designed as a general-purpose chip for non-linear feed-forward ANN operation.
There were two grids of analog inner product networks, each with 80 inputs and 64
outputs, and a total of 10 K (5 K for each grid) weights. The chip computed the two
inner products simultaneously, taking about 5 ms for the entire operation. This
resulted in a total performance (feed-forward only, no learning) of over two billion
connections computed per second, where a connection is a single multiply-accumulate of an input-weight pair. All inputs and outputs were in analog. The weights were
analog voltages stored on oating gates with the chip being developed and
manufactured by the ash memory group at Intel. Complementary signals for each
input provided positive and negative inputs. An analog multiplier was used to
multiply each input by a weight, current summation of multiplier outputs provided
the accumulation, with the output being sent through a non-linear amplier (giving
roughly a sigmoid function) to the output pins.
Although not designed specically to do real-time learning, it was possible to carry
out chip in the loop learning where incremental modication of the weights was
performed in an approximately stochastic fashion. Learning could also be done offline and the weights then downloaded to the chip.
The ETANN chip had very impressive computational density, although the
awkward learning and total analog design made it somewhat difcult to use. The
multipliers were non-linear, which made the computation sensitive to temperature
and voltage uctuations. Ultimately, Intel retired the chip and moved to a signicantly more powerful and robust all digital chip, the Ni1000.
The Ni1000 [25, 26] implemented a family of algorithms based on radial basis
function networks (RBF [26]). This family included a variation of a proprietary
algorithm created by Nestor, Inc., a neural network algorithm and software company.
j265
266
Rather than doing incremental gradient descent learning, as can be seen with the BP
algorithm, the Ni1000 used more of a template approach where each node represented a basis vector in the input space. The width of these regions, which was
controlled by varying the node threshold, was reduced incrementally when errors
were made, allowing the chip to start with crude over-generalizations of an input to
output space mapping, and then ne-tune the mapping to capture more complex
variations as more data are input. An input vector would then be compared to all the
basis vectors, with the closest basis vector being the winner. The chip performed a
number of concurrent basis computations simultaneously, and then, also concurrently, determined the classication of the winning output, both functions were
performed by specialized hardware.
The Ni1000 was a two-layer architecture. All arithmetic was digital and the network
parameters/weights were stored in Flash EEPROM. The rst or hidden layer had 256
inputs of 16 bits each with 16 bit weights. The hidden layer had 1024 nodes and the
second or output layer 64 nodes (classes). The hidden layer precision was 5 to 8 bits
for input and weight precision. The output layer used a special 16-bit oating point
format. One usage model was that of Bayesian classication, where the hidden layer
learns an estimate of a probability density function (PDF) and the output layer
classies certain regions of that PDF into up to 64 different classes. At 40 MHz the
chip was capable of over 10 billion connection computations per second, evaluating
the entire network 40 K times per second with roughly 4 W peak power dissipation.
The Ni1000 used a powerful, compact, specialized architecture (unfortunately,
space limitations prevent a more detailed description here, but the interested reader
is referred to Refs. [25, 26]). The Ni1000 was much easier to use than the ETANN and
provided very fast dedicated functionality. However, referring back to Figure 10.5,
this chip was a specic family of algorithms wired into silicon. Having a narrower
functionality it was much more at risk from Amdahls law, as it was speeding up an
even smaller part of the total problem. Like CNAPS (the Connected Network of
Adapted Processors), it too was ultimately run over by the silicon steam roller.
10.2.4
Cellular Neural Networks
Cellular neural networks (CNN) constitute another family of analog VLSI neural
networks. This was proposed by Leon Chua in 1988 [27], who called it the Cellular
Neural Network, although now it is known as the Cellular Non-Linear Network. Like
aVLSI, CNN has a dedicated following, the most well-known being the Analogic and
Neural Computing Laboratory of the Computer and Automation Research Institute of
the Hungarian Academy of Sciences under the leadership of Tamas Roska. CNN-based
chips have been used to implement vision systems, and complex image processing
similar to that of the retina has been investigated by a number of groups [28].
Although there are variations, the basic architecture is a 2-D rectangular grid of
processing nodes. Although the model allows arbitrary inter-node connectivity, most
CNN implementations have only nearest-neighbor connections. Each cell computes
its state based on the values of its four immediate neighbors, where the neighbors
voltage and the derivative of this voltage are each multiplied by constants and
summed. Each node then takes its new value and the process continues for another
clock. This computation is generally specied as a type of lter, and is done entirely in
the analog domain. However, as the algorithm steps are programmable one of the real
strengths of CNN is that the inter-node functions and data transfer is programmable,
with the entire array appearing as a digitally programmed array of analog-based
processing elements. This is an example of a Single Instruction, Multiple Data
(SIMD) architecture, which consists of an array of computation units, where each
unit performs the same operation, but each on its own data. CNN programming can
be complex and requires an intimate understanding of the basic analog circuits
involved. The limited inter-node connectivity also restricts the chip to mostly frontend types of processing, primarily of images. A schematic of the basic CNN cell is
shown in Figure 10.7.
Whereas, research and development continue, the technology has had only limited
commercial success. As with aVLSI, it is a fascinating and technically challenging
system, but in real applications it tends to be used for front-end problems and
consequently is subject to Amdahls law.
10.2.5
Other Analog/Mixed Signal Work
It is difcult to do justice to the large and rich area of biologically inspired analog design
that has developedover the years. Other investigations include those of Murray [29], the
former neural networks group at AT&T Bell Labs [30], Ettienne-Cummings [31],
Principe [32], and many more that cannot be mentioned due to limited space. And
today, some workers, such as Boahan, are beginning to move the processing further
into the back end [33] by looking at cortical structures for early vision.
j267
268
On returning to Figure 10.4, it can be seen that the rst few boxes of processing
require the type of massively parallel, locally connected feature extraction that CNN,
aVLSI and other analog techniques provide. With regards to sensors, these can
perform enhanced signal processing, and demonstrate better signal-to-noise ratios
than more traditional implementations, providing such capabilities in compact, lowpower implementations.
Although further studies are needed, there is a concern that the limited connectivity and computational exibility make it difcult to apply these technologies to the
back end. Although not a universally held opinion, the author feels that these higherlevel association areas require a different approach to implementation. This general
idea will be presented in more detail below, but rst, it is important to examine
another major family of neural network chips, the massively parallel digital
processors.
10.2.6
Digital SIMD Parallel Processing
Concurrent with the development of analog neural chips, parallel effort was devoted
to architecting and building digital neural chips. Although these could have dealt with
a larger subset of pattern-recognition solutions they, like the analog chips, were
mostly focused on neural network solutions to simple classication. A common
design style that was well matched to the basic ANN algorithms was that of SIMD
processor arrays. One chip that embodied that architecture was CNAPS, developed by
Adaptive Solutions [34, 35].
The world of digital silicon has always irted with specialized processors. During
the early days of microprocessors, silicon limitations restricted the available functionality and as a result many specialized computations were provided by coprocessor
chips. Early examples of this were specialized oating point chips, as well as graphics
and signal processing. Following Moores law, the chip vendors found that they could
add increasing amounts of function and so began to pull some of these capabilities
into the processor.
Interestingly, graphics and signal processing have managed to maintain some
independence, and remain as external coprocessors in many systems. Some of the
reasons for this were the signicant complexity in the tasks performed, the software
inertia that had built up around these functions, and the potential for very low power
dissipation which is required for embedded signal processing applications, such as
cell phones, PDAs, and MP3 players.
During the early 1990s it was clear that there was an opportunity to provide a
signicant speed-up of basic neural network algorithms because of their natural
parallelism. This was particularly true in situations involving complex, incremental,
gradient descent adaptation, as can be seen in many learning models. As a result, a
number of digital chips were produced that aimed squarely at supporting both
learning and non-learning network emulation.
It was clear from Moores law that performance improvements and enhanced
functionality continued for mainline microprocessors. This relentless march of the
desktop processors was referred to as the silicon steam roller where, as the 1990s
continued, it became increasingly difcult for the developers of specialized silicon
to stay ahead of. At Adaptive Solutions, the goal was to avoid the steam roller by
steering between having enough exibility to solve most of the problem, to avoid
Amdahls law, and yet to have a sufciently specialized function that allowed
enough performance to make the chip cost-effective essentially sitting somewhere
in the middle of the line in Figure 10.5. This balancing act became increasingly
difcult until eventually the chip did not offer enough cost-performance improvement in its target applications to justify the expense of a specialized coprocessor
chip and board.
The CNAPS architecture consisted of a one-dimensional (1-D) processor node
(PN) array in an SIMD parallel architecture [36]. To allow as much performance-price
as possible, modest xed point precision was used to keep the PNs simple. With small
PNs the chip could leverage a specialized redundancy technique developed by
Adaptive Solutions silicon partner, Inova Corporation. During chip testing, each
PN could be added to the 1-D processor chain, or bypassed. In addition, each PN had a
large power transistor (with a width of 20 000 l) connecting the PN to ground. Laser
fuses on the 1-D interconnect and the power transistor were used to disconnect and
power down defective PNs. The testing of the individual PNs was done at wafer sort,
after which an additional lasing stage (before packaging and assembly) would
congure the dice, fusing in the good PNs and fusing out and powering down the
bad PNs. The rst CNAPS chip had an array of 8 10 (80) PNs fabricated, of which
only 64 needed to work to form a fully functional die. The system architecture and
internal PN architecture is shown in Figure 10.8.
Simulation and analysis was used to determine approximately the optimal PN size
(the unit of redundancy) and the optimal number of PNs. Ultimately, the die was
almost exactly 2.5 cm (1 inch) on a side with 14 million transistors fabricated; this led
to 12 dice per 15-cm (6-inch) wafer which, until recently, made it physically the largest
processor chip ever made (see Figure 10.9).
The large version of the chip, called the CNAPS-1064, had 64 operational PNs and
operated at 25 MHz with 6 W worst-case power consumption. Each PN was a
complete 16-bit DSP with its own memory. Neural network algorithms tend to be
vector/matrix-based and map fairly cleanly to a 1-D grid, so it was easy to have all PNs
performing useful work simultaneously. The maximum compute rate then was
1.2 billion multiply-accumulates per second per chip, which was about 1000-fold
faster than the fastest workstation at that time. Part of this speed up was due to the fact
that each PN did several things in one clock to realize a single clock multiplyaccumulate: input data to the PN, perform a multiply, perform an accumulate,
perform a memory fetch, and compute the next memory address. During the late
1980s, DSP chips were able to perform such a single multiply-accumulate in one
clock, but it was not until the Pentium Pro that desktop microprocessors reached a
point where they performed most of these operations simultaneously.
When developing the CNAPS architecture, a number of key decisions were made,
including the use of limited precision and local memories, architecture support for
the C programming language, and I/O bandwidth.
j269
270
At a time when the computing community was moving to oating-point computation, and the microprocessor vendors were pulling oating processing onto the
processor chips and optimizing its performance, the CNAPS used limited precision,
xed point arithmetic. The primary reason for this decision was based on yield
calculations, which indicated that a oating-point PN was too large to take advantage
of PN redundancy. This redundancy bought an approximately twofold cost-performance improvement. Since a oating-point PN would have been two to three times
larger than a xed-point PN, the use of modest precision xed-point arithmetic meant
an almost sixfold difference in cost-performance. Likewise, simulation showed that
most of the intended algorithms could get by with limited precision xed-point
arithmetic, and this proved to be, in general, a good decision, as problems were rarely
encountered with the limited precision. In fact, the major disadvantage was that it
made programming more difcult, although DSP programmers had been effectively
using xed-point precision for many years.
The second decision was to use local, per PN, memory (4 KB of SRAM per PN).
Although this signicantly constrained the set of applications that could leverage the
chip, it was absolutely necessary in achieving the performance goals. The reality was
that it was unlikely that the CNAPS chip would have ever been built had performance
been reduced enough to allow the use of off-chip memory. As with DSP applications,
almost all memory access was in the form of long arrays that can benet from some
pre-fetching, but not much from caching.
The last two decisions architecture and I/O bandwidth limitations were driven
by performance-price and design time limitations. One objective of the architecture
was that it be possible for two integrated circuit (IC) engineers to create the circuits,
logic schematics and layout (with some additional layout help) in one year. As a result,
the architecture was very simple, which in turn made the design simpler and the PNs
smaller, but programming was more difcult. One result of this strategy was that the
architecture did not support the C language efciently. Although there were some
fabrication delays, the architecture, board, and software rolled out simultaneously
and worked very well, and the rst system was shipped in December 1991 and quickly
ramped to a modest volume. One of the biggest selling products was an accelerator
card for Adobe Photoshop which, in spite of Amdahl problems (poor I/O bandwidth),
offered unprecedented performance.
By 1996, desktop processors had increased their performance signicantly, and
Intel was on the verge of providing the MMX SIMD coprocessor instructions.
Although this rst version of an SIMD coprocessor was not complete and was not
particularly easy to use, the performance of a desktop processor with MMX reduced
j271
272
the advantages of the CNAPS chipset even further in the eyes of their customers, and
people stopped buying.
Everybody knew that the silicon steam roller was coming, but it was moving much
faster (and perhaps even accelerating, as some had suggested) than expected. In
addition, Intel quickly enhanced the MMx coprocessor to the now current SSE3,
which is a complete and highly functional capability. DSPs were also vulnerable to the
microprocessor steam roller, but managed, primarily through software inertia and
very low power dissipation, to hold their own.
Although there were other digital neural network processors, none of them
achieved any signicant level of success, and basically for the same reasons. Although
at this point the discussion of all but a few of these others is limited by space, two in
particular deserve mention.
10.2.7
Other Digital Architectures
One important digital neural network architecture was the Siemens SYNAPSE-1
processor developed by Ramacher and colleagues at Siemens Research in Munich [37].
The chip was similar to CNAPS in terms of precision, xed-point arithmetic, and basic
processor node architecture, but differed by using off-chip memory to store the weight
matrices.
A SYNAPSE-1 chip contained a 2 4 array of MA16 neural signal processors,
each with a 16-bit multiplier and 48-bit accumulator. The chip frequency was
40 MHz, and one chip could compute about ve billion connections per second
with feedforward (non-learning) execution.
Recall that, in architecting the CNAPS, one of the most important decisions was
whether to use on-chip per PN memory, or off-chip shared memory for storing the
primary weight matrices. For a number of reasons, including the targeted problems
space and the availability of a state-of-the-art SRAM process, Adaptive Solutions
chose to use on-chip memory for CNAPS. However, for performance reasons this
decision limited the algorithm and application space to those whose parameters t
into the on-chip memories. Although optimized for matrix-vector operations,
CNAPS was designed to perform efciently over a fairly wide range of computations.
The SYNAPSE-1 processor was much more of a matrixmatrix multiplication
algorithm mapped into silicon. In particular, Ramacher and colleagues were able to
take advantage of a very clever insight the fact that in any matrix multiplication, the
individual elements of the matrix are used multiple times. The SYNAPSE-1 broke all
matrices into 4 4 chunks. Then, while the elements of one matrix were broadcast to
the array, 4 4 chunks of the other matrix would be read from external memory into
the array. In a 4 4-matrix by 4 4-matrix multiplication, each element in the matrix
was actually used four times, which allowed the processor chip to support four times
as many processor units for a given memory bandwidth than a processor not using
this optimization.
OnreturningtoFigure10.5,itcanbeseenthattheSYNAPSE-1architectureincreased
performance by specializing the architecture to matrixmatrix multiplications.
A similar chip to the Ni1000 was the ZISC (Zero Instruction Set Computer)
developed by Paillet and colleagues at IBM in Paris. The ZISC chip was digital,
employed basically a vector template approach, and was simpler and cheaper than
the Ni1000 but implemented approximately the same algorithms. Today, the ZISC
chip survives as the primary product of General Vision, Petaluma, California.
In addition to the CNAPS, SYNAPSE-1, ZISC, and Ni1000, several other digital
chips have been developed either specically or in part to emulate neural networks.
HNC developed the SNAP, a oating-point SIMD standard cell-based architecture [38]. One excellent architecture is the SPERT [39], which was developed by
groups at the University of California Berkeley and the International Computer
Science Institute (ICSI) in Berkeley. SPERT was designed to perform efcient integer
vector arithmetic and to be congured into large parallel arrays. A similar parallel
processor array that was created from eld-programmable gate arrays (FPGAs) and
suited to neural network emulation was REMAP [40].
10.3
Current Directions in Neuro-Inspired Hardware
One limitation of traditional ANN algorithms is that they did not scale particularly
well to very large congurations. As a result, commercial silicon was generally fast
enough to emulate these models, thus reducing the need for specialized hardware.
Consequently, with the exception of on-going studies in aVLSI and CNN, general
research in neural inspired hardware has languished.
Today, however, activity in this area is picking up again, for two main reasons. The
rst reason is that computational neuroscience is beginning to yield algorithms that
can scale to large congurations and have the potential for solving large, very complex
problems. The second reason is the excitement of using molecular-scale electronics,
which makes possible comparably scalable hardware. As will be seen, at least one of
the projected nanoelectronic technologies is a complementary match to biologically
inspired algorithms.
Today, a number of challenges face the semiconductor industry, including power
density, interconnect reverse scaling, device defects and variability, memory bandwidth limitations, performance overkill, density overkill, and increasing design
complexity. Performance overkill is where the highest-volume segments of the market
are no longer performance/clock frequency-driven. Density overkill is where it is
j273
274
difcult for a design team to effectively design and verify all the transistors available to
them on a single die. Although neither of these is a potential show-stopper, taken
together they do create some signicant challenges.
Another challenge is the growing reliance on parallelism for performance improvements. In general purpose applications, the primary source of parallelism has
been within a single instruction stream, where many instructions can be executed
simultaneously, sometimes even out of order. However, this instruction level
parallelism (ILP) has its limits and becomes exponentially expensive to capture.
Microprocessor manufacturers are now developing multiple core architectures, the
goal of which is to execute multiple threads efciently. As multiple core machines
become more commonplace, software and application vendors will struggle to create
parallel variations of their software.
Due to very small, high-resistance wires, many nano-scale circuits will be slow, and
power density will be a problem because of high electric elds. Consequently,
performance improvements at the nano-scale will also need to come almost exclusively from parallelism and to an even greater extent than traditional architectures.
When considering these various challenges, it is unclear which ones are addressed
by nanoelectronics. In fact, nanoelectronics only addresses the end of Moores law,
and perhaps also the memory bandwidth problem. However, it also aggravates most
other existing problems, notably signal/clock delay, device variability, manufacturing
defects, and design complexity.
In proceeding down the path of creating nanoscale electronics, by far the biggest
question is, how exactly will this technology be used? Can it be assumed that
computation, algorithms, and applications will continue more or less as they have
in the past? What should the research agenda be? Will the nanoscale processor of the
future consist of thousands of 86 cores with a handful of application-specic
coprocessors? The effective use of nanoelectronics will require solutions to more
than just an increased density; rather, total system solutions will need to be considered. Today, computing structures cannot be created in the absence of some sense of
how they will be used and what applications they will enable. Any paradigm shift in
applications and architecture will have a profound impact on the entire design
process and the tools required, as well as the requirements placed on the circuits and
devices themselves.
As discussed above, algorithms inspired by neuroscience have a number of
interesting and potentially useful properties, including ne-grain and massive
parallelism. These are constructed from slow, low-power, unreliable components,
are tolerant of manufacturing defects, and are robust in the presence of faulty and
failing hardware. They adapt rather than be programmed, they are asynchronous,
compute with low precision, and degrade gracefully in the presence of faults. Most
importantly, they are adaptive, self-organizing structures which promise some degree
of design error tolerance, and solve problems dealing with the interaction of an
organism/system with the real world. The functional characteristics of neurons, such
such as analog operation, fault tolerance, slow, massive parallelism, are radically
different from those of typical digital electronics. Yet, some of these characteristics
match very well the basic characteristics such as large numbers of faults and defects,
low speed, and massive parallelism that many research groups feel will characterize
nanoelectronics systems.
Self-organization involves a system adapting (usually increasing in complexity) in
response to an external stimulus. In this context, a system will learn about its
environment and adjust itself accordingly, without any additional intervention. In
order to achieve some level of self-organization, a few fundamental operating
principles are required. Self-organizing systems are those that have been built with
these principles in mind.
Recently, Professor Christoph von der Malsburg has dened a new form of
computing science organic computing which deals with a variety of computations that are performed by biology. Organic computations are massively parallel, low
precision, distributed, adaptive, and self-organizing. The neural algorithms discussed in this chapter form an important subset of this area (the interested reader is
referred to the web site: www.organic-computing.org).
Several very important points should be made about biologically inspired models.
The rst point concerns the computational models and the applications they support.
Biologically inspired computing uses a very different set of computational models
than have traditionally been used. And subsequently they are aimed at a fairly
specialized set of applications. Consequently, for the most part biological models are
not a replacement for existing computation, but rather they are an enhancement to
what is available now. Specialized hardware for implementing these models needs to
be evaluated accordingly, and in the next few sections some of these models will be
explored at different levels.
10.3.1
Moving to a More Sophisticated Neuro-Inspired Hardware
As mentioned above, it is the back end where the struggle with algorithms and
implementation continues, and it is also the back end where potential strategic
inection points lie. Hence, the remainder of the chapter will focus on back-end
algorithms and hardware.
The ultimate cognitive processor is the cerebral cortex. The cortex is remarkably
uniform, not only across its different parts, but also across almost all mammalian
species. Although current knowledge is far from providing an understanding of
how the cerebral cortex does what it does, some of the basic computations are
beginning to take shape. Nature has, it appears, produced a general-purpose
computational device that is a fundamental component of higher level intelligence
in mammals.
Some generally accepted notions about the cerebral cortex are that it represents
knowledge in a sparse, distributed, hierarchical manner, and that it performs a type of
Bayesian inference over this knowledge base, which it does with remarkable
efciency. This knowledge is added to the cortical data base by a complex process
of adaptive learning.
One of the fundamental requirements of intelligent computing, the need to
capture higher-order relationships. The problem with Bayesian inference is that it
j275
276
these algorithms will require networks with a million or more nodes. Back-end
processing, because of a need to store large amounts of unique synaptic information,
will most likely have simpler processing than is seen at the front end, albeit on a much
larger scale.
Hecht-Nielsen [45] bases the inter-column (which he calls regions) connections
on conditional probabilities, which capture higher-order relationships. He also uses
abstraction columns to represent groups of lower-level columns. He has demonstrated networks that perform a remarkable job of capturing aspects of English, as
these networks consist of several billion connections and require a large computer
cluster to execute.
Granger [46, 47] leverages nested distributed representations in a way that adds the
temporaldimension, creatinghierarchical networks that learn sequencesof sequences.
George and Hawkins [48] use model likelihood information ascending a hierarchy with
model condence information being fed back. Other researchers are also contributing
to these ideas include Grossberg [49], Lansner [50, 51], Arbib [52], Roudi and
Treves [53], Renart et al. [54], Levy et al. [55], and Anderson [56]. Clearly, this remains
a dynamic area of research, and at this point there is no clear winning approach.
Another key feature of some of these algorithms is that there is an oscillatory
sliding threshold that causes the more condent columns to converge to their
attractors more quickly, the less-condent more slowly, while those of low condence
do not converge at all, taking a NULL state. This process is remarkably similar to the
electromagnetic waves that ow through the cortex when it is processing data.
Connectivity remains one of the most important problems when rst considering
scaling to very large models. Axons are very ne and can be packed very densely in a
three-dimensional (3-D) mesh. Interconnect in silicon generally operates in a 2-D
plane, although with several levels, nine or more with todays semiconductor
technologies. Most importantly, silicon communication is not continuously
amplifying, as can be seen in axonal and some dendritic processes. The following
result [5759] demonstrates this particular problem.
Theorem: Assume an unbounded or large rectangular array
of silicon neurons where each neuron receives input
from its N nearest neighbors; that is, the fan-out (divergence)
and fan-in (convergence) is N. Each such connection consists
of a single metal line, and the number of 2-D metal layers
is much less than N. Then, the area required by the metal
interconnect is O(N3).
So, if the fan-in per node is doubled from 100 to 200, the silicon area required for
the metal interconnect increases by a factor of 8. This result means that, even for
modest local connectivity, that portion of silicon area devoted to the metal interconnect will dominate. It has been shown that for some models even moderate multiplexing of interconnect would signicantly decrease the silicon area requirements,
without any real loss in performance [60]. Carver Meads group at Caltech, and others,
developed the address event representation (AER), a technique for multiplexing a
number of pulse streams on a single metal wire [61, 62]. When analog computation is
j277
278
Figure 10.10 CMOL [1]. (a) A schematic side view. (b) Top view
showing that any nanodevice may be addressed via the
appropriate pin pair (e.g. pins 1 and 2 for the leftmost of the two
shown devices, and pins 1 and 20 for the rightmost device). Panel
(b) shows only two devices; in reality, similar nanodevices are
formed at all nanowire crosspoints. Also not seen on panel (b) are
CMOS cells and wiring.
state of the molecule. Another required characteristic of these devices is rectication, where current ow is allowed only in one direction.
One of the most important characteristics of CMOL is the unique way in which the
grids are laid at an angle with respect to the CMOS grid. Each nanowire is connected
to the upper metal layers of the CMOS circuitry through a pin. In order for the CMOS
to address each nanowire in the denser nanowire crossbar arrays, when it is
fabricated, the nanowire crossbar is turned by an angle a, where a is the tangent
of the ratio of the CMOS inter-pin distance to the nanogrid inter-pin distance. This
technique allows the grid to be aligned arbitrarily with the CMOS and still have most
nanowires addressable by some selection of CMOS cells. A nanowire is contacted by
two CMOS cells, both of which are required to input a signal or read a signal. This
basic connectivity structure is shown in Figure 10.10.
Although CMOL is not necessarily biologically inspired, it represents a promising
technology for implementing such algorithms, as will be seen in the next section.
CMOL uses charge-accumulation as its basic computational paradigm, which is also
used by neural structures. Other nanoscale devices such as spin technologies do not
implement a charge accumulation model, so such structures would have to emulate a
charge-accumulation model, probably in digital.
10.3.3
An Example: CMOL Nano-Cortex
j279
280
Table 10.1 Typical values of parameters used for a cortical column analysis.
Parameter
Range
Typical Value I
Typical Value II
128 128 K
214 234 bits
218 251 bits
7 17 bits
7 17
3 5 bits
11 21 bits
1K
220 bits
230 bits
10 bits
16
5 bits
14 bits
16 K
228 bits
242 bits
14 bits
16
5 bits
18 bits
parameters used in the cortical column architectural analysis are listed in Table 10.1.
These values represent typical numbers used by several different simulation models,
in particular, Lansner and his group at KTH [69]. Related investigations have been
conducted at IBM [70], where a mouse cortex-sized model has been simulated on a
32 K processor IBM BlueGene/L.
Four basic designs have been analyzed, as shown in Figure 10.11:
.
.
.
.
All-digital CMOS
Mixed-signal CMOS
All-digital hybrid CMOS/CMOL
Mixed-signal hybrid CMOS/CMOL.
For the CMOS designs and the CMOS portion of CMOL, a 22-nm process was
assumed as a maximally scaled CMOS. To approximate the features for this
process, a simple linear scaling of a 250-nm process was made. The results of this
Design
CMOS All Digital
CMOS Mixed-Signal
CMOL All Digital
CMOL MS
1-bit
1-bit
1-bit
1-bit
eDRAM
eDRAM
CMOL Mm
CoNets
No. of column
processors
Power
(W)
Update rate
(G nodes s1)
Memory
(%)
6600
19 500
4 042 752
10 093 568
528
487
317
165
3072
22 187
4492
11 216
2.9
9.0
40
100
analysis, where the cost-performance for the four systems with the assumption of an
858 mm2 die size (the maximum lithographic eld size expected for a 22-nm
process), are presented in Table 10.2.
With regards to Table 10.2, with the mixed signal CMOL it was possible to
implement approximately 10 M columns, each having 1 K nodes, with 1 K connections each, for a total of 10 Tera-connections. In addition, this entire network can be
updated once every millisecond which is approaching biological densities and
speeds, although of course with less functionality. Such technology could be built into
a portable platform, with the biggest constraint being the high power requirements.
Current studies include investigations into spike-based models [71] that should allow
a signicant lowering of the duty cycle and the power consumed.
Although real-time learning/adaptation was not included in the circuits analyzed
here, deployed systems will need to be capable of real-time adaptation. It is expected
that additional learning circuitry will reduce density by about two- to threefold.
Neither has the issue of fault tolerance been addressed, although the Palm model has
been found to tolerate errors, single 1 bits set to 0, in the weight matrix of up to 10%.
For this reason, and given the excellent results of Likharev and Strukov [66] on the
fault tolerance of CMOL arrays used as memory, it is expected that some additional
hardware will be required to complement algorithmic fault tolerance, although this
should not reduce the density in any signicant way.
10.4
Summary and Conclusions
In this chapter, a brief and somewhat supercial survey has been provided of the
specialized hardware developed over the past 20 years to support neurobiological
models of computation. A brief examination was made of the current efforts and
speculation made on how such hardware, especially when implemented in nanoscale
electronics, could offer unprecedented compute density, possibly leading to new
capabilities in computational intelligence. Biologically inspired models seem to be a
better match to nanoscale circuits.
The mix of continued Moores law scaling, models from computational neuroscience and molecular-scale technology portends a potential paradigm shift in how
computing is carried out. Among other points, the future of computing is most likely
not about discrete logic but rather about encoding, learning, and performing
j281
282
inference over stochastic variables. There may be a wide range of applications for
such devices in robotics, in the reduction and compression of widely distributed
sensor data, and power management.
One of the leading lights of the rst computer revolution saw this clearly. At
the IEEE Centenary in 1984 (The Next 100 Years; IEEE Technical Convocation),
Dr. Robert Noyce, the co-founder of Intel and co-inventor of the Integrated Circuit,
noted that:
Until now we have been going the other way; that is, in order to
understand the brain we have used the computer as a model for it.
Perhaps it is time to reverse this reasoning: to understand where we
should go with the computer, we should look to the brain for some
clues.
References
1 K. Likharev, CMOL: Freeing advanced
lithography from the alignment accuracy
burden, in: The International Conference
on Electron, Ion, and Photon Beam
Technology and Nanofabrication 07,
Denver, 2007.
2 W. S. McCulloch, W. H. Pitts, A logical
calculus of the ideas immanent in nervous
activity, Bull. Mathematical Biophys. 1943,
5, 115133.
3 K. Steinbuch, Die Lernmatrix, Kybernetik
1961, 1.
4 F. Rosenblat, Principles of Neurodynamics:
Perceptrons and the Theory of Brain
Mechanisms, Spartan, New York, 1962.
5 L. Minsky, M. A. S. Papert, Perceptrons: An
introduction to computational geometry, MIT
Press, Cambridge MA, 1988.
6 SOAR.Web Page. http://sitemaker.umich.
edu/soar.
7 J. Hopeld, Neural networks and physical
systems with emergent collective
computational abilities. Proc. Natl. Acad.
Sci. USA 1982, 79, 25542558.
8 D. Rumelhart, G. Hinton, R. Williams,
Learning internal representations by error
propagation, Nature 1986, 323, 533536.
9 P. J. Werbos, The Roots of Backpropagation:
From Ordered Derivatives to Neural Networks
and Political Forecasting, WileyInterscience, 1994.
References
18 H. Bourlard, N. Morgan, Connectionist
Speech Recognition A Hybrid Approach,
Kluwer Academic Publishers, Boston, MA,
1994.
19 J. L. Hennessy, D. A. Patterson, Computer
Architecture: A Quantitative Approach,
Morgan Kaufmann, Palo Alto, CA, 1991.
20 Intel. IA-32 Intel Architecture Software
Developers Manual, Volume 1: Basic
Architecture. 2001 (cited 2001; Available
from: http://developer.intel.com/design/
pentium4/manuals/245470.htm)
21 C. Mead, Analog VLSI and Neural Systems,
Addison-Wesley, Reading, Massachusetts,
1989.
22 M. A. Mahowald, Computation and Neural
Systems, California Institute of Technology,
1992.
23 R. Douglas, Fifteen years of neuromorphic
engineering: Progress, problems, and
prospects, in: Proceedings, Brain
Inspired Cognitive Systems BICS2004,
University of Stirling, Scotland,
UK, 2006.
24 M. Holler, et al., An electrically trainable
articial neural network (ETANN) with
10240 oating gate synapses, in:
Proceedings, International Joint Conference
on Neural Networks, IEEE, Washington DC,
1989.
25 I. Nestor, Ni1000 Recognition Accelerator
Data Sheet, 1996, 17. Available at:
http://www.warthman.com/projects-intelni1000-TS.htm.
26 M. J. L. Orr, Introduction to Radial Basis
Function Networks, Centre for Cognitive
Science, University of Edinburgh,
Edinburgh, 1996.
27 L. O. Chua, T. Roska, Cellular Neural
Networks and Visual Computing,
Cambridge University Press, 2002.
28 D. Balya, B. Roska, T. Roska, F. S. Werblin,
A CNN framework for modeling parallel
processing in a mammalian retina, Int. J.
Circuit Theory Applications 2002, 30,
363393.
29 A. F. Murray, The future of analogue
neural VLSI, in: Proceedings Second
International ICSC Symposium on
30
31
32
33
34
35
36
37
38
j283
284
39
40
41
42
43
44
45
46
47
48
49
References
61
62
63
64
65
j285
j287
11
Nanowire-Based Programmable Architectures
Andre DeHon
11.1
Introduction
288
modest-sized, interconnected crossbar arrays (Section 11.6.3). A reliable, lithographic-scale support structure provides power, clocking, control, and bootstrap testing for
the nanowire crossbar arrays. Each nanowire is coded so that it can be uniquely
addressed from the lithographic support wires (Section 11.4.2). With the ability to
address individual nanowires, individual crosspoints can be programmed
(Section 11.8) to personalize the logic function and routing of each array and to
avoid defective nanowires and switches (Section 11.7).
As specic nanowires cannot, deterministically, be placed in precise locations
using these bottom-up techniques, stochastic assembly is exploited to achieve
unique addressability (Section 11.4.2). Stochastic assembly is further exploited to
provide signal restoration and inversion at the nanoscale (Section 11.4.3). Remarkably, starting from regular arrays of programmable diode switches and stochastic
assembly of non-programmable eld-effect controlled nanowires, it is possible to
build fully programmable architectures with all logic and restoration occurring at
the nanoscale.
The resulting architectures (Section 11.6) provide a high-level view similar
to island-style eld-programmable gate arrays (FPGAs), and conventional logic
mapping tools can be adapted to compile logic to these arrays. Owing to the
high defect rates likely to be associated with any atomic-scale manufacturing
technology, all viable architectures at this scale are likely to be post-fabrication
congurable (Section 11.7). That is, while nanowire architectures can be
customized for various application domains by tuning their gross architecture
(e.g. ratio of logic and memory), there will be no separate notion of custom
atomic-scale logic.
Even after accounting for the required, regular structure, high defect rates,
stochastic assembly, and the lithographic support structure, a net benet is seen
from being able to build with nanowires which are just a few atoms in diameter and
programmable crosspoints that t in the space of a nanowire junction. Mapping
conventional FPGA benchmarks from the Toronto20 benchmark set [1], the designs
presented here should achieve one to two orders of magnitude greater density than
FPGAs in 22 nm CMOS lithography, even if the 22 nm lithography delivers defectfree components (Section 11.10).
The design approach taken here represents a signicant shift in design styles
compared to conventional lithographic fabrication. In the past, reliance has been
placed on virtually perfect and deterministic construction and complete control of
features down to a minimum technology feature size. Here, it is possible to exploit
very small feature sizes, although there is no complete control of device location in
all dimensions. Instead, it is necessary to rely on the statistical properties of large
ensembles of wires and devices to achieve the desired, aggregate component
features. Further, post-fabrication conguration becomes essential to device yield
and personalization.
This chapter describes a complete assembly of a set of complementary technologies and architectural building blocks. The particular ensemble presented is one of
several architectural proposals which have a similar avor (Section 11.11) based on
these types of technologies and building blocks.
11.2 Technology
11.2
Technology
11.2.1
Nanowires
Figure 11.1 Axial doping profile places selective gateable regions in a nanowire.
j289
290
11.2.2
Assembly
11.2.3
Crosspoints
11.3 Challenges
11.2.4
Technology Roundup
It is possible to create wires which are nanometers in diameter and which can be
arranged into crossbar arrays with nanometer pitch. Crosspoints which both switch
conduction between the crossed wires and store their own state can be placed at every
wire crossing without increasing the pitch of the crossbar array. NWs can be
controlled in FET-like manner, and can be designed with selectively gateable regions.
This can all be done without relying on ultrane lithography to create the nanoscale
feature sizes. Consequently, these techniques promise smaller feature sizes and an
alternate perhaps more economical path to atomic-scale computing structures
than top-down lithography. Each of the capabilities previously described has been
demonstrated in a laboratory setting as detailed in the reports cited. It is assumed
that, in future, it will be possible to combine these capabilities and to scale them into a
repeatable manufacturing process.
11.3
Challenges
j291
292
11.3.1
Regular Assembly
The assembly techniques described above (see Sections 11.2.2 and 11.2.3)
suggest that regular arrays can be built at tight pitch with both NW trace width
and trace spacing using controlled NW diameters. While this provides nanometer
pitches and crosspoints that are tens of nanometers in area, it is impossible to
differentiate deterministically between features at this scale; that is, one particular
crosspoint cannot be made different in some way from the other crosspoints in the
array.
11.3.2
Nanowire Lengths
Nanowire lengths can be grown to hundreds of microns [17] or perhaps millimeters [18] in length. However, at this high length to diameter ratio, they become
highly susceptible to bending and ultimately breaking. Assembly puts stresses along
the NW axis which can break excessively long NWs. Consequently, a modest limit
must be placed on the NW lengths (tens of microns) in order to yield a large fraction of
the NWs in a given array. Gudiksen et al. [19] reported the reliable growth of Si NWs
which are over 9 mm long, while Whang et al. [11, 20] demonstrated collections of
arrays of NWs of size 10 mm 10 mm. Even if it was possible physically to build longer
NWs, the high resistivity of small-diameter NWs would force the lengths to be kept
down to the tens of microns range.
11.3.3
Defective Wires and Crosspoints
At this scale, wires and crosspoints are expected to be defective in the 1 to 10%
range:
.
.
.
.
NWs may break along their axis during assembly as suggested earlier, and the
integrity of each NW depends on the 100 atoms in each radial cross-section.
NW to microwire junctions depend on a small number of atomic scale bounds
which are statistical in nature and subject to variation in NW properties.
Junctions between crossed NWs will be composed of only tens of atoms or
molecules, and individual bond formation is statistical in nature.
Statistical doping of NWs may lead to high variation among NWs.
For example, Huang et al. [13] reports that 95% of the wires measured had
good contacts, while Chen et al. [21] reported that 85% of crosspoint junctions
measured were usable. Both of these were early experiments, however, and the
yield rates would be expected to improve. Nonetheless, based on the physical
phenomena involved it is anticipated that the defect rates will be closer to the
few percent range than the minuscule rates frequently seen with conven-
Wire defects: a wire is either functional or defective. A functional wire has good
contacts on both ends, conducts current with a resistance within a designated range,
and is not shorted to any other NWs. Broken wires will not conduct current. Poor
contacts will increase the resistance of the wire, leaving it outside of the designated
resistance range. Excessive variation in NW doping from the engineered target can
also leave the wire out of the specied resistance range. It can be determined if a
wire is in the appropriate resistance range during testing (see Section 11.8.1) and
arranged not to use those which are defective (see Section 11.7.1).
Non-programmable crosspoint defects: a crosspoint is programmable, non-programmable, or shorted into the on state. A programmable junction can be switched
between the resistance range associated with the on-state and the resistance range
associates with the off-state. A non-programmable junction can be turned off, but
cannot be programmed into the on-state; a non-programmable junction could
result from the statistical assembly of too few molecules in the junction, or from
poor contacts between some of the molecules in the junction and either of the
attached conductors. A shorted junction cannot be programmed into the off-state.
Based on the physical phenomena involved, non-programmable junctions are
considered to be much more common than shorted junctions. Further, it is
expected that fabrication can be tuned to guarantee that this is the case. Consequently, shorted junctions will be treated like a pair of defective wires, and both
wires associated with the short will be avoided.
11.4
Building Blocks
By working from the technological capabilities and within the regular assembly
requirements, it is possible to construct a few building blocks which enable the
creation of a wide range of interesting programmable architectures.
j293
294
11.4.1
Crosspoint Arrays
As suggested in Section 11.2.2 and demonstrated by Chen et al. [21] and Wu et al. [22],
assembly processes allow the creation of tight-pitch arrays of crossed NWs with
switchable diodes at the crosspoints (see Figure 11.3). Assuming for the moment
that contact can be made to individual NWs in these tight-pitch arrays (see Section 11.4.2), these arrays can serve as:
.
.
.
memory cores
programmable, wired-OR planes
programmable crossbar interconnect arrays.
the row NW will remain low (see O2 and O6 in Figure 11.4). Consequently, the row
NW effectively computes the OR of its programmed inputs.
The output NWs do pull their current directly off the inputs and may not be
driven as high as the input voltage. Hence, these outputs will need restoration
(see Section 11.4.3).
11.4.1.3 Programmable Crossbar Interconnect Arrays
A special use of the Wired-OR programmable array is for interconnect. That is, if a
restriction is introduced to connecting a single row wire to each column wire, the
crosspoint array can serve as a crossbar switch. This allows any input (column)
to be routed to any output (row) (e.g. see Figure 11.5). This structure is useful for
j295
296
A key challenge is bridging the length scale between the lithographic-scale wires
that can be created using conventional top-down lithography and the smalldiameter NWs that can be grown and assembled into tight-pitch arrays. As noted
above, it must be possible to establish a voltage differential across a single row and
column NW to write a bit in the tight-pitch NW array. It must also be possible to
drive and sense individual NWs to read back the memory bit. By building a
decoder between the coarse-pitch lithographic wires and the tight-pitch NWs, it is
possible to bridge this length scale and to address a single NW at this tight
pitch [2326].
11.4.2.1 NW Coding
One way to build such a decoder is to place an address on each NW using the axial
doping or material composition prole described previously. In order to interface
with lithographic-scale wires, address bit regions are marked off at the lithographic
pitch. Each such region is then either doped heavily so that it is oblivious to the eld
applied by a crossed lithographic-scale wire, or is doped lightly so that it can be
controlled by a crossed lithographic scale wire. In this way, the NW will only conduct
if all of the lithographic-scale wires crossing its lightly doped, controllable regions
have a suitable voltage to allow conduction. If any of the lithographic-scale wires
crossing controllable regions provide a suitable voltage to turn off conduction, then
the NW will not be able to conduct.
It should be noted that each bit position can only be made controllable or noncontrollable with respect to the lithographic-scale control wire; different bit positions
cannot be made sensitive to different polarities of the input. Consequently, the
addresses must be encoded differently from the dense, binary address normally used
for memories. One simple way to generate addresses is to use a dual-rail binary code.
That is, for each logical input bit, the value and its complement are provided. This
results in two bit positions on the NW for each logical input address bit one for the
true sense and one for the false sense. To code a NW with an address, either bit
position is simply coded to be sensitive to exactly one sense of each of the bit positions
(see Figure 11.6). This results in a decoder which requires 2 log2(N) address bits to
address N NWs.
Denser addressing can be achieved by using Na/2-hot codes; that is, rather than
forcing one bit of each pair of address lines to be off and one to be on, it is simply
required that half of the address bits, Na, be set to a voltage which allows conduction,
and half to be set to a voltage that prevents conduction. This scheme requires only
1.1 log2(N) 3 address bits [24].
j297
298
voltage driven on the common line, and all other NWs are held at the nominal voltage
(see Figure 11.7).
It should be noted that there is no directionality to the decoder, and consequently this
same unit can serve equally well as a multiplexer. That is, when an address is applied to
the lithographic-scale wires it allows conduction through the addressing region for
only one of the NWs. Consequently, the voltage on the common line can be sensed
rather than driven. Now, the one line which is allowed to conduct through the array can
potentially pull the common line high or low. All other lines have a high resistance
path across the lithographic-scale address wires and will not be able to strongly effect
the common line. This allows a single NW to be sensed at a time (see Figure 11.8) as
there is a need to read out the crosspoint state, as described in Section 11.4.1.1.
11.4.3
Restoration and Inversion
As noted in Section 11.4.1.2, the programmable, wired-OR logic is passive and nonrestoring, drawing current from the input. Further, OR logic is not universal, and to
build a good composable logic family an ability will be required to isolate inputs from
output loads, restore signal strength and current drive, and invert signals.
Fortunately, NWs can be eld-effect controlled, and this provides the potential to
build FET-like gates for restoration. However, in order to realize these ways must be
found to create the appropriate gate topology within the regular assembly constraints
(see Section 11.3.1).
j299
300
11:1
For the sake of illustration, the vertical NW has a lightly doped p-type depletion mode
region at the input crossing forming a FET controlled by the input voltage
(Rfet(Input)). Consequently, a low voltage on the input NW will allow conduction
through the vertical NW (Rfet Ronfet is small), and a high input will deplete the
carriers from the vertical NW and prevent conduction (Rfet Rofffet is large). As a
result, a low input allows the NW to conduct and pull the output region of the vertical
NW up to a high voltage. A high input prevents conduction and the output region
remains low. A second crossed region on the NW is used for the pull down (Rpd); this
region can be used as a gate for predischarge, so the inverter is pulled low before the
input is applied, then left high to disconnect the pulldown voltage during evaluation.
Alternately, it can be used as a static load for PMOS-like ratioed logic. By swapping the
location of the high- and low-power supplies, this same arrangement can be used to
buffer rather than invert the input.
Note that the gate only loads the input capacitively, and consequently current
isolation is achieved at this inverter or buffer. Further, NW eld-effect gating has
sufcient non-linearity so that this gate provides gain to restore logic signal levels [27].
11.4.3.2 Ideal Restoration Array
In many scenarios, there is a need to restore a set of tight-pitch NWs such as the
outputs of a programmable, wired-OR array. To do this, the approach would be to
build a restoration array as shown in Figure 11.10a. This array is a set of crossed NWs
which can be assembled using NW assembly techniques. If each of the NWs was
sensitive to all of the crossed inputs, the result would be that all of the outputs would
actually compute the NOR of the same set of inputs. To avoid computing a redundant
set of NORs and instead simply to invert each of the inputs independently, these NWs
are coded using an axial doping or material composition prole. In this way, each NW
is eld-effect sensitive to only a single NW, and hence provides the NW inversion
described for a single one of the crossed NWs and is oblivious to the inputs of the
other NWs.
The only problem here is that there is no way to align and place axially doped NWs
so that they provide exactly this pattern, as the assembly treats all NWs as identical.
11.4.3.3 Restoration Array Construction
Although the region for active FETs is a nanoscale feature, it does not require small
pitch or tight alignment. As such, there may be ways to mask and provide material
differentiation along a diagonal as required to build this decoder.
Nonetheless, it is also possible to stochastically construct this restoration array in a
manner similar to the construction of the address decoder. That is, an assembly is
provided with a set of NWs with their restoration regions in various locations. The
restoration array will be built by randomly selecting a set of restoration NWs for each
array (see Figure 11.10b).
Two points differ compared to the address decoder case.
.
.
The code space will be the same size as the desired restoration population.
Duplication is allowed.
The question then is how large a fraction of the inputs will be successfully restored
for a given number of randomly selected restoration NWs? This is an instance of the
Coupon Collector Problem [28]. If the restoration array is populated with the same
number of NWs as inputs, the array will typically contain restoration wires for
5060% of the NW inputs. One way to consider this is that the array must be
populated with 1.7- to 2-fold as many wires as would be hoped to yield due to these
j301
302
11.5
Memory Array
Figure 11.12 Memory array built from coded NW decoder and crosspoint memory core.
Limitations on reliable NW length and the capacitance and resistance of long NWs
prevent the building of arbitrarily large memory arrays. Instead, the large NW
memories are broken up into banks similar to the banking used in conventional
DRAMs (see Figure 11.13). Reliable, lithographic-scale wires provide address and
control inputs and data inputs and outputs to each of the NW-based memory banks.
The expected yield would be only a fraction of the NWs in the array due to wire defects.
Error-correcting codes (ECC) can be used to tolerate non-programmable crosspoint
defects. After accounting for defects, ECC overhead, and lithographic control
overhead, net densities on the order of 1011 bits cm 2 appear achievable, using NW
pitches of about 10 nm [29].
11.6
Logic Architecture
j303
304
provide gain and signal inversion, and the NWs themselves provide interconnect
among arrays. Lithographic scale wires provide a reliable support infrastructure
which allows device testing and programming (see Section 11.8), addressing
individual NWs using the decoders introduced in Section 11.4.2. Lithographic-scale
wires also provide power and control logic evaluation.
11.6.1
Logic
Figure 11.14 shows a simple PLA created using the building blocks from Section 11.4
and rst introduced by DeHon and Wilson [31]. The design includes two interconnected logic planes, each of which is composed of a programmable Wired-OR
array, followed by a restoration array. It should be noted here that two restoration
arrays are actually used one providing the inverted sense of the OR-term logic and
one providing the non-inverted buffered sense. This arrangement is similar to
conventional PLAs where the true and complement sense of each input is provided
in each PLA plane. Since Wired-OR logic NWs can be inverted in this nanoPLA, each
plane effectively serves as a programmable NOR plane. The combination of the two
coupled NORNOR planes can be viewed as an ANDOR PLA with suitable
application of DeMorgans laws and signal complementation.
11.6.1.1 Construction
The entire construction is simply a set of crossed NWs as allowed by the regular
assembly constraints (see Section 11.3.1). Lithographic-scale etches are used to
differentiate regions (e.g. programmable-diode regions for the Wired-OR). The
topology allows the same NWs that perform logic or restoration to carry their outputs
across as inputs to the array that follows it.
With slight modication as to how the control signals on the identied logic stages are
driven, this can be turned into a clocked logic scheme. An immediate benet is the
ability to create a nite-state machine out of a single pair of PLA planes. A second
benet is the ability to use precharge logic evaluation for inverting restoration stages.
11.6.2.1 Basic Clocking
The basic nanoPLA cycle shown in Figure 11.14 is simply two restoring logic stages
back-to-back (see Figure 11.16). For the present clocking scheme, the two stages are
evaluated at altering times.
j305
306
First, it should be noted that if all three of the control transistors in the restoring
stages (restoring precharge and evaluate and diode precharge; e.g. evalA and
prechargeA in Figure 11.16) are turned off, there is no current path from the input
to the diode output stage. Hence, the input is effectively isolated from the output. As
the output stage is capacitively loaded, the output will hold its value. As with any
dynamic scheme, eventually leakage on the output will be an issue which will set a
lower bound on the clock frequency.
With a stage isolated and holding its output, the following stage can be evaluated. It
computes its value from its input, the output of the previous stage, and produces its
result by suitably charging its output line. When this is done, this stage can be isolated
and the succeeding stage (which in this simple case is also its predecessor) can
be evaluated. This is the same strategy as two-phase clocking in conventional VLSI
(e.g. Refs. [32, 33]).
In this manner, there is never an open current path all the way around the PLA (see
Figures 11.16 and 11.17). In the two phases of operation, there is effectively a single
register on any PLA outputs which feed back to PLA inputs.
11.6.2.2 Precharge Evaluation
For the inverting stage, the pulldown gate is driven hard during precharge and turned
off during evaluation. In this manner, the line (Ainv) is precharged low and pulled it
up only if the input (Ainput) is low. This works conveniently in this case because the
output will also be precharged low. If the input is high, then there is no need to pullup
the output and it is simply left low. If the input is low, the current path is allowed to
pullup the output. The net benet is that inverter pulldown and pullup are both
controlled by strongly driven gates and can be fast, whereas in a static logic scheme,
the pulldown transistor must be weak, making pulldown slow compared to pullup.
Typically, the weak pulldown transistor would be set to have an order of magnitude
higher resistance than the pullup transistor so this can be a signicant reduction in
worst-case gate evaluation latency.
Unfortunately, in the buffer case the weak pullup resistor can neither be precharged to high nor turned off, and so there are no comparable benets there. It is
possible that new devices or circuit organizations will eventually allow precharge
buffer stages to be built.
11.6.3
Interconnect
It is known from VLSI that large PLAs do not always allow the structure which exists
in logic to be exploited. For example, an n-input XOR requires an exponential number
of product terms to construct in the two-level logic of a single PLA. Further, the
limitation on NW length (see Section 11.3.2) bounds the size of the PLAs that can
reasonably be built. Consequently, in order to scale up to large-capacity logic devices,
modest size nanoPLA blocks must be interconnected; these nanoPLA blocks are
extended to include input and output to other nanoPLA blocks and then assembled
into a large array (see Figure 11.18), as rst introduced by DeHon [34].
11.6.3.1 Basic Idea
The key idea for interconnecting nanoPLA blocks is to overlap the restored output
NWs from each such block with the wired-OR input region of adjacent nanoPLA
blocks (see Figure 11.18). In turn, this means that each nanoPLA block receives
inputs from a number of different nanoPLA blocks. With multiple input sources and
outputs routed in multiple directions, this allows the nanoPLA block also to serve as a
j307
308
allow those inputs to be selected which participate in each logical product term
(PTERM) building a wired-OR array, as in the base nanoPLA (see Section 11.6.1).
.
Internal inversion and restoration array. The NW outputs from the input block are
restored by a restoration array. The restoration logic is arranged at this stage to be
inverting, thus providing the logical NOR of the selected input signals into the
second plane of the nanoPLA.
Output OR plane. The restored outputs from the internal inversion plane become
inputs to a second programmable crosspoint region. Physically, this region is the
same as the input plane. Each NW in this plane computes the wired-OR of one or
more of the restored PTERMs computed by the input plane.
Selective output inversion. The outputs of the output OR plane are then restored in
the same way as the internal restoration plane. On this output, however, the
selective inversion scheme introduced in Section 11.6.1 is used. This provides both
polarities of each output, and these can then be provided to the succeeding input
planes. This selective inversion plays the same role as a local inverter on the inputs
of conventional, VLSI PLA; here it is placed with the output to avoid introducing an
additional logic plane into the design. As with the nanoPLA block, these two planes
provide NORNOR logic. With suitable application of DeMorgans laws, these can
be viewed as a conventional ANDOR PLA.
Feedback. As shown in Figures 11.18 and 11.19, one set of outputs from each
nanoPLA block feeds back to its own input region. This completes a PLA cycle
similar to the nanoPLA design (see Section 11.6.1). These feedback paths serve the
role of intracluster routing similar to internal feedback in conventional Islandstyle [35] FPGAs. The nanoPLA block implements registers by routing signals
around the feedback path (Section 11.6.2.1). The signals can be routed around this
feedback path multiple times to form long register delay chains for data retiming.
11.6.3.3 Interconnect
.
Block outputs. In addition to self feedback, output groups are placed on either
side of the nanoPLA block and can be arranged so they cross input blocks of
nanoPLA blocks above or below the source nanoPLA block (see Figure 11.18). Like
segmented FPGAs [36, 37], output groups can run across multiple nanoPLA block
inputs (i.e. Connection Boxes) in a given direction. The nanoPLA block shown in
Figure 11.19 has a single output group on each side, one routing up and the other
routing down. It will be seen that the design shown is sufcient to construct a
minimally complete topology.
Since the output NWs are directly the outputs of gated elds: (i) an output wire can
be driven from only one source; and (ii) it can only drive in one direction.
Consequently, unlike segmented FPGA wire runs, directional wires must be
present that are dedicated to a single producer. If multiple control regions were
coded into the NW runs, conduction would be the AND of the producers crossing
the coded regions. Single direction drive arises from the fact that one side of the
j309
310
gate must be the source logic signal being gated so the logical output is only
available on the opposite side of the controllable region. Interestingly, the results of
recent studies have suggested that conventional, VLSI-based FPGA designs also
benet from directional wires [38].
.
Y route channels. With each nanoPLA block producing output groups which run
one or more nanoPLA block heights above or below the array, the result is vertical
(Y) routing channels between the logic cores of the nanoPLA blocks (see Figure 11.18). The segmented, NW output groups allow a signal to pass a number of
nanoPLA blocks. For longer routes, the signal may be switched and rebuffered
through a nanoPLA block (see Figure 11.20). Because of the output directionality,
the result is separate sets of wires for routing up and routing down in each channel.
X routing. While Y route channels are immediately obvious in Figure 11.18, the X
route channels are less apparent. All X routing occurs through the nanoPLA block.
As shown in Figure 11.19, one output group is placed on the opposite side of the
nanoPLA block from the input. In this way, it is possible to route in the X direction
by going through a logic block and conguring the signal to drive a NW in the
output group on the opposite side of the input. If all X routing blocks had their inputs
on the left, then it would be possible only to route from left to right. To allow both leftto-right and right-to-left routing, the orientation of the inputs is alternated in
alternate rows of the nanoPLA array (see Figures 11.18 and 11.20). In this manner,
even rows provide left-to-right routing, while odd rows allow right-to-left routing.
.
11.6.4
CMOS IO
j311
312
loaded capacitively by the lithographic-scale output, and only for a short distance. The
NW thresholds and lithographic FET thresholds can be tuned into comparable
voltage regions so that the NWs can drive the lithographic FETat adequate voltages for
switching. As shown, multiple NWs will cross the lithographic-scale gate. The ORterms driving these outputs are all programmed identically, allowing the multiplegate conguration to provide strong switching for the lithographic-scale FET.
11.6.5
Parameters
The key parameters in the design of the nanoPLA block are shown in Figure 11.22,
where:
.
.
.
.
.
11:2
That is, in addition to the P logical PTERMs, one physical wire may be needed for each
signal that routes through the array for buffering; there will be at most Op of these.
Additionally, the number and distribution of inputs [e.g. one side (as shown in
Figure 11.22), from both sides, subsets of PTERMs from each side], the output
topology (e.g. route both up and down on each side of the array), and segment length
distributions could be parameterized. However, in this chapter attention is focused
on this simple topology, with Lseg 2. Consequently, the main physical parameters
determining nanoPLA array size are Wseg and Pp.
11.7
Defect Tolerance
As noted in Section 11.3.3, it is likely that a small percentage of wires are defective
and crosspoints are non-programmable. Furthermore, stochastic assembly (see
Sections 11.4.2.2 and 11.4.3.3) and misalignment will also result in a percentage of
NWs which are unusable. Fortunately, NWs are interchangeable and the crosspoints are small. Consequently, spare NWs can be provisioned into an array (e.g.
overpopulate compared to the desired Pp and Wseg), NWs can be tested for usability
(see Section 11.8.1), and the array congured using only the non-defective NWs.
Further, a NW need not have a perfect set of junctions to be usable (see
Section 11.7.4).
11.7.1
NW Sparing
11:3
j313
j 11 Nanowire-Based Programmable
Architectures
314
N
ways to select i functional OR-terms from N total wires, and
i
the yield probability of each case is: (POR)i(1 POR)N i. An ensemble is yielded with
M items whenever M or more items yield, so the system yield is actually the
cumulative distribution function:
X N
11:5
(POR )i (1 P OR )N i
PM of N
i
MiN
The NW may make poor electrical contact to microwires on either end (let Pc be the
probability the NW makes a good connection on one end).
The NW may be broken along its length (let Pj be the probability that there is no
break in a NW in a segment of length Lunit).
The NW may be poorly aligned with address region (wired-OR NWs) or restoration
region (restoration NW) (let Pctrl be the probability that a the NW is aligned
adequately for use).
Figure 11.23 Physical population (N) of wires to achieve 100 restored OR-terms (M).
11:6
Typically, Pc 0.95 (after Ref. [5] and Pj 0.9999 with Lunit 10 nm (after Ref. [19];
see also Refs. [27, 34]). Pctrl can be calculated from the geometry of the doped
regions [24]. Pwire is typically about 0.8.
11.7.3
Net NW Yield Calculation
A detailed calculation for NW population includes both wire defect effects and
stochastic population effects. Starting with a raw population number for the NWs in
each piece of the array, it is possible to:
.
.
calculate the number of non-defective wired-OR wires within the condence bound
[Eqs. (11.6) and (11.5)];
calculate the number of those which can be uniquely addressed using the following
recurrence:
0
1
T
(u
1)
A Pdifferent (T; N 1; u 1)
Pdifferent (T; N; u) @
T
0 1
11:7
uA
@
Pdifferent (T; N 1; u)
T
where T is the number of different wire types (i.e. the size of the address space), N is
the raw number of nanowires populated in the array, and u is the number of unique
NWs in the array.
.
.
calculate the number of net non-defective restored wire pairs within the condence
bound [Eqs. (11.3), (11.6), and (11.5)];
calculate the number of uniquely restored OR terms using Eq. (11.7); in this case, T
is the number of possible restoration wires rather than the number of different NW
addresses.
As will be seen in Table 11.1, PLA crosspoint arrays are typically built with approximately 100 net junctions. If were demanded that all 100 crosspoint junctions on a NW
were programmable in order for the NW to yield, then an unreasonably high yield rate
per crosspoint would be required. That is, assuming a crosspoint is programmable
with probability Ppgm and a NW has Njunc input NWs and hence crosspoint junctions
j315
316
Pp
Wseg
Area ratio
alu4
apex2
apex4
bigkey
clma
des
diffeq
deip
elliptic
ex1010
ex5p
frisc
misex3
pdc
s298
s38417
seq
spla
tseng
60
54
62
44
104
78
86
58
78
66
67
92
64
74
79
76
72
68
78
8
15
7
13
28
25
21
18
27
9
18
34
8
13
15
22
18
12
25
340
39
210
69
30
26
32
59
27
290
390
17
150
360
110
32
69
630
20
11:8
To have Ppgmwire 0.5, Ppgm would need to be >0.993. However, as was noted in
Section 11.3.3, the non-programmable crosspoint defect rates would be expected to
be in the range of 1 to 10% (0.9 Ppgm 0.99).
As introduced by Naeimi and DeHon [39], it is apparent that a NW with nonprogrammable crosspoints can still be used to implement a particular OR-term, as
long as it has programmable crosspoints where the OR-term needs on-programmed
junctions. Furthermore, as the array has a large number of otherwise interchangeable
NWs (e.g. 100), it is possible to search through the array for NWs that can implement
each particular OR-term.
For example, if a logic array (AND or OR plane) of a nanoPLA has defective junctions
(as marked in Figure 11.24), the OR-term f A B C E can be assigned to NW
W3, despite the fact that it has a defective (non-programmable) junction at (W3, D); that
is, the OR-term f is compatible with the defect pattern of NW W3.
As the number of programmed junctions needed for a given OR-term is usually
small (e.g.820) compared to the number of inputs in an array (e.g. 100), theprobability
that a NWcan support a given OR-term is much larger than the probability that it has no
junction defects. Assuming that C is the fan-in to the OR-term, and assuming random
junction defects, the probability that the NW can support the OR-term is
Psupport (C) (Ppgm )C :
11:9
For example, in a 100 NW array, if Ppgm 0.95, Psupport(13) 0.51, and Ppgmwire
0.006. Furthermore, as multiple NWs can be used in an array to nd a compatible
match, failure to map a NW will only occur if there are no compatible NWs in the
array.
Pmatch (C; N wire ) (1 (1 P support (C)N wire ):
11:10
11.8
Bootstrap Testing
11.8.1
Discovery
Since addressing and restoration is stochastic, there is a need to discover the live
addresses and their restoration polarity. Further, as the NWs will be defective it is vital
j317
318
to identify those NWs which are usable and those which are not. Here, the restoration
columns (see Figures 11.14 and 11.19) are used to help identify useful addresses.
The gate side supply (e.g. the top set of lithographic wire contacts in Figure 11.10) can
be driven to a high value, after which a voltage is sought on the opposite supply line
(e.g. the bottom set of lithographic wire contacts in Figure 11.10; these contacts are
marked Vhigh and Gnd in Figure 11.10, but will be controlled independently as
described here during discovery). There will be current ow into the bottom supply
only if the control associated with the p-type restoration wire can be driven to a
sufciently low voltage. The process is started by driving all the row lines high, using
the row precharge path. A test address is then applied and the supply (Vrow in
Figure 11.14) is driven low. If a NW with the test address is present, only that line will
now be strongly pulled low. If the associated row line can control one or more wires in
the restoration plane, the selected wires will now see a low voltage on their eld-effect
control regions and enable conduction from the top supply to the bottom supply. By
sensing the voltage change on the bottom supply, the presence of a restored address
can be deduced. Broken NWs will not be able to effect the bottom supply. NWs with
excessively high resistance due to doping variations or poor contacts will not be able to
pull the bottom supply contact up quickly enough. As the buffering and inverting
column supplies are sensed separately it will be known whether the line is buffering,
inverting, or binate.
No more than O((Pp)2) unique addresses are needed to achieve virtually unique row
addressing [24], so the search will require at most O((Pp)2) such probes. A typical
address width for the nanoPLA blocks is Na 14, which provides 3432 distinct 7-hot
codes, and a typical number of OR-terms might be 90 (see Table 11.1). Hence, 3432
addresses may need to be probed to nd 90 live row wires.
When all the present addresses in an array and the restoration status associated with
each address are known, logic can be assigned to logical addresses within each plane,
based on the required restoration for the output. With logic assigned to live addresses
in each row, the address of the producing and consuming row wires can now be used to
select and program a single junction in a diode-programmable OR plane.
11.8.2
Programming
In order to program any diode crosspoint in the OR planes (e.g. Figure 11.14),
one address is driven into the top address decoder, and the second address into
the bottom. The stochastic restoration performs the corner turn, so that the desired
programming voltage differential is effectively placed across a single crosspoint.
The voltages and control gating on the restoration columns are then set to dene
which programmable diode array is actually programmed during a programming
operation. For example, in Figure 11.14 the ohmic supply contacts at the top and
bottom are the control voltages; the signals used for control gating are labeled with
precharge and eval signal names. To illustrate the discovery and programming
process, DeHon [29] presents the steps involved in discovering and programming an
exemplary PLA.
11.8.3
Scaling
It should be noted that each nanoPLA array is addressed separately from its set of
microscale wires (A0, A1, . . . and Vrow, Vbot, and Vtop; see Figure 11.14). Consequently, the programming task is localized to each nanoPLA plane, and the work
required to program a collection of planes (e.g. Figure 11.18) only scales linearly with
the number of planes.
11.9
Area, Delay, and Energy
11.9.1
Area
From Figures 11.19 and 11.22 the basic area composition of each tile can be seen. For
this, the following feature size parameters are used:
.
.
.
Wlitho is the lithographic interconnect pitch; for example, for the 45-nm node,
Wlitho 105 nm [40].
Wdnano is the NW pitch for NWs which are inputs to diodes (i.e. Y route channel
segments and restored PTERM outputs).
Wfnano is the NW pitch for NWs which are inputs to eld-effect gated NWs; this may
be larger than Wdnano in order to prevent inputs from activating adjacent gates and
to avoid short-channel FET limitations.
The tile area is computed by rst determining the tile width, TW, and tile
height, TH:
TW (3 4(Lseg 1)) W litho (P or 4(Lseg 1)W segr ) W dnano
11:11
11:12
AW (N a 2) W litho
11:13
11:14
where Por, Pir, Or, and Wsegr (shown in Figure 11.19) are the raw number of wires
needed to populate in the array in order to yield Pp restored inputs, Op restored
outputs, and Wseg routing channels (see Section 11.7.3). The two 4s in TW arise from
the fact that there are Lseg 1 wire groups on each side of the array (2), and each of
those is composed of a buffer/inverter selective inversion pair (2). A lithographic
spacing is charged for each of these groups as they must be etched for isolation and
controlled independently by lithographic scale wires. The 12 lithographic pitches in
TH account for the three lithographic pitches needed on each side of a group of wires
for the restoration supply and enable gating. As segmented wire runs end and begin
j319
320
between the input and output horizontal wire runs, these three lithographic pitches
are paid for four-fold in the height of a single nanoPLA block: once at the bottom top
of the block (see Figure 11.19).
Na is the number of microscale address wires needed to address individual,
horizontal nanoscale wires [24]; for the nanoPLA blocks in these studies, Na is
typically 14 to 20. Two extra wire pitches in the AddressWidth (AW) are the two power
supply contacts at either end of an address run.
11.9.2
Delay
Figures 11.16 and 11.17 show the basic nanoPLA clock cycle, Tplacycle. The component
delays shown in Figure 11.17 (e.g. nanowire precharge and evaluation times) are
calculated based on the NW resistances and capacitances, the crosspoint resistances,
and the nanowire FET resistances [29]. NW resistance and capacitance can be
calculated based on geometry and material properties using the NW lengths, which
are roughly multiples of the tile width, TW, and tile height, TH, identied in the
previous section. If simply heavily doped silicon nanowires are used, the NW
resistances can be close to 10 MW, and this results in nanoPLA clock cycle times
in the tens of nanoseconds. However, if the regions of the NW which do not need to
be semiconducting are converted selectively that is, everything except the diode
crosspoint region and the eld-effect restoration region into a nickel silicide
(NiSi) [12], the NW resistances can be reduced to the 1 MW range. As a result,
the nanoPLA clock cycle is brought down to the nanosecond region. This selective
conversion can be performed as a lithographic-scale masking step and, with careful
engineering, subnanosecond nanoPLA cycle times may be possible. As long as the
NW resistance is in the 1 MW range, it will dominate the on-resistance of both the
eld-effect gating in the restoration NW (Ronfet) and diode on-resistances (Rondiode) in
the 100 KW range.
11.9.3
Energy and Power
The nanoPLAs will dissipate active energy, charging and discharging the functional
and congured NWs.
1
E NW CWire V 2 :
2
11:15
As noted in the previous section, Cwire can be computed from the material properties
and geometry. To tolerate variations in NW doping, it is likely the operating voltage
will need to be 0.5 to 1 V.
The raw ENW can be discounted by the fraction of NWs typically used in a routing
channel or a logic array, F. This tends to be 7080% with the current tools and
designs. When using the selective inversion scheme, both polarities of most signals
will typically be driven to guarantee a close to 50% activity factor, A.
Parray
:
Area
11:17
Here, Area is the area for the tile as calculated in Eq. (11.14).
Figure 11.25 shows the power density associated with interconnected nanoPLAs,
and suggests that the designs may dissipate a few hundred Watts per cm2. In typical
designs, compute arrays would be interleaved with memory banks (see Section 11.5),
which have much lower power densities. Nonetheless, this suggests that power
management is as much an issue in these designs as it is in traditional, lithographic,
designs.
11.10
Net Area Density
j321
322
(see Section 11.7) to compute physical nanowire population, and the area equations
in Section 11.9.1 to compute composite area. Subsequently, the resultant minimum
area obtainable is compared with the nanoPLA designs to lithographic 4-LUT FPGAs
at the 22 nm node [40]. As shown in Table 11.1, and further detailed in Ref. [29], the
routed nanoPLA designs are one to two orders of magnitude smaller than 22 nm
lithographic FPGAs, even after accounting for lithographic addressing overhead,
defects, and statistical addressing.
The datapoints in Table 11.1 are based on a number of assumptions about
lithographic and nanowire pitches and statistical assembly. DeHon [29] also examined the sensitivity to these various parameters, and showed that the statistical
restoration assembly costs a factor of three in density for large arrays, while the cost of
statistical addressing is negligible. If the diode pitch (Wdnano) could be reduced to
5 nm, another factor of almost two in area could be saved. Moreover, if the
lithographic support were also reduced to the 22 nm node (Wlitho 45 nm), a further
three-fold factor in density advantage would be gained compared to the data in
Table 11.1.
11.11
Alternate Approaches
During recent years, several groups have been studying variants of these nanowirebased architectures (see Table 11.2). Heath et al. [41] articulated the rst vision for
constructing defect-tolerant architectures based on molecular switching and bottomup construction. Luo et al. [42] elaborated the molecular details and diode-logic
structure, while Williams and Kuekes [23] introduced a random particle decoder
scheme for addressing individual NWs from lithographic-scale wires. These early
designs assumed that diode logic was restored and inverted using lithographic scale
CMOS buffers and inverters.
Goldstein and Budiu [43] described an interconnected set of these chemicallyassembled diode-based devices, while Goldstein and Rosewater [44] used only twoterminal non-restoring devices in the array, but added latches based on resonanttunneling diodes (RTDs) for clocking and restoration. Snider et al. [45] suggested
nanoFET-based logic and also tolerated non-programmable crosspoint defects by
matching logic to the programmability of the device.
Strukov and Likharev [46] also explored crosspoint-programmable nanowire-based
programmable logic and used lithographic-scale buffers with an angled topology and
nanovias so that each long NW could be directly attached to a CMOS-scale buffer.
Later, Snider and Williams [47] built on the Strukov and Likharev interfacing concept
and introduced a more modest design which used NWs and molecular-scale switches
only for interconnect, performing all logic in CMOS.
These designs all share many high-level goals and strategies, as have been
described in this chapter. They suggest a variety of solutions to the individual
technical components including the crosspoint technologies, NW formation, lithographic-scale interfacing, and restoration (see Table 11.2). The wealth of technologies
HP/UCLA
CMU nanoFabric
SUNY CMOL
HP FPNI
This chapter
Design source
Crosspoint technology
Imprint lithography
NanoPore templates
Interferometric lithography
Imprint lithography
Catalyst NWs
NW
Nanoscale wired-OR
Nanoscale wired-OR
Nanoscale wired-OR
CMOS (N)AND
Nanoscale wired-OR
Logic
Component
Random particles
Offset angles
Offset angles
Coded NWs
Litho $ NW
CMOS
RTD latch
CMOS
CMOS
NW FET
Restoration
22, 41
43, 44
46
47
Reference(s)
j323
324
and construction alternatives identied by these and other research groups has
increased the general condence that there are options to bypass any of the challenges
that might arise when realizing any single technique or feature in these designs.
11.12
Research Issues
While the key building blocks have been demonstrated as previously cited, considerable research and development remains in device synthesis, assembly, integration,
and process development. At present, no complete fundamental understanding of
the device physics at these scales is available, and a detailed and broader characterization of the devices, junctions, interconnects, and assemblies is necessary to rene
the models, to better predict the system properties, and to drive architectural designs
and optimization.
The mapping results outlined in Section 11.10 were both area- and defect-tolerance
driven. For high-performance designs, additional techniques, design transformations, and optimizations will be needed, including interconnect pipelining (e.g.
Ref. [48]) and fan-out management (e.g. Ref. [49]).
In Section 11.7 it was noted that high defect rates could be tolerated when the
defects occurred before operation. However, new defects are likely to arise during
operation, and additional techniques and mechanisms will be necessary to detect
their occurrence, to guard the computation against corruption when they do occur,
and rapidly to recongure around the new defects.
Further, it is expected that these small feature devices will encounter transient
faults during operation. Although the exact fault rates are at present unknown, they
are certainly expected to exceed those rates traditionally seen in lithographic silicon.
This suggests the need for new lightweight techniques and architectures for fault
identication and correction.
11.13
Conclusions
References
Acknowledgments
The architectural studies into devices and construction techniques which emerge
from scientic research do so only after close and meaningful with the physical
scientists. These studies have been enabled by collaboration with Charles M.
Lieber and his students. The suite of solutions summarized here includes joint
investigations with Helia Naeimi, Michael Wilson, John E. Savage, and Patrick
Lincoln.
These research investigations were funded in part by National Science Foundation
Grant CCF-0403674 and the Defense Advanced Research Projects Agency under
ONR contracts N00014-01-0651 and N00014-04-1-0591.
Any opinions, ndings, and conclusions or recommendations expressed in this
material are those of the authors, and do not necessarily reect the views of the
National Science Foundation or the Ofce of Naval Research.
Christian Nauenheim and Rainer Waser helped to produce this brief chapter as a
digested version of Ref. [29].
References
1 V. Betz, J. Rose, FPGA place-and-route
challenge, 1999. Available at http://
www.eecg.toronto.edu/vaughn/
challenge/challenge.html.
2 A. M. Morales, C. M. Lieber, A laser
ablation method for synthesis of crystalline
semiconductor nanowires. Science 1998,
279, 208211.
3 (a) Y. Cui, L. J. Lauhon, M. S. Gudiksen, J.
Wang, C. M. Lieber, Diameter-controlled
synthesis of single crystal silicon
nanowires. Appl. Phys. Lett. 2001, 78 (15),
22142216. (b) Y. Cui, Z. Zhong, D. Wang,
W. U. Wang, C. M. Lieber, High
performance silicon nanowire eld effect
transistors. Nano Lett. 2003, 3 (2), 149152.
j325
326
10
11
12
13
14
15
16
References
27 A. DeHon, Array-based architecture for
FET-based, nanoscale electronics. IEEE
Trans. Nanotech. 2003, 2 (1), 2332.
28 F. G. Maunsell, A problem in cartophily.
The Math. Gazette 1937, 22, 328331.
29 A. DeHon, Nanowire-based
programmable architecture. ACM J.
Emerging Technol. Comput. Systems 2005, 1
(2), 109162
30 A. DeHon, H. Naeimi, Seven strategies for
tolerating highly defective fabrication.
IEEE Design Test Comput. 2005, 22 (4),
306315.
31 A. DeHon, M. J. Wilson, Nanowire-based
sublithographic programmable logic
arrays, in: Proceedings International
Symposium on Field-Programmable Gate
Arrays, Napa Valley, CA, IEEE Publishers,
pp. 123132, 2004.
32 C. Mead, L. Conway, Introduction to VLSI
Systems, Addison-Wesley, 1980.
33 N. H. E Weste, D. Harris, CMOS VLSI
Design: A Circuits and Systems Perspective,
3rd edn., Addison-Wesley, 2005.
34 A. DeHon, Design of programmable
interconnect for sublithographic
programmable logic arrays, in:
Proceedings International Symposium on
Field-Programmable Gate Arrays,
Monterey, CA, ACM Publishers, pp.
127137, 2005.
35 V. Betz, J. Rose, A. Marquardt, Architecture
and CAD for Deep-Submicron FPGAs,
Kluwer Academic Publishers, Norwell,
MA, 1999.
36 S. Brown, M. Khellah, Z. Vranesic,
Minimizing FPGA interconnect delays.
IEEE Des. Test Comput. 1996, 13 (4), 1623.
37 V. Betz, J. Rose, FPGA routing
architecture: Segmentation and buffering
to optimize speed and density, in:
Proceedings International Symposium on
Field-Programmable Gate Arrays,
Monterey, CA, ACM Publishers, pp.
5968, 1999.
38 G. Lemieux, E. Lee, M. Tom, A. Yu,
Directional and single-driver wires in
FPGA interconnect, in: Proceedings
International Conference on Field-
39
40
41
42
43
44
45
46
47
48
j327
328
j329
12
Quantum Cellular Automata
Massimo Macucci
12.1
Introduction
The concept of quantum cellular automata (QCA) was rst proposed by Craig Lent
and coworkers [1] at the University of Notre Dame in 1993, as an alternative based on
bistable electrostatically coupled cells to traditional architectures for computation.
Overall, the QCA architecture probably represents the proposal for an alternative
computing paradigm that has been developed furthest, up to the experimental proof
of principle [2]. As will be discussed in the following sections, its strengths are
represented by the reduced complexity (in particular for the implementation based
on ground-state relaxation), extremely low power consumption, and potential for
ultimate miniaturization; its drawbacks are the extreme sensitivity to fabrication
tolerances and stray charges, the difculty in achieving operating temperatures
reasonably close to room temperature, the undesired interaction among electrodes
operating on different cells, and the very challenging control of dot occupancy.
The initial formulation of the QCA architecture relied on the relaxation to the
ground state of an array of cells: computation was performed enforcing the polarization state of a number of input cells and then reading the state of a number of
output cells, once the array had relaxed down to the ground state consistent with the
input data. Such an approach is characterized (as will be discussed in the following)
by a simple at least in principle layout, but suffers from the presence of many
states very close in energy to the actual ground state. This leads to an extremely slow
and stochastic relaxation, which may lead to unacceptable computational times.
The slow and unreliable evolution of the ground-state relaxation approach was
addressed with the introduction of a modied QCA architecture based on clocked
cells [3], which can exist in three different conditions, depending on the value of a
clock signal:
.
330
.
.
The null condition corresponds to having no electrons in the cell and therefore no
polarization.
The active condition is the one in which the cell adiabatically reaches the
polarization condition resulting from that of the nearby cells.
The clocked QCA architecture solves the problem of unreliable evolution and
allows data pipelining, but introduces a remarkable complication: the clock signal
must be distributed, with proper phases, to all the cells in the array. Unless a
wireless technique for clock distribution could be devised (some proposals have
been made in this direction, but a denite solution is yet to be found), one of the most
attractive features of QCA circuits the lack of interconnections would be lost.
Current research is focusing on the possibility of implementing QCA cells with
molecules [4] or with nanomagnets [5], in order to explore the opportunities for
further miniaturization (molecular cells) and for overcoming the limitations imposed by Coulomb coupling (nanomagnetic cells). However, these technologies do
not seem to be suitable for fast operation: highly parallel approaches could make up
for the reduced speed, but this would further complicate system design and the
denition of the algorithms.
Although the basic principle of operation is sound, the above-mentioned technological difculties and the reliability problems make practical application of QCA
technology unlikely, at least in the near future. Nevertheless, the QCA concept
remains of interest and the subject of lively research, because of its innovation
potential and because it opens up a perspective beyond the so far unchallenged threeterminal device paradigm for computation.
In this chapter, an overview of the QCA architecture will be provided, with a
discussion of its two main formulations: the ground-state relaxation approach and
the clocked QCA approach. In Section 12.2 the issue of operation with cells with more
than two electrons will also be addressed, as well as the details of intercell interaction.
Section 12.3 will focus on the various techniques that have been developed to model
QCA devices and circuits, while Section 12.4 will be devoted to the challenges facing
the implementation of QCA technology. Actual physical implementations of QCA
arrays will be addressed in Section 12.5, and an overlook for the future will be
presented in Section 12.6.
12.2
The Quantum Cellular Automaton Concept
12.2.1
A New Architectural Paradigm for Computation
in the rst dash was conned into one end of the dash, the electron in the next dash
would be conned into the opposite end, as a result of electrostatic repulsion. This
conguration would propagate along the line of dashes, leading to a sort of antiferroelectric ordering that could then be exploited for the implementation of more
complex functions. This initial concept, however, had a serious problem, consisting
in the fact that the localization of electrons along the chain of dashes would soon
decay (See Figure 12.2), because the electrostatic repulsion due to an electron
localized at the end of a dash is not sufcient to signicantly localize the electron
in the nearby dash, the probability density of which would just be slightly displaced.
The Notre Dame group realized that this problem could be solved with the insertion
of a barrier in the middle of the dash: in this way, the electron wave function must be
localized on either side of the barrier and the electrostatic interaction from the
electron in the nearby dash is sufcient to push the electron into the opposite half of
the dash (Figure 12.3). This concept can be easily understood considering a two-level
system subject to an external perturbing potential V [7]. The Hamiltonian of such a
system in second quantization reads:
X
X
^
ni E i t(b{1 b2 b{2 b1 )
ni qV i ;
12:1
H
i1;2
i1;2
where n1 and n2 are the occupation numbers of levels 1 and 2, respectively, b{i and bj
are the creation and annihilation operators for levels i and j, t is the coupling between
Figure 12.2 Sketch of the actual electron density within a chain of dashes.
j331
332
the two levels and Vi is the external perturbing potential at the location of the i-th level.
The creation operator b{i applied to a state with (n 1) electrons yields a state with n
electrons, thereby creating an electron in state i, while the annihilation operator bj
applied to a state with n electrons yields a state with (n 1) electrons, thereby
destroying an electron from state j. For example, the application of b{i b2 transfers
an electron from level 2 to level 1. Each level can be associated with one of the sides into
which the dash is divided by a potential barrier: if the barrier is exactly in the middle of
the dash, E1 E2 and the value of t depends on the height and thickness of the barrier;
the higher and the thicker the barrier, the smaller t will be. A sketch of the potential
prole is provided in Figure 12.4, where the dots represent the locations of the two levels
1 and 2. If E1 V1 is chosen as the energy reference and e is dened as E2 V2
E1 V1, the Hamiltonian can be represented in the basis (|0i (n1 1, n2 0), |1i
(n1 0, n2 1)) simply as (computing the matrix elements between the basis states)
0 t
H
:
12:2
t e
The state |0i corresponds to having an electron in level 1 and no electron in level 2,
while |1i corresponds to the situation with level 1 empty and an electron in level 2. The
eigenvalues of this representation can be easily computed:
p
p
1
1
e e2 4t2
e e2 4t2 :
e1
e2
12:3
2
2
Figure 12.4 Sketch of the potential landscape defining the two levels, 1 and 2, separated by a barrier.
The unbalance term e in this case depends only on the external potential produced
by the electron in the nearby dash: it will vary between a negative and a positive value,
depending on the position of the electron.
The occupancy of the rst level that is, of the dash side labeled with 1 will be
given by the square modulus of the corresponding coefcient of the ground-state
eigenvector, which can be computed along with the eigenvalues. Such a quantity is
plotted in Figure 12.5 as a function of the unbalance e for different values of the
coupling energy t. It is apparent that, for low values of t (and therefore for an opaque
barrier), the electron moves abruptly from one level to the other, as the external eld is
varied. Therefore, it is strongly localized even for very small values of such a eld,
while for high values of t (and therefore for a transparent or inexistent barrier) a very
smooth transfer of the probability density from one level to the other are needed and
large values of the perturbing eld are required to achieve some degree of localization. Thus, the introduction of a barrier in the middle of the dash creates a sort of
bistability that is, a strongly non-linear response to external perturbations, which is
at the heart of QCA operation and allows the regeneration of logic values. From the
wire of dashes with barriers of Figure 12.3, the next step is represented by joining two
adjacent dashes to form a square cell, which is the basis of the Notre Dame QCA
architecture. The square cell allows the creation of two-dimensional (2-D) arrays, as
shown in Figure 12.6, which can implement any combinatorial logic function. In an
isolated cell the two congurations or polarization states, with the electrons along one
or the other diagonal, are equally likely. However, if an external perturbation is
applied, or in the presence of a nearby cell with a xed polarization, one of them will
be energetically favored. The two logic states, 1 and 0, can be associated with the two
possible polarization congurations, as shown in Figure 12.7, where the solid dots
represent the electrons. If a linear array of dots is created, enforcing the polarization
corresponding to a given logic state on the rst cell will lead to the propagation of the
same polarization state along the whole chain, in a domino fashion. Such a linear
array is usually dened a binary wire, and can be used to propagate a logic variable
j333
334
across a circuit: here, the strength and, at the same time, the weakness of the QCA
architecture is noticed. Indeed, signal regeneration occurs during propagation along
the chain, as a result of the non-linear response of the cells, but the transfer of a logic
variable from one location in the circuit to a different location may require a relatively
large number of cells. In other words, there are no interconnects, but the number of
elementary devices needed to implement a given logic function may become much
larger than in a traditional architecture.
The basic gate in QCA logic is represented by the majority voting gate, which is
shown in Figure 12.8. Cells A, B and C are input cells, whose polarization state is
enforced from the outside (here and in the following such driver cells are
represented with a double boundary), while cell Y is the output cell, the polarization
of which represents the result of the calculation. On the basis of a full quantum
calculation or of simple considerations based on a classical electrostatic model, it is
possible to show that the logic value at the output will correspond to the majority
of the input values. Thus, for example, if A 1, B 0, C 0, then Y 0, or, if A 1,
B 0, C 1, the output will be Y 1. From the majority voting gate it is
j335
336
cells will act as outputs. Once the input values have been enforced, the array is allowed
to evolve into the ground state and, when this has been reached, the state of the output
cells corresponds to the result of the computation.
As the number of cells in the array increases, its energy spectrum that is, the set of
energies corresponding to all the possible congurations becomes more complex,
with a large number of congurations that have energies very close to the actual
ground state. As a result, the evolution of the array may become stuck in one of these
congurations for a long time, thus leading to a very slow computation. Furthermore,
due to the appearance of states that are very close in energy to the ground state, the
maximum operating temperature decreases as the number of cells is increased. In
particular, it has been shown with entropy considerations [8] or by means of an
analytical model [9] that, for the specic case of a binary wire, the maximum operating
temperature falls logarithmically with the number of cells.
12.2.2
From the Ground-State Approach to the Clocked QCA Architecture
Figure 12.10. Tunneling is possible only between the upper and the lower dot of each
half of the cell (it can be shown that this does not limit in any way the logic operation of
the cell) and the barrier height between the active islands is controlled by means of the
potential applied to the central island. If the potential on the central island is low, then
the electron will be trapped there (null state). As the potential on the central island is
raised, a condition will be reached in which the electron can tunnel into one of the
active dots, the one that is favored at the time by the potential created by the nearby
cells (active state). Finally, as the potential on the central dot is further raised, the
electron will be trapped in the dot into which it has previously tunneled, even if the
polarization of the other cells is reversed (locked state).
Ideally, the computation should evolve with a cell in the locked state driving
the next cell that moves from the null state to the locked state, going through the
active state. When the state of a cell must be the result of that of more than one
neighboring cell (as in the case of the central cell of a majority voting gate), all the
cells acting on it should be at the same time in the locked state. The sequence of
clock phases would allow the information to travel along the circuit in a controlled
way, thus achieving a deterministic evolution and eliminating the uncertainty about
the time when the calculation is actually completed that plagues the ground-state
relaxation scheme. Furthermore, since the ux of data is steered by the clock, it
would also possible to have data pipelining: new input data could be fed into the
inputs of the array as soon as the previous data have left the rst level of logic gates
j337
338
and moved to the second. Ideally, within this scheme each cell should be fed a
different clock phase with respect to its nearest neighbors, which would imply an
extremely complex wiring structure. Such a solution has been adopted in the
experiments performed so far to demonstrate the principle of operation of clocked
QCA logic [12]. However, in large-scale circuits it would forfeit one of the main
advantages of the QCA architecture that is, the simplicity deriving from the lack of
interconnections. In order to address this problem, it has been proposed to divide
the overall QCA array into clocking zones: such regions consist of a number of
QCA cells and would be subject to the same clock phase and evolve together while
in the active state (similarly to a small array operating according to the ground-state
relaxation principle). They would then be locked all at the same time, in order to
drive the following clocking zone. This would reduce the complexity of the required
wiring, and has been proposed in particular for the implementation of QCA circuits
at the molecular scale, where it is impossible to provide different clock phases to
each molecule, as the wires needed for clocking would be much larger than the
molecules themselves! There are many difculties involved, however, because each
clocking zone is affected by the problems typical of ground-state relaxation
(although on a smaller scale), and the clock distribution is still extremely challenging. For example, conducting nanowires have been suggested as a possible solution
to bring the clock signal to regions of a molecular QCA circuit, but achieving
uniformity in the clocking action of a nanowire on many molecular cells is certainly
a very challenging task.
Notwithstanding all of these difculties, the clocked scheme appears to be the only
one capable of yielding a reasonably reliable QCA operation in realistic circuits, as
will be discussed in the following sections.
12.2.3
Cell Polarization
r1 r3 r2 r4
;
r1 r2 r3 r4
12:4
where ri is the charge in the i-th dot (dots are numbered counterclockwise starting
from the one in the upper right quadrant). With such an expression, as soon as
the number of electrons increases above two, full polarization can no longer be
achieved, as the maximum possible value for the numerator is 2. There can be at most
a difference of two electrons between the occupancy of one diagonal and that of the
other, since congurations with a larger difference would require an extremely large
external electric eld (to overcome the electrostatic repulsion between electrons).
Starting from the observation that a QCA cell is overall electrically neutral, because
of the presence of ionized donors, of the positive charge induced on the metal
electrodes, and of the screening from surface charges, Girlanda et al. [14] proposed a
different expression for the polarization of a cell, which is more representative of its
action upon the neighboring cells. Indeed, neutralization occurs over an extended
region of space; thus, although the global monopole component of the eld due to a
cell is zero, there can be some effect on the neighboring cells associated with the total
number of electrons. However, in practical cases this turns out to be negligible
compared to the dipole component associated with the charge unbalance between the
two diagonals. The alternative expression for cell polarization introduced in Ref. [14]
reads
P
r1 r3 r2 r4
;
2q
12:5
12.3
Approaches to QCA Modeling
12.3.1
Hubbard-Like Hamiltonian
The rst approach to QCA simulation was developed by the Notre Dame group [15],
based on an occupation number, Hubbard-like formalism. Within such an approach
the details of the electronic structure of each quantum dot are neglected, and a few
parameters are used to provide a description of the dots and their interaction.
Although based on a few phenomenological parameters, this technique has been
successful in providing a good basic understanding of the operation of QCA cells.
The occupation number Hamiltonian for a single, isolated cell reads
H0
X
i;s
E 0;i ; ni;s
X
i
X
i>j;s
E Qi ni;" ni;#
i>j;s;s0
VQ
ni;s nj;s0
!
jR i R j j
12:6
where E0,i is the ground-state energy of the i-th dot (assumed to be isolated), b{j;s and
bj;s are the creation and annihilation operators, respectively, for an electron in the j-th
dot with spin s, nj,s is the number operator for electrons in the i-th dot with spin s, t is
the tunneling energy between neighboring dots, VQ is equal to e2/(4pe) (e is the
electron charge and e is the dielectric permittivity), EQi is the on-site charging energy
j339
340
for the i-th dot [16], and R i is the position of the i-th dot center. The tunneling energy t
cannot be computed directly, and must be evaluated with some approximation. A
commonly used approximation consists in assuming t to be equal to half of the levelsplitting resulting because of the presence of a barrier of nite height between the
dots. In the presence of a driver cell in a given polarization state, the above-written
Hamiltonian must be augmented with a term that expresses the electrostatic
contribution from such a cell:
X X
rj;2 r
12:7
VQ !
Hint
! ;
jRj;2 Ri;1 j
i2cell1 j2cell2
where rj,2 is the number of electrons in the j-th dot of the driver cell, r is the average
!
!
number of positive neutralizing charges per dot, Rj;2 and R i;1 are the positions of the jth dot of the driver cell (cell 2) and of the i-th dot of the driven cell (cell 1), respectively.
Diagonalization of the total Hamiltonian can be performed easily using an
occupation number representation: |n1,", n1,#; n2,", n2,#; n3,", n3,#; n4,", n4,#i.
The dimension of the basis, considering only two-electron states, would be 256 but,
on the basis of spin considerations [17], the number of basis vectors required for the
determination of the ground state is just 16.
A representation of the complete Hamiltonian on such a basis consists in a sparse
matrix with only four non-zero off-diagonal elements in each row. Eigenvalues
and eigenvectors can be obtained numerically, and the ground state of the driven cell
will be
jy0 i
16
X
ai jii;
12:8
i1
12:9
from which the cell polarization can then be computed. In Figure 12.11 the
polarization of the driven cell computed as a function of the polarization of the
driver cell is reported; that is, the cell-to-cell response function. Calculations have
been performed for a cell with four dots located at the vertices of a square with a 24 nm
side. The dots have a diameter of 16 nm, except for one, the diameter of which varies
between 15.94 and 16.06 nm, and the separation between the centers of the driver and
driven cells is 32 nm. Material parameters for GaAs have been considered, with an
effective mass m 0.067 m0 and a relative permittivity er 12.9; furthermore the
tunneling energy t has been assumed to be 0.1 10 3 eV.
In the case of identical dots (all with the same 16-nm diameter), the response
function is symmetric, while just an extremely small variation in the diameter of a dot
leads to strong asymmetry and eventually to failure of operation, with the driven cell
always stuck in the same state for any value of the polarization of the driver cell. It
appears that such a sensitivity to geometric tolerances is a very serious practical
j341
342
1
e2
1
e2
1 e2
q
;
!
!
4pe jr 1 r 2 j 4pe ! ! 2
4pe 2z
jr 1 r 2 j (2z)2
12:10
where h is the reduced Planck constant, m is the electron effective mass, Vcon is the
bare connement potential (due to the electrodes, the ionized donors, the charged
impurities, and the bandgap discontinuities), Vdriv is the Coulomb potential due to
the charge distribution in the neighboring driver cell, e is the electron charge, the last
two terms include the effects of the image charges (since, for simplicity, a Dirichlet
boundary condition is assumed at the surface and at an innitely far away conducting
substrate), and z is the distance of the 2DEG from the surface of the heterostructure
where the boundary condition is enforced.
A matrix representation of this Hamiltonian is derived by computing the matrix
elements of Eq. (12.10) between the elements of the basis of Slater determinants, and
is then diagonalized, obtaining the ground-state energy as the lowest eigenvalue and
the ground-state wave function as a linear combination of the basis elements with
coefcients corresponding to the elements of the associated eigenvector.
This technique does not have convergence problems, as it is intrinsically a one-shot
method and allows the consideration of a realistic connement potential, obtained
from a detailed numerical calculation. However, if the intention was to introduce
more realistic boundary conditions for the semiconductor surface, or in general to
provide a more rened treatment of the electrostatic problem, going beyond the
method of images, the problem would, computationally, be very intensive. This is
because, in order to compute the matrix elements of the Hamiltonian, the complete
Greens function of the Poisson equation between each pair of points in the domain
would be needed.
If an occupancy of only two electrons per cell were to be considered, the actual twoelectron wave function is very close to the Slater determinant constructed from the
one-electron wave functions of the isolated dots. Therefore, the size of the basis
needed to obtain a good congurationinteraction solution is small, of the order of
100 determinants. Instead, if there are more than four electrons per cell (and thus
more than one electron per dot), the strong electronelectron interaction determines
a signicant deformation of the wave functions and therefore a large number of basis
elements constructed from the single-electron orbitals is needed. For example, in the
case of a six-electron cell, more than 1000 determinants are necessary. As the number
Figure 12.12 Gate layout for the definition of a working QCA cell (all measures are in nanometers).
of electrons is raised, there is a combinatorial increase in the size of the basis, and
consequently the problem soon becomes intractable from a computational point of
view.
Notwithstanding the above-mentioned limitations on the way that Coulomb that
interaction can be included, and on the maximum number of electrons that can be
considered, congurationinteraction has been very successfully applied to the
simulation of QCA systems. In fact, it has allowed the demonstration that, for a
semiconductor implementation, an array of holes (dening the quantum dots) in a
depletion gate held atconstant potential cannot possibly befabricated with therequired
precision. Alternative gate arrangements, such as those shown in Figure 12.12, are
possible [21], and have been used in the experimental demonstration of QCA action in
GaAs/AlGaAs heterostructure-based devices [22]. However, they imply severe technological difculties and the need for adjustment of individual gate voltages to correct
for geometrical imperfections and for the unintentional asymmetries introduced by
the presence of nearby cells [23].
12.3.3
Semi-Classical Models
Quantum models of QCA cells are needed to describe the bistable behavior of the
single cell, and also to provide information on the technological requirements needed
to obtain successful QCA operation. They are, however, too complex (from a
computational point of view) to be applied to the analysis of complete QCA circuits
consisting of a large number of cells. The time required to complete a simulation of a
circuit made up of just a few tens of cells would be prohibitive. Therefore, a multiscale
approach is needed, which is structured in a way similar to that of traditional
microelectronics, where circuit portions of increasing complexity are treated with
models based on progressively more simplied physical models.
It should be noted that the effect at the core of QCA action is purely classical that
is, it is the Coulomb interaction between electrons. As long as electrons are strongly
localized, they behave substantially as classical particles, and a semi-classical model,
j343
344
based on the minimization of the electrostatic energy, can capture most of the
behavior of a QCA circuit.
If the only point of interest is to determine the ground-state conguration of an
array of QCA cells and in computing the energy separation DE between the rst
excited state and the ground state, then a simple electrostatic model can be used.
The quantity DE is essential to determine the maximum operating temperature of
the circuit: it must be at least a few tens of kT (where k is the Boltzmann constant
and T is the absolute temperature); otherwise, the system will not remain stably in
the ground state. The basic electrostatic model developed in Ref. [9] relies on a cell
model in which the charge of the two electrons is neutralized either by positive
charges of value e/2 located in each dot, or by image charges located at a distance
from the dots and representing the effect of metal electrodes or of Fermi level
pinning at the semiconductor surface. Although a cell can be in the two congurations with the electrons aligned along one of the diagonals, other congurations
are also possible. However, in most cases they are not energetically favored. A more
complete model must also introduce such congurations, corresponding to the two
electrons occupying the dots along one of the four sides of the cell. While the
congurations with the electrons on the diagonals are associated with the logical
values 1 and 0, the other congurations do not correspond to any logical value and
are thus indicated with X in Figure 12.13, where all possible congurations are
represented.
The total electrostatic energy is given by [24]:
X qi qj
12:11
E
4pe0 er r ij
ij
If, for the sake of simplicity, the neutralizing charge is considered to be located
directly in each dot (in an amount e/2), the total charge in each dot can assume only
two values: either e/2 or e/2, thereby leading to
1
qi qj e2 sgn(qi qj )
4
12:12
If the distance between the dots is expressed in terms of the ratio R d/a and of the
electron positions, the following can be written:
E
sij
e2 1 X
q ;
4a 4pe0 er ij
(n R l )2 m2
ij
ij
ij
12:13
where nij 2 {0, . . ., Ncell 1} is the separation, in terms of number of cells, between
the cell with dot i and the cell with dot j, sij 2 { 1, 1} is the sign of qiqj, lij 2 { 1, 0, 1}
and mij 2 {0, 1}, is the position of dots i and j within the relative cells. The quantity lij
is equal to 0 if both the i and the j dots are on the left side or on the right side of the
cell, to 1 if dot i is on the right side and dot j is on the left side, and to 1 if dot i is on
the left side and dot j is on the right. Analogously, mij is equal to 0 if both dots i and j
are on the top or on the bottom of a cell, to 1 if one dot is on the top and the other is on
the bottom.
The most direct approach consists of computing the energy associated with each
possible conguration by means of the direct evaluation of Eq. (12.13) and choosing
the conguration that corresponds to the minimum energy. With this procedure the
complete energy spectrum for the circuit is also obtained; that is, the energy values
corresponding to all possible congurations. However, such a method soon becomes
prohibitively expensive from a computational point of view, as the number of
congurations to be considered is 6N, where N is the number of active cells (i.e.
of cells whose polarization is not enforced from the outside, as in the case of the driver
cells). As the number of cells is increased, a simplied model can be used in which
only the two basic states of a cell are considered, thus reducing the total number of
congurations down to 2N. This does not introduce signicant errors, as long as the X
states are unlikely (which is in general true), except for the case of intercell separation
equal to or smaller than the cell size, when X states with both electrons vertically
aligned may occur.
An example of application of the semi-classical simulation technique with six
states per cell is shown in Figure 12.14, where the maximum operating temperature
of a binary wire is reported as a function of the number of cells, for a 60%, 90% and
99% probability of obtaining the correct logical output, assuming an interdot distance
of 40 nm, an intercell separation of 100 nm and GaAs material parameters. It should
j345
346
be noted that the probability of obtaining the correct logical output is in general larger
than the probability of being in the ground state, as there are also a number of excited
states in which the polarization of the output cell has the correct value.
It is apparent that, even with the simplication down to just two states per cell, large
circuits cannot be simulated with the semi-classical approach just described, because
of the exponential increase in the time required to perform a complete exploration of
the conguration space. This has led to the development of techniques based on an
incomplete, targeted exploration of the conguration space, such as that described in
the following subsection.
12.3.4
Simulated Annealing
The concept of simulated annealing derives from that of thermal annealing, whereby
a material is brought into a more crystalline and regular phase by heating it and
allowing it to cool slowly. Analogously, in simulated annealing the aim is to reach the
ground state of the system, starting from a generic state at a relatively high
temperature, and then to perform a Monte Carlo simulation in which at each step
an elementary transition within a cell (chosen at random) is accepted with a
probability P depending on the energy Eold of the system before the transition, and
on the energy Enew after the transition:
P
1
if
exp[ (E new E old )=kT] if
E new E old
E new >E old
12:14
It is apparent that, in this way, the evolution of the system is steered along a path of
decreasing energy, whilst at the same time trapping in a local minimum is prevented
in most cases by the non-zero probability of climbing to a higher energy conguration. This procedure is iterated many times, gradually decreasing the temperature,
until convergence to a stable conguration is achieved [17].
The application of simulated annealing to QCA circuits was originally proposed for
their operation [25], and has since been applied to their modeling [26]. This has
allowed the analysis of circuits with a number of cells of the order of 100 with limited
computational resources and with just a few hours of CPU time. With large circuits,
the simulated evolution of the circuit may occasionally become stuck in a local energy
minimum, which would then be erroneously assumed as the ground state. The
probability of this happening can be minimized by performing the equivalent of
thermal cycling. Once a stable state has been reached, the temperature is raised
again, driving the circuit into an excited state, and a new annealing run is performed,
reaching a new stable state. If the whole procedure is repeated several times, there is a
better chance of reaching the ground state. It is possible to show that the probability
P of the computational procedure stopping in the ground state is given by
P 1 (1 P0)m, where P0 is the probability of reaching the ground state without
cycling, and m is the number of cycles. With this technique it is possible to reliably
simulate QCA circuits with a few hundreds of cells.
12.3.5
Existing Simulators
A number of simulators have been developed to study both the static and dynamic
behaviors of QCA circuits. One of the rst available was AQUINAS (A Quantum
Interconnected Network Array Simulator, from the Notre Dame group), where cells
are modeled within a HartreeFock approximation and the time-dependent
Schrodinger equation is solved with the CrankNickolson algorithm. Relatively large
systems can be handled, as a result of an approximation consisting in the representation of the state of a single cell with a simplied two-dimensional basis [27]. NASA
researchers have added to AQUINAS capabilities for the statistical analysis of data,
thus creating TOMAS (Topology Optimization Methodology using Applied Statistics)
AQUINAS [28].
A static simulator for the determination of the ground state of a QCA circuit on the
basis of a classical electrostatic model has been developed by the group in Pisa, and is
currently available on the Phantoms Computational Hub (http://vonbiber.iet.unipi.
it). The simulator has been named QCAsim, and operates according to the approach
described in Section 12.3.3. In general, it can compute the ground-state conguration
of a generic array of cells via a complete exploration of the conguration space,
assuming for each cell six possible congurations for the two electrons. It is possible
to specify whether neutralization charges should be included and in which positions
(on the same plane as the electrons, on a different plane, as image charges, etc.).
The group in Pisa has also developed a dynamic simulator, MCDot (also available
on the Phantoms HUB). This was conceived specically for the QCA implementation
based on metal tunnel junctions, and is therefore based on the Orthodox Theory of
the Coulomb Blockade [29] with the addition of cotunneling effects treated to rst
order in perturbation theory [30]. The operation of such a code will be described in
more detail in Section 12.4.3 while discussing limitations for the operating speed.
Although the code was originally developed for circuits with metallic tunnel junctions, its range of applicability can be easily extended to different technologies,
extracting appropriate circuit parameters and dening an equivalent circuit. For
example, it has been successfully applied to the simulation of silicon-based QCA
cells [17]. To this purpose, linearized circuit parameters can be determined from
three-dimensional simulations around a bias point and then used in MCDot. The
most challenging part of the parameter extraction procedure is represented by the
capacitances and resistances of the tunneling junctions obtained by dening a
lithographic constriction in silicon wires [31]: the detailed geometry and the actual
distribution of dopants cannot be known exactly, and resort to experimental data is
often necessary.
Recently, another simulator has been developed at the University of Calgary,
QCADesigner (http://www.qcadesigner.ca). This uses a two-state model for the
representation of each cell, derived from the theory developed by the Notre Dame
group. QCADesigner is meant to be an actual CAD (computer-aided design) tool,
applicable to the design of generic QCA circuits and with the capability for testing
their operation with a targeted or exhaustive choice of input vectors.
j347
348
12.4
Challenges and Characteristics of QCA Technology
12.4.1
Operating Temperature
stronger electrostatic interaction in silicon dots makes them suitable for relatively
higher operating temperatures.
From Figure 12.15 it is however clear that an extremely small feature size would be
needed to achieve operation at temperatures that are easily attainable.
An interaction that could allow QCA operation at room temperature is the one
between nanomagnets characterized by a bistable behavior [17] (this will be discussed
further in the next section). The magnetic interaction can be made strong enough to
allow proper behavior of the circuit up to room temperature, but the achievable data
processing speed is probably very low, of the order of a few kilohertz. On the other
hand, a magnetic QCA circuit could exhibit an extremely reduced power dissipation,
which could make it of interest for specic applications where speed is not a major
issue, while keeping power consumption low is essential.
12.4.2
Fabrication Tolerances
The issue of fabrication tolerances has been introduced previously, and is probably
the major limitation of the QCA concept, particularly for its implementation with
semiconductor technologies. Detailed simulations [21] have shown that an approach
based on the creation of an array of cells with dots dened by means of openings in a
depletion gate cannot possibly lead to a working circuit. This is due to the fact that
even extremely small errors in the geometry of such holes are sufcient to perturb the
value of the connement energy for the corresponding quantum dot to make it
permanently empty or permanently occupied, no matter what the occupancy of the
nearby dots is. Although shrinking the size of the cell the electrostatic interaction
energy is increased, the above-mentioned problem becomes even more serious, due
to the larger increase of the quantum connement energy. An evaluation was made of
the precision that could be achieved with state-of-the-art lithographic techniques and
compared with the requirements for proper operation of a QCA circuit [32]. An array
of square holes was obtained on a metal gate by means of electron beam lithography
(Figure 12.16), after which the contour of the holes was extracted from a scanning
j349
350
The maximum operating speed of QCA circuits is ultimately limited by the dynamics
of the evolution toward the ground state (if the ground-state relaxation paradigm is
used), or by the tunneling rate between quantum dots. First, consider a non-clocked
circuit, such as that represented for the case of a binary wire in Figure 12.18. The
polarization state of the rst cell is switched by inverting the bias voltages, and the
cells of the wire will follow; however, according to a time evolution that may be rather
complex and involved. In particular, the presence of states that are very close in energy
to the ground state, although corresponding to different congurations, will increase
the time required for settling.
It is possible to obtain estimates of the time required for completion of the
computation in a QCA array by performing simulations with a Monte Carlo
approach. A Monte Carlo simulator specically devised for QCA circuits was
presented in Ref. [33]. This is based on the Orthodox Coulomb Blockade theory:
the transition rate between two congurations differing by the position of one
electron (which has tunneled through one of the junctions) and by a free energy
variation DE can be expressed as
G
1
DE
;
e2 RT 1 exp kTDE
12:15
j351
352
(circles) and the last (squares) cells in the chain. As the tunneling resistances are
assumed to be 200 kW, the resulting RC constant is of the order of 10 12 s. There is
therefore a difference of about ve orders of magnitude between the RC time
constant of the circuit and the minimum clock period. This is due to a series of
reasons [17]: the average time an electron takes to tunnel through a 200-kW junction
with a 1.5-mV voltage is around 20 ps; furthermore the time during which the cell is
active is only about one-tenth of the actual ramp duration; the active time, to be
reasonably sure of regular operation, must be at least ten times the tunneling time; a
clock period is made up of four ramps; and the intercell coupling is about ve times
smaller than the intracell coupling, which involves a further slow-down. Taken
together, all of these effects lead to the above-mentioned reduction of the speed by ve
orders of magnitude with respect to the RC time constant, and make QCA technology
not very suitable for high-speed applications.
On the other hand, the relatively slow operation of QCA circuits further limits their
power dissipation, and in particular makes the power dissipated in capacitor
chargingdischarging negligible, as will be discussed in the next subsection.
12.4.4
Power Dissipation
One of the most attractive features of QCA circuits is represented by the limited
power dissipation, which results mainly from the fact that there is no net transfer of
charge across the whole circuit: electrons move only within the corresponding cell.
The energy dissipated can be computed by integrating over the Vi Q plane [34]
(Figure 12.21), where Q is the charge transferred from the source, for each external
voltage source Vi and taking the algebraic sum of the results. The voltage V2 is varied
linearly until the unbalance is reversed: up to this point the charge variation
corresponds to charging of the equivalent capacitance seen by V2; then, some time
after the new bias condition has been established, the electrons in the cell will tunnel,
thus leading to a charge variation at constant voltage, which is represented by the
horizontal segment. It is this tunneling event that makes the switch operation
irreversible: without it, the area comprised between the two curves would be zero, as
the voltage would simply be reversed across an equivalent capacitor, without
changing its magnitude.
The energy dissipation depends on the voltage unbalance that is applied to the
input cells, and that is reversed when the input data change: the larger the unbalance,
the faster the switch, but the larger the dissipation, too. In the case of a single driven
cell, for a typical unbalance of a few millivolts the power dissipation for a single
switching event is about 10 22 J [35]. When considering a binary wire, the energy
dissipated when the polarization of the driver cell is reversed, followed by all the
other cells, and then increases very slowly as the number of cells is increased: the
value for a six-cell wire is just 1% larger than that for a three-cell wire. This is due to
the fact that the external voltage sources that provide energy are directly connected
only to the driver cell, and the electrostatic action of the electrodes of the driver cell
decays rapidly when moving along the chain. This leads to a very marginal
contribution to the dissipated energy from the cells further away but, at the same
j353
354
time, is the fundamental reason for the above-mentioned slow and irregular
switching of a long chain.
So far, the energy loss associated with the capacitor charging process has not been
included. If the unbalance reversal were abrupt, such an energy loss would be (as well
known) equal to the electrostatic energy stored in the capacitor, and therefore it would
represent the main contribution to dissipation. However, due to the other limitations
in switching speed, there is no reason to perform such a switching with very steep
ramps. It transpires from calculations, applying the expressions typically used in
adiabatic logic [36], that the energy loss in capacitor charging performed with a speed
compatible with the response of the circuit is negligible with respect to that due to
electron tunneling in all practical cases [35].
For the case of clocked circuits the simulation is more complex and must be
performed over a complete clock cycle; the conclusion is however similar, as far as a
single cell is concerned: about 6 1022 J dissipated in a clock cycle for the abovementioned typical circuit parameters. In the clocked case, however, the dissipated
energy is supplied by the clock electrodes directly to each cell (or clocking zone, in the
case of groups of cells sharing the same clock phase), and therefore there is a linear
increase in the energy dissipation as the number of cells is increased, contrary to what
happens with the unclocked circuits. Indeed, with the clocked architecture there is an
improvement in terms of speed and regularity of operation, but the power consumption in increased. Also in this case, it is possible to show that the contribution from
the energy loss associated with capacitor charging is negligible, because the clock
ramps can be much slower than the relevant time constants in the circuit.
Overall, the dissipation for a switching event of a single QCA cell is four orders of
magnitude smaller than that projected for an end-of-the-roadmap MOS transistor by
the ITRS Roadmap. However, the transistor operates at 300 K, while the simulations
have been performed, for the clocked case, at 2.5 K. Cooling down to such a
temperature requires energy, which can be estimated on the basis of the efciency
of a Carnot cycle refrigerator [37]. Inclusion of the energy lost for cooling reduces the
ratio of the energy dissipated in a transistor to that dissipated in a QCA cell by two
orders of magnitude. The advantage of the QCA cell still remains two orders of
magnitude, but a fair comparison would require a relatively large effort, as a larger
number of QCA cells is in general needed than transistors in order to obtain the same
logic function. Furthermore, the energy savings that can be obtained in CMOS
adiabatic logic should also be taken into consideration.
12.5
Physical Implementations of the QCA Architecture
12.5.1
Implementation with Metallic Junctions
small area (and therefore very small capacitance) can be fabricated between slightly
overlapping electrodes, on top of a silicon oxide substrate, using the shadow mask
evaporation technique. The QCA array in this case corresponds to a single-electron
circuit, with tunnel junctions, capacitors, and voltage sources. From a technological
point of view, a circuit with metallic junctions is relatively simple to implement, but
has the major drawback, with currently available fabrication capabilities, of yielding
capacitances no smaller than a few hundred attofarads [17], thus making operation
possible only at temperatures of 100 mK or lower.
With such a technique, several QCA circuits have been demonstrated by the Notre
Dame group: the basic driver cell-driven cell interaction [2], the operation of a clocked
cell [12], a clocked QCA latch [38], and power gain in a QCA chain [34].
The problems connected with the undesired inuence of each electrode on the
proper balance of the cells via stray capacitances have been solved with a clever
experimental scheme, based on an initial evaluation of the capacitance matrix among
all electrodes. Once this is known, when the voltage of one electrode is varied, the
voltages applied to all the other electrodes can be corrected in such a way as to
compensate for the effects deriving from undesired capacitive couplings.
Although this technology has been very successful in the experimental demonstration of QCA operation, it appears very difcult to scale it down in order to increase
the operating temperature so that it can be applied to large-scale circuits.
12.5.2
Semiconductor-Based Implementation
There are two main semiconductor implementations of QCA technology that have
been attempted: one based on the Si/SiO2 material system, and the other on the
GaAs/AlGaAs material system. As previously stated, silicon has the advantage of the
reduced permittivity of SiO2, which allows the operating temperature to be raised but
the fabrication of nanostructures in GaAs (dened by means of depletion gates) is
more developed and tested.
For the approach based on GaAs/AsGaAs, a high-mobility, two-dimensional
electron gas (2DEG) is formed at the heterointerface, and the quantum dots forming
a cell are obtained by means of electrostatic depletion performed with properly
shaped metal gates (Figure 12.22). In the experiments conducted to date, there is a
hint of QCA effect, but it has not been possible to obtain full cell operation due to the
too-small value of the capacitance coupling the upper with the lower dots across the
barrier created by the horizontal electrode.
As the 2DEG is a few tens of nanometers below the surface, it is not possible to
effectively dene (at the 2DEG level) features that are signicantly smaller; this also
implies that dots cannot be made very close to each other, which represents a
limitation on the maximum achievable interdot capacitance.
The advantage of GaAs technology is that tunnel barriers between dots can be nely
tuned (contrary to what happens with the siliconsilicon oxide material system; see
Section 12.5.3) by adjusting the bias voltage applied to the split gates dening them. In
the cell represented in Figure 12.22, tunneling can occur between the top dots and
j355
356
between the bottom dots, but not between one of the top dots and one of the bottom
dots. This is not a problem, however, as the two congurations, with the electrons
aligned along either diagonal, can still be achieved, and thus cell operation is unaltered.
A series of experiments has been performed on the prototype GaAs cell, operating,
for example, the bottom part, while using one of the split gates in the top part as a noninvasive charge detector [39, 40], to monitor the motion of electrons between the two
bottom dots. The outer quantum point contacts (QPC) in the bottom part are pinched
off, to guarantee that the total number of electrons remains constant. Therefore, as a
result of a variation of the voltage applied to the plunger gate of the dot at the bottom
left (the shorter gate in the middle of the dot region), it is possible to observe motion of
an electron from one dot to the other: as the plunger gate becomes more negative, an
electron is moved from left to right. It has been observed [39] that the motion of an
electron between the two bottom dots causes a shift of the Coulomb blockade peaks
relative to one of the upper dots by about 20% of the separation between two
consecutive peaks: this coupling is estimated to be sufcient to determine a reverse
motion of an electron between the upper dots (i.e. the basic QCA effect).
The gate layout used for this experiment can also be applied to the implementation of
a binary wire, but not for general logic circuits, because lateral branching is not possible
due to the presence of the leads reaching each gate. It should also be pointed out that,
even for the implementation of a simple binary wire, a careful balancing procedure
would be needed, because even the nite length of the wire may be sufcient to create a
fatal unbalance for all the cells, except for the one in the middle [23].
The other semiconductor implementation that has been attempted is based on
silicon dots embedded in silicon oxide. As mentioned above, this material system has
the advantage of the lower permittivity of silicon oxide with respect to gallium
arsenide. However, although smaller feature sizes are achievable, control of the
tunnel barriers is quite difcult as they are obtained by lithographically dening a
narrower silicon region between two adjacent dots [31]. A prototype silicon QCA cell
was fabricated at the University of T
ubingen starting from a SOI (Silicon-OnInsulator) substrate, dening the structure by means of electron-beam lithography
and reactive ion etching. The lithographically dened features are then further
shrunk by means of successive oxidations [41]. It can be seen in Figure 12.23 that the
tunneling junctions (between the two upper and the two lower dots) have been
obtained by creating a narrower region, with a cross-section small enough that it does
not allow propagation of electrons at the Fermi energy.
Such tunnel junctions are not easily controllable and, depending on the value of the
Fermi energy and on the distribution of charged impurities, they may contain
multiple barriers. However, it has been shown [17, 42] that, by properly tuning the
back-gate voltage and the bias voltages applied to the gates, it is possible to achieve a
condition in which both junctions contain a single barrier. Clear control of dot
occupancy by means of the external gates has been demonstrated, by monitoring the
conductance of the upper and lower double-dot systems. A clear demonstration of the
QCA effect has not yet been possible due to the limited capacitive coupling between
the upper and the lower double dots. However, simulations have shown [17] that a
modied layout, with reduced spacing between the upper and the lower parts, should
allow the observation of cell operation at a temperature of 0.3 K (probably up to 1 K),
which is denitely higher than that required for metal dots and for GaAs. Unfortunately, also in this case, the basic layout used for the experiments should be
signicantly modied to make it suitable for complex circuits.
12.5.3
Molecular QCA
j357
358
reduce the precision constraints, exploiting the fact that molecules are identical by
construction. Furthermore, approaches to fabrication based on self-assembly could
be envisioned, which would signicantly decrease fabrication costs.
The molecular QCA concept relies on molecules containing four (or possibly two)
sites where excess electrons can be located and which are separated by potential
barriers. It has been demonstrated that potential barriers do exist at the molecular
level and that they do lead to bistability effects [7]: a simple example is represented by a
methylene group (CH2) placed between two phenyl rings.
For the implementation of a complete cell, several candidate molecular structures
have been proposed, such as metallorganic mixed-valence compounds containing
four ruthenium atoms that represent the four dots. However, investigations are
continuing to determine whether sufcient coupling is achievable between cells,
because the screening action of the electronic clouds of the ligand atoms may
determine too large a suppression of the electrostatic interaction. Furthermore, the
problem of attaching the molecules to a substrate, in order to create properly
structured arrays, is still only partially solved. In particular, the presence of imperfections or unavoidable stray charges at the surface of the substrate may create
asymmetries preventing correct QCA operation, notwithstanding the identity of all
molecules.
A simple molecule that has been proposed by the Notre Dame group as a model
system for half of a cell is the so-called Aviram molecule [43], in which the two dots are
represented by allyl groups at the ends of an alkane chain. Quantum chemistry
calculations have shown that some bistability effects can be obtained, as well as
sufcient electrostatic interaction between neighboring molecules, although, due to
the reactivity of the allyl groups and to the difculty to attach this molecule to a
substrate, it does not seem a likely candidate for experiments.
Whilst overall the molecular approach seems the most appropriate solution for the
implementation of the QCA concept in the long term, many problems some of
which are fundamental in nature remain unsolved, such as nding a reliable way to
assemble molecular arrays, managing the effect of stray charges, and determining
whether the interplay of molecular sizes and screening effects will allow reasonably
high operating temperatures.
12.5.4
Nanomagnetic QCA
To date, implementations of the QCA concept have been considered that rely on an
electrostatic interaction between dots within a cell and between neighboring cells. It
is also possible, as mentioned in the introduction, to exploit other forms of interaction, which may be less susceptible to the effects of temperature and of imperfections.
One such alternative solution is represented by nanomagnetic QCA circuits. The
concept is rather simple: an array of elongated single-domain dots obtained from
properly chosen magnetic material will relax into an antiferromagnetic ordering, and
it will be possible to drive the evolution of the system with an external clock consisting
in an oscillating magnetic eld that also supplies the energy needed for power gain
along the circuit. The rst experimental investigation into the possibility of propagating the magnetic polarization along a chain of nanomagnets was performed by
Cowburn and Welland [44], who managed to show the operation of a chain of
magnetic nanodots.
The specic nanomagnetic approach to QCA circuits has been investigated
mainly by Csaba and Porod, who have determined that an energy difference
between the ground state and the rst excited state of 150 kT at room temperature
can be achieved with elongated nanomagnets that are manufacturable with existing
technologies. However, there is also a relatively high barrier (100 kT) between the
two states, and therefore at room temperature the system would be stable in both
congurations. Thus, a pure ground-state relaxation scheme is not applicable, and
resort must be made to the above-mentioned oscillating clock eld. Such a eld is
used to turn the magnetic moments of all nanomagnets into a neutral state, from
which they can relax into the ground state (as long as they remain in the instantaneous ground state).
A chain with an even number of cells (including the driver cell) will act as an
inverter, as antiferromagnetic ordering is present. Thus, implementation of the NOT
gate is straightforward; the majority voting gate can be implemented [17] in a way
similar to that for electrostatically coupled QCA systems, with three binary wires
converging on a single cell, from which another binary wire representing the output
departs. Therefore, a generic combinatorial network can be realized in a way quite
similar to what has been seen for other QCA technologies.
Simulators for nanomagnetic QCA circuits have been developed by Csaba and
Porod [45], in which the complete micromagnetic equations are solved numerically. It
has also been noticed that, for dot sizes below 100 nm, a signicant simplication can
be used the single-domain approximation in which the magnetization condition
of the dots can be represented by means of single vectors instead of vector elds. Such
an approximation is valid because magnetic dots below a size of 100 nm operate as
single domains. The equations governing the evolution of single domains can be
written as a system of ordinary differential equations and may then be recast into the
form of a standard SPICE model. This allows efcient and easy simulation of
relatively complex architectures of nanomagnetic QCAs, and has made it possible
to show that logic signal restoration and power gain can be achieved, at the expense of
the external oscillating magnetic eld.
12.5.5
Split-Current QCA
j359
360
The current ows mainly through these four dots, and the actual value of the current
through each dot is strongly dependent on the position of the alignment of the energy
levels in the upper and lower GaAs wells. Starting from a situation where the upper and
lower levels are aligned, the ow of current will create a charge density that, in turn, will
perturb the position of the resonant levels in the nearby dots: the larger the current, the
greater the induced level shift. Therefore, for an isolated cell, the rest condition will be
with current owing mainly through either pair of antipodal dots (i.e. the dots that are
furthest from each other), so that the resonant level shift is minimized. If another,
driver, cell is placed next to it, with a well-dened polarization, the same polarization
state will be obtained, as a result of Coulomb interaction between dots. Thus, operation
similar to that of previously discussed electrostatically coupled cells will be achieved.
The authors of Ref. [46] suggest that clocking is also possible, by controlling the voltage
applied in the vertical direction across the resonant tunneling structures.
Although interesting in principle, this approach forfeits one of the main advantages of the QCA architecture, namely the potentially extremely low power consumption, since non-zero currents owing in the vertical direction through the resonant
tunneling diodes are always present, except for the null phase of the clock.
12.6
Outlook
The QCA concept has been the subject of signicant research activity throughout the
past decade, leading to results of general interest in the eld of nanoelectronics. The
practical implementation of QCA circuits is, however, still elusive, because of a few
major problems connected with the weakness of the proposed cell-to-cell interaction
mechanisms and with the extreme sensitivity to fabrication tolerances. Novel
concepts are being explored, in particular aimed at the ultimate miniaturization,
with cells consisting of single molecules, or aimed at an increase of inter-cell
interaction, with cells made up of single-domain nanomagnets.
References
References
1 C. S. Lent, P. D. Tougaw, W. Porod, Appl.
Phys. Lett. 1993, 62, 714.
2 I. Amlani, A. O. Orlov, G. L. Snider, C. S.
Lent, G. H. Bernstein, Appl. Phys. Lett.
1998, 72, 2179.
3 C. S. Lent, P. D. Tougaw, Proc. IEEE 1997,
85, 541.
4 C. S. Lent, Science 2000, 288, 1597.
5 G. Csaba, A. Imre, G. H. Bernstein, W.
Porod, V. Metlushko, IEEE Trans.
Nanotechnol. 2002, 1, 209.
6 P. Bakshi, D. A. Broido, K. Kempa, J. Appl.
Phys. 1991, 70, 5150.
7 M. Girlanda, M. Macucci, J. Phys. Chem. A
2003, 107, 706.
8 C. S. Lent, P. D. Tougaw, W. Porod, in:
Proceedings of the Workshop on Physics
and Computation - Physcomp, Dallas,
Texas, November 1720, IEEE Computer
Press, pp. 113, 1994.
9 C. Ungarelli, S. Francaviglia, M. Macucci,
G. Iannaccone, J. Appl. Phys. 2000, 87, 7320.
10 A. N. Korotkov, Appl. Phys. Lett. 1995, 67,
2412.
11 D. J. Griffths, Introduction to Quantum
Mechanics, Prentice-Hall, Englewood
Cliffs, NJ, 1994.
12 A. O. Orlov, I. Amlani, R. K. Kummamuru,
R. Ramasubramaniam, G. Toth, C. S. Lent,
13
14
15
16
17
18
19
20
21
22
23
j361
362
35
36
37
38
39
40
41
42
43
44
45
46
j363
13
Quantum Computation: Principles and Solid-State Concepts
Martin Weides and Edward Goldobin
13.1
Introduction to Quantum Computing
364
value of temperature, the device has to use a so-called reversible logic (see Section
13.2). The implementation of reversible computing demands a precise control of the
physical dynamics of the computation machine to prevent a partial dissipation of the
input information (i.e. energy) into the form of heat. One type of reversible computer
is the quantum computer which, by denition, relies on the time-reversible quantum
mechanics.
A quantum computer is a device for information processing that is based on
distinctively quantum mechanical phenomena, such as quantization of physical
quantities, superposition and entanglement of states, to perform operations on data.
The amount of data in a quantum computer is measured in quantum bits or qubits,
whereas a conventional digital computer uses binary digits, in short: bits. The
quantum computation relies on the quantum information carriers, which are single
quantities, whereas the conventional computer uses a huge number of information
carriers. In addition, the quantum devices may be much more powerful than the
conventional devices, as a different class of (quantum) algorithm exploiting quantum
parallelism can be implemented. Some specic problems cannot be solved efciently
on classical computers because the computation time would be astronomically large.
However, they could be solved in reasonable time by quantum computers. This
emerging technology attracts much attention and effort, both from the scientic
community and industry, although the possibility of building such a device with a
large number (10) of qubits is still not answered.
For a more detailed introduction to classical and quantum computation, the
interested reader is referred to a great selection of excellent textbooks such as, for
example, Feynman Lectures on Computation [2] and others [4, 5].
13.1.1
The Power of Quantum Computers
In recent years there has been a growing interest in quantum computation, as some
problems, which are practically intractable with classical algorithms based on digital
logic, can be solved much faster by massive parallelism provided by the superposition
principle of quantum mechanics.
In theoretical computer science, all problems can be divided into several classes of
complexity, which represents the number of steps of the most efcient algorithm
needed to solve the problem. The class P consists of all problems that can be solved on
a Turing machine in a time which is a polynomial function of the input size. The class
BPP (Bound-error, Probabilistic, Polynomial time) is solvable by a probabilistic Turing
machine in polynomial time, with a given small (but non-zero) error probability for all
instances. It contains all problems that could be solved by a conventional computer
within a certain probability. It comprises all problems of P. The class NP consists of
all problems whose solutions can be veried in polynomial time, or equivalently,
whose solution can be found in polynomial time on a non-deterministic machine.
Interestingly, no proof for P 6 NP that is, whether NP is the same set as P has yet
been found. Thus, the possibility that P NP remains, although this is not believed
by many computer scientists.
The class of problems that can be efciently solved by quantum computers, called
BQP (bounded error, quantum, polynomial time), is a strict superset of BPP (see
Figure 13.1). For details on complexity classes, see Nielsen and Chuang [4].
The most important problems from the BQP class that cannot be solved efciently
by conventional computation (i.e. they do not belong to P or BPP class) but by
quantum computation, are summarized below.
13.1.1.1 Sorting and Searching of Databases (Grovers Algorithm)
Quantum
computers should be able to search unsorted databases of N elements in
p
N queries, as shown by Grover [6], rather than the linear classical search
algorithm which takes N steps of a conventional machine (see Figure 13.2). This
speedup is considerable when N is getting large.
13.1.1.2 Factorizing of Large Numbers (Shors Algorithm)
A quantum algorithm for the factorization of large numbers was proposed by
Shor [7], who showed that quantum computers could factor large numbers into
their prime factors in a polynomial number of steps, compared to an exponential
j365
366
number of steps on classical computers (see Figure 13.2). The difculty of prime
factorization is a cornerstone of modern public key cryptographic protocols. The
successful implementation of Shors algorithm may lead to a revolution in
cryptography.
13.1.1.3 Cryptography and Quantum Communication
Contrary to the classical bit, an arbitrary quantum state cannot be copied (no-cloning
theorem; see Section 13.3.4) and may be used for secure communication by means of
quantum key distribution or quantum teleportation. By sharing an entangled pair
of qubits (so-called EPS-pair after a famous paper from Einstein, Podolski and
Rosen [8]), signals can be transmitted that cannot be eavesdropped upon without
leaving a trace; that is, performing a measurement and thereby destroying the
entangled state.
All of these three important challenges for quantum computations have been
implemented on NMR qubits or photons; that is, their feasibility has been proven
using a small set of qubits.
13.2
Types of Computation
S kB lnW kB ln2:
In case that the number of bits m is reduced during the computation process, for
example by logic gates with less output than input bits, the energy dissipation per lost
bit is given by
DW TDS kB Tln2:
This is the so-called Landauer principle [10], which states that erasure is not
thermodynamically reversible. Any loss of information inherently leads to a (minimum) energy dissipation of this amount per reduced bit in the case of an isothermal
operation, as the entropy of the systems changes. In contrast, to prevent the thermal
energy destroying the stored information, the minimum power for storage of
information is kBT ln 2. The energy consumption in computation is closely linked to
the reversibility of the computation.
13.2.2
Irreversible Computation
In all switching elements with more input than output bits, a loss of information
occurs upon the information processing. For example, the Boolean function AND is
dened as
0; 0 ! 0; 0; 1 ! 0;
1; 0 ! 0;
1; 1 ! 1 :
The 0 output of this conventional two-valued gate can be caused by three possible
input signal congurations. This type of computing is called irreversible.
13.2.3
Reversible Computation
0; 1 ! 0; 1;
1; 0 ! 1; 1;
1; 1 ! 1; 0 :
j367
368
or to switch the system back into its original state. Thus, both forward- and backwardswitching processes have the same probabilities and the net speed of the computational process is zero. The ow of information must be determined by the gates and
an adequate feedback prevention must be built into the logic gates.
13.2.4
Information Carriers
13.3
Quantum Mechanics and Qubits
13.3.1
Bit versus Qubit
j369
370
described by a quantum-mechanical wave function and can tunnel under the barrier
which separates two wells (i.e. the two states). As a consequence, a quantum system in
a double-well potential has the two lowest energy states j0i p12j"i j#i and
j1i p12j"i j#i. Here " and # are two classical states, like the states 0 and 1 in
the double-well potential (Figure 13.3a). In the ground state |0i the wave function is
symmetric, whereas for the rst excited state |1i it is antisymmetric, Figure 13.3b). A
system prepared in a superposition state exhibits coherent oscillations between the two
wells, and the measuring probability for nding the particle in each well oscillates with
frequency w D h 1, where D is the splitting of the lowest energy level (see
Figure 13.3b).
13.3.2
Qubit States
A pure qubit state is a linear superposition of both eigenstates. This means that the
qubit can be represented as a linear combination of |0i and |1i:
jyi aj0i bj1i;
13:1
The common phase factor resulting from the complex nature of a and b was
neglected. The state of a single qubit can be represented geometrically by a point on
the surface of the Bloch sphere (see Figure 13.4). A single qubit has two degrees of
freedom j, q. A classical bit can only be represented by two discrete values 0 or 1. Note
that the two complex numbers a, b in Eq. 13.1 in fact correspond to four numbers:
Re(a), Im(a), Re(b), Im(b). However, these numbers are not independent; they are
linked by the unity norm and the physical irrelevant common phase factor can be
neglected. Any two-level quantum physical system can be used as a qubit; for
example, the electron charge, polarization of photons, the spin of electrons or atoms
and the charge, ux or phase of Josephson junctions could be used for implementation of qubits.
13.3.3
Entanglement
j371
372
^
Hjxi
E x jxi:
yields the spectrum of allowed energy levels of the system, given by the set of
^ is a Hermitian operator, the energy is always a real number.
eigenvalues Ex. Since H
The time evolution |y(t)i of quantum states is given by the time-dependent
Schrodinger equation:
^ jyti ih q jyti:
H
qt
^ is independent of time this equation can be integrated to obtain the state at any
If H
time:
^t
iH
jyti exp
jy0i;
h
where |y(0)i is the state at some initial time (t 0).
13.3.4.1 Measurement
A measurement of a quantum state inevitably alters the system, as it projects the state
onto the basis states of the measuring operator. Only if the state is already the
eigenstate of the measuring operator then the state does not change. Thus, the
superposition of states collapse into one or the other eigenstate of the operator, dened
by the probabilities amplitudes. The precise amplitudes (a and b of a single qubit) can
be found by multiple recreation of the superposition and subsequent measurements.
13.3.4.2 No-Cloning Theorem
Unlike the classical bit, of which multiple copies can be made, it is not possible to
make a copy (clone) of an unknown quantum state [12, 13]. This so-called no-cloning
theorem has profound implications for error correction of quantum computing as
well as quantum cryptography. For example, such as it prevents the use of classical
error correction techniques on quantum states, no backup copy of a state to correct
subsequent errors could be made. However, error correction in quantum computation is still possible (see Section 13.5 for details). The no-cloning theorem protects the
uncertainty principle in quantum mechanics, as the availability of several copies of an
unknown system, on which each dynamical variable could be measured separately
with arbitrary precision would bypass the Heisenberg uncertainty principle DxDp,
DEDt h/2.
13.4
Operation Scheme
Quantum bits must be coupled and controlled by gates in order to process the
information. At the same time, they must be completely decoupled from external
inuences such as thermal noise. It is only during well-dened periods that the
control, write and readout operations take place to prevent an untimely readout.
13.4.1
Quantum Algorithms: Initialization, Execution and Termination
j373
374
vector by projection onto the eigenstate of the corresponding observable, and only a
probability distributed n-bit vector is obtained.
Even when neglecting the decoherence sources during the unitary transformations, the experimental readout schemes can never be perfectly efcient. Thus, it
should be possible to repeat the measurement to enhance the probability of the
obtained results by a majority polled output.
13.4.2
Quantum Gates
13.5
Quantum Decoherence and Error Correction
will increase, and the need for decoherence control becomes predominant. This
could be done by the implementing of quantum error correcting gates. In general,
the sources of error can be: (i) non-ideal gate operations; (ii) interaction with
environment causing relaxation or decay of phase coherence; and (iii) deviations
of the quantum system from an idealized model system.
In classical computers every bit of information can be re-adjusted after every logical
step by using non-linear devices to re-set the information bit to the 0 or 1 state.
Contrary in a quantum system, no copy can be made of a qubit state without
projecting it onto one of its eigenstates and thus destroying the superposition state.
Quantum information processing attracted much attention after Shors surprising
discovery that quantum information stored in a superposition state can be protected
against decoherence. The single qubit is encoded in multiple qubits, followed by a
measurement yielding the type of error, if any, which happened on the quantum state.
With this information the original state is recovered by applying a proper unitary
transformation to the system. This stimulated much research on quantum error
correction, and led to the demonstration that arbitrarily complicated computations
could be performed, provided that the error per operation is reduced below a certain
error threshold.
By repeated runs of the quantum algorithm and measurement of the output, the
correct value can be determined to a high probability. In brief, quantum computations are probabilistic.
13.6
Qubit Requirements
Di Vincenzo [14] listed criteria that any implementation for quantum computers
should fulll to be considered as useful:
.
The ability to initialize the state of all qubits to a simple basis state.
Relative long decoherence times tdec, much longer than the gate-operating time.
To observe quantum-coherent oscillations the requirement tdecD h must be
fullled, and the delity loss per single quantum gate operation should be below
some criteria.
13.7
Candidates for Qubits
j375
376
Atomic nuclei are relatively isolated from the environment and thus well protected
against decoherence. Their spins can be manipulated by properly chosen radio
frequency irradiation. Elementary quantum logic operations have been carried out on
the spin 12 nuclei as qubits by using nuclear magnetic resonance (NMR) spectroscopy [16]. This system is restricted by the low polarization and the limited numbers
of resolvable qubits. The nuclei (see Figure 13.6) were in molecules in a solution, and
a magnetic eld dened the two energy-separated states of the nuclei. These
macroscopic samples with many nuclear qubits provide massive redundancy, as
each molecule serves as a single quantum computer. The qubits interact through the
electronic wave function of the molecule. Quantum calculations with up to seven
qubits have been demonstrated. For example, the Shor algorithm was implemented
to factorize the number 15 [17].
13.7.2
Advantages of Solid-State-Based Qubits
In 1998, Kane [19] suggested imbedding a large number of nuclear spin qubits, made
up by isotopically pure 31 P (nuclear spin 1/2) donor atoms, at a moderate depth in a
28
Si (nuclear spin 0) crystal to build a quantum computer (see Figure 13.7). When
placed in a magnetic eld, two distinguishable states for the spin 1/2 nuclei of the P
atoms appear. By a controllable overlap of the wave functions of electrons bound to
this atom in the deliberately doped semiconductor, some interaction may take place
between adjacent nuclear qubits. Voltages applied to electrodes (gate A) placed above
the phosphorus atoms can modify the strength of the electronnuclear coupling.
Individual qubits are addressed by the A-gates to shift their energies in and out of
resonance with an external radiofrequency. The strength of the qubitqubit coupling
by overlapping wavefunctions is controlled by electrodes placed midway between the
P atoms (gate B). The operation principle is similar to the NMR-based qubits in a
macroscopic molecule solution, except that the nuclei are addressed by potentials
rather than by the radiofrequency pulses.
j377
378
13.7.4
Quantum Dot
Beside the nucleus, the spin and charge of electrons may also be used to construct a
double degenerate system. The electron spin-based qubits have the advantage that the
Hilbert space consists of only two spin states, which strongly suppresses the leakage
of quantum information into other states. In addition, the spin is less coupled to the
environment than the charge, which results in longer decoherence times.
Quantum dots of 5 to 50 nm are thin, semiconducting multilayers on the substrate
surface, where conned electrons with discrete energy spectra appear. By using
Group IIIV compound semiconductor materials such as GaAs and InAs of different
lattice sizes, two-dimensional electron gas systems with high electron mobility can be
constructed. To use these as qubits, their quantized electron number or spin is
utilized [20]. The switching of the quantum state can be achieved either by optical or
electrical means. In case of the spin-based quantum dot the electrons are localized in
electrostatically dened dots (see Figure 13.8). The coupling between the electron
spins is made via the exchange interaction, and the electron tunneling between the
dots is controlled by a gate-voltage (gate B in Figure 13.8).
13.7.5
Superconducting Qubits
crosses the weak link is the Josephson supercurrent. For details on superconductivity
and the Josephson effect, see Refs. [21, 22].
Until now, several possible systems differing by 2e have been described for
constructing a superconducting qubit. In the charge qubit a coherent state with a
well-dened charge of individual Cooper pairs is used, while the ux qubit employs
two degenerate magnetic ux states and the phase qubit is based on the phase
difference of superconducting wavefunctions in two electrodes for quantum computation [23, 24].
13.7.5.1 Charge Qubits
The basis states of a charge qubit are two charge states. The qubit is formed by a tiny
100-nm superconducting island a Cooper pair box. Thus the charging electrostatic energy of a Cooper pair dominates in comparison with all other energies. An
external gate voltage controls the charge, and the operation can be performed by
controlling the applied gate voltages Vg and magnetic elds. A superconducting
reservoir, coupled by a Josephson junction to the Cooper pair box, supplies the
neutral reference charge level (see Figure 13.9). The state of the qubit is determined
by the number of Cooper pairs which have tunneled across the junction.
The readout is performed by a single-electron transistor attached to the island
(not shown). By applying a voltage to this transistor, the dissipative current through
the probe destroys the coherence of charge states and its charge can be measured
[25].
13.7.5.2 Flux Qubits
A ux qubit is formed by a superconducting loop (1 mm size) interrupted by several
Josephson junctions with well-chosen parameters (c.f. Figure 13.10) [26]. To obtain a
double-well potential, as in Figure 13.10, either an external ux F0/2 or a p junction is
needed. By including a p junction [2729] in the loop the persistent current,
j379
380
Figure 13.10 The basis states |0i and |1i of the flux qubit are
determined by the direction of a persistent current circulating in
three-junctions qubit [26]. The basis states |0i and |1i differ by the
direction of the current in the superconducting loop. The pJosephson junction (JJ) self-biases the qubit to the working point
and, thus, substitutes an external magnetic flux.
generating the magnetic ux, may spontaneously appear and ow continuously, even
in absence of an applied magnetic eld [30]. The basis states of the qubit are dened
by the direction of the circulating current (clockwise or counter-clockwise). The
currents screen the intrinsic phase shift of p of the loop, such that the total ux
through the loop is equal to F0/2, i.e. half a magnetic ux quanta. The two energy
levels corresponding to the two directions of circulating supercurrent are degenerate.
If the system is in the quantum mechanical regime (low temperature to suppress
thermal contributions) and the coupling between the two states (clockwise/counterclockwise current ow) is strong enough (viz. the barriers are low), the system can be
in the superposition of clockwise and counter clockwise states. This quiet qubit [31], is
expected to be robust to the decoherence by the environment because it is self-biased
by a p Josephson junction. Note that this ux-qubit device with a p junction is an
optimization of the earlier scheme, where the phase shift of p was generated by an
individual outer magnetic eld of F0/2 for each qubit [26].
The readout could by made by an additional superconducting loop with one
or two Josephson junctions (i.e. SQUID loop) that is inductively coupled to the
qubit.
To process the input and output of ux qubits an interface hardware based on the
rapid single ux quantum (RSFQ) circuits could be used. These well-developed
superconducting digital logics work by manipulating and transmitting a single ux
quanta. In fact, this logic overcomes many problems of conventional CMOS-logics as
it has a very low power consumption, an operating frequency of several hundred
GHz, and is compatible with the ux qubit technology [32].
13.7.5.3 Fractional Flux Qubits
One variation of a ux p qubit is a fractional ux qubit. At the boundary between a 0
and a p coupled Josephson junction (i.e. a 0p JJ) a spontaneous vortex of supercurrent
may appear under certain circumstances. Depending on the length L of the junction,
the supercurrent carries a half-integer ux quantum F0/2 (called semiuxon) or
fractions of it. Figure 13.11 depicts the cross-section of a symmetric 0p long JJ.
Classically, the semiuxon has a degenerate ground state of either positive or negative
polarity, that corresponds to clockwise and counter-clockwise circulation of supercurrent around the 0p boundary. The magnetic ux localized at the 0p boundary is
F0/2 and represents two degenerate classical states [33].
0p Josephson junctions with a spontaneous ux in the ground state are realized
with various technologies. The presence of spontaneous ux has been demonstrated
experimentally in d-wave superconductor-based ramp zigzag junctions [34], in long
Josephson 0p junctions fabricated using the conventional Nb/Al-Al2O3/Nb technology with a pair of current injectors [35], in the so-called tricrystal grain-boundary
long junctions [3638] or in SIFS Josephson junctions [39] with a step-like ferromagnetic barrier. In the latter systems the Josephson phase is set to 0 or p by choosing a
proper F-layer thicknesses dF. The advantages of this system are that it can be
prepared in a multilayer geometry (allowing topological exibility) and it can be easily
combined with the well-developed Nb/Al-Al2O3/Nb technology.
A single semiuxon ground state is double degenerate with basis ux states |"i, |#i.
It transpires that the energy barrier scales proportionally to the junction length L, and
the probability of tunneling between |"i and |#i decreases exponentially for increasing L [40]. Hence, a single semiuxon will always be in the classical regime with
thermal activated tunneling for long junctions. As a modication, a junction of nite,
rather small length L may be considered. In this case, the barrier height is nite and
approaches zero when the junction length L ! 0. At this limit, the situation is not
really a semiuxon, as the ux F present in the junction is much smaller than F0/2.
A 0p0 Josephson junction (see Figure 13.12) has two antiparallel coupled
semiuxons for a distance a larger than the critical distance ac [40]. The ground
state of this system is either |"#i or |#"i. For symmetry reasons both states are
degenerate. The tunnel barrier can be made rather small, which results in a rather
strong coupling with appreciable energy level splitting due to the wave functions
overlap. Estimations show that this system can be mapped to a particle in a doublewell potential, and thus can be used as a qubit like other Josephson junctions-based
j381
382
Figure 13.12 The two basis states are |"#i and |#"i of two coupled
fractional vortices in a long linear 0p0 Josephson junction.
qubits. Thus, the 0p0 junctions are supposed to show the motion of a point-like
particle in a double-well potential, and may be used as the basis cell of a fractional ux
qubit.
13.8
Perspectives
References
1 R. P. Feynman, Int. J. Theoret. Physics 1982,
21, 467.
2 R. P. Feynman, Feynman Lectures on
Computation, Addison-Wesley, Reading,
MA, 1996.
3 International Technology Roadmap for
Semiconductors, http://www.itrs.net/.
4 M. A. Nielsen, I. L. Chuang, Quantum
Computation and Quantum Information,
University Press, Cambridge, 2000.
5 J. Stolze, D. Suter, Quantum Computation,
Wiley-VCH, 2004.
6 L. K. Grover, Proceedings 28th Annual
ACM Symposium on the Theory of
8
9
10
11
References
12 W. K. Wootters, W. H. Zurek, Nature 1982,
299, 802.
13 D. Dieks, Phys. Lett. A 1982, 92, 271.
14 D. P. DiVincenzo, Fortschr. Phys. Prog.
Physics 2000, 48, 771.
15 GDEST , EU/US Workshop on Quantum
Information and Coherence, December 89,
Munich, Germany, 2005.
16 L. M. K. Vandersypen, I. L. Chuang, Rev.
Mod. Phys. 2004, 76, 1037.
17 L. M. K. Vandersypen, M. Steffen, G.
Breyta, C. S. Yannoni, M. H. Sherwood, I.
L. Chuang, Nature 2001, 414, 883.
18 R. W. Keyes, Appl. Phys. A 2003, 76, 737.
19 B. E. Kane, Nature 1998, 393, 133.
20 D. Loss, D. P. DiVincenzo, Phys. Rev. A
1998, 57, 120.
21 W. Buckel, R. Kleiner, Superconductivity.
Fundamentals and Applications, WileyVCH, 2004.
22 A. Barone, G. Paterno, Physics and
Applications of the Josephson Effect, John
Wiley & Sons, 1982.
23 A. Ustinov, in: R. Waser (Ed.), Nanoelectronics
and Information Technology - Advanced
Electronic Materials and Novel Devices, WileyVCH, 2005.
24 Y. Makhlin, G. Schon, A. Shnirman, Rev.
Mod. Phys. 2001, 73, 357.
25 Y. Nakamura, Y. A. Pashkin, J. S. Tsai,
Nature 1999, 398, 786.
26 J. E. Mooji, T. P. Orlando, L. Levitov, L.
Tian, C. H. van der Wal, S. Lloyd, Science
1999, 285, 1036.
27 L. Bulaevskii, V. Kuzii, A. Sobyanin, J. Exp.
Theoret. Physics Lett. 1977, 25, 290.
28 V. V. Ryazanov, V. A. Oboznov, A. Y.
Rusanov, A. V. Veretennikov, A. A.
29
30
31
32
33
34
35
36
37
38
39
40
j383
Part One:
Nanomedicine: The Next Waves of Medical Innovations
j3
1
Introduction
Viola Vogel
1.1
Great Hopes and Expectations are Colliding with Wild Hype and Some Fantasies
What is nanomedicine? Will nanomedicine indeed help to cure major diseases and
live up to the great hopes and expectations? What innovations are on the horizon
and how can sound predictions be distinguished from wild hype and plain fantasy?
What are realistic timescales in which the public might benet from their ongoing
investments?
When rst exploring whether nanotechnology might reshape the future ways of
diagnosing and treating diseases, the National Institutes of Health stated in the report
of their very rst nanoscience and nanotechnology workshop in 2000 (http://www.
becon.nih.gov/nanotechsympreport.pdf. Bioengineering Consortium):
Every once in a while, a new eld of science and technology emerges
that enables the development of a new generation of scientic and
technological approaches. Nanotechnology holds such promise.
Our macroscopic bodies and tissues are highly structured at smaller and smaller
length scales, with each length scale having its own secrets as to how life-supporting
tasks are mastered. While we can still touch and feel our organs, they are all composed
of cells which are a little less than one million times smaller and only visible under the
light microscope (microscopic). Zooming further into the cell, about one thousand
times, we nd the nanoscale molecular machineries that drive and control the cellular
biochemistry, and thereby distinguish living systems from dead matter. Faced with a
rest of new technologies that has enabled researchers to visualize and manipulate
atoms and molecules, as well as to engineer new materials and devices at this tiny
length scale [1], major think tanks have begun since the late 1990s to evaluate the
future potential of nanotechnology [2], and later at the interface to medicine [311].
These efforts were paralleled by a rapid worldwide increase in funding and research
activities since 2000. The offset of a gold rush into the nano, by which the world of the
very small is currently discovered, will surely also lead to splendid new entrepreneurial opportunities. Progress impacting on human health came much faster than
expected.
Nanotechnology, Volume 5: Nanomedicine. Edited by Viola Vogel
Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31736-3
j 1 Introduction
1.2
The First Medical Applications are Coming to the Patients Bedside
1.3 Major Advances in Medicine Have Always been Driven by New Technologies
1.3
Major Advances in Medicine Have Always been Driven by New Technologies
During the past few decades, the deciphering of the molecular origins of many
diseases has had a most profound impact on improving human health. One
historical step was the deciphering of the rst protein structure in 1958 [23]. This
opened a new era in medicinal chemistry, as drugs could since then be designed in
a rational manner that is, drugs that t tightly into essential binding pockets
thereby regulating protein and DNA functions. The invention of how to harness
DNA polymerase in order to amplify genetic material in the test tube which we
now know as the polymerase chain reaction (PCR) [24] then opened the eld of
molecular biology during the 1980s. PCR also enabled targeted genetic alterations
of cells to identify the functional roles of many proteins, and this in turn led to the
discovery that cell signaling pathways of many interacting proteins existed, and
could be altered by diseases. The explosion of knowledge into how cell behavior is
controlled by biochemical factors opened the door to target drugs to very specic
players in cell signaling pathways. This also led to a host of new biotechnology startup companies, the rst of which became protable only around 2000.
The next major breakthrough came with the solving of the human genome in
2001 [2529]. Access to a complete genetic inventory of more than 30 000 proteins
in our body, combined with high-resolution structures for many of them, enables a
far more systematic search for correlations between genetic abnormalities and
diseases. Finally, various diseases could for the rst time be traced to inherited
point mutations of proteins. In achieving this, much insight was gained into the
regulatory roles of these proteins in cell signaling and disease development [30].
This includes recognizing genetic predispositions to various cancers [31], as well as
to inherited syndromes where larger sets of seemingly uncorrelated symptoms
could nally be explained by the malfunctioning of particular proteins or cell
signaling pathways [3238], including ion channel diseases [3942].
j5
j 1 Introduction
1.4
Nanotechnologies Foster an Explosion of New Quantitative Information How
Biological Nanosystems Work
Far less noticed by the general public are the next approaching waves of medical
innovations, made possible by an explosion of new quantitative information how
biological systems work.
The ultimate goal is to achieve an understanding of all the structural and dynamic
processes by which the molecular players work with each other in living cells and
coregulate cellular functions. Driven by the many technologies that have emerged
from the physical, chemical, biological and engineering sciences, to visualize (see
elsewhere in this volume [43, 44]) and manipulate the nanoworld, numerous
discoveries are currently being made (as highlighted in later chapters
[4351]). These ndings result from the new capabilities to create, analyze and
manipulate nanostructures, as well as to probe their nanochemistry, nanomechanics
and other properties within living and manmade systems. New technologies will
continuously be developed that can interrogate biological samples with unprecedented temporal and spatial resolution [52]. Novel computational technologies have,
furthermore, been developed to simulate cellular machineries in action with atomic
precision [53]. New engineering design principles and technologies will be derived
from deciphering how natural systems are engineered and how they master all the
complex tasks that enable life. Take the natural machineries apart, and then learn how
to reassemble their components (as exemplied here in Chapter 8 for molecular
motors [48]).
How effectively will these novel insights into the biological nanoworld are
recognized in their clinical signicance, and translated into addressing grand
medical challenges? This denes the time that it takes for the emergence of a next
generation of diagnostic and therapeutic tools. As these insights change the way we
think about the inner workings of cells and cell-made materials, totally new ways of
treating diseases will emerge. As described in detail elsewhere in this volume, new
developments are already under way of how to probe and control cellular activities
[45, 47, 4951, 54, 55]. This implicates the emergence of new methodologies of how to
correct tissue and organ malfunctions. Clearly, we need to know exactly how each
disease is associated with defects in the cellular machinery before medication can be
rationally designed to effectively cure them.
Since every one of the new (nano)analytical techniques has the potency of
revealing something never seen before, a plethora of opportunities can be
envisioned. Their realization, however, hinges on the scientists ability to recognize the physiological and medical relevance of their discoveries. This can best be
accomplished in the framework of interdisciplinary efforts aimed at learning from
each other what the new technologies can provide, and how this knowledge can be
effectively translated to address major clinical challenges. Exploring exactly how
these novel insights into the nanoworld will impact medicine has been the goal of
many recent workshops [311, 5658]). This stimulated the creation of the NIH
1.5 Insights Gained from Quantifying how the Cellular Machinery Works
Roadmap Initiative in Nanomedicine [57, 58], and is the major focus of this
volume.
1.5
Insights Gained from Quantifying how the Cellular Machinery Works will lead
to Totally New Ways of Diagnosing and Treating Disease
Which are some of the central medical elds that will be impacted? Despite these
stunning scientic advances, and the successful suppression or even eradication of a
variety of infectious diseases during the past 100 years, the goal has not yet been reached
of having medication at hand to truly cure many of the diseases that currently kill the
largest fraction of humans per year, including cancers, cardiovascular diseases and
AIDS. Much of the current medication against these diseases ghts symptoms or
inhibits their progression, often inicting considerable side effects. Unfortunately,
however, much of the medication can slow down but it cannot reverse disease progression in any major way all of which contributes to healthcare becoming unaffordable,
even in the richest nations of the world. For instance, intense cancer research over the
past decades has revealed that the malignancy of cancer cells progresses with the
gradual accumulation of genetic alterations [12, 50, 5965]. Yet, little remains known as
to how cancer stem cells form, in the rst place, and about the basic mechanisms that
trigger the initiation of their differentiation into more malignant cancer cells after
having remained dormant, sometimes for decades [6669]. While much has been
learned in the past about the molecular players and their interactions, the abovementioned shortcomings in translating certain advances in molecular and cell biology
into moreeffective medicationreect substantial gapsin our knowledge of howall these
componentswithin thecellsworkintheframework of anintegratedsystem. How can so
many molecular players be tightly coordinated in a crowded cellular environment to
generate predictable cell and tissue responses? Whilst lipid membranes create barriers
that enclose the inner volumes of cells and control which molecules enter and leave
(among other tasks), it is the proteins that are the workhorses that enable most cell
functions. In fact, some proteins function as motors that ultimately allow cells to move,
as discussed in different contexts in the following chapters [46, 48, 50]. Other proteins
transcribe and translate genetic information, and efforts to visualize these in cells have
been summarized in Chapter 6 [45]. Yet other sets of proteins are responsible for the cell
signaling through which all metabolic functions are enabled, orchestrated and regulated. But what are the underlying rules by which they play and interact together to
regulate diverse cell functions? How do cells sense their environments, integrate that
information, and translate it to ultimately adjust their metabolic functions if needed?
Can this knowledge help to develop interventions which could possibly reverse
pathogenic cells such that they performed their normal tasks again? Deciphering how
all of these processes are regulated by the physical and biochemical microenvironment
of cells is key to addressing various biomedical challenges with new perceptions, and is
described as one of the major foci in this volume. But, how can this be accomplished?
j7
j 1 Introduction
1.6
Engineering Cell Functions with Nanoscale Precision
1.7
Advancing Regenerative Medicine Therapies
Virtually any disease that results from malfunctioning, damaged or failing tissues may
be potentially cured through regenerative medicine therapies, as was recently stated in
the rst NIH report on Regenerative Medicine [4]. But, how will nanotechnology make
a difference? The repair or ultimately replacement of diseased organs, from larger
bone segments to the spinal cord, or from the kidneys to the heart, still poses major
challenges as discussed in chapters 9 to 14 [47, 49, 50, 54, 55, 77]. The current need for
organ transplants surpasses the availability of suitable donor organs by at least an order
of magnitude, and the patients who nally receive an organ transplant must receive
immune suppressant drugs for the rest of their life. Thus, one goal will be to apply the
mounting insights into how cells work, and how their functions are controlled by
matrix interactions, to design alternate therapies that stimulate regenerative healing
1.8
Many More Relevant Medical Fields Will be Innovated by Nanotechnologies
Whilst the major focus of this volume is to outline the biomedical implications
derived from revealing the underpinning mechanisms of how human cells function,
it should also be mentioned that fascinating developments that are prone to alter
medicine are being made in equally relevant other biomedical sectors. Future ways to
treat infection will change when the underpinning mechanisms of how microbial
systems function are deciphered, and how they interact with our cells and tissues.
Many beautiful discoveries have already been made that will help us for example to
interfere more effectively with the sophisticated machinery that bacteria have evolved
to target, adhere and infect cells and tissues. Nanotechnology tools have revealed
much about the function of the nano-engines that bacteria and other microbes have
evolved for their movement [7880], how bacteria adhere to surfaces [8183], and
how microbes infect other organisms [8185]. Equally important when combating
infection is an ability to exploit micro- and nanofabricated tools in order to understand the language by which microorganisms communicate with each other [86, 87]
and how their inner machineries function [8890]. A satisfying understanding of how
a machine works can only be reached when we are capable of reassembling it from its
components. It is thus crucial to learn how these machineries can be reassembled ex
vivo, potentially even in nonbiological environments, as this should open the door to
many technical and medical applications [84, 9193]. Today, we have only just started
along the route to combining the natural and synthetic worlds, with the community
j9
j 1 Introduction
10
seeking how bacteria might be used as delivery men for nanocargoes [94], or in
manmade devices to move uids and objects [95, 96].
Finally, microfabricated devices with integrated nanosensors, nanomonitors and
nanoreporters all of which are intrinsic to a technology sector enabled by (micro/
nano)biotechnology will surely also lead to changes in medical practice. In the case of
chemotherapies and many other drugs, it is well known that they may function well in
some patients, but fail in others. It is feasable that this one-size-ts-all approach might
soon be replaced by a more patient-specic system. Personalized medicine refers to the
use of genetic and other screening methods to determine the unique predisposition of
a patient to a disease, and the likelihood of them responding to particular drugs and
treatments [30, 97100]. Cheap diagnostic systems that can automatically conduct
measurements on small gas or uid volumes, such as human breath or blood, will
furthermore enable patients to be tested rapidly, without the need to send samples to
costly medical laboratories. Needless to say, portable integrated technologies that will
allow the testing and treatment of patients on the spot (point-of-care) will save many
lives, and are urgently needed to improve human health in the Third World [101104].
Faced with major challenges in human healthcare, an understanding what each of
the many nanotechnologies can do and how they each can best contribute to address
the major challenges ahead is crucial to drive innovation forwards. An improved
awareness of how new technologies will help to unravel underpinning mechanisms
of disease is crucial to setting realistic expectations and timescales, as well as to
prepare for the innovations to come.
Since ultimately, thriving towards providing access to efcient and affordable
healthcare, by improving upon technology, is not just an intellectual luxury, but our
responsibility.
Acknowledgments
Many of the thoughts about the future of nanomedicine were seeded during exciting
discussions with Drs Eleni Kousvelari (NIH) and Jeff Schloss (NIH), as well as with
Dr Mike Roco (NSF), when preparing for the interagency workshop on Nanobiotechnology [9]. The many inspirations and contributions from colleagues and friends
worldwide are gratefully acknowledged, several of whom have contributed chapters
here, as well as from my students and collaborators and the many authors whose
articles and conference talks have left long-lasting impressions.
References
1 Nalwa, H.S. (2004) Encyclopedia of
Nanoscience and Nanotechnology,
American Scientic Publishers.
2 Roco, M.C. (2003) National
Nanotechnology Initiative: From Vision
to Implementation. http://www.nano.
gov/nni11600/sld001.htm.
3 National Institutes of Health (2000):
Nanoscience and Nanotechnology:
Shaping Biomedical Research. Organized
References
10
11
j11
j 1 Introduction
12
23
24
25
26
27
References
30
31
32
33
34
35
36
37
38
39
40
j13
j 1 Introduction
14
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
References
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
j15
j 1 Introduction
16
Part Two:
Imaging, Diagnostics and Disease Treatment by Using Engineered
Nanoparticles
j19
2
From In Vivo Ultrasound and MRI Imaging to Therapy: Contrast
Agents Based on Target-Specific Nanoparticles
Kirk D. Wallace, Michael S. Hughes, Jon N. Marsh, Shelton D. Caruthers, Gregory M. Lanza,
and Samuel A. Wickline
2.1
Introduction
Advances and recent developments in the scientic areas of genomics and molecular
biology have created an unprecedented opportunity to identify clinical pathology
in pre-disease states. Building on these advances, the eld of molecular imaging
has emerged, leveraging the sensitivity and specicity of molecular markers
together with advanced noninvasive imaging modalities to enable and expand the
role of noninvasive diagnostic imaging. However, the detection of small aggregates of
precancerous cells and their biochemical signatures remains an elusive target that is
often beyond the resolution and sensitivity of conventional magnetic resonance and
acoustic imaging techniques. The identication of these molecular markers requires
target-specic probes, a robust signal amplication strategy, and sensitive highresolution imaging modalities.
Currently, several nanoparticle or microparticle systems are under development
for targeted diagnostic imaging and drug delivery [1]. Peruorocarbon (PFC)
nanoparticles represent a unique platform technology, which may be applied to
multiple clinically relevant imaging modalities. They exploit many of the key
principles employed by other imaging agents. Ligand-directed, lipid-encapsulated
PFC nanoparticles (with a nominal diameter of 250 nm) have inherent physicochemical properties which provide acoustic contrast when the agent is bound to a
surface layer. The high surface area of the nanoparticle accommodates 100 to 500
targeting molecules (or ligands), which impart high avidity and provides the agent
with a robust stick and stay quality (Figure 2.1). The incorporation of large payloads
of lipid-anchored gadolinium chelate conjugates further extends the utility of the
agent to detect sparse concentrations of cell-surface biochemical markers with magnetic resonance imaging (MRI) [2]. Moreover, for MRI the high uorine signal
from the nanoparticle core allows the noninvasive quantication of ligand-bound
particles, which enables clinicians to conrm tissue concentrations of drugs when
20
2.2
Active versus Passive Approaches to Contrast Agent Targeting
The passive targeting of a contrast agent is achieved by exploiting the bodys inherent
defense mechanisms for the clearance of foreign particles. Macrophages of the
macrophage phagocytic system are responsible for the removal of most of these
contrast agents from the circulation; these are produced, in size-dependent fashion,
from the lung, spleen, liver and bone marrow. Phagocytosis and accumulation within
specic sites can be enhanced by biologic tagging (i.e. opsonization) with blood
proteins such as immunoglobulins, complement proteins or nonimmune serum
factors. In general, sequestration in the liver appears to be complement-mediated,
while the spleen removes foreign particulate matter via antibody Fc receptors [8].
This natural process of nondirected and nonspecic uptake of particles is generally
referred to as passive targeting (e.g. Feridex in the liver, or iron oxide in the sentinel
lymph nodes [9]).
Distinguished from passive contrast agents, targeted (i.e. ligand-directed) contrast agents are designed to enhance specic pathological tissue that otherwise might
be difcult to distinguish from the surrounding normal tissues. Here, an extensive
array of ligands can be utilized, including monoclonal antibodies and fragments,
peptides, polysaccharides, aptamers and drugs. These ligands may be attached either
covalently (i.e. by direct conjugation) or noncovalently (i.e. by indirect conjugation) to
the contrast agent. Engineered surface modications, such as the incorporation of
polyethylene glycol (PEG), are used to delay or avoid the rapid systemic removal of the
agents, such that ligand-to-target binding is allowed to occur.
The effectiveness of this concept of contrast agent targeting is demonstrated
with the application of paramagnetic MRI contrast agents. Paramagnetic agents
inuence only those protons in their immediate vicinity, and removal of these
contrast agents by the macrophage phagocytic system during passive targeting may
decrease their effectiveness via two mechanisms: (i) an accumulation of contrast
agent in specic organs that are distal to region of interest; and (ii) endocytosis,
which further decreases their exposure to free water protons. By targeting the
contrast agent, the paramagnetic ions can be brought in close proximity to the region
of interest with sufcient accumulation to overcome the partial dilution effect that
plagues some MRI contrast agents. Its efcacy is further enhanced with some
targeting platforms by delivering multiple contrast ions per particle [2].
2.3
Principles of Magnetic Resonance Contrast Agents
The fundamental physics underpinning MRI is grounded in the quantum mechanical magnetic properties of the atomic nucleus. All atomic nuclei have a fundamental
property known as the nuclear magnetic momentum or spin quantum number. Individual protons and neutrons are fermions that possess an intrinsic angular momentum,
or spin, quantized with a value of 1/2 [10, 11].
The overall spin of a nucleus (a composite fermion) is determined by the numbers
of neutrons and protons. In nuclei with even numbers of protons and an even
numbers of neutrons, these nucleons pair up to result in a net spin of zero. Nuclei
with an odd number of protons or neutrons will have a nonzero net spin which, when
placed in a strong magnetic eld (with magnitude B0), will have an associated net
magnetic moment, ~
m , that will orient either with (parallel) or against (anti-parallel)
the direction of B0. For a nucleus with a net spin of 1/2 (e.g. 1 H), this results in two
j21
22
To elucidate the source of image contrast, let us assume that two adjacent tissue
types (A and B) manifest identical longitudinal (T1) and transverse (T2) relaxation
times prior to nanoparticle binding, but only one tissue (say, type B) expresses the
molecular epitope of interest that binds the targeted paramagnetic nanoparticles.
The bound paramagnetic nanoparticles affect the relaxation times in the targeted
tissue according to the following equations [14]:
1
1
r1P hNPi
T1B T1A
1
1
r2P hNPi
T2B T2A
2:1
2:2
where T1B and T2B are the observed relaxation times after the nanoparticle binding,
T1A and T2A are the original relaxation times, r1P and r2P are the particle-based
relaxivities, and hNPi represents the average nanoparticle concentration within the
imaging voxel. For the purpose of this example, the assumption is made that targeted
binding does not affect particle relaxivity (i.e. r1p and r2p are constant).
The contrast-to-noise ratio (CNR) between the two tissues for a given sequence is
calculated as the absolute difference between their signal intensities. If IA and IB
represent the signal intensities of tissue A and B respectively, and N is the expected
level of noise in the resulting image, the CNR ratio is given by:
CNR
IA IB
N
2:3
For a spin echo pulse sequence, the signal intensity of each tissue is related to the
chosen scan parameters (echo time, TE, and repetition time, TR) as well as its
magnetic properties (T1 and T2), which change due to binding of the contrast agent,
and is described with the following relationships for tissues A and B [15]:
IA kA 12eTRTE=2=T1A eTR=T1A eTE=T2A
2:4
2:5
j23
24
of Gd3 in saline (4.5 mM1 s1) [20] is lower when compared to Gd3 bound to the
surface of a PFC nanoparticle (33.7 mM1 s1) [18] at a eld strength of 1.5 Tesla.
Considering that each nanoparticle carries approximately 50 000 to 100 000 Gd3 ,
the particle relaxivity has been measured at over 2 000 000 mM1 s1 [18]. The high
level of relaxivity achieved using this paramagnetic liquid PFC nanoparticle
allows for the detection and quantication of nanoparticle concentrations as low as
100 picomolar, with a CNR of 5 : 1 [16].
2.3.3
Perfluorocarbon Nanoparticles for Fluorine (19 F) Imaging and Spectroscopy
The intensity of a magnetic resonance signal is directly proportional to the gyromagnetic ratio (g) and the number of nuclei in the volume of interest [21]. Although
there are seven medically relevant nuclei, the 1 H proton is the most commonly
imaged nuclei in clinical practice because of its high g and natural abundance.
The isotopes, their g-values, natural abundance and relative sensitivity compared to
1
H with a constant eld are listed in Table 2.1. With a gyromagnetic ratio second
only to 1 H and a natural abundance of 100%, 19 F is an attractive nucleus for
MRI [22]. Its sensitivity is 83% (when compared to 1 H) at a constant eld strength
and with an equivalent number of nuclei. In biological tissue, low 19 F concentrations
(in the range of micromoles) makes MRI impractical at clinically relevant eld
strengths without 19 F-specic contrast agents [23]. Peruorocarbon nanoparticles
are 98% peruorocarbon by volume, which for peruoro-octylbromide (1.98 g ml1,
498.97 g mol1) equates to an approximately 100 M concentration of uorine within a
nanoparticle. The paucity of endogenous uorine in biological tissue allows the use
of exogenous PFC nanoparticles as an effective 19 F MR contrast agent, without
any interference from signicant background signal. When combined with local
drug delivery, detection of the 19 F signal serves as a highly specic marker for the
presence of nanoparticles that would permit the quantitative assessment of drug
dosing.
19
F has seven outer-shell electrons rather than a single electron (as is the case for
hydrogen); as a result, the range and sensitivity to the details of the local environment
of chemical shifts are much higher for uorine than hydrogen. Consequently,
Table 2.1 Medically relevant MRI nuclei.
Isotope
g (MHz T1)
Relative sensitivity
42.58
40.05
11.26
17.24
10.71
6.54
3.08
99.98
100
100
100
1.11
0.015
4.31
1.00
0.83
0.093
0.066
0.015
0.0097
0.0010
H
F
23
Na
31
P
13
C
2
H
15
N
19
distinct spectra from different PFC species can be obtained and utilized for
simultaneous targeting of multiple biochemical markers.
For use as a clinically applicable contrast agent, the biocompatibility of PFC
nanoparticles must be considered. Liquid PFCs were rst developed for use as a
blood substitute [24], and no toxicity, carcinogenicity, mutagenicity or teratogenic
effects have been reported for pure uorocarbons within the 460 to 520 molecularweight range. Peruorocarbons, which inert biologically, are removed via the
macrophage phagocytic system and excreted primarily through the lungs and in
small amounts through the skin, as a consequence of their high vapor pressure
relative to their mass [25]. The tissue half-lives of PFCs range from 4 days for
peruoro-octylbromide up to 65 days for peruorotripropylamine. The prolonged
systemic half-life of PFC nanoparticles, in conjunction with the local concentrating
effect produced by ligand-directed binding, permits 19 F spectroscopy and imaging
studies to be conducted at clinically relevant magnetic eld strengths.
2.3.4
Fibrin-Imaging for the Detection of Unstable Plaque and Thrombus
Of the over 720 000 cardiac-related deaths that occur each year in the United States,
approximately 63% are classied as sudden cardiac death [26]. Unfortunately, for the
majority of patients, this is the rst and only symptom of their atherosclerotic
heart disease [27]. Atherosclerosis manifests initially as a fatty streak but, without
proper treatment, it can progress to a vulnerable plaque that is characterized by a large
lipid core, a thin brous cap and macrophage inltrates [28]. These vulnerable
plaques are prone to rupture, which can lead to thrombosis, vascular occlusion and
subsequent myocardial infarction [29] or stroke. Routine angiography is the most
common method of diagnosing atherosclerotic heart disease, with the identication
of high-grade lesions (>70% stenosis) being referred for immediate therapeutic
intervention. Ironically, most ruptured plaques originate from coronary lesions
classied as nonstenotic [28]. Even nuclear and ultrasound-based stress tests are
only designed to detect ow-limiting lesions. Because the most common source of
thromboembolism comes from atherosclerotic plaques with 5060% stenosis [30],
diagnosis by traditional techniques remains elusive. In addition, there appears to be a
window of opportunity that exists between the detection of a vulnerable or ruptured
plaque and acute myocardial infarction (measured in a few days to months) [31],
when intervention could prove to be benecial.
The acoustic enhancement of thrombi using brin-targeted nanoparticles was
rst demonstrated in vitro as well as in vivo in a canine model at frequencies typically
used in clinical transcutaneous scanning [32]. The detection of thrombi was later
expanded to MRI in a study by Flacke et al. [7] Fibrin clots were targeted in vitro with
paramagnetic nanoparticles and imaged using typical low-resolution T1-weighted
proton imaging protocols with a eld strength of 1.5 Tesla. Low-resolution images
show the effect of increasing the amount of Gd3 incorporated in the nanoparticles: a
higher gadolinium loading results in brighter T1 signals from the brin-bound PFC
nanoparticles (Figure 2.3). In the same study, in vivo MR images were obtained of
j25
26
Figure 2.3 Low-resolution images (three-dimensional, T1weighted) of control and fibrin-targeted clot with paramagnetic
nanoparticles presenting a homogeneous, T1-weighted
enhancement.
brin clots in the external jugular vein of dogs. Enhancement with brin-targeted
PFC nanoparticles produced a high signal intensity in treated clots (1780 327),
whereas the control clot exhibited a signal intensity (815 41) similar to that of the
adjacent muscle (768 47).
This method was extended to the detection of ruptured plaque in human carotid
artery endarterectomy specimens resected from a symptomatic patient (Figure 2.4).
Fibrin depositions (hot spots) were localized to microssures in the shoulders of the
ruptured plaque in the targeted vessel (where brin was deposited), but this was not
appreciated in the control. Further investigation towards the molecular imaging of
small quantities of brin in ruptured plaque may someday detect this silent pathology
sooner in order to pre-empt stroke or myocardial infarction.
Figure 2.4 Color-enhanced magnetic resonance imaging of fibrintargeted and control carotid endarterectomy specimens, revealing
contrast enhancement (white) of a small fibrin deposit on a
symptomatic ruptured plaque. The black area shows a calcium
deposit. Three-dimensional, fat-suppressed, T1-weighted fast
gradient echo.
The high uorine content of brin-targeted PFC nanoparticles, as well as the lack
of background signal, can also be exploited for 19 F MRI and spectroscopy. In a recent
study conducted by Morawski et al., several methods were described for quantifying
the number of nanoparticles bound to a brin clot using the 19 F signal [16]. First,
brin-targeted paramagnetic peruoro-crown-ether nanoparticles and trichlorouoromethane 19 F spectra were obtained (Figure 2.5a). The relative crown ether signal
intensity (with respect to the trichlorouoromethane peak) from known emulsion
volumes provided a calibration curve for nanoparticle quantication (Figure 2.5b).
The peruorocarbon (crown ether) nanoparticles then were mixed in titrated
ratios with brin-targeted nanoparticles containing safower oil and bound to
plasma clots in vitro. As the competing amount of nonsignaling safower-oil agent
was increased, there was a linear decrease in the 19 F and Gd3 signal. The number
of bound nanoparticles was calculated from the 19 F signal and the calibration curve
described above, and compared with mass of Gd3 as determined by neutron
activation analysis. As expected, there was excellent agreement between measured
Gd3 mass and number of bound nanoparticles (calculated from the 19 F signal)
(Figure 2.5c).
In addition, clots were treated with brin-targeted nanoparticles containing either
of two distinct PFC cores: crown ether and peruoro-octyl bromide (PFOB) [33].
These exhibited two distinct 19 F spectra at a eld strength of 4.7 Tesla, and the signal
from the sample was highly related to the ratio of PFOB and crown ether emulsion
applied. These ndings demonstrated the possibility of simultaneous imaging
and quantication of two separate populations of nanoparticles, and hence two
distinct biomarkers.
These quantication techniques were applied to the analysis of human carotid
endarterectomy samples (see Figure 2.6). An optical image of the carotid reveals
extensive plaques, wall thickening and luminal irregularities. Multislice 19 F images
showed high levels of signal enhancement along the luminal surface due to the
binding of targeted paramagnetic nanoparticles to brin deposits (not shown in
Figure 2.6). The 19 F projection images of the artery, taken over approximately 5 min,
showed an asymmetric distribution of brin-targeted nanoparticles around the vessel
wall, corroborating the signal enhancement observed with 1 H MRI. Concomitant
visualization of 1 H and 19 F images would permit the visualization of anatomical and
pathological information in a single image. In theory, the atherosclerotic plaque
burden could be visualized with paramagnetic PFC contrast-enhanced 1 H images,
while 19 F could be used localize identify plaques with high levels of brin and thus
prone to rupture.
2.3.5
Detection of Angiogenesis and Vascular Injury
j27
28
j29
30
Figure 2.7 Schematic depicting the anb3 targeting aspect and the
Gd3 component that are incorporated into the lipid shell of the
liquid perfluorocarbon nanoparticle agent.
2.4
Perfluorocarbon Nanoparticles as an Ultrasound Contrast Agent
Using the bubbles produced by agitating saline, Gramiak and Shah introduced the
concept of an ultrasonic contrast agent in 1968 [40]. Today, commercially available
ultrasound contrast agents are based on gas-lled encapsulated microbubbles
(average diameter 25 mm) that transiently enhance the blood pool signal, which
is otherwise weakly echogenic. When insonied by an ultrasound wave, microbubbles improve the gray scale images and Doppler signal via three distinct mechanisms [4143]. First, at lower acoustic power, microbubbles are highly efcient
scatterers due to their large differences in acoustic impedance (Z rc, where r is
the mass density and c is the speed of sound) compared to the surrounding tissue
or blood [44]. With increasing acoustic energies, microbubbles begin nonlinear
oscillations and emit harmonics of the fundamental (incident) frequency, thus
behaving as a source of sound, rather than as a passive reector [44, 46]. As biological
tissue does not display this degree of harmonics, the contrast signal can be exploited
to preferentially image microbubbles and improve signal-to-noise ratios (SNRs).
At even higher acoustic power, the destruction of microbubbles occurs allowing
the release of free gas bubbles. Although not desirable for most forms of imaging,
this results in a strong but transient scattering effect and provides the most sensitive
detection of microbubbles. To emphasize these strong echogenic properties, it has
been shown that even one microbubble can be detected with medical ultrasound
systems [47]. Interestingly, the destruction and cavitation of microbubbles by ultrasound waves have been shown to facilitate drug delivery by sonoporating membranes and allowing drugs and gene therapy to enter the cell [48, 49]. When this
process occurs in capillary beds, the permeability increases allowing a subset of
particles access to surrounding tissue for further drug deposition [50].
The wide use of microbubbles in everyday clinical applications highlight its
effectiveness as a blood pool agent [45]. For example, microbubbles enhance the
bloodtissue boundary of the left ventricular cavity, allowing for better diagnostic
yield in resting as well as stress echocardiograms [51]. Improved Doppler signals are
benecial in the diagnosis of valvular stenosis and regurgitation [52]. Additionally,
microbubbles are removed from the circulation via the macrophage phagocytic
system and accumulate in the liver and spleen that is, passive targeting. This
mechanism can be employed for the detection of focal liver lesions and malignancies [48, 53]. When used as targeted contrast agents, microbubbles have been
conjugated with ligands for a variety of vascular biomarkers including integrins
expressed during angiogenesis, the glycoprotein IIb/IIIa receptor on activated
platelets in clots, and L-selectin for the selective enhancement of peripheral lymph
nodes, in vivo [5456]. One disadvantage associated with the targeting of microbubbles is the tethering of these particles to a surface. This interaction with a solid
structure limits the ability of insonied microbubbles to oscillate, and dampens its
echogenicity.
Unlike microbubble formulations that are naturally echogenic, liquid PFC nanoparticles have a weaker inherent acoustic reectivity, and suspensions of them
j31
32
have been shown to exhibit backscattering levels 30 dB below that of whole blood [57].
However, when collective deposition occur on the surfaces of tissues or a cell in
a layering effect, these particles create a local acoustic impedance mismatch
that produces a strong ultrasound signal, without any concomitant increase in the
background level [58]. The echogenicity of nanoparticles does not depend upon the
generation of harmonics, and therefore is not affected by binding with molecular
epitopes. Due to their small size and inherent in vivo stability, PFC nanoparticle
emulsions have a long circulatory half-life compared to microbubble contrast agents.
This is accomplished without modication of their outer lipid surfaces with PEG or
the incorporation of polymerized lipids, which may detract from the targeting
efcacy. Acquired data have suggested that the PFC nanoparticles remain bound
to the tissues for up to 24 h. In additionally, nongaseous PFOB-lled nanoparticles
neither easily deform nor cavitate with ultrasound imaging.
The successful detection of cancer in vivo depends on a variety of factors when
using molecularly targeted contrast agents. The number of epitopes to which the
ligand can bind must be sufcient to allow enough of the contrast agent to
accumulate for detection, while the ligand specicity must be maintained to ensure
that nonspecic binding remains negligible. As stated above, the background signal
from unbound, circulating contrast agent is low enough (or even absent) so as to not
interfere with the assessment of bound, targeted agent. Previous studies have already
demonstrated the use of high-frequency ultrasound in epitope-rich pathologies,
such as brin in thrombus, where targeted PFC nanoparticles can act as a suitable
molecular imaging agent by modifying the acoustic impedance on the surface to
which they bind in a conguration that is well-approximated by a reective layer [59].
However, at lower frequencies and for sparse molecular epitopes, in the typically
tortuous vascular bed associated with the advancing front of a growing tumor, the
clear delineation between nontargeted normal tissue and angiogenic vessels
remains a challenge. The imaging technology itself must be highly sensitive and
capable of detecting and/or quantifying the level of contrast agent bound to the
pathological tissue. In clinical ultrasonic imaging, the sensitivity of detection
depends on a physical difference in the way sound interacts with a surface covered
by targeted contrast agent versus one that is not. The data presented below show that,
in many cases, the sensitivity of this determination can be improved by applying
novel and specic signal-processing techniques based on thermodynamic or information-theoretic analogues.
Site-targeted nanoparticle contrast agents, when bound to the appropriate receptor,
must be detected in the presence of bright echoes returned from the surrounding
tissue. One approach to the challenge of detecting the acoustic signature of sitespecic contrast is through the use of novel signal receivers (i.e. mathematical
operations that reduce an entire RF waveform, or a portion of it, to a single number)
based on information-theoretic quantities, such as Shannon entropy (H), or its
counterpart for continuous signal (Hf ). These receivers have been shown to be
sensitive to diffuse, low-amplitude features of the signal that often are obscured by
noise, or else are lost in large specular echoes and, hence, not usually perceivable
by a human observer [6064].
255
X
pk logpk :
2:7
k0
While this quantity has demonstrated utility for signal characterization [60], it also
has the undesirable feature that it depends critically on the attributes of the digitizer
used to acquire the data. This dependence may be removed by taking the limit
where the sampling rate and dynamic range are taken to innity [61, 62]. In that case,
the probabilities, pk, are replaced by density function, wf(y), of the signal f(t). While the
Shannon entropy HS becomes innite in this limit, we may extract a nite portion of
it, called Hf, that is also useful for signal characterization.
This well-behaved quantity can be expressed as
fmax
wf y log wf ydy:
2:8
Hf
f min
This quantity has been shown to be very sensitive to local changes in backscattered
ultrasound that arise from the accumulation of targeted nanoparticles in the acoustic
eld of view [46, 57, 65]. In contrast to most methods used to construct medical
images, the waveform f(t) does not directly enter the expression used to compute pixel
values. Instead, the density function of the waveform is used.
2.4.2
The Density Function wf(y)
The density function wf(y) corresponds to the density functions that are the primary
mathematical objects in statistical signal processing and the description from which
other mathematical quantities are subsequently derived (e.g. mean values, variances,
covariances) [6668]. In that setting, the density function constitutes the most
j33
34
f min
for any continuous function f(y). This should be compared with the expression
for the expectation value of a function f of a random variable X with density pX(x),
which is given by
fxpX xdx;
which explains why wf(y) is referred to as the density function for f(t) [69]. If we chose
f(x) x2, then
1
f max
f t2 dt
y2 wf ydy;
2:10
0
f min
transducer, with an F-number of 2 (diameter 3 mm, focal length 6 mm). Radiofrequency ultrasonic backscatter waveforms corresponding to a region 80 mm
wide 30 mm deep, were digitized at time points 0, 15, 30 and 60 min after
administration of the nanoparticle contrast agent. All of these RF data were
processed off-line to reconstruct images using information theoretic (entropy-based)
and conventional (energy-based) receivers. Image segmentation was performed
automatically using the threshold, which excluded 93% of the area under the
composite histogram for all data sets. The mean value of segmented pixels was
computed at each time point post injection.
A diagram depicting the placement of transducer, gel standoff and mouse ear is
shown in the left side of Figure 2.9, together with a representative B-mode grayscale
image (i.e. logarithm of the analytic signal magnitude). The labels indicate the
location of the skin (top of image insert), the structural cartilage in the middle of
the ear, and a short distance below this, the echo from the skin at the bottom of the
ear. To the right of this ultrasound image, is an histological view of a HPV mouse
ear that has been magnied 20-fold to permit a better assessment of the thickness
and architecture of the sites where the anb3 integrin-targeted nanoparticle might
attach (red by b3 staining). Both, the skin and tumor are both visible in the image.
On either side of the cartilage (center band in image), extending to the dermalepidermal junction, is the stroma, which is lled with neoangiogenic microvessels.
These microvessels are also decorated with targeted anb3 nanoparticles, as indicated
by the uorescent image (labeled, upper right of Figure 2.9) of a bisected ear from
an anb3-injected K14-HPV16 transgenic mouse. It is in this region that the
anb3-targeted nanoparticles are expected to accumulate, as indicated by the presence
of red b3 stain in the magnied image of an immunohistological specimen also
shown in the image.
j35
36
selected using NIH ImageJ (http://rsb.info.nih.gov/ij/), and the mean value of the
pixels lying below the threshold were computed for each of the images acquired at 0,
15, 30, 45 and 60 min post injection. The mean value at zero minutes was subtracted
from the values obtained for all subsequent times, to obtain a sequence of changes
in receiver output as a function of time post injection. This was done for all four
animals injected with targeted nanoparticles, and also for the four control animals.
The sequences of relative changes were then averaged over the targeted and control
groups to obtain a sequence of time points for change in receiver output for both
groups of animals. The threshold of 93% was nally chosen as it produced the
smallest p-value (0.00 043) for a t-test comparing the mean values of the ROI at 15 min
as compared to 60 min. The corresponding p-value for the control group was 0.27.
The average change, with time after injection, of the mean value of the enhanced
regions of Hf images obtained from all eight of the animals reported in the study are
compared in Figure 2.11a [6]. As these data show, the mean value, or enhancements,
obtained in the targeted group increased steadily with time. After 30 min the mean
j37
38
value of enhancement was measurably different from baseline values (p < 0.005).
Moreover, the values at 15 and 60 min were also statistically different (p < 0.005).
The corresponding results obtained from control animals that were injected with
nontargeted nanoparticles are shown in Figure 2.11b. There was no discernible
trend in the group, and the last three time points were not statistically different
from zero. A comparison of the enhancement measured at 15 and 60 min yielded
a p-value 0.27. The enhancement in Hf observed after 60 min for representative
instances of targeted and nontargeted nanoparticles is shown in Figure 2.12.
These images were generated by overlaying a 93% thresholded version of the Hf
using a look-up table (LUT) on top of the conventional grayscale B-mode image.
For comparison, and to illustrate the potential value of the entropy-based analysis,
the corresponding results obtained using the log[Ef ] analysis are shown in
Figure 2.13a for same data used to generate Figure 2.11. These data were obtained
by computing the mean value of pixels lying below the 93% threshold at each time
point (0, 15, 30, 45 and 60 min) for each animal (four injected with targeted
nanoparticles, and four with nontargeted nanoparticles), as discussed above. Unlike
the entropy case, the values at 15 and 60 min were not statistically different (p 0.10).
Figure 2.13b shows the corresponding result obtained from the control group
of animals that were injected with nontargeted nanoparticles. There was no discernible trend in the group, and the last three time points were not statistically different
from zero.
2.4.4
Targeting of MDA-435 Tumors
abundance (red regions), although not exclusively, between the skin and the tumor
capsule. The close proximity of these binding sites to the skintransducer interface is
one of the primary obstacles that must be overcome by any quantitative detection
scheme intended to determine the extent of this region. Accordingly, the acoustic
portion of the experiment was designed to maximize system sensitivity near this
interface. This was carried out in order to maximize the opportunity to detect
nanoparticles targeted towards angiogenic neovasculature. It also provided a stringent test of the Hf entropy-based metrics ability to separate signals near the
confounding skintissue interface, which was one of the primary goals of this study.
Nine animals were injected with targeted nanoparticles, and seven with nontargeted nanoparticles to serve as controls. Each mouse was preanesthesitized
with ketamine, after which an intravenous catheter was inserted into the right
jugular vein to permit the injection of nanoparticles (either anb3-targeted or untargeted). The mouse was then placed on a heated platform maintained at 37 C, and
anesthesia administered continually with isourane gas through a nose cone.
j39
40
the mean values increased in an approximately linear fashion versus time. The plot
of control experiments showed there was no signicant change in enhancement
with time in these animals. However, a careful visual inspection of the image
sequence revealed measurable changes in tumor shape and position that most
likely were induced by respiration and relaxation of the animal over the 2-h
experiment.
j41
42
Figure 2.16 Quantitative comparison of enhancement and Bmode images obtained using targeted nanoparticles. No
significant changes were observed with either untargeted or
targeted nanoparticles. All plots have vertical units of nepers, as
the image analysis was performed on noise-scaled or normalized
images (which are unitless). As explained in the text, this does not
alter the quantitative conclusion presented in these plots.
2.4.5
In Vivo Tumor Imaging at Clinical Frequencies
ultrasound in the frequency range between 7 and 15 MHz [79]. These investigators
employed a liquidPFC nanoparticle conjugation of an anb3 peptidomimetic to target
the expression of anb3 in Vx-2 tumors implanted in the hindquarters of New Zealand
White rabbits (n 9). Anesthesia was administered continually with isourane
gas, the model was injected with a whole-body dose of 0.66 ml kg1 of nanoparticle
emulsion, after which the ultrasound data were acquired at 0, 15, 30, 60 and 120 min.
Six control rabbits were also imaged using the same methodology, but were not
injected with nanoparticles. Beam-formed RF data were acquired using a modied
research version of a clinical ultrasound system (Philips HDI-5000). Data were
analyzed for all rabbits at all times post injection, using three different techniques:
(i) conventional grayscale; (ii) Hf (an entropy-based quantity); and (iii) log[Ef ] (i.e. the
logarithm of the signal energy, Ef). Representative image data are shown in
Figure 2.18, depicting the tumor in cross-section. A paired t-test comparing Hf
image enhancement obtained at 0 and 120 min for the rabbits injected with targeted
nanoparticles indicated a signicant difference (p < 0.005). For control rabbits there
was no signicant difference between 0 and 120 min (p 0.54). Conventional
grayscale imaging at the fundamental frequency and log[Ef ] imaging failed to detect
a coherent signal, and did not show any systematic pattern of signal change.
2.5
Contact-Facilitated Drug Delivery and Radiation Forces
2.5.1
Primary and Secondary Radiation Forces
r
r c2 5r2r0
f
;
0 20
r0
rc
3r r0
2:12
Here, V0 is the sphere volume, PA is the acoustic pressure amplitude, r and r0 are
the sphere and uid densities, respectively, and c and c0 are the sound velocity in the
j43
44
Figure 2.18 Images produced from beamformed RF acquired from a rabbit injected with
anb3-targeted nanoparticles. The three rows
show composite images formed by the
application of three different signal processing
techniques: Hf, log[Ef ] signal receiver (both
applied with a moving window), and
2:13
where
x
2p rr0 2 3 3 2
a b u0 ;
3
r0
2:14
and a, b are the sphere radii, r is their separation distance, and u0 is the velocity
amplitude of the suspending medium.
The action of these forces in vivo is to concentrate the suspended nanoparticles and
push them away from the acoustic source (i.e. away from the center of arterial ow
and onto the capillary wall), as shown in Figure 2.19. This effect increases the
potential to increase their therapeutic efcacy.
2.5.2
In Vitro Results
Besides detecting sparse epitopes for noninvasive imaging, PFC nanoparticles are
capable of specically and locally delivering drugs and other therapeutic agents
through a novel process known as contact-facilitated drug delivery [12]. The direct
transfer of lipids and drugs from the nanoparticles surfactant layer to the cell
membrane of the targeted cell is usually a slow and inefcient process. However,
through ligand-directed targeting this process can be accelerated by minimizing the
separation of the lipids and surfaces, and increasing the frequency and duration
of the lipidsurface interactions (see Figures 2.20 and 2.21). Spatial localization
(via high-resolution 19 F-enhanced MRI) and quantication of the nanoparticles
(via 19 F spectroscopy) permits the local therapeutic concentrations to be estimated.
Thus, PFC nanoparticles can be used for detection, therapy and treatment
monitoring.
As an example, in vitro vascular smooth muscle cells were treated with tissue
factor-targeted PFC nanoparticles containing 0, 0.2 or 2.0 mol% doxorubicin or
paclitaxel, or an equivalent amount of drug in buffer solution alone [83]. After
targeting for only for 30 min, proliferation was inhibited for three days, while in vitro
dissolution studies revealed that the nanoparticles drug release persisted for
more than one week. High-resolution MRI with a 4.7 Tesla eld strength showed
that the image intensity of the targeted vascular smooth muscle cells was twofold
higher compared to nontargeted cells. In addition, the uorine signal amplitude
at 4.7 Tesla was unaffected by the presence of surface gadolinium, and was linearly
Figure 2.20 Schematic representation illustrating contactfacilitated drug delivery. The phospholipids and drugs within the
nanoparticles surface-exchange with the lipids of the target
membrane through a convection process, rather than diffusion, as
is common among other targeted systems.
j45
46
2.6
Conclusions
References
References
1 Cyrus, T. (2006) Nanomaterials for Cancer
Therapy and Diagnosis (eds S. Challa and
S. Kumar), Wiley-VCH, Weinheim, p. 121.
2 Lanza, G.M., Lorenz, C.H., Fischer, S.E.,
Scott, M.J., Cacheris, W.P., Kaufmann,
R.J., Gaffney, P.J. and Wickline, S.A. (1998)
Academic Radiology, 5, S173.
3 Marsh, J.N., Partlow, K.C.,
Abendschein, D.R., Scott, M.J.,
Lanza, G.M. and Wickline, S.A. (2007)
Ultrasound in Medicine and Biology, 33, 950.
4 Hughes, M.S., Marsh, J.N., Hall, C.S.,
Savery, D., Scott, M.J., Allen, J.S.,
Lacy, E.K., Carradine, C., Lanza, G.M. and
Wickline, S.A. (2004) Proceedings IEEE
Ultrasonics Symposium, 04CH37553,
p. 1106.
5 Hughes, M.S., Marsh, J.N., Zhang, H.,
Woodson, A.K., Allen, J.S., Lacy, E.K.,
Carradine, C., Lanza, G.M. and
Wickline, S.A. (2006) IEEE Transactions
on Ultrasonics, Ferroelectrics, and Frequency,
Control, 53, 1609.
6 Hughes, M.S., McCarthy, J., Marsh, J.N.,
Arbeit, J., Neumann, R., Fuhrhop, R.,
Wallace, K., Znidersic, D., Maurizi, B.,
Baldwin, S., Lanza, G.M. and Wickline,
S.A. (2007) Journal of the Acoustical Society
of America, 121, 3542.
7 Flacke, S., Fischer, S., Scott, M.J.,
Fuhrhop, R.J., Allen, J.S., McLean, M.,
Winter, P., Sicard, G.A., Gaffney, P.J.,
8
9
10
11
12
13
14
15
16
j47
48
32
33
34
35
36
37
38
39
40
41
42
43
References
44 Szabo, T.L. (2004) Diagnostic Ultrasound
Imaging: Inside Out, Elsevier Academic
Press, Burlington, MA.
45 Goldberg, B.B. Raichlen, J.S. and
Forsberg, F. (eds) (2001) Ultrasound
Contrast Agents: Basic Principles and
Clinical Applications, Martin Dunitz,
London.
46 Leighton, T.G. (1997) The Acoustic Bubble,
Academic Press, San Diego.
47 Klibanov, A.L., Rasche, P.T., Hughes, M.S.,
Wojdyla, J.K., Galen, K.P., Wible, J.H. and
Brandenburger, G.H. (2004) Investigative
Radiology, 39, 187.
48 Blomley, M., Cooke, J., Unger, E.,
Monaghan, M. and Cosgrove, D. (2001)
British Medical Journal, 322, 1222.
49 Shohet, R., Chen, S., Zhou, Y., Wang, Z.,
Meidell, R., Unger, R. and Grayburn, P.
(2000) Circulation, 101, 2554.
50 Price, R., Skyba, D.M.P., Kaul, S.M. and
Skalak, T.C.P. (1998) Circulation, 98, 1264.
51 Cheng, T.D., S.C. and Feinstein, S. (1998)
Contrast echocardiography: review and
future directions. American Journal of
Cardiology, 81, 41.
52 Terasawa, A., Miyatake, K., Nakatani, S.,
Yamagishi, M., Matsuda, H. and Beppu, S.
(1993) Journal of the American College of
Cardiology, 21, 737.
53 Harvey, C., Blomley, M., Eckersley, R.,
Cosgrove, D., Patel, N., Heckemann, R.
and Butler-Barnes, J. (2000) Radiology, 216,
903.
54 Leong-Poi, H., Christiansen, J.,
Klibanov, A., Kaul, S. and Lindner, J. (2003)
Circulation, 107, 455.
55 Schumann, P., Christiansen, J., Quigley,
R., McCreery, T., Sweitzer, R., Unger, E.,
Lindner, J. and Matsunaga, T. (2002)
Investigative Radiology, 37, 587.
56 Hauff, P., Reinhardt, M., Briel, A.,
Debus, N. and Schirner, M. (2004)
Radiology, 231, 667.
57 Hughes, M.S., Marsh, J.N., Arbeit, J.,
Neumann, R., Fuhrhop, R.W., Lanza, G.M.
and Wickline, S.A. (2005) Proceedings
IEEE Ultrasonics Symposium,
05CH37716, p. 617.
j49
50
79
80
81
82
83
j51
3
Nanoparticles for Cancer Detection and Therapy
Biana Godin, Rita E. Serda, Jason Sakamoto, Paolo Decuzzi, and Mauro Ferrari
3.1
Introduction
3.1.1
Cancer Physiology and Associated Biological Barriers
52
The tumor vasculature is extremely heterogeneous, with necrotic and hemorrhagic areas neighboring regions with a dense vascular network formed as a result
of angiogenesis triggered to sustain a sufcient supply of oxygen and nutrients
necessary for tumor growth and progression [5, 9]. Tumor blood vessels are
architecturally and structurally different from their normal counterparts. The
vascular networks that are formed in response to tumor growth are not organized
into denitive venules, arterioles and capillaries as for the normal circulation
but rather share chaotic features of all of them. Furthermore, the blood ow
in tumor vessels is irregular, sluggish, and sometimes oscillating. Angiogenic
vessels possess several abnormal features such as a comparatively high percentage
of proliferating endothelial cells, an insufcient number of pericytes, an enhanced
tortuosity, and the formation of an atypical basal membrane. As a result, tumor
vasculature is more permeable, with the pore cut-off size ranging from 380 to
780 nm in different tumor models [10, 11]. The hemoglobin in the erythrocytes
is oxygen-starved, which makes the microenvironment profoundly hypoxic.
The tumor environment is also nutrient-decient (e.g. glucose), acidic (owing
to lactate production from anaerobic glycolysis), and under oxidative stress [3, 9].
Although the molecular controls of the above abnormalities are not fully elucidated, these may be attributed to the imbalanced expression and function of
angiogenic factors. Various mediators can affect angiogenesis as well as vascular
permeability. Among these are vascular endothelial growth factor (VEGF), nitric
oxide, prostaglandins and bradykinin. Macromolecules can traverse through
neoplastic vessels using one of the following pathways: vasculature fenestrations;
interendothelial junctions; transendothelial channels (open gaps); and vesicular
vacuolar organelles [9]. The tumor vasculature, in being formed de novo during the
angiogenic process, possesses a number of characteristic markers which are not
seen on the surface of normal blood vessels, and can serve as therapeutic targets
(these will be discussed later).
The interstitial compartment of solid tumors is mainly composed of a collagen and
elastic ber, crosslinked structure. Interstitial uid and high-molecular-weight
gelling constituents, such as hyaluronate and proteoglycans, are interdispersed
within the above network. The characteristic feature of the interstitium, which
distinguishes it from the majority of normal tissues, is the intrinsic high pressure
resulting from the absence of an anatomically well-dened and operating lymphatic
network, as well as an apparent convective interstitial uid ow. These parameters
present additional biobarriers towards the penetration of a therapeutic agent into the
cancer cells, as the transport of an anticancer molecule or nanovector in this tumor
subcompartment will be governed by physiological (pressure) and physico-chemical
(charge, lipophilicity, composition, structure) properties of the interstitium and the
agent itself [4, 5].
The cellular subcompartment accounts for the actual cancerous cell mass. The
barriers directly related to the cellular compartment are generally categorized in
terms of alterations in the biochemical mechanisms within the malignant cells
making them resistant to anticancer medications. Among these biochemical shifts
3.1 Introduction
are the P-glycoprotein efux system, which is responsible for multidrug resistance
and the impaired structure of specic enzymes (i.e. topoisomerase). Moreover, in
order to efciently treat the disease, a cytotoxic agent should be able to cross the
cytoplasmic and nuclear membranes a far from trivial deed for basic drugs that are
ionizable within an acidic tumor environment [12, 13].
As mentioned above, following their administration, therapeutic agents encounter
a multiplicity of biological barriers that adversely impact their ability to reach the
intended target at the desired concentrations [58, 14]. This problem is considerably
decoupled from the ability of agents to recognize and selectively bind to the target,
that is, by the use of antibodies, aptamers or ligands. In other words, despite their
high specicity these agents invariably present with concentrations at target sites that
are vastly inferior to what is expected on the basis of molecular recognition alone. The
biodistribution proles for conventional chemotherapeutic agents are evenly adverse, if not worse, leading to a plethora of unwanted toxicities and collateral effects at
the expense of the therapeutic action (i.e. a decreased therapeutic index). The
reticuloendothelial system (RES), which comprises immune cells and organs such as
the liver and spleen, presents an important physiological biobarrier, causing an
efcient clearance of the agent from the bloodstream. Other barriers of epithelial and
endothelial nature, for example the bloodbrain barrier, are based on tight-junctions,
which signicantly limit the paracellular transport of agents that owe their molecular
discrimination to several mechanisms and proteins (occludin, claudin, desmosomes,
zonula occludens).
To summarize, some of the most challenging biobarriers as the main cause for
tumor resistance to therapeutic intervention, include physiological noncellular and
cellular barriers, such as the RES, epithelial/endothelial membranes and drug
extrusion mechanisms, and biophysical barriers, which include interstitial pressure
gradients, transport across the extracellular matrix (ECM), and the expression and
density of specic tumor receptors.
3.1.2
Currently Used Anticancer Agents
Since the pathology of cancer involves the dysregulation of endogenous and frequently essential cellular processes, the treatment of malignancies is extremely
challenging. The vast majority of presently used therapeutics utilize the fact that
cancer cells replicate faster than most healthy cells. Thus, most of these agents do not
differentiate greatly between normal and tumor cells, thereby causing systemic
toxicity and adverse side effects. More selective agents which include monoclonal
antibodies and anti-angiogenic agents are now available, and the efciency of
these medications is still under evaluation in various types of tumor. Since cancer
arising from certain tissues including the mammary and prostate glands may
be inhibited or stimulated by appropriate changes in hormone balance, several
malignancies may also respond to hormonal therapy. Various groups of anticancer
therapeutics are exemplied below.
j53
54
3.1.2.1 Chemotherapy
Chemotherapy, or the use of chemical agents to destroy cancer cells, is a mainstay
in the treatment of malignancies. The modern era of cancer chemotherapy was
launched during the 1940s, with the discovery by Louis S. Goodman and Alfred
Gilman of nitrogen mustard, a chemical warfare agent, as an effective treatment for
blood malignancies [15, 16].
Through a variety of mechanisms, chemotherapy affects cell division, DNA
synthesis, or induces apoptosis. Consequently, more aggressive tumors with high
growth fractions are more sensitive to chemotherapy, as a larger proportion of the
targeted cells are undergoing cell division at any one time. A chemotherapy agent may
function in only one phase (G1, S, G2 and M) of the cell cycle (when it is called cell
cycle-specic), or be active in all phases (cell cycle-nonspecic). The majority of
chemotherapeutic drugs can be categorized as alkylating agents (e.g. cisplatin,
carboplatin, mechlorethamine, cyclophosphamide, chlorambucil), antimetabolites
(e.g. azathioprine, mercaptopurine), anthracyclines (daunorubicin, doxorubicin,
epirubicin, idarubicin, valrubicin), plant alkaloids (vinca alkaloids and taxanes) and
topoisomerase inhibitors (irinotecan, topotecan, amsacrine, etoposide) [1719].
The lack of any great selectivity by chemotherapeutic agents between cancer and
normal cells is apparent when considering the adverse effect proles of most
chemotherapy drugs [18, 19]. Hair follicles, skin and the cells that line the gastrointestinal tract are some of the fastest growing cells in the human body, and therefore
are most sensitive to the effects of chemotherapy. It is for this reason that patients
may experience hair loss, rashes and diarrhea, respectively. As these agents do not
possess favorable pharmacokinetic proles to localize specically into the tumor
tissue, they become evenly distributed throughout the body, with resultant adverse
side effects and other toxic reactions that greatly limit their dosage.
3.1.2.2 Anti-Angiogenic Therapeutics
The publication of Judah Folkmans imaginative hypothesis in 1971 launched the
current research area of anti-angiogenic therapy for cancer [20], although more
than three decades elapsed before the Food and Drug Administration (FDA)
approved the rst anti-angiogenic drug, bevacizumab (a humanized monoclonal
antibody directed against VEGF) [21, 22]. The rst clinical trials with this agent,
when used in combination with standard chemotherapy, resulted in an enhanced
survival of metastatic colorectal cancer and advanced non-small-cell lung cancers [23, 24]. Another group of anti-angiogenic therapeutics, also approved by the
FDA, is based on small-molecule receptor tyrosine kinase inhibitors (RTKIs)
which target VEGF receptors, platelet-derived growth factor (PDGF) and other
tyrosine kinase-dependent receptors [25]. Examples of agents in this group are
sorafenib and sunitinib; these orally administered medications have been shown
to be effective in the treatment of metastatic renal cell cancer and hepatocellular
carcinoma, when used as monotherapy [2628]. When used as monotherapy, the
survival benets of these treatments are relatively modest (usually measured in
months). Additionally, the treatments are also costly [29] and have toxic side
effects [3034].
3.1 Introduction
3.1.2.3 Immunotherapy
While tumor cells are ultimately derived from normal progenitor cells, transformation to a malignant phenotype is often accompanied by changes in antigenicity.
Antibodies are amazingly selective, possessing the natural ability to produce a
cytotoxic effect on target cells. The immune system was rst appreciated over 50
years ago for its ability to recognize malignant cells and defend against cancer, when
Pressman and Korngold [35] showed that antibodies could distinguish efciently
between normal and tumor tissues. These results were conrmed by Burnet [36]
during the 1960s, who also showed that neoplasms are actually formed only when
lymphocytes lose the capability of differentiating between normal and malignant
cells. These studies grounded the foundation for modern monoclonal antibody
(mAb) -based cancer therapy. The expression of tumor-associated antigens can arise
due to a variety of mechanisms, including alterations in glycosylation patterns [37],
the expression of virally encoded genes [38], chromosomal translocations [39], or
an overexpression of cellular oncogenes [40, 41]. The rst challenge in the development of efcient mAb-based therapeutics is the detection of an appropriate and
specic tumor-associated antigen. Some examples of mAbs used for cancer therapy
are given below.
Hematologic malignancies, which possess fewer barriers capable of preventing
mAbs from accessing their target antigens, are well suited for mAb therapy.
Following intravenous injection and distribution throughout the vascular space,
therapeutic antibodies may easily access their targets on the surface of blood
malignant cells. Many of these B- and T-cell surface antigens, such as CD20, CD22,
CD25, CD33 or CD52, are expressed only on a particular family of hematopoietic
cells [42, 43]. These antigens are also expressed at high levels on the surface of various
populations of malignant cells, but not on normal tissues or hematopoietic progenitor cells. The chimeric antibody which binds to CD20 B lymphocyte surface antigens,
rituxan (rituximab; Genentech) was among the rst of the mAbs to receive FDA
approval for the treatment of nonHodgkins lymphoma [44]. Alemtuzumab, which
recognizes CD52 antigens present on normal B and T lymphocytes (Campath-1; Ilex
Oncology) has also received FDA approval for the treatment of patients suffering
from chronic lymphocytic leukemia.
The successful treatment of solid tumors with mAb therapeutics has proved to be
more elusive compared to hematological malignancies, although some signicant
therapeutic benets have been achieved. The failure of mAbs in the treatment of
these malignancies is primarily attributable to an insufcient level of injected mAb
that actually reaches its target within a tumor mass. The results of several studies
using radiolabeled mAbs have suggested that only a very small percentage of the
original injected antibody dose (0.010.1% g1 tumor tissue) is able to ever reach
target antigens within a solid tumor [4547]. These low in vivo concentrations are due
to the series of biobarriers (see above) that an intravenously administered mAb
encounters en route to its specic antigens on the surface of cancer cells. Herceptin
(trastuzumab; Genentech) is a humanized antibody marketed for the treatment
of metastatic breast cancer. This mAb recognizes an extracellular epitope of the
HER-2 protein, which is highly overexpressed in approximately 2530% of invasive
j55
56
breast tumors [40, 41]. It is noteworthy that HER-2 expression on breast cancer cells
can be as high as 100-fold in comparison to normal breast epithelial cells. Clinical
trials with herceptin have shown it to be well tolerated, both as a single agent for
second- or third-line therapy or in combination with chemotherapeutic agents as a
rst line of therapy. A combination therapy resulted in a 25% improvement in overall
survival among patients with HER-2-overexpressing tumors that are refractory to
other forms of treatment [48, 49].
The levels of prostate-specic membrane antigen (PSMA), a transmembrane
protein expressed primarily on the plasma membrane of prostatic epithelial cells [50],
are elevated in virtually all cases of prostatic adenocarcinoma, with maximum
expression levels observed in metastatic disease and androgen-independent tumors [5053]. Due to this behavior, PSMA has become an important biomarker for
prostate cancer, and antibodies to PSMA are currently being developed for the
diagnosis and imaging of recurrent and metastatic prostate cancer, as well as for the
therapeutic management of malignant disease [5356].
Another mAb used for the treatment of colorectal cancer is elecolomab (panorex;
GlaxoSmith-Kline), the anti-epithelial cellular adhesion molecule. Today, many other
immunotherapeutics are being used in the clinic or are undergoing various stages of
clinical trials. Beyond their pronounced therapeutic potential, these agents can be
efciently combined with nanovectors to enhance targeting of the latter to cancer
tissues.
3.1.2.4 Issues and Challenges
As mentioned above, currently used conventional cancer therapies have several
drawbacks that result in a pronounced toxicity and poor treatment efcacy. On the
other hand, current diagnostic techniques do not allow for the competent detection of
various malignancies, and do not reect the vast clinical heterogeneity of the
condition. Targeted approaches will ultimately increase the treatment efciency,
while decreasing toxicity to normal cells and tissues; thus, specic drug delivery in
cancer treatment is of prime importance. As opposed to cancers of the blood, solid
malignancies possess several unique characteristics, such as extensive blood vessel
growth (angiogenesis), damaged vascular architecture and enhanced permeability,
and impaired lymphatic ow and drainage. All of the above can serve as effective
therapeutic targeting mechanisms, as well as for the passive homing of agents into
the tumor tissue by means of various delivery systems.
To summarize, current issues and unmet needs in translational oncology include:
.
.
.
Progress in the above listed elds will sculpt the major cornerstones for a yetto-come personalized tumor therapy and early and predictive diagnosis of the
disease.
3.2 Nanotechnology for Cancer Applications: Basic Definitions and Rationale for Use
Later in this chapter we will describe the currently available and underdevelopment carriers and vectors from a nano-toolbox, and critically discuss the
benets and weaknesses of these systems for the design of specic, personalized and
targeted medications. The benets of rational design of the nanovectors to efciently
negotiate biobarriers and various aspects of a preclinical characterization for the
nanoscale systems will be argued.
3.2
Nanotechnology for Cancer Applications: Basic Definitions and Rationale for Use
j57
58
The rst generation (Figure 3.1a) comprises a delivery system that homes into the
action site governed by passive mechanisms. In the case of liposomes as a
nanovector, the mode of tumor localization is based on the enhanced permeation
and retention (EPR) effect, which drives the system to localize into tumor through
fenestrations in the adjacent neovasculature. Some of these carriers are surfacemodied with a stealth layer [e.g. polyethylene glycol (PEG)] which prevents their
uptake by the RES, and thus substantially prolongs the particles circulation time [63].
Later in this chapter we will focus on each of the three generations of nanovectors, discussing the pros and cons, and presenting various examples of these
technologies.
3.3
First-Generation Nanovectors and their History of Clinical Use
Today, the rst-generation nanovectors that passively localize into tumor sites
represents the only generation of nanomedicines broadly represented in the clinical
situation. These systems are generally designed to achieve long circulation times for
therapeutics and an enhanced accumulation of the drug into the target tissue. This is
achieved through a pronounced extravasation of the carrier-associated therapeutic
agent into the interstitial uid at the tumor site, exploiting the locally increased
vascular permeability, the EPR effect (Figure 3.2). An additional physiological factor
which contributes to the EPR effect is that of impaired lymphatic function impeding
clearance of the nanocarriers from their site of action [6971]. The localization in this
case is driven only by the particles nanodimensions, and is not related to any specic
recognition of the tumor or neovascular targets.
In order to prolong their circulation time, these systems are generally decorated on
their surface by a stealth layer (e.g. PEG) which prevents their uptake by phagocytic
blood cells and organs of the RES system [63, 72, 73]. The most pronounced
j59
60
Figure 3.2 Mechanism of passive tumor targeting by enhanced permeation and retention (EPR).
representatives of this generation in clinical use are liposomes, which are the leaders
among nanocarriers used in clinics. These self-assembling structures, which were
rst discovered by Bangham in 1965 [74], are composed of one or several lipid bilayers
surrounding an aqueous core. This structure imparts an ability to encapsulate
molecules that possess different degrees of lipophilicity; lipophilic and amphiphilic
drugs will be localized in the bilayers while water-soluble molecules will concentrate
into the hydrophilic core. The rst drug to benet from being encapsulated within
this delivery system was doxorubicin. As of today, various companies market
doxorubicin liposomal formulations, but Myocet (non-PEGylated liposomes) and
Doxil (PEGylated liposomes) were among the rst systems in clinical use [71, 75]. The
pronounced advantages of liposomally encapsulated doxorubicin can be illustrated in
its pharmacokinetic performance: an elimination half-life for the free drug is only
0.2 h, but this increases to 2.5 and 55 h, respectively, when non-PEGylated and
PEGylated liposomal formulations are administered. Moreover, the area under the
timeplasma concentration prole (the AUC), which indicates the bioavailability of
an agent following its administration, is increased 11- and 200-fold for Myocet and
Doxil, respectively, compared to the free drug [76]. Encapsulation into the liposomal
carrier also causes a signicant reduction in the most signicant adverse side effect of
doxorubicin, namely cardiotoxicity, as demonstrated in clinical trials [71, 7577].
Liposomal doxorubicin is currently approved for the treatment of various malignancies, including Kaposis sarcoma, metastatic breast cancer, advanced ovarian cancer
and multiple myeloma. Other liposomal drugs which are either currently in use or are
being evaluated in clinical trials include non-PEGylated liposomal daunorubicin
(DaunoXome) and vincristine (Onco-TCS), PEGylated liposomal cisplatin (SPI-77)
and lurtotecan (OSI-211) [78, 79].
Other systems in this category include metal nanoparticles for use in diagnostics,
albumin paclitaxel nanoparticles approved for use in metastatic breast cancer, and
drugpolymer constructs.
Nanoscale particles can act as contrast agents for all radiological imaging approaches. Iron-oxide particles provide a T2-mode negative contrast for magnetic
resonance imaging (MRI), while gold nanoparticles can be used to enhance the
contrast in X-ray and computed tomography (CT) imaging, in a manner which is
essentially proportional to their atomic number. Mechanical impendence disparity
is the origin for the contrast in ultrasound imaging provided by the materials that
are either more rigid (metals, ceramics) or much softer (microbubbles) than the
surrounding tissue. The very existence of better contrast agents can drive the
development of new imaging modalities. The emergence of nanocrystalline quantum dots has generated great interest in novel optical imaging technologies. The
architecture and composition of quantum dots provide tunable emission properties
that resist photobleaching. By concentrating preferentially at tumor sites following an
EPR mechanism, nanoparticles which comprise a contrast material can provide an
enhanced denition of anatomical contours and location, as well as the extent of
disease. In addition, if coupled with a biological recognition moiety they can further
offer molecular distribution information for the diagnostician [78].
Albumin-bound paclitaxel (Abraxane) was granted FDA approval in 2005. Paclitaxel is a highly lipophilic molecule that was previously formulated for injection with
Cremophor, a toxic surfactant, under the trade name Taxol. In a multicenter Phase
II clinical trial involving 4400 women with metastatic breast cancer, Abraxane
(30-min infusion, 260 mg m2) was proven to be more benecial in terms of
treatment efciency and reduction in side effects than the free drug (3-h infusion,
175 mg m2) [80]. Albumin-bound methotrexate is currently being evaluated in the
clinical situation.
Although the next group to be discussed does not have a particulate nature, these
agents drugpolymeric cleavable constructs have also been considered as
nanoengineered objects. In 1975, Ringdorf proposed a new concept of drugpolymer
constructs that could be conjugated by using a linker with a certain degree of
selectivity, and which would be stable in blood but cleaved in an acidic or
enzymatic environment of a tumor site, or within an acidic intracellular compartment (e.g. endosomes) [81]. Some 20 years later, in 1994, doxorubicin conjugated
to poly(N-(2-(hydroxypropyl)methacrylamide) (PHPMA), through an enzymatically
cleavable tetrapeptide spacer (GFLG), was the rst polymeric construct to enter
clinical trials [70, 78]. This system signicantly improved the therapeutic index
of the drug, as indicated by a 45-fold higher maximum tolerated dose of the
drugpolymer conjugate when compared to doxorubicin alone [82].
j61
62
3.4
Second-Generation Nanovectors: Achieving Multiple Functionality at the Single
Particle Level
3.4 Second-Generation Nanovectors: Achieving Multiple Functionality at the Single Particle Level
uorescent intensity than quantum dots, with light emission in the near-infrared
window, which is very appropriate for in vivo imaging. When conjugated to tumortargeting ligands such as single-chain variable fragment (ScFv) antibodies, the
conjugated nanoparticles were able to target tumor biomarkers such as epidermal
j63
64
growth factor (EGF) receptor on human cancer cells and in xenograft tumor models,
with a 10-fold higher accumulation for targeted particles (see Figure 3.4b).
The nanovectors in the second subclass of this generation include responsive
systems, such as pH-sensitive polymers or those activated by the disease site-specic
enzymes, as well as a diverse group of remotely activated vectors. Among the most
interesting examples here are gold nanoshells that are activated by near-infrared light,
or iron oxide nanoparticles triggered by oscillating magnetic elds [92, 93]. Other
techniques used to remotely activate the second-generation particulates include
ultrasound and radiofrequency (RF) [9496]. Linking nanoshells to antibodies that
recognize cancer cells enables these novel systems to seek out their cancerous targets
prior to applying near-infrared light to heat them up. For example, in a mouse model
of prostate cancer, nanoparticles activated with 20 -uoropyrimidine RNA aptamers
that recognized the extracellular domain of the PSMA, and loaded with docetaxel as a
cytostatic drug, were used for targeting and destroying cancer cells [97, 98]. Another
new approach is based on the coupling of nanoparticles to siRNA, used to silence
specic genes responsible for malignancies. By using targeted nanoparticles, it was
shown that the delivered siRNA can slow the growth of tumors in mice, without
eliciting the side effects often associated with cancer therapies. Although the
representatives of the second generation have not yet been approved by FDA, there
are today numerous ongoing clinical trials involving targeted nanovectors, especially
in cancer applications.
3.5
Third-Generation Nanoparticles: Achieving Collaborative Interactions Among
Different Nanoparticle Families
j65
66
reaches the intended site, permitting the overwhelming majority of the highly toxic,
nondiscriminating, systemically disbursed poison to manifest in a number of
undesirable side effects associated with cancer chemotherapy. This familiar scenario
was quantitated in a study of Kaposis sarcoma study that showed the percentage
concentration of doxorubicin in Kaposis sarcoma lesions to be 0.001% [99].
This therapeutic phenomenon does not appear to be a tumor-specic challenge,
however, and is therefore applicable to the lions share of malignancies and tumor
types [5, 100102].
Some of the above-mentioned and most notable challenges include physiological
barriers (i.e. the RES, epithelial/endothelial membranes, cellular drug extrusion
mechanisms) and biophysical barriers (i.e. interstitial pressure gradients, transport
across the ECM, expression and density of specic tumor receptors, and ionic
molecular pumps). Biobarriers are sequential in nature, and therefore the probability
of reaching the therapeutic objective is the product of individual probabilities of
overcoming each and every one of them [8]. The requirement for a therapeutic agent to
be provided with a sufcient collection of weaponry to conquer all barriers, yet still be
small enough for safe vascular injection, is the major challenge faced by nanotechnology [14]. Once injected, nanoscale drug delivery systems (or nanovectors) are ideal
candidates for the time-honored problem of optimizing the therapeutic index for
treatment that is, to maximize efcacy, while reducing adverse side effects.
The ideal injected chemotherapeutic strategy is envisioned to be capable of
navigating through the vasculature after intravenous administration, to reach the
desired tumor site at full concentration, and to selectively kill cancer cells with a
cocktail of agents with minimal harmful side effects. Third-generation nanoparticle
strategies represent the rst wave of next-generation nanotherapeutics that are
specically equipped to address biological barriers to improve payload delivery at
the tumor site. By denition, third-generation nanoparticles have the ability to
perform a time sequence of functions which involve the cooperative coordination
of multiple nanoparticles and/or nanocomponents. This novel generation of
nanotherapeutics is exemplied through the employment of multiple nanobased
products that synergistically provide distinct functionalities. In this chapter, the
nanocomponents will include any engineered or articially synthesized nanoproducts, including peptides, oligonucleotides (e.g. thioaptamers, siRNA) and phage
with targeting peptides. Naturally existing biological molecules, such as antibodies,
will be excluded from this designation, despite their ability to be synthesized.
Third-generation approaches have been developed to address the numerous
challenges responsible for reducing the chemotherapeutic efcacy of earlier strategies. For example, surface modication of the exterior of nanoparticles with
PEG has proven to be effective in increasing the circulation time within the
bloodstream; however, this preservation tactic proves detrimental to the biological
recognition and targeting ability of the nanovector [103]. In order to avoid such
paradoxical approaches of employing debilitating improvements to therapeutic
delivery systems, many research groups are combining multiple nanotechnologies
to exploit the additive contributions of the constituent components. One example of
third-generation nanoparticles is the biologically active molecular networks known as
j67
68
favorable physical characteristics, the Stage 1 particle can be surface-treated with such
modications as PEG for RES avoidance and also equipped with biologically active
targeting moieties (e.g. aptamers, peptides, phage, antibodies) to enhance the tumortargeting specicity. This approach decouples the challenges of: (i) transporting
therapeutic agents to the tumor-associated vasculature; and (ii) delivering therapeutic agents to cancer cells. The Stage 1 particles shoulder the burden of efciently
transporting a nanoparticle payload to the tumor site within the nanoporous
structures of its interior (Figure 3.5). The nanoparticles called Stage 2 particles,
generically represent any nanovector construct within the approximate diameter
range of 5 to 100 nm. The Stage 1 particles have demonstrated the ability to rapidly
load (within seconds) and gradually release (within hours) multiple nanoparticles
(i.e. single-walled carbon nanotubes and quantum dots) during in vitro experiments,
with complete biodegradation within 2448 h, depending on the pore density [106].
Furthermore, unpublished preliminary data have demonstrated the ability to deliver
liposomes and other nanovectors, as well as indications of the successful in vivo
delivery of Stage 2 nanoparticles to tumor masses in xenograft murine models.
The multistage drug delivery system is emblematic of third-generation nanoparticle technology, since the strategy combines numerous nanocomponents to deliver
multiple nanovectors to a tumor lesion. The Stage 1 particle is rationally designed to
have a hemispherical shape to enhance particle margination within blood vessels,
and to increase particle/endothelium interaction to maximize the probability of active
tumor targeting and adhesion [107]. In addition to improved hemodynamic physical
properties and active biological targeting by utilizing nanocomponents such as
aptamers and phage, the Stage 1 particle can also present with specic surface
modications in order to avoid RES uptake and exhibit degradation rates predetermined by nanopore density. Upon tumor recognition and vascular adhesion, a series
of nanoparticle payloads may be released in a sequential order predicated upon Stage
1 particle degradation rates and payload conjugation strategies (e.g. environmentally
sensitive crosslinking techniques, pH, temperature, enzymatic triggers). The versatility of this platform nanovector multistage delivery particle allows for a multiplicity
of applications. Depending upon the nanoparticle cocktail loaded within the Stage 1
particle, this third-generation nanoparticle system can provide for the delivery not
only of cytotoxic drugs but also of remotely activated hyperthermic nanoparticles,
contrast agents and future nanoparticle technologies.
3.6
Nanovector Mathematics and Engineering
Third-generation particles are transported by the blood ow and interact with the
blood vessel walls, both specically through the formation of stable ligandreceptor
bonds and nonspecically, by means of short-ranged van der Waals, electrostatic
and steric interactions. If suitable conditions are met in terms of a sufciently high
expression of vascular receptors and sufciently low hydrodynamic shear stresses at
the wall, particles may adhere rmly to the blood vessel walls and control cell uptake,
either by avoiding or favoring, based on their nal objective. Such an intravascular
journey can be broken down into three fundamental events which form the
cornerstone of the rational design, namely: the margination dynamics; the rm
adhesion; and the control of internalization. The rational design of particles has the
aim of identifying the dominating governing parameters in each of the above-cited
events in order to propose the optimal design strategy as a function of the biological
target (diseased cell or environment).
In physiology, the term margination is conventionally used to describe the lateral
drift of leukocytes and platelets from the core of the blood vessels towards the
endothelial walls. This event is of fundamental importance as it allows an intimate
contact between the circulating cells and the vessel walls, and in the case of leukocytes it
is required for diapedesis. Similarly, the rational particle design should aim at
generating a marginating particle, that can spontaneously move preferentially in close
proximity to the blood vessel walls. Accumulating the particles in close proximity to the
blood vessel walls is highly desirable both in vascular targeting and when the delivery
strategy relies on the EPR approach. This occurs for two main reasons:
.
The particles can sense the vessel walls for biological and biophysical diversities,
as for instance the overexpression of specic vascular markers (vascular targeting)
or the presence of sufciently large fenestrations through which they extravasate
(EPR-based strategy).
The particles can more easily leave the larger blood vessels in favor of the smaller
ones, thus accumulating in larger numbers within the microcirculation
(Figure 3.6) [108].
j69
70
Figure 3.6 Marginating particles can more likely sense the vessel
walls for biological and biophysical diversity and more easily leave the
vascular compartments through openings along the endothelium.
While leukocyte and platelet margination is an active process requiring an interaction with red blood cells (RBCs) and the dilatation of inamed vessels with blood ow
reduction [109], particle margination can only be achieved by proper rational design.
It should be noted that the RBCs the most abundant blood-borne cell population
have a behavior opposite to margination, with an accumulation that occurs preferentially within the core of the vessels. This has long been described by Fahraeus and
Lindqvist [110], and is referred to as the plasma skimming effect. An immediate
consequence of this phenomenon is the formation of a cell-free layer in the
proximity of the wall, which varies in thickness with the size of the channel
and mean blood velocity. For example, it may be as large as few tens of microns
in arterioles (100 mm in diameter) and a few microns in capillaries (10 mm in
diameter [111]). Particles designed to marginate should accumulate and move in a
cell-free layer, which is also characterized by an almost linear laminar ow.
The motion of spherical particles in a linear laminar ow has been described by
Goldmann et al. [112], who showed that the exerted hydrodynamic forces grow with
the particle radius, and that no lateral drift would be observed unless an external
forces such as gravitational or magnetic, or short-ranged van der Waals and
electrostatic interactions were applied [113]. In other words, a neutrally buoyant
spherical particle moving in close proximity to a wall can drift laterally only if an
external force is applied. Here, it is important to recall that the gravitational force has
been shown to be relevant even for submicrometer polystyrene beads (relative density
to water of 0.05 g cm3), and that margination dynamics can be effectively controlled
in horizontal channels by changing the size of the nonbuoyant nanoparticles [114].
On the other hand, nonspherical particles exhibit more complex motions with
tumbling and rolling that can be exploited to control their margination dynamics,
without any need for lateral external forces. The longitudinal (drag) and lateral (lift)
forces, as well as the torque exerted by the owing blood, depend on the size, shape
and orientation of the particle to the stream direction, and change over time as the
particle is transported. Considering an ellipsoidal particle with an aspect ratio of 2
(Figure 3.7a) in a linear laminar ow, the particle trajectory and its separation distance
from the wall are shown in Figure 3.7b. Clearly, the particle motion is very complex,
with periodic oscillations towards and away from the wall. Overall, however, the
particle would approach the wall and interact with its surface.
For nonspherical particles, it has been shown that the lateral drifting velocity is
directly related to their aspect ratio [115, 116], with a maximum between the two
extremes: a sphere, with aspect ratio unity, and a disk, with aspect ratio innity.
More recently, in vitro experiments have been conducted using spherical, discoidal
and quasi-hemispherical particles with the same weight injected into a parallel plate
ow chamber under controlled hydrodynamic conditions [117]. The experiments
have shown that discoidal particles tend to marginate more than quasi-hemispherical
and more than spherical particles in a gravitational eld. Notably, these observations
neglect the interaction of the particles with blood cells, in particular RBCs. However,
this is a reasonable assumption as long as the particles are sufciently smaller than
RBCs and tend to accumulate within the cell-free layer.
Therefore, with regards to what concerns the design of marginating particles their
size and shape, their geometric properties are of fundamental importance.
The marginating particle moving in close proximity to the blood vessels can
interact both specically and nonspecically with the endothelial cells, and eventually
adhere rmly to it. Firm and stable adhesion is ensured as long as the dislodging
forces (hydrodynamic forces and any other force acting to release the particle from
j71
72
Figure 3.8 The longitudinal (drag) force (F) and the torque (T)
exerted over a particle adhering to a cell layer under flow.
the target cell) are balanced by specic ligandreceptor interactions and nonspecic
adhesion forces arising at the cellparticle interface (Figure 3.8).
The strength of adhesion must be expressed in terms of an adhesion probability
factor, Pa, dened as the probability of having at least one ligandreceptor bond
formed under the action of the dislodging forces. The probability of adhesion is
decreased as the shear stress at the blood vessel wall mS and as the characteristic size
of the particle increase; and grows as the surface density of ligand molecules ml
distributed over the particle surface and of receptor molecules mr expressed at the cell
membrane increases. However, for a xed volume particle that is to say, for a xed
payload oblate particles with an aspect ratio g larger than unity would have a larger
strength of adhesion having xed all other parameters [118]. Interestingly, for each
particle shape, a characteristic size can be identied for which the probability of
adhesion has a maximum, as shown in Figure 3.9. For small particles, the hydrodynamic forces are small but the area of interaction at the particlecell interface is also
smaller, leading consequently to a small number of ligandreceptor bonds involved
which cannot withstand even a small dislodging force. For large particles, the number
of ligandreceptor bonds that can be formed grows, but the hydrodynamic forces
grow even more. The optimal size for adhesion that is, the size for which Pa has a
maximum falls between these two limiting conditions.
As an example, when considering a capillary with a shear stress at the wall of
mS 1 Pa and a surface density of receptors mr 100 mm2, the optimal radius for a
spherical particle would be about 500 nm with a total volume of 0.05 mm3, whereas the
optimal volume for an oblate spheroidal particle with an aspect ratio g 2 would be
more than 50 times larger (3.5 mm2) [118].
In particle adhesion, rational design should focus on the shape of the particle and
the type and surface density of ligand molecules decorating the particle surface.
Once the particle has adhered to the target cell, it should be internalized if the aim
is to release drugs or therapeutic agents within the cytosol or at the nuclear level (gene
delivery). Alternatively, it should resist internalization if the target cell is used just as a
docking site (vascular targeting) from which are released second-stage particles. The
internalization rate is affected by the geometry of the particle and the ligandreceptor
bonds involved.
Freund and colleagues [119] developed a mathematical model for receptormediated endocytosis based on an energetic analysis. This showed that a threshold
particle radius Rth exists, below which endocytosis could never occur and, that an
optimal particle radius Ropt exists, slightly larger than Rth, for which internalization is
favored with the maximum internalization rate, thus conrming (in theory) the
above-cited experimental observations. This analysis was then generalized to
account for the contribution of the surface physico-chemical properties that may
dramatically affect the internalization process, changing signicantly both Ropt and
Rth (Figure 3.10), as shown by Decuzzi and Ferrari [114].
Figure 3.10 The optimal radius Ropt and the wrapping time tw as a
function of the nonspecific parameter F, growing with the
repulsive nonspecific interaction at the cellparticle interface.
j73
74
A more recent theoretical model has been developed by Decuzzi and Ferrari for the
receptor-mediated endocytosis of nonspherical particles [120]. This shows how
elongated particles laying parallel to the cell membrane are less prone to internalization compared to spherical particles or particles laying normal to the cell membrane.
The results show clearly how particle size and shape can be used to control the
internalization process effectively, and that particles that deviate slightly from the
spherical shape are more easily internalized compared to elongated particles that
deviate severely from the classical spherical shape.
Even in the case of particle internalization, a judicious combination of surface
physico-chemical properties and particle geometry would lead to a particle with
optimized internalization rates, depending on the nal biological applications.
Finally, a mathematical model has been recently developed [121] that allows one to
predict the adhesive and endocytotic performances of particulate systems based on
three different categories of governing parameters: (i) geometric (radius of the
particle); (ii) biophysical (ligand-to-receptor surface density ratio; nonspecic interaction parameter; hydrodynamic force parameter); and (iii) biological (ligand
receptor binding afnity). This nding has led to the denition of Design Maps
through which the three different states of the particulate system can be predicted:
(i) no adhesion at the blood vessel walls; (ii) rm adhesion with no internalization by
the endothelial cells; or (iii) rm adhesion and internalization (Figure 3.11) [121].
3.7
The Biology, Chemistry and Physics of Nanovector Characterization
j75
76
3.7.1
Physical Characterization
Physical characterization includes assays for particle size, size distribution, molecular
weight, density, surface area, porosity, solubility, surface charge density, purity, sterility,
surface chemistry and stability. The mean particle size that is, the hydrodynamic
diameter is determined by batch-mode dynamic light scattering (DLS) in aqueous
suspensions. Care must be taken with these measurements, because they can be
affected by other parameters. For example, precautions include the cleaning of cuvettes
with ltered, demineralized water; media ltering with 0.1 mm pore size membranes
and pre-rinsing cuvettes multiple times; scattering contributions by media in the
absence of analyte; optimized sample concentration; and ltering samples in conjunction with loading into the cuvette. The sample concentration should be optimized to
avoid signal-to-noise ratio (SNR) deterioration at low concentration and particle
interactions and scattering effects at high concentration. Another precaution is to
add only small amounts of monovalent electrolyte in order to avoid salt effects on the
electrical double-layer surrounding the particles in the media. Again, concentration
optimization is necessary for optimal measurements. In order to evaluate instrument
performance, latex size standards are commercially available. When analyzing these
data, the absolute viscosity and refractive index for the suspending media is required to
calculate the hydrodynamic diameter.
3.7.2
In Vitro Testing
assay to assess PLGA uptake [128]). Both, uorescence microscopy and ow cytometry rely on the attachment of uorescent probes to the nanoparticle; alternatively,
the latter technique may rely on changes in light scattering caused by the presence of
internalized nanoparticles [127]. Controls are always essential to ensure that any
intracellular uorescence is not due to the uptake of dye that might have been
released from the particles [128]. Factors affecting nanoparticle uptake include
nanoparticle concentration, incubation time, nanoparticle size and shape and culture
media.
For multicomponent systems, targeting may be difcult to access in vitro, especially for systems composed of nested particles, where each particle is targeting a
specic and discrete population. Additional problems arise due to the modication of
particles with imaging agents. For example, the conjugation of uorescent probes to
the surface of particles alters the surface charge density of the particle, and may also
mask the binding of ligands on the particle surface. This in turn alters the ability of
particles to bind to the cell-surface receptors that are responsible for their uptake. In
vitro targeting assays also need to emphasize the impact of serum opsonization on
particle uptake [127]. Serum components are signals for immune cells, and may
either activate cells or serve as bridges attaching particles to cells. For example,
antibodies found in serum may bind to particles and mediate their uptake via Fc
receptors found on specic cell populations. The end result is dramatic, however, and
may even completely alter which cell populations are able to internalize the particles.
To date, research investigations have shown that nanoparticles can stimulate and/or
suppress the immune response [129]. Compatibility with the immune system is affected to a large degree by the surface chemistry. The cellular interaction of nanoparticles
is dependent on either their direct binding to surface receptors or binding through the
absorption of serum components to particles and their subsequent interaction with
cell receptors [127]. Blood-contact properties of the nanomaterial and cell-based assays
are used to determine the immunological compatibility of the device.
The blood-contact properties of nanoparticles are characterized by plasma protein
binding, hemolysis, platelet aggregation, coagulation and complement activation.
j77
78
Multiple tests are used to assess the effect of nanoparticles on plasma coagulation,
including prothrombin time.
In vitro immunology assays also include cell-based assays including colonyforming units-granular macrophages (CFU-GM), leukocyte proliferation, macrophage/neutrophil function and cytotoxic activity of natural killer (NK) cells. The effect
of nanoparticles on the proliferation and differentiation of murine bone marrow
hematopoietic stem cells (HSC) is monitored by measuring the number of colony
forming units (CFU) in the presence and absence of nanoparticles. The effect of
nanoparticles on lymphocyte proliferation is determined in similar manner. The
ability of nanoparticles to either induce or suppress proliferation is measured and
compared to control induction by phytohemaglutinin. Macrophage/neutrophil function is measured by the analysis of phagocytosis, cytokine induction, chemotaxis and
oxidative burst. Similar to earlier targeting studies, nanoparticle internalization is
measured, but with respect to classical phagocytic cells rather than to the target
populations. A current phagocytosis assay utilizes luminol-dependent chemiluminescence, although alternative detection dyes must be used for nanoparticles that
interfere with measurements. Cytokine production induced by nanoparticles is
measured using white blood cells isolated from human blood. Following particle
incubation, the cell culture supernatants are collected and analyzed for the presence
of cytokines, using cytometry beads. The chemoattractant capacity of nanoparticles is
measured used a cell migration assay; here, cell migration through a 3 mm lter
towards test nanoparticles is quantitated using a uorescent dye. The nal measure
of macrophage activation is a measure of nitric oxide production using the Greiss
reagent. NK-mediated cytotoxicity can be measured by radioactive release assays, in
which labeled target cells release radioactivity upon cytolysis by NK cells. A new labelfree assay known as xCELLigence (available from Roche) is used to measure the
electrical impedance of cells attached to the bottom of a microtiter plate containing
cell sensor arrays. In this system, any changes in cell morphology, number, size or
attachment are detected in real time.
The nal category of assays relies on in vivo animal testing. Under this umbrella are
included disposition studies, immunotoxicity, dose-range-dependent toxicity and
efcacy. The initial disposition of nanoparticles is dependent upon tissue distribution, clearance, half-life and systemic exposure. In the NCL regime, immunotoxicity
is measured as a 28-day screen and by immunogenicity testing (repeat-dosing). Dosedependent toxicity can be evaluated by monitoring blood chemistry, hematology,
histopathology and gross pathology. Depending on the nature of the delivery system,
the efcacy is measured either by imaging or by therapeutic impact.
One possible route of nanoparticle exposure within the work environment is that of
inhalation, which in turn creates a need for additional studies that include animal
inhalation and intratracheal instillation assays [130, 131]. These additional studies
also illicit the need for even more characterization studies, such as determining the
dispersion properties of nanoparticles. Hence, new methods to determine not only
hazard and risk assessments but also therapeutic efcacy continue to be developed
as new areas of concern arise. The careful characterization and optimized bioengineering of both nanoparticles and microparticles represent key contributors to the
generation of nanomedical devices with optimal delivery and cellular interaction
features.
3.8
A Compendium of Unresolved Issues
Unresolved issues and opportunities live in symbiosis. Programmatically, we welcome even the most daunting challenges, as their mere identication as such not a
simple task in most cases, and invariably one that requires the right timing and
knowledge maturation frequently happens when solutions are conceivable, or well
within the reach of the scientic community. With this essentially positive outlook we
will list in this section some questions that appear daunting at this time, but are
starting to present themselves with ner detail and resolution, indicating in our mind
that readers in a few years, if any, will nd them to be essentially resolved, and the
j79
80
value of this section, if any, to be basically that of a message in a bottle across the seas
of time and reading patience.
1. The key issue for all systemically administered drugs (nanotechnological, biological, and chemotherapeutical alike) is the management of biological barriers.
Biological targeting is always helpful, under the assumption that a sufcient
amount of the bioactive agent successfully navigates the sequential presentation
of biological barriers. This is a very stringent and daunting assumption
essentially the success stories of the pharmaceutical world correspond to the
largely serendipitous negotiation of a subset of biological barriers, for a given
indication, and in a sufciently large subset of the population. The third-generation nanosystems described above are but a rst step toward the development of a
general, modular system that can systematically address the biological barriers in
their sequential totality. We certainly expect that novel generations, and renements of nanovector generations 13 will be developed, to provide a general
solution to the chief problem of biobarriers and biodistribution-by-design.
2. There is no expectation that any single, present or future biobarrier management
vectoring system will be applicable to all, or even to most. Personalization of
treatment is the focus of great emphasis worldwide with overwhelming bias
toward personalization by way of molecular biology. The vectoring problem, on
the other hand, is a combination of biology, physics, engineering, mathematics,
and chemistry with substantial prevalence of the non-biological components of
vector design. The evolution of nanotechnologies makes it conceivable, that
personalization of treatment will develop as combination of biological methods,
and vector design based on non-biological sciences. Foundational elements of
mathematics-based methods of rational design of vectors were disclosed in the
preceding chapters. The missing link to personalized therapy at this time is
the renement of imaging technologies that can be used to identify the characteristics of the target pathologies lesion-by-lesion, at any given time, and with
the expectation that a time evolution will occur providing the basis for the
synthesis of personalized vectors, which may then carry bioactive agents that
may be further personalized for added therapeutic optimization. The word
personalization does not begin to capture the substance of this proposition;
perhaps individualization is a better term, with the understanding that treatment
would individualize at the lesion level (or deeper) in a time-dynamic fashion,
rather than at the much coarser level of a individual patient at a given time.
3. Hippocrates left no doubt that safety is rst and foremost. The conjoined twin of
personalized treatment by biodistribution design is the adverse collateral event
by drug concentration at unintended location. Safety and the regulatory approval
pathways that are intended to ensure it in a more advanced sense that the current
observation of macroscopic damage requires an accurate determination of the
biodistribution of the administered agents. This would be an ideal objective for
all drugs, to be sure, but arguably an impossible one in general. Here is where a
challenge turns into an opportunity for the nanomedicine: The ability of nanovectors
to be or carry active agents of therapy, while at the same time be or carry contrast
agents that allow the tracking and monitoring of their biodistribution in real time
provides the nanopharmaceutical world with a unique advantage. Alas reality at this
time smiles much less than this vision would entail: In general it proves very difcult
at this time to comprise or conjugate nanovectors with contrast agents or nuclear
tracers in a manner that is stable in-vivo. Forming a construct that will not separate
in their components once systemically administered is a difcult general conjugation chemistry proposition. Less than total success at it means that what is being
tracked may be the label rather than the vector or the drug. The problem becomes
combinatorially more complex with increasing numbers of nanovector components. Another facet of the same problem is that frequently the conjugation of labels,
contrast agents or tracers dramatically alters biodistribution with respect to the
construct that is intended for therapeutic applications. One strategic recommendation that naturally emerges for the immediate future is to prioritize nanovectors that
are themselves easily traceable with current radiological imaging modalities.
4. Again in common with all drugs, the development and clinical deployment of
nanomedicines would greatly benet from the development of methods for the
determination of toxicity and efcacy indicators from non-invasive or minimally
invasive procedures such as blood draws. Serum or plasma proteomics and
peptidomics are a promising direction toward this elusive goal. The challengeturned-into-competitive advantage for nanomedicine is in the ability of nanovectors to carry reporters of location and interaction, which can be released into the
blood stream and collected therefrom, to provide indications of toxicity and
therapeutic effect.
5. Individualization by rational design of carriers together with biological optimization of drug both informed by imaging and biological proling are a dimension
of progress toward optimal therapeutic index for all. Another dimension is in the
time dynamics of release: the right drug at the right place at the right time is the
nal objective. With their exquisite control of size, shape, surface chemistry and
overall deign parameters, nanovectors are outstanding candidates for controlled
release by implanted (nano)devices or (nano)materials; yet another case of
challenge turned opportunity, and of synergistic application of multiple nanotechnologies to form a higher generation nanosystem.
6. The last and perhaps most important challenge ahead and a wide, extraordinarily
exciting prairie of opportunities for rides of discovery is the generation of novel
biological hypotheses. With the higher-order nanotechnologies in development, it
is possible to reach subcellular target A with nanoparticle species X at time T, and
in the same cell and from the same platform then reach subcellular target B with
nanoparticle species Y at subsequent time T0. What therapeutic advantages that
may bring, for the possible combinations of A, B, X, Y, T and T0 (and extensionsby-induction of the concept) is absolutely impossible to fathom at this time. There
is basically little is any science on it and of course that is the case, since the
technology that permits the validating experiment is in its infancy at this time. The
j81
82
References
1 American Cancer Society website,
Statistics 08 (2008) http://www.cancer.
org/docroot/STT/stt_0.asp (accessed 21
August 2008).
2 Jemal, A., Siegel, R., Ward, E., Murray, T.,
Xu, J. and Thun, M.J. (2007) Cancer
statistics, 2007. Cancer Journal for
Clinicians, 57, 43.
3 Brigger, I. Dubernet, C. and Couvreur, P.
(2002) Nanoparticles in cancer therapy
and diagnosis. Advanced Drug Delivery
Reviews, 54, 631.
4 Jain, R.K. (1987) Transport of molecules
in the tumor interstitium: a review. Cancer
Research, 47, 30393051.
5 Jain, R.K. (1999) Transport of molecules,
particles, and cells in solid tumors.
Annual Review of Biomedical Engineering,
1, 241.
6 Sanhai, W.R., Sakamoto, J.H., Canady, R.
and Ferrari, M. (2008) Seven challenges
for nanomedicine. Nature Nanotechnology,
3, 242.
7 Sakamoto, J., Annapragada, A., Decuzzi,
P. and Ferrari, M. (2007) Antibiological
barrier nanovector technology for cancer
applications. Expert Opinion on Drug
Delivery, 4, 359.
8 Ferrari, M. (2005) Nanovector
therapeutics. Current Opinion in Chemical
Biology, 9, 343.
9 Kerbel, R.S. (2008) Tumor angiogenesis.
The New England Journal of Medicine, 358,
2039.
10 Hobbs, S.K., Monsky, W.L., Yuan, F.,
Roberts, W.G., Grifth, L., Torchilin, V.P.
and Jain, R.K. (1998) Regulation of
transport pathways in tumor vessels: role
of tumor type and microenvironment.
Proceedings of the National Academy of
Sciences of the United States of America, 95,
4607.
References
18 Zitvogel, L., Apetoh, L., Ghiringhelli, F.
and Kroemer, G. (2008) Immunological
aspects of cancer chemotherapy. Nature
Reviews Immunology, 8, 59.
19 National Cancer Institute website (2008)
http://www.cancer.gov/ (accessed 12
October 2008).
20 Folkman, J. (1971) Tumor angiogenesis:
therapeutic implications. The New
England Journal of Medicine, 285, 1182.
21 Folkman, J. (2007) Angiogenesis: an
organizing principle for drug discovery?
Nature Reviews Drug Discovery, 6, 273.
22 Ferrara, N., Hillan, K.J., Gerber, H.P. and
Novotny, W. (2004) Discovery and
development of bevacizumab, an antiVEGF antibody for treating cancer. Nature
Reviews Drug Discovery, 3, 391.
23 Hurwitz, H., Fehrenbacher, L., Novotny,
W. et al. (2004) Bevacizumab plus
irinotecan, uorouracil, and leucovorin
for metastatic colorectal cancer. The New
England Journal of Medicine, 350, 2335.
24 Sandler, A., Gray, R., Perry, M.C. et al.
(2006) Paclitaxel-carboplatin alone or with
bevacizumab for non-small-cell lung
cancer. The New England Journal of
Medicine, 355, 2542.
25 Faivre, S., Demetri, G., Sargent, W. and
Raymond, E. (2007) Molecular basis for
sunitinib efcacy and future clinical
development. Nature Reviews Drug
Discovery, 6, 734.
26 Motzer, R.J., Michaelson, M.D.,
Redman, B.G. et al. (2006) Activity of
SU11248, a multitargeted inhibitor of
vascular endothelial growth factor
receptor and platelet-derived growth
factor receptor, in patients with metastatic
renal cell carcinoma. Journal of Clinical
Oncology, 24, 16.
27 Escudier, B., Eisen, T., Stadler, W.M. et al.
(2007) Sorafenib in advanced clear-cell
renal-cell carcinoma. The New England
Journal of Medicine, 356, 125.
28 Llovet, J., Ricci, S., Mazzaferro, V. et al.
(2008) Sorafenib in advanced
hepatocellular carcinoma. The New
England Journal of Medicine, 359, 378.
j83
84
50
51
52
53
54
55
56
57
References
58 National Nanotechnology Initiative
program . (2008) http://www.nano.gov/
NNI_FY09_budget_summary.pdf
(accessed 12 October 2008).
59 Theis, T., Parr, D., Binks, P., Ying, J.,
Drexler, K.E., Schepers, E., Mullis, K.,
Bai, C., Boland, J.J., Langer, R.,
Dobson, P., Rao, C.N. and Ferrari, M.
(2006) nan o.tech.nol o.gy n. Nature
Nanotechnology, 1, 8.
60 Heath, J.R. and Davis, M.E. (2008)
Nanotechnology and cancer. Annual
Review of Medicine, 59, 251.
61 Nie, S., Kim, G.J., Xing, Y. and Simons,
J.W. (2007) Nanotechnology applications
in cancer. Annual Review of Biomedical
Engineering, 9, 257.
62 Riehemann, K., Schneider, S.W.,
Luger, T.A., Godin, B., Ferrari, M. and
Fuchs, H. (2008) Nanomedicine Developments and perspectives.
Angewandte Chemie - International Edition,
in press, DOI : 10.1002/ange. 200802585.
63 Harris, J.M. and Chess, R.B. (2003) Effect
of pegylation on pharmaceuticals. Nature
Reviews Drug Discovery, 2, 214.
64 Brannon-Peppas, L. and Blanchette, J.O.
(2004) Nanoparticle and targeted systems
for cancer therapy. Advanced Drug Delivery
Reviews, 56, 1649.
65 Torchilin, V.P. (2007) Targeted
pharmaceutical nanocarriers for cancer
therapy and imaging. The APS Journal, 9,
E128.
66 Saul, J.M., Annapragada, A.V. and
Bellamkonda, R.V. (2006) A dual-ligand
approach for enhancing targeting
selectivity of therapeutic nanocarriers.
Journal of Controlled Release, 114, 277.
67 Yang, X., Wang, H., Beasley, D.W. et al.
(2006) Selection of thioaptamers for
diagnostics and therapeutics. Annals
of the New York Academy of Sciences, 116,
1082.
68 Souza, G.R., Christianson, D.R.,
Staquicini, F.I. et al. (2006) Networks of
gold nanoparticles and bacteriophage as
biological sensors and cell-targeting
agents. Proceedings of the National
69
70
71
72
73
74
75
76
77
78
79
j85
86
80
81
82
83
84
85
86
References
96 Monsky, W.L., Kruskal, J.B.,
Lukyanov, A.N., Girnun, G.D.,
Ahmed, M., Gazelle, G.S., Huertas, J.C.,
Stuart, K.E., Torchilin, V.P. and
Goldberg, S.N. (2002) Radio-frequency
ablation increases intratumoral liposomal
doxorubicin accumulation in a rat breast
tumor model. Radiology, 224, 823.
97 Farokhzad, O.C., Cheng, J., Teply, B.A.,
Sheri, I., Jon, S., Kantoff, P.W.,
Richie, J.P. and Langer, R. (2006) Targeted
nanoparticle-aptamer bioconjugates for
cancer chemotherapy in vivo. Proceedings
of the National Academy of Sciences of the
United States of America, 103, 6315.
98 Farokhzad, O.C., Karp, J.M. and
Langer, R. (2006) Nanoparticle-aptamer
bioconjugates for cancer targeting. Expert
Opinion on Drug Delivery, 3, 311.
99 Northfelt, D.W., Martin, F.J., Working,
P., Volberding, P.A., Russell, J.,
Newman, M., Amantea, M.A. and
Kaplan, L.D. (1996) Doxorubicin
encapsulated in liposomes containing
surface-bound polyethylene glycol:
pharmacokinetics, tumor localization,
and safety in patients with AIDS-related
Kaposis sarcoma. Journal of Clinical
Pharmacology, 36, 55.
100 Jang, S.H., Wientjes, M.G., Lu, D. and Au,
J.L. (2003) Drug delivery and transport to
solid tumors. Pharmaceutical Research, 20,
1337.
101 Lankelma, J., Dekker, H., Luque, F.R.,
Luykx, S., Hoekman, K., van der Valk, P.,
van Diest, P.J. and Pinedo, H.M. (1999)
Doxorubicin gradients in human breast
cancer. Clinical Cancer Research, 5, 1703.
102 Tannock, I.F., Lee, C.M., Tunggal, J.K.,
Cowan, D.S. and Egorin, M.J. (2002)
Limited penetration of anticancer drugs
through tumor tissue: a potential cause of
resistance of solid tumors to
chemotherapy. Clinical Cancer Research, 8,
878.
103 Klibanov, A.L., Maruyama, K., Beckerleg,
A.M., Torchilin, V.P., and Huang, L.
(1991) Activity of amphipathic poly
(ethylene glycol) 5000 to prolong the
104
105
106
107
108
109
110
111
112
j87
88
Part Three:
Imaging and Probing the Inner World of Cells
j91
4
Electron Cryomicroscopy of Molecular Nanomachines and Cells
Matthew L. Baker, Michael P. Marsh, and Wah Chiu
4.1
Introduction
92
Electron cryomicroscopy (cryo-EM) is an emerging methodology that is particularly well suited for studying molecular nanomachines at near-native or chemically
dened conditions. Cryo-EM can be used to study nanomachines of various sizes,
shapes and symmetries, including two-dimensional (2-D) arrays, helical arrays and
single particles [12]. With recent advances, cryo-EM can now not only reveal the gross
morphology of these nanomachines but also provide highly detailed models of
protein folds approaching atomic resolutions [1317]. In this chapter, we will present
the methodology of single-particle cryo-EM, as well as its potential biomedical
applications and future prospects.
Complementary to structural studies of nanomachines with cryo-EM, the application of cryo-tomography (cryo-ET) can depict the locations and low-resolution
structures of nanomachines in a 3-D cellular environment. The power of cryo-ET
comes from its unique ability to observe directly biological nanomachines in situ,
without the need for isolation and purication. This approach has the potential to
capture the structural diversity of nanomachines in their milieu of interacting
partners and surrounding cellular context.
4.2
Structure Determination of Nanomachines and Cells
Figure 4.1 shows a series of typical steps in imaging nanomachines using cryo-EM or
cryo-ET. The rst steps are common to both techniques; biochemical preparation,
specimen preservation via rapid freezing; and imaging the frozen, hydrated specimens by low-dose electron microscopy. Although the subsequent steps differ for the
two techniques, they both include image processing to generate a 3-D reconstruction,
interpreting the 3-D volume density together with other biological data and archiving
the density maps and models. In this chapter we will not address how to perform each
of the aforementioned mentioned steps, as numerous technical reports and books
exist that describe them in detail [12, 18]. Rather, we will briey summarize these
steps and their applications to a few examples of molecular nanomachines and cells.
4.2.1
Experimental Procedures in Cryo-EM and Cryo-ET
In principle, most of these steps are rather straightforward, and the length of time
taken to start from a highly puried nanomachine to obtaining a complete structure
can range from a few days to months. However, as with any experimental method,
various hurdles may be encountered that require further optimization before a
reliable structure can be determined.
4.2.1.1 Specimen Preparation for Nanomachines and Cells
Specimen preparation is a critical step in single-particle cryo-EM, which necessarily
requires high conformational uniformity while preserving functional activities.
In X-ray crystallography, crystallization is a selective process through which only
Figure 4.1 The experimental pipeline for cryoimaging experiments. (a) The first steps of the
experiment specimen preparation, specimen
freezing and microcopy are common to both
cryo-EM and cryo-ET; (b) The subsequent steps
j93
94
uniform particles are merged to yield a 3-D model, cryo-ET merges many images of
the same specimen target, collected at different angles. With this approach, a
reconstruction can be computed from the images of a single cell or nanomachine,
and so conformational uniformity is not an issue in the most general case. The
merging of whole cells or organelles such as single particles is not a reasonable goal,
as uniformity can never realistically be expected; however, some subcelluar structures
may be sufciently uniform in conformation to warrant merging and averaging
from the 3-D tomogram. Below, we consider such an example when discussing the
bacterial agella motor.
4.2.1.2 Cryo-Specimen Preservation
Following biochemical isolation and purication, the rst step in a cryo-imaging
experiment is to embed a biochemically puried nanomachine or cell under welldened chemical conditions in ice on a cryo-EM grid [19]. This freezing process is
extremely quick in order to prevent the formation of crystalline ice, and thus produces
a matrix of vitreous ice in which the water molecules remain relatively unordered.
The spread of the nanomachines on the grid should be neither too crowded, such that
they would contact each other, nor too dilute as to only have a few nanomachines
recorded in each micrograph. For cryo-EM, it is preferable to have the nanomachines
situated in random orientations to allow sufcient angular sampling needed for
the subsequent 3-D reconstruction procedure. The ideal thickness of the embedding
ice is slightly greater than the size of the nanomachine or cell. Excessive ice thickness
is detrimental because it diminishes the signal-to-noise ratio (SNR) of the images that
can be acquired. Ice that is too shallow can be a problem for cryo-ET experiments,
whereby attening of the specimen can occur. The capillary forces of the solvent, in
the uid phase just prior to vitrication, can compress the sample; this has been
reported in vesicles [20] as well as real cells where a 1 mm-thick cell can be reduced to
600 nm [21, 22].
Some specimens are very easy to prepare, while others are more difcult, which
necessarily means optimization of the specimen preparation is a trial-and-error
process. In general, this step the preparation of the frozen, hydrated specimens,
preserved in vitreous ice with an optimal ice thickness is often a bottleneck.
Analogous to the crystallization process in X-ray crystallography, there is no foolproof
recipe for optimal specimen preservation. However, a computer-driven freezing
apparatus has made this step more reproducible and tractable in nding optimal
conditions for freezing a given specimen [23].
In principle, the frozen, hydrated specimens represent native conformations as
they are maintained in an aqueous buffer. Fixation of the nanomachines in a specic
orientation can occur prior to freezing. Specimen freezing can also be coordinated
with a time-resolved chemical mixing reaction; prototype apparatuses have been built
to perform such a time-resolved reaction [24, 25]. It is conceivable that a more
sophisticated instrument can be built to allow all sorts of chemical reactions,
including those that can be light-activated. Such an approach would allow cryo-EM
to follow the structure variations in a chemical process with a temporal resolution of
milliseconds [25].
j95
96
j97
98
The result of a cryo-EM experiment is typically a 3-D density map with multiple
domains and/or models used to annotate the structure and function of the
molecular nanomachine. In reaching this model, multiple intermediate data sets
and image processing workows are produced. Databases, such as EMEN [79] and
others [80, 81], function on a laboratory scale and can house the nal 3-D density maps
and model, as well as the original specimen images and all of the intermediate data and
processes. The nal density map, models and associated metadata can also be
deposited in public repositories such as the electron microscopy databank (EMDB)
and the protein databank (PDB) [82]. Individual cryo-EM structures are easily retrieved
through accession numbers or IDs directly from publicly accessible websites.
4.3
Biological Examples
4.3.1
Skeletal Muscle Calcium Release Channel
Ryanodine receptor (RyR1) is a 2.3 MDa homotetramer that regulates the release of
Ca2 from the sarcoplasmic reticulum to initiate muscle contraction (for a review,
see Ref. [49]). Figure 4.2a shows the 9.6 resolution cryo-EM density map of RyR1
reconstructed from 28 000 particle images (Figure 4.2b) [83]. In this map, the
structural organization, including the transmembrane and cytoplasmic regions for
each monomer, as well as domains within individual monomers can be clearly seen.
A structural analysis of the RyR1 map using SSEHunter [71] revealed 41 a-helices,
36 in the cytoplasmic region and ve in the transmembrane region, as well as seven
b-sheets in the cytoplasmic region of a RyR1 monomeric subunit (Figure 4.2c).
Interestingly, a kinked inner, pore-lining helix and a pore helix in the transmembrane
region bears a remarkable similarity to those of the MthK channel [84]. b-Sheets
located in the constricted part that connect the transmembrane and cytoplasmic
regions have been seen in the crystal structures of inward rectier K channels
(Kir channels) [85, 86] and a cyclic nucleotide-modulated (HCN2) channel [87]. In Kir
channels, this b-sheet has been proposed to form part of the cytoplasmic pore, which
is connected to the inner pore. Therefore, this region in the RyR1 may play a role in
regulating the ions by interacting with cellular regulators which are yet to be
determined.
While there is no crystal structure from any domain or region of RyR1, a
homologous domain from the IP3 receptor is known. Using the aforementioned
cryo-EM constrained homology modeling approach [69], it was possible to derive
three protein folds, based on the ligand-binding suppressor and IP3-binding core
domains from the type 1 IP3 receptor, for the N-terminal portion (residues 12565) of
the RyR1 primary sequence [88] (Figure 4.2d). Interestingly, these models were
localized to a region at the four corners of the RyR1 tetramer, a region that has also
been implicated to interact with the dihydropyridine receptor (DHPR) during the
j99
100
Figure 4.3 Bacteriophage epsilon15. (a) A 300kV electron image of the phage particles
embedded in vitreous ice; (b) A cut-away view of
the 20 asymmetric reconstruction of
epsilon15 [92] shows the molecular components
of the portal vertex, the capsid shell protein and
the viral DNA. The different molecular
j101
102
Traditional cellular biology studies are frequently limited by carrying out experiments
in vitro or investigating only fractions of cells. It is an obvious and tremendous
advantage to integrate the structures and processes of all of the cellular space,
enabling investigators to comprehend cells in toto. Today, Baumeister and colleagues
continue to make strides towards this goal of visualizing a complete cell with all of its
major nanomachines. Early proof-of-concept studies have shown that it is possible to
identify and differentiate large complexes in the tomograms of synthetic cells [20].
Moreover, recent advances in data processing suggest that even similar assemblies
with subtle differences in mass, such as GroEL and GroEL-GroES, can be differentiated [102]. The rst application of mapping nanomachines in a cell showed that the
total spatial distribution of ribosomes through an entire cell could be directly
observed in the spirochete Spiroplasma melliferum (Figure 4.5) [103].
The archaebacteria Thermoplasma acidophilum is a relatively simple cell with only
approximately 1507 open reading frames (ORFs) comprising considerably fewer
subcellular assemblies [104]. As such, it is an attractive cryo-ET target for mapping the
3-D position of all major nanomachines the proteomic atlas which, ultimately,
will reveal unprecedented detail about the 3-D organization of proteinprotein
networks [105].
4.4
Future Prospects
Today, single-particle cryo-EM has reached the turning point where it is now possible
to resolve relatively high-resolution structures of molecular nanomachines under
conditions not generally possible with other high-resolution structure determination
techniques. Due to the intrinsic nature of the cryo-EM experiment, it can also
produce unique and biologically important information, even when a high-resolution
j103
104
structure is already known. Cryo-EM structures of both the ribosome [106] and
GroEL [14, 107, 108] have provided signicant insight into structural and functional
mechanisms, despite being extensively studied using X-ray crystallography.
One obvious challenge for cryo-EM is the pursuit of higher resolution (i.e. close to
or better than 3.0 ), at which point full, all-atom models could be constructed. On the
other hand, cryo-EM is not aimed solely at high resolution. Rather, it offers the ability
to resolve domains and/or components that are highly exible at lower resolutions, as
well as samples with multiple conformational states [108]. With further developments in the image-processing routines, both high-resolution structure determination and computational purication of samples [108] will further allow for the
exploration of complex molecular nanomachines in greater detail.
As with cryo-EM, improvements in data collection and image processing will allow
cryo-ET to achieve more accurate and higher-resolution reconstructions of large
nanomachines and cells. However as alluded to in the proteomic atlas the real
strength of cryo-ET is its power to integrate known atomic structures and cryo-EM
reconstructions to provide a complete model of in vivo protein function [105, 109].
Such integration will ultimately establish a true spatial and temporal view of
functional nanomachines within the cell, which can systematically be investigated
in either healthy or diseased states. In addition, there is a trend towards the
integration of live cell observations made by light microscopy, followed by cryo-ET
observations of the same specimens (e.g. [110]). Such hybrid approaches require not
only new instrumentation to make sequential observations practical but also the
computational tools to integrate the data. These integrated cellular views promise to
enhance our understanding of cell structure and function relationships in normal
and diseased states at higher spatial and temporal resolutions.
Acknowledgments
References
1 Celniker, S.E. and Rubin, G.M. (2003) The
Drosophila melanogaster genome.
Annual Review of Genomics and Human
Genetics, 4, 89.
References
Denys, M.E., Dominguez, R., Fang, N.Y.,
Foster, B.D., Freudenberg, R.W., Hadley,
D., Hamilton, L.R., Jeffrey, T.J., Kelly, L.,
Lazzeroni, L., Levy, M.R., Lewis, S.C., Liu,
X., Lopez, F.J., Louie, B., Marquis, J.P.,
Martinez, R.A., Matsuura, M.K.,
Misherghi, N.S., Norton, J.A., Olshen, A.,
Perkins, S.M., Perou, A.J., Piercy, C.,
Piercy, M., Qin, F., Reif, T., Sheppard, K.,
Shokoohi, V., Smick, G.A., Sun, W.L.,
Stewart, E.A., Fernando, J., Tejeda, Tran,
N.M., Trejo, T., Vo, N.T., Yan, S.C.,
Zierten, D.L., Zhao, S., Sachidanandam,
R., Trask, B.J., Myers, R.M. and Cox, D.R.
(2001) A high-resolution radiation hybrid
map of the human genome draft
sequence. Science, 291, 1298.
3 Venter, J.C., Adams, M.D., Myers, E.W.,
Li, P.W., Mural, R.J., Sutton, G.G., Smith,
H.O., Yandell, M., Evans, C.A., Holt, R.A.,
Gocayne, J.D., Amanatides, P., Ballew,
R.M., Huson, D.H., Wortman, J.R.,
Zhang, Q., Kodira, C.D., Zheng, X.H.,
Chen, L., Skupski, M., Subramanian, G.,
Thomas, P.D., Zhang, J., Gabor Miklos,
G.L., Nelson, C., Broder, S., Clark, A.G.,
Nadeau, J., McKusick, V.A., Zinder, N.,
Levine, A.J., Roberts, R.J., Simon, M.,
Slayman, C., Hunkapiller, M., Bolanos,
R., Delcher, A., Dew, I., Fasulo, D.,
Flanigan, M., Florea, L., Halpern, A.,
Hannenhalli, S., Kravitz, S., Levy, S.,
Mobarry, C., Reinert, K., Remington, K.,
Abu-Threideh, J., Beasley, E., Biddick, K.,
Bonazzi, V., Brandon, R., Cargill, M.,
Chandramouliswaran, I., Charlab, R.,
Chaturvedi, K., Deng, Z., Di Francesco,
V., Dunn, P., Eilbeck, K., Evangelista, C.,
Gabrielian, A.E., Gan, W., Ge, W., Gong,
F., Gu, Z., Guan, P., Heiman, T.J.,
Higgins, M.E., Ji, R.R., Ke, Z., Ketchum,
K.A., Lai, Z., Lei, Y., Li, Z., Li, J., Liang, Y.,
Lin, X., Lu, F., Merkulov, G.V., Milshina,
N., Moore, H.M., Naik, A.K., Narayan,
V.A., Neelam, B., Nusskern, D., Rusch,
D.B., Salzberg, S., Shao, W., Shue, B.,
Sun, J., Wang, Z., Wang, A., Wang, X.,
Wang, J., Wei, M., Wides, R., Xiao, C.,
Yan, C. et al. (2001) The sequence of
10
11
12
13
14
j105
106
References
34
35
36
37
38
39
40
41
j107
108
62
63
64
65
66
67
68
69
70
References
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
j109
110
90
91
92
93
94
95
96
97
98
References
GroEL-GroES complex encapsulates an 86
kDa substrate. Structure, 14, 1711.
109 Nickell, S., Koer, C., Leis, A.P. and
Baumeister, W. (2006) A visual approach
to proteomics. Nature Reviews. Molecular
Cell Biology, 7, 225.
j111
j113
5
Pushing Optical Microscopy to the Limit:
From Single-Molecule Fluorescence Microscopy to Label-Free
Detection and Tracking of Biological Nano-Objects
Philipp Kukura, Alois Renn, and Vahid Sandoghdar
5.1
Introduction
114
slow, with acquisition rates rarely exceeding a few Hertz. Electron microscopy
requires a vacuum and sometimes metal-coating or cryogenic conditions for
high-resolution images, thus making it difcult to perform studies under biologically
relevant conditions.
As a consequence, the method of choice for real-time, in vivo biological imaging
has remained optical microscopy, despite its comparatively low resolution. The
optical microscope was invented about 400 years ago, and improvements in resolution to about 200 nm had already taken place by the end of the nineteenth century,
through advances in lens design. However, this is a factor of 100 larger than the size of
single molecules or proteins that dene the ultimately desired resolution, and
represents the diffraction limit established by Abbe during the 1880s. Throughout
most of the twentieth century, this fundamental limit prevented the optical microscope from opening our eyes to the molecular structure of matter. Nevertheless,
numerous recent advances have been made to improve the resolution and, in
particular, the contrast of images. In general, these contrast techniques can be
divided into two categories, namely linear and nonlinear.
5.1.1
Linear Contrast Mechanisms
The central challenge in visualizing small objects is to distinguish their signals from
background scattering, or from reections that may be caused by other objects or the
optics inside the microscope. One possible solution to this problem is provided by
dark-eld microscopy, which was rst reported by Lister in 1830. Here, the design is
optimized to reduce all unwanted light to a minimum by preventing the illumination
light from reaching the detector [10, 11]. To achieve this, the illumination light is
shaped such that it can be removed before detection through the use of an aperture.
In general, there are two approaches to achieve dark-eld illumination; these are
illustrated schematically in Figure 5.1a and b. In Figure 5.1a, the illumination light is
modied in such a way as to provide an intense ring of light, for example through the
use of a coin-like object blocking all but the outer ring of the incident light. The
illumination light can be then removed by an aperture after the sample. In this way,
only the light that is scattered by objects of interest, and whose path is thereby
deviated, can reach the detector. The other approach, shown in Figure 5.1b, is referred
to as total internal reection microscopy (TIRM) [12]. Here, the illumination is
incident on the sample at a steep angle, and is fully reected at the interface between
the glass coverslip and the sample (e.g. water). There, an evanescent region is
created, where the light decays exponentially as it enters into the sample. When a
particle is placed into this region it scatters energy out of the evanescent part of the
beam into the objective. Thus, light will only reach the detector from scatterers that
are located within 100 nm of the samplesubstrate interface. This technique has been
mostly used to detect very weak signals such as those from metallic nanoparticles, or
to obtain surface-specic images [13, 14].
Another approach to minimize unwanted light is provided by confocal microscopy,
which rst appeared during the early 1960s (Figure 5.1c) [15, 16]. In contrast to the
5.1 Introduction
previous techniques, where the light from a relatively large sample area is collected
(20 20 mm), only light from the focal region of the objective is allowed to reach the
detector, thereby considerably reducing the background light. This is achieved
through the use of a pinhole that is placed in the confocal plane of the objective.
The size of the pinhole is usually chosen so as to match the size of the image of
the focal spot. Collecting light in this fashion leads to optical sectioning that is, the
ability to provide three-dimensional (3-D) information by moving the focus in the
z-direction through the sample. The major disadvantage of this approach is the need
to scan the sample with respect to the illumination, which leads to relatively slow
acquisition times.
An alternative for improving the contrast is to exploit the phase of the illumination
light. Differential interference contrast (DIC) microscopy, which was introduced
during the mid-1950s, takes advantage of slight differences in the refractive index of
the specimen to generate additional information about the composition of the
sample [17, 18]. For instance, the refractive index of water differs from that of lipids,
so areas with a high water content will generate a different signal than those
consisting mostly of organic material. The approach splits the illumination light
j115
116
into two slightly displaced beams (<1 mm) at the focus that are recombined before
detection. If the two beams travel through material with different indices of
refraction, the phase of one relative to the other will change, and this will lead to
a small degree of destructive interference after recombination of the two beams. Such
areas will thus appear dark, while areas where the sample is homogeneous will
appear bright.
So far, we have been only concerned with improving the contrast using scattered
light. A different but powerful method is to use uorescence as a contrast mechanism. The rst such reports emerged during the early twentieth century, when
ultra-violet light was used for the rst time in microscopes. However, the breakthrough occurred during the 1950s, when the application of uorescence labeling
for the detection of antigens was demonstrated [19]. Rather than detecting the
scattering of incident light, the illumination light is used to excite molecules,
causing them to uoresce. The advantage of this approach is that the excitation light
can be reduced by many orders of magnitude by the use of appropriate lters, as the
uorescence is usually red-shifted in energy (Figure 5.1d). One can then observe
the species of interest virtually against zero background because the only photons
that can reach the detector must be due to uorescence emission. Such contrast is
virtually unachievable with scattering techniques, even when dark-eld illumination
is employed, because the background scattering cannot be extinguished to such a
high degree. The major sources of contrast are biological autouorescence, specic
labeling of the objects of interest with uorescent dye molecules, or use of the
cellular expression system itself to produce uorescent proteins [20]. Fluorescence
can also be used to introduce specicity by spectrally coding the sample. Examples
are uorescence recovery after photobleaching (FRAP) for studying uidity [21],
uorescence lifetime imaging [22] or simply simultaneous labeling with different
uorophores.
The spectral resolution of uorescence is, however, rather poor because the
emission is usually very broad in energy. Techniques based on vibrational spectroscopy on the other hand, where the resonances are orders of magnitude sharper,
provide a much more unique molecular ngerprint, but at the expense of muchreduced signal intensities. In particular, Raman and infrared microscopy yield
information about the composition of the sample, without the need for any
label [23]. The experimental set-up is very similar to that used for uorescence
microscopy, but is focused on detecting vibrational resonances rather than uorescence emission. As a matter of fact, Raman experiments can only be successful
when the background uorescence is reduced to a minimum, because the crosssections for Raman scattering are orders of magnitudes smaller than that of
uorescence. On the upside, every species has a unique Raman spectrum and
can thus be identied without the need for any label. Because these experiments
must be performed in a confocal arrangement to achieve a sufciently high photon
ux, they provide highly specic and spatially resolved information, even inside
cells. The downside is that the intrinsically low Raman cross-sections require large
illumination powers, and this might be problematic for live cell imaging, for
reasons of phototoxicity.
5.1.2
Nonlinear Contrast Mechanisms
The recent availability of high-power laser sources producing ultrashort pulses on the
order of a hundred femtoseconds (1013 s) or less at high pulse powers (>mJ) has
opened up completely new areas in microscopy. When such short and intense pulses
are focused to a diffraction-limited spot, peak powers of terawatts per cm2 and above
can be achieved. At such high peak powers, nonlinear effects that involve the
simultaneous interaction of multiple photons with the sample become observable.
The main microscopy-related application that has emerged from this technological
jump is that of two-photon imaging [24]. Here, rather than exciting a uorophore with a
single resonant photon, for example at 400 nm, two off-resonant photons at 800 nm are
used to produce the same excitation. Although nonresonant two-photon cross-sections
are negligible compared to their one-photon counterparts (1050 versus 1016 cm2),
the high peak powers coupled with high pulse repetition rates on the order of 100 MHz
can compensate for these dramatically lowered cross-sections. The major advantage of
this technique is that optical sectioning comes for free, because the high peak
intensities are only produced at the focus of the illuminating beam, making confocal
pinholes superuous. Despite the fact that biological tissue is generally transparent in
the near-infrared region, sample heating and the rapid destruction of two-photon
uorophores cannot be avoided due to the necessarily high peak intensities used.
Another prominent example of nonlinear microscopy is based on coherent antiStokes Raman scattering (CARS) [25], which is the nonlinear equivalent of Raman
microscopy. Here, three incident photons are required to produce a single Raman
shifted photon. Additional techniques based on second and third harmonic generation
microscopies have also appeared [26, 27]. Finally, a very interesting recent development
has been discussed in the context of RESOLFT (reversible saturable optically linear
uorescencetransitions). Here,a nonlinear processsuchasstimulated emission isused
to depletethe uorescence to a subdiffraction-limitedspot. As a result, the actualvolume
from which uorescent photons are emitted is considerably reduced beyond l/50 [28].
In this chapter, we will focus our attention on pushing both the sensitivity and
resolution limits of state-of-the-art linear microscopic techniques. In particular, we will
discuss the capabilities and limitations of single-molecule detection in the light of
biological applications. After covering the fundamental aspects of resolution, we will
outline recent advances in the detection and tracking of nonuorescent nano-objects.
Scattering based labels show much promise in eliminating many of the limitations of
uorescence microscopy. Yet, by going a stepfurther, we will show how these techniques
can be used to detect and follow the motion of unlabeled biological nanoparticles.
5.2
Single-Molecule Fluorescence Detection: Techniques and Applications
The fundamental question that arises from the previous discussion of current optical
imaging methods is: Why is it so difcult, rst to see single molecules, and second to
j117
118
To understand the intricacies of detecting single molecules in biological environments, it is useful to ask why it is so easy to observe single ions trapped in a
vacuum. The answer is simply the absence of any background. In a vacuum, the
emitter is alone, with no other objects or molecules nearby that can either scatter
the excitation light or uoresce upon excitation. In this scenario, single-molecule
detection simply becomes an issue of having a good enough detector (curiously, the
human eye is one of the best light detectors available, being able to detect single
photons with almost 70% efciency [30]). However, the number of photons that
any single molecule can emit via uorescence is strictly limited by its intrinsic
photophysics, and cannot be increased at will simply by raising the illumination
power.
To understand this concept, it is useful to consider the dynamics that follow
the absorption of a single photon by a single molecule. Population of the rst
excited electronic state is followed by an excited state decay which can take place
via two major pathways: nonradiative and radiative decay. The former refers in this
simple case to a transition from the excited to the ground state, without the emission of
a photon. The excess energy is usually deposited in vibrational degrees of freedom,
either of the molecule itself or of the surroundings. In the latter case, the energy is lost
through the emission of a photon which is typically lower in energy (red-shifted) than
the excitation photon due to the Stokes shift. To generate as many detectable photons
per unit time as possible for a given excitation power, one requires: (i) a large
absorption cross-section; and (ii) a high uorescence quantum yield that is, an
efcient conversion of absorbed into emitted photons.
The former requirement brings with it a radiative lifetime on the order of
nanoseconds, which can be related to the fundamental considerations of absorption
and emission. This limits the total number of emitted photons, irrespective of the
total incoming photon ux, because a single absorptionemission cycle takes about
10 ns. Thus, even in ideal circumstances no single molecule can emit more than 108
photons per second. The restricted collection properties of objectives, imperfect
transmission and reectivity of optics and limited quantum efciencies of detectors
result in typical effective collection efciencies of <10%. The corresponding count
rates on the order of a few million counts per second can indeed be observed in singlemolecule experiments at cryogenic temperatures [31]. Under ambient conditions,
which are of relevance for biological investigations, photobleaching puts a limit on
the applied excitation intensities. The cause of this bleaching is often the generation
and further excitation of triplet states that are accessed through intersystem crossing
from the rst excited singlet state. Despite the low quantum efciency of the process
(<1%), excitation powers must be chosen at the kW cm2 level in order to avoid rapid
photobleaching. At these incident light levels, the observed count rates are below 105
photons per second [32].
Photobleaching is also the reason why single molecules usually emit a total of
105107 photons before turning dark. The most likely cause of this sudden death is
triplettriplet annihilation with molecular oxygen, which is a particular problem in
biological environments. The highly reactive singlet oxygen that is generated attacks
the dye and oxidizes it, greatly altering its electronic properties and thereby rendering
it dark to the excitation photons [33, 34]. To make matters worse, triplet state
formation is thought to be the main cause of phototoxicity [35, 36]. This situation
can be improved somewhat by deoxygenating the system, or by using oxygen
scavengers for in vitro experiments in solution [37].
5.2.2
The Signal-to-Noise Ratio Challenge
j119
120
single-molecule methods for biological studies. It was quickly realized that far-eld
methods are capable of generating much larger single-molecule signals than SNOM,
with much-reduced experimental demands. Therefore, far-eld detection has
become the method of choice for detecting single molecules in biological environments [43, 44]. This advance was facilitated by the development of low autouorescence microscope objectives and immersion oils, as well as improved excitation
light rejection through the use of dielectric lters and highly efcient single photon
detectors such as avalanche photodiodes. Today, single-molecule detection has
become an almost standard technique in biology, chemistry and physics [45].
Single-molecule techniques have been used to directly observe the motion and
function of single biological nano-objects such as enzymes, viruses or motor proteins
in real time [37, 46, 47].
5.2.3
High-Precision Localization and Tracking of Single Emitters
Despite the fact that a single molecule smaller than 1 nm yields an image with a
diameter of several hundred nanometers, it is possible to determine its location to
within a few nanometers. This can be achieved by tting the data to the theoretical
PSF. Thus, the precision of this t is only limited by the signal-to-noise ratio (SNR)
of the acquired Gaussian prole, while the t tolerance provides the uncertainty in
x and y of the emitters center [49]. To illustrate this, we have acquired confocal
images of single rhodamine-labeled GM1 receptors adhered onto an acid-cleaned
coverslip (Figure 5.3a). The observed single-step bleaching in Figure 5.3b for the
spot highlighted in Figure 5.3a conrms that the image stems from a single
molecule. The emission pattern (Figure 5.3c) and the corresponding Gaussian t
(Figure 5.3d) to the highlighted molecule result in a lateral localization accuracy of
10 nm. The high accuracy is due to the fact that all the information in two
dimensions can be used for the t. To illustrate this, three slices and the
corresponding ts along the x axis of the spot are shown in Figure 5.3eg. The
center position uctuates by >20 nm for these three ts due to the limited SNR.
A two-dimensional t, however, provides much higher accuracy, because one
effectively ts all slices in every possible direction simultaneously. This approach
has been employed in several recent investigations, including the study of lipid
diffusion inside supported membrane bilayers [50], and of the mechanism of the
molecular motor kinesin stepping along microtubules [51]. While the former study
showed a maximum localization accuracy of 40 nm, the latter state-of-the-art
measurements succeeded in realizing molecular resolution (1.5 nm) with
sub-second time resolution.
As can be seen from the previous discussion, the localization accuracy is critically
dependent on the SNR with which the PSF can be measured. The limitation arises
from the nite number of detectable photons per molecule. The longer the
integration time, the higher the accuracy but also the lower the time resolution.
As a rule of thumb, a SNR of 10 is required for a localization accuracy of 10 nm,
which translates into roughly a time resolution on the order of several to tens of
milliseconds [49, 52]. This makes tracking beyond video rates difcult if the object
is to remain visible against the background, especially for in vivo imaging [47].
In addition, the total tracking time is limited to a few seconds because of
photobleaching. These issues can only be addressed by using labels with no
limitations on the number of emitted photons and on photostability. Nevertheless,
single molecules have been used successfully to track individual biological
nano-objects in real time. An excellent example is given in Figure 5.4, where
single adenoviruses were labeled with single dye molecules and then observed
before, during and after cell entry [47].
Inorganic quantum dots have become popular as labels in uorescence microscopy because of their brightness and extreme photostability [53]. Their inherent toxicity
is commonly deactivated through the use of protecting layers, and they have been
used successfully in intracellular and in vivo studies [54]. A major disadvantage of
these labels is that, so far, their emission switches off intermittently at unpredictable
times and for unknown durations, a process known as photoblinking. In addition,
once passivated, they can become as large as 1520 nm.
j121
122
5.2.4
Getting Around the Rayleigh Limit: Colocalization of Multiple Emitters
j123
124
photobleaching, such as that shown in Figure 5.5a. After the rst bleaching event,
only m2 is visible, and this can now be localized with high precision (as described
previously). Subtracting the contribution of m2 from the initial image, where both
emitters are present, results in an image representative of m1 (Figure 5.5b) which can
again be localized with high precision. Thus, it is possible to determine the positions
of several emitters with near-molecular resolution (down to 1.5 nm). This technique
is extremely useful and precise for imaging fairly simple samples containing few
uorophores. However, for general applications such as those required for in vivo
imaging where many labels are present, it quickly reaches its limit.
This barrier has recently been lifted by a recent approach proposed by Betzig and
coworkers [58] and by Zhuang and colleagues [59], based on photoswitchable
uorophores. These methods have been named PALM (photoactivated localization
microscopy) and STORM (stochastic optical reconstruction microscopy), respectively.
Here, the problem of multiple uorophores emitting simultaneously is eliminated
by initially illuminating the sample in the near-UV (405 nm) at low light levels. This
causes a small and stochastically distributed fraction of the total molecules to
convert photochemically into an active state. Illumination of the sample in the
yellow region (561 nm) then causes the photoactivated molecules to uoresce.
By observing and tting the emission pattern from each of these molecules, they can
be localized with high precision. Those molecules are subsequently bleached and
the process is repeated until the entire sample has been imaged. The resulting
images are of spectacular clarity and resolution, especially when compared to
standard confocal images, such as in Figure 5.6ad. The main disadvantage of
this approach is currently the low time resolution required by the stochastic
activation of a small number of emitters and the reliance on the destruction of
the activated signal.
The highest spatial resolution in three dimensions has been achieved at cryogenic
temperatures, taking advantage of spectral selectivity [60]. When cooled to a few
Kelvin, the absorption and emission from single molecules becomes extremely
narrow and highly sensitive to their local environment. Single molecules can then be
excited individually and therefore localized with nanometer precision. Whilst, in
j125
126
principle, this approach should be extendable to biological studies, none has so far
been reported due to the high degree of experimental complexity compared to roomtemperature studies.
One of the most successful and widespread approaches to achieve spatial
information far below the Rayleigh limit involves taking advantage of uorescence
resonance energy transfer (FRET). The principle of this approach is depicted
schematically in Figure 5.7. It uses two chromophores, a donor (D) and an acceptor
(A), with overlapping absorption and emission bands (Figure 5.7b). When the two
emitters are well separated, excitation of the donor will lead to observable emission
only from the donor. However, when the two uorophores are brought into close
proximity of each other, the excitation energy originally placed in the donor is
efciently transferred to the acceptor due to the overlapping absorption and
emission bands through Forster energy transfer (Figure 5.7c). This indirect
excitation of the acceptor leads to uorescence emission of the acceptor (i.e. far
red-shifted compared to that of the donor). Commonly, the emission channels of
both the donor and the acceptor are monitored simultaneously by the use of
appropriate beam splitters. Thus, a dynamic system where the distance of the two
labels changes with time, for example due to conformational changes of a protein,
will exhibit alternating emission from donor and acceptor [61]. The strong distancedependence (R6) of energy transfer efciency makes FRET an excellent molecular
ruler on the sub-10 nm length scale (Figure 5.7d).
5.3
Detection of Non-Fluorescent Single Nano-Objects
j127
128
absorption cross-section. For scattering detection, the situation is even worse because
every object in the focal volume scatters light and the spectral selectivity available in
uorescence detection is lost, making low background measurements a true challenge.There are two possible sol utions to this problem: (i) the number of background
scatterers from the focal volume is minimized; or (ii) the scattering signal from the
particles of interest is somehow increased. The former has been the traditional
approach to detecting small gold nanoparticles through dark-eld or total internal
reection microscopy [14]. In this case, the scattered light intensity depends on the
square of the polarizability of the particle which, in the electrostatic approximation,
can be written as [64]:
al
pd3 ep em
2 ep 2em
occurs in the epi-direction, we are interested in the light returning through the
microscope objective. The detector will therefore see the incident light (Ei) reected
at the interface between the glass and the medium. The reected eld reads
Er rEi(exp(ip/2)) where r is the eld reectivity and p/2 denotes the Gouy phase
of the reected focused beam. In addition, light scattered by the particle can
be written as Es sEi |s|exp(ij)Ei at the detector, where s is proportional to the
particle polarizability and therefore to d3. Here, f signies the phase change on
scattering. The intensity, ID, measured at the detector is thus
ID jEr Es j2 jEi j2 r 2 s2 jrjjsj sin f:
We can use this equation to illustrate some of the factors discussed above. Darkeld or total internal reection detection are designed in such a way that r ! 0 and
therefore only the scattering term, s2, is detected. This signal drops very rapidly below
the background level (|rEi|2) for particles <30 nm. In this case, the nature of the
observed signal depends on the relative magnitudes of the three terms in the above
equation. For large particles, the scattering term, s2, dominates and the particles
appear bright against the background (Figure 5.9). As the particle size decreases,
s2 becomes negligible compared to the other two terms, and only the reection and
interference terms contribute to the detected intensity. The particles appear dark
against the background due to the destructive interference between the scattered and
the reected beams caused by a p/2 Gouy phase shift of the reected incident beam
(Figure 5.10a and b) [71]. The change-over from bright to dark occurs according to the
relative magnitudes of the scattering and interference terms, and therefore occurs
j129
130
Figure 5.10 Interferometric images of gold nanoparticles spincoated onto glass coverslips. (a) 20 nm; (b) 10 nm; (c) 5 nm
diameter. Representative particle cross-sections, as well as
intensity histograms, are provided in each case. Total acquisition
time in each case 10 s; incident power 2 mW.
earlier for larger r: for example, at 40 nm for air as the surrounding medium, 30 nm
for water, and 15 nm for oil that is index-matched fairly well to the glass coverslip.
5.3.2.1 Is it Possible to Detect Molecule-Sized Labels?
To explore the theoretical sensitivity limitations, it is useful to consider the origin of the
true noise background that limits the detection sensitivity. This is governed mostly by
the noise of the light source itself and the noise of the detector. Both will result in
uctuations in the detected reected intensity, r, which is the major contributor to the
overall detected signal at the detector. Other potential noise sources such as mechanical instabilities of the microscope or beam-pointing instability of the light source are
comparatively small and easily corrected for in post-acquisition image processing.
An incident power of 1 mW on the sample will yield 3 mW of light reaching the
detector, taking into account the reectivity of the glasswater interface and losses
due to the limited transmission of optics such as the microscope objective. The ideal
detectors for such light intensities are photodiodes, which produce a corresponding
photocurrent of 1 mA. The shot noise limit for this photocurrent is on the order of
500 fA Hz1/2, which is about an order of magnitude above the noise of available
ampliers with 107 V/A gain, suggesting that shot noise-limited detection is possible.
At this amplication and a realistic detection bandwidth for mechanical scanning
of 1 kHz, the electronic shot noise amounts to 1.5 105 rms, which is a factor of
300 below the magnitude of the signal observed for 5 nm gold particles. A factor
of three reduction in size on the other hand, which would lead to molecular sized
labels on the order of 1.3 nm, brings with it a factor of 27 reduction in signal intensity.
Thus, such molecular-sized labels should be observable with a SNR of 10 at kHz
bandwidths with localization accuracies down to 10 nm!
The previous discussion has shown that neither shot noise nor detector noise limit
the detectability of such small labels. One other critical noise source remains: the
light source itself. Lasers used in confocal microscopes show optical noise on
the order of a small percentage over a wide frequency range. Even state-of-the-art,
solid-state, diode-pumped lasers rarely perform better than 0.1%. However, external
stabilization using optical bers, acousto-optic modulators and feedback loops has
been shown to reduce laser noise from a small percentage to 5 105 rms with kHz
bandwidth [72], which is comparable to the electronic shot noise calculated above. In
addition, the use of single-mode bers in this stabilization scheme signicantly
reduces the effects of beam-pointing and mode instabilities, further contributing to
the overall stability of the system. Given these simple calculations, it becomes evident
that the rapid detection of molecular-sized gold scatterers should be possible. We are
currently pursuing such experiments.
All of the images presented in Figure 5.10 have been obtained by scanning the
sample across the focus using a piezo translation stage that requires 110 s per image.
By using scanning mirrors rather than a piezo stage, we have shown previously that it
is possible to detect 20 nm particles with up to MHz bandwidths three orders of
magnitude above what is possible using single molecules as labels [73]. In addition,
rather than scanning the focus across the surface, the use of a feedback loop enables
one to lock the focus to a particle and follow its movements rapidly. The feedback loop
j131
132
is fed by the signal recorded on a four-quadrant detector, where the movement of the
particle inside the focus leads to changes in the measured differential voltages. Using
the same detector and amplier combination above, which provide MHz detection
bandwidths, the shot noise increases to 5 104 with mW incident powers. In this
way, the tracking of labels as small as 5 nm with MHz bandwidths should be possible.
The sensitivity limitations of the technique are illustrated in Figure 5.10c, which
shows a confocal image of 5 nm particles spin-coated on a glass coverslip and covered
by water. As can be seen in the image, the particles are visible against the background
with a signal contrast on the order of 3 103. The distribution width of the signal
intensities is in agreement with the manufacturers specications with regards to the
size of the gold particles, conrming that we are indeed observing single particles.
A close inspection of Figure 5.10b and c reveals the presence of a rather noisy
background as the size of the particles and thus their signal magnitude decreases.
However, these features are not noise, as they are perfectly reproducible in sequential
images. Rather, these patterns are due to the surface roughness of the glass coverslips
used. Indeed, AFM measurements have shown that the surface roughness amounts
to a few nanometers over a few microns. The reproducibility of such nanometer-sized
surface roughness demonstrates the excellent sensitivity of this technique to
nonmetallic species.
5.3.2.2 The Needle in the Haystack: Finding and Identifying Gold
So far, we have been concerned mostly with detecting and tracking the smallest
possible gold nanoparticles. An interesting point to address is how such small
labels can be detected in the presence of much larger scatterers, for example in
intracellular imaging. Fortunately, gold nanoparticles have a type of built-in
identication card in the form of a plasmon resonance in the visible region of
the electromagnetic spectrum (Figure 5.11a). As a result, one obtains roughly twice the
scattering intensity in the green (532 nm) compared to the blue (488 nm) or red
(>560 nm) regions. This wavelength-dependent scattering intensity is in contrast to
the constituents of typical biological samples, where the scattering should be roughly
identical for both wavelengths.
To demonstrate the possibility of using this interesting feature of gold nanoparticles, we have labeled microtubules with 40 nm gold particles and obtained
scattering images simultaneously in the blue and green (Figure 5.11b and c). In
the two images, both the nanoparticles and the microtubule are clearly visible.
However, when the two images are subtracted from each other (Figure 5.11d), the
microtubule disappears while the particles remain. One can thus use this form of
differential spectral contrast to ensure that the observed particles are indeed the gold
labels of interest and not other scatterers [69].
5.3.3
Combining Scattering and Fluorescence Detection: A Long-Range Nanoscopic Ruler
example where the combination of the two leads to a potentially useful technique.
It has been shown previously that nanostructures brought into close vicinity of single
emitters can cause enhancement of luminescence and Raman scattering [75]. We
have shown recently that a single gold nanoparticle can enhance the uorescence of a
single molecule and the decay rate of its excited state by a factor of 20 [76].
Furthermore, we demonstrated that this strong uorescence modication is a
function of the particleemitter separation with nanometer sensitivity.
The mechanism of this effect can be intuitively explained as the near-eld
interaction of the molecular dipole moment, with its image dipole induced in the
gold nanoparticle. The dipoledipole character of this interaction gives rise to a strong
distance-dependence much in the same way as in FRET (see also Figure 5.10).
However, in this case the interaction range drops much more softly than the 1/r6
dependence observed in FRET. Thus, the modication of the uorescence lifetime
close to a nanoparticle can be used as a nanoscopic ruler for distances larger than that
of FRET (>10 nm).
To demonstrate this, we have performed studies of systems where a single
molecule is linked to a gold nanoparticle with DNA double strands of differing
lengths [77]. The techniques of single-molecule detection and microscopy of gold
nanoparticles were combined to locate such moleculeparticle pairs. The corresponding confocal scans of single functionalized gold nanoparticles of 15 nm
j133
134
diameter are shown in Figure 5.12a (scattering) and b (uorescence). As can be seen
in the gure, only a fraction of the gold particles contains a uorescent marker.
Figure 5.12c demonstrates the dependence of uorescence lifetime on linker length.
In the absence of a gold nanoparticle, the uorescence lifetime of the molecule
was 3 ns, but this was reduced to about 0.6 ns for a 15 nm-long DNA linker
consisting of 44 base pairs. The interaction length and its slope can be tuned by
choosing the particle size and emission wavelength of the dye molecule. The
precision of such a nanoscopic ruler is on the order of 1 nm, and is limited by
the accuracy with which the uorescence lifetime can be determined [77].
5.3.4
Label-Free Detection of Biological Nano-Objects
j135
136
5.4
Summary and Outlook
j137
138
this chapter will soon become somewhat of an antique. In fact, since the rst
concept of this chapter, video rate uorescence imaging with a focal spot of
approximately 60 nm in living cells has been achieved using RESOLFT, and this
has resulted in some impressive images of synaptic vesicle movement [79]. Furthermore, the initial 2-D studies with PALM and STORM have now been extended to 3-D
imaging with a lateral resolution of 2030 nm and an axial resolution of
5060 nm [80]. Finally, single-molecule detection has been successfully extended to
the investigation of single labels, such as semiconductor quantum dots, in absorption
at room temperature. This will surely open the way to optical nanoscopy without a
need for efcient uorescent labels [74].
Acknowledgments
References
1 Preston, G.M. and Agre, P. (1991) Isolation
of the CDNA for erythrocyte integral
membrane-protein of 28-kilodaltons member of an ancient channel family.
Proceedings of the National Academy of
Sciences of the United States of America, 88,
11110.
2 Doyle, D.A., Cabral, J.M., Pfuetzner, R.A.,
Kuo, L., Gulbis, J.M., Cohen, S.L., Chait,
B.T. and MacKinnon, R. (1998) The
structure of the potassium channel:
molecular basis of K conduction and
selectivity. Science, 280, 69.
3 Palczewski, K., Kumasaka, T., Hori, T.,
Behnke, C.A., Motoshima, H., Fox, B.A., Le
Trong, I., Teller, D.C., Okada, T., Stenkamp,
R.E., Yamamoto, M. and Miyano, M. (2000)
Crystal structure of rhodopsin: A G proteincoupled receptor. Science, 289, 739.
4 Cramer, P., Bushnell, D.A. and Kornberg,
R.D. (2001) Structural basis of
10
References
11 Horio, T. and Hotani, H. (1986)
Visualization of the dynamic instability of
individual microtubules by dark-eld
microscopy. Nature, 321, 605.
12 Prieve, D.C., Luo, F. and Lanni, F. (1987)
Brownian-motion of a hydrosol particle in
a colloidal force-eld. Faraday Discussions,
297.
13 Joos, U., Biskup, T., Ernst, O., Westphal, I.,
Gherasim, C., Schmidt, R., Edinger, K.,
Pilarczyk, G. and Duschl, C. (2006)
Investigation of cell adhesion to structured
surfaces using total internal reection
uorescence and confocal laser scanning
microscopy. European Journal of Cell
Biology, 85, 225.
14 Sonnichsen, C., Geier, S., Hecker, N.E.,
von Plessen, G., Feldmann, J., Ditlbacher,
H., Lamprecht, B., Krenn, J.R., Aussenegg,
F.R., Chan, V.Z.H., Spatz, J.P. and Moller,
M. (2000) Spectroscopy of single metallic
nanoparticles using total internal
reection microscopy. Applied Physics
Letters, 77, 2949.
15 Minsky, M. (1988) Memoir on inventing
the confocal scanning microscope.
Scanning, 10, 128.
16 Wilson, T. (1990) Confocal Microscopy,
Academic Press, London.
17 Nomarski, G. (1955) Nouveau dispositif
pour lobservation en contraste de phase
differentiel. Journal de Physique et le
Radium, 16, S88.
18 Murphy, D.B. (2001) Fundamentals of Light
Microscopy and Electronic Imaging, WileyLiss, New York.
19 Coons, A.H. and Kaplan, M.H. (1950)
Localization of antigen in tissue cells 2.
Improvements in a method for the
detection of antigen by means of
uorescent antibody. The Journal of
Experimental Medicine, 91, 1.
20 Chale, M., Tu, Y., Euskirchen, G.,
Ward, W.W. and Prasher, D.C. (1994)
Green uorescent protein as a marker
for gene-expression. Science, 263, 802.
21 Axelrod, D., Koppel, D.E., Schlessinger, J.,
Elson, E. and Webb, W.W. (1976) Mobility
measurement by analysis of uorescence
22
23
24
25
26
27
28
29
30
31
32
33
j139
140
34
35
36
37
38
39
40
41
42
43
References
56 Lacoste, T.D., Michalet, X., Pinaud, F.,
Chemla, D.S., Alivisatos, A.P. and Weiss,
S. (2000) Ultrahigh-resolution multicolor
colocalization of single uorescent probes.
Proceedings of the National Academy of
Sciences of the United States of America, 97,
9461.
57 Gordon, M.P., Ha, T. and Selvin, P.R.
(2004) Single-molecule high-resolution
imaging with photobleaching. Proceedings
of the National Academy of Sciences of the
United States of America, 101, 6462.
58 Betzig, E., Patterson, G.H., Sougrat, R.,
Lindwasser, O.W., Olenych, S.,
Bonifacino, J.S., Davidson, M.W.,
Lippincott-Schwartz, J. and Hess, H.F.
(2006) Imaging intracellular uorescent
proteins at nanometer resolution. Science,
313, 1642.
59 Rust, M.J., Bates, M. and Zhuang, X.W.
(2006) Sub-diffraction-limit imaging by
stochastic optical reconstruction
microscopy (STORM). Nature Methods, 3,
793.
60 Hettich, C., Schmitt, C., Zitzmann, J.,
K
uhn, S., Gerhardt, I. and Sandoghdar, V.
(2002) Nanometer resolution and coherent
optical dipole coupling of two individual
molecules. Science, 298, 385.
61 Ha, T., Enderle, T., Ogletree, D.F., Chemla,
D.S., Selvin, P.R. and Weiss, S. (1996)
Probing the interaction between two single
molecules: uorescence resonance energy
transfer between a single donor and a
single acceptor. Proceedings of the National
Academy of Sciences of the United States of
America, 93, 6264.
62 Kusumi, A., Nakada, C., Ritchie, K.,
Murase, K., Suzuki, K., Murakoshi, H.,
Kasai, R.S., Kondo, J. and Fujiwara, T.
(2005) Paradigm shift of the plasma
membrane concept from the twodimensional continuum uid to the
partitioned uid: high-speed singlemolecule tracking of membrane
molecules. Annual Review of Biophysics and
Biomolecular Structure, 34, 351.
63 Schultz, S., Smith, D.R., Mock, J.J. and
Schultz, D.A. (2000) Single-target
64
65
66
67
68
69
70
71
72
j141
142
73
74
75
76
77
j143
6
Nanostructured Probes for In Vivo Gene Detection
Gang Bao, Phillip Santangelo, Nitin Nitin, and Won Jong Rhee
6.1
Introduction
The ability to image specic RNAs in living cells in real time can provide essential
information on RNA synthesis, processing, transport and localization, as well as on
the dynamics of RNA expression and localization in response to external stimuli.
Such an ability will also offer unprecedented opportunities for advancement in
molecular biology, disease pathophysiology, drug discovery and medical diagnostics.
Over the past decade or so, an increasing amount of evidence has come to light
suggesting that RNA molecules have a wide range of functions in living cells, from
physically conveying and interpreting genetic information, to essential catalytic roles,
to providing structural support for molecular machines, and to gene silencing.
These functions are realized through control of the expression level and stability,
both temporally and spatially, of specic RNAs in a cell. Therefore, determining the
dynamics and localization of RNA molecules in living cells will signicantly impact
on the molecular biology and medicine.
Many in vitro methods have been developed to provide a relative (mostly semiquantitative) measure of gene expression level within a cell population, by using
puried DNA or RNA obtained from cell lysates. These methods include the
polymerase chain reaction (PCR) [1], Northern hybridization (or Northern blotting) [2], expressed sequence tag (EST) [3], serial analysis of gene expression
(SAGE) [4], differential display [5] and DNA microarrays [6]. These technologies,
combined with the rapidly increasing availability of genomic data for numerous
biological entities, present exciting possibilities for the understanding of human
health and disease. For example, pathogenic and carcinogenic sequences are
increasingly being used as clinical markers for diseased states. However, the use
of in vitro methods to detect and identify foreign or mutated nucleic acids is often
difcult in a clinical setting, due to the low abundance of diseased cells in blood,
sputum and stool samples. Further, these methods cannot reveal the spatial and
temporal variation of RNA within a single cell.
144
Labeled linear oligonucleotide (ODN) probes have been used to study intracellular
mRNA via in situ hybridization (ISH) [7], in which cells are xed and permeabilized to
increase the probe delivery efciency. Unbound probes are removed by washing to
reduce the background and achieve specicity [8]. In order to enhance the signal level,
multiple probes targeting the same mRNA can be used [7], although xation agents
and other supporting chemicals can have a considerable effect on the signal level [9]
and possibly also on the integrity of certain organelles, such as mitochondria.
Thus, the xation of cells (by using either crosslinking or denaturing agents) and
the use of proteases in ISH assays may prevent an accurate description of intracellular
mRNA localization from being obtained. It is also difcult to obtain a dynamic picture
of gene expression in cells using ISH methods.
Of particular interest is the uorescence imaging of specic messenger RNAs
(mRNAs) in terms of both their expression level and subcellular localization in
living cells. As shown schematically in Figure 6.1, for eukaryotic cells a pre-mRNA
molecule is synthesized in the cell nucleus. After processing (including splicing and
polyadenylation), the mature mRNAs are transported from the cell nucleus to the
cytoplasm, and often are localized at specic sites. The mRNAs are then translated by
ribosomes to produce specic proteins, and then degraded by RNases after a certain
period of time. The limited lifetime of mRNA enables a cell to alter its protein
synthesis rapidly, and in response to its changing needs. During the entire life cycle of
an mRNA, it is always complexed with RNA-binding proteins to form a ribonucleoprotein (RNP). This has signicant implications for the live-cell imaging of mRNAs
(as discussed below).
To detect RNA molecules in living cells, with not only high specicity but also high
sensitivity and signal-to-background ratio, it is important that the probes recognize
RNA targets with high specicity, convert target recognition directly into a measurable
signal, and differentiate between true and false-positive signals. This is especially
important for low-abundance genes and clinical samples containing only a small
number of diseased cells. It is also important for the probes to quantify low gene
expression levels with great accuracy, and have fast kinetics in tracking alterations
in gene expression in real time. For detecting genetic alterations such as mutations,
insertions and deletions, the ability to recognize single nucleotide polymorphisms
(SNPs) is essential. In order to achieve this optimal performance, it is necessary to
have a good understanding of the structurefunction relationship of the probes,
the probe stability and the RNA target accessibility in living cells. It is also necessary
to achieve an efcient cellular delivery of probes, with minimal probe degradation.
In the following sections we will review the uorescent probes that are most often
used for RNA detection, and discuss the critical issues in live-cell RNA detection,
including probe design, target accessibility, the cellular delivery of probes, as well as
detection sensitivity, specicity and signal-to-background ratio. Emphasis is placed
on the design and application of molecular beacons, although some of the issues are
common to other oligonucleotide probes.
6.2
Fluorescent Probes for Live-Cell RNA Detection
Several classes of molecular probes have been developed for RNA detection in living
cells, including: (i) tagged linear ODN probes; (ii) oligonucleotide hairpin probes; and
(iii) probes using uorescent proteins as reporter. Although probes composed of
full-length RNAs (mRNA or nuclear RNA) tagged with a uorescent or radioactive
reporter have been used to study the intracellular localization of RNA [1012], they are
not discussed here as they cannot be used to measure the expression level of specic
RNAs in living cells.
6.2.1
Tagged Linear ODN Probes
Single uorescently labeled linear oligonucleotide probes have been developed for
RNA tracking and localization studies in living cells [1315]. Although these probes
may recognize specic endogenous RNA transcripts in living cells via WatsonCrick
base pairing, and thus reveal subcellular RNA localization, this approach lacks the
j145
146
ability to distinguish background from true signal, as both bound probes (i.e. those
hybridized to RNA target) and unbound probes give a uorescence signal. Such an
approach might also lack detection specicity, as a partial match between the probe
and target sequences could induce probe hybridization to RNA molecules of multiple
genes. A novel way to increase the signal-to-noise ratio (SNR) and improve detection
specicity is to use two linear probes with a uorescence resonance energy transfer
(FRET) pair of (donor and acceptor) uorophores [13]. However, the dual-linear
probe approach may still have a high background signal due to direct excitation of
the acceptor and emission detection of the donor uorescence. Further, it is
difcult for linear probes to distinguish targets that differ by a few bases as the
difference in free energy of the two hybrids (with and without mismatch) is
typically rather small. This limits the application of linear ODN probes in biological
and disease studies.
6.2.2
ODN Hairpin Probes
Hairpin nucleic acid probes have the potential to be highly sensitive and specic in
live-cell RNA detection. As shown in Figure 6.2a and b, one class of such probes is that
of molecular beacons; these are dual-labeled oligonucleotide probes with a uorophore at one end and a quencher at the other end [16]. They are designed to form a
stemloop hairpin structure in the absence of a complementary target, so that the
uorescence of the uorophore is quenched. Hybridization with the target nucleic
acid opens the hairpin and physically separates the uorophore from quencher,
allowing a uorescence signal to be emitted upon excitation (Figure 6.2b). Under
optimal conditions, the uorescence intensity of molecular beacons can increase
more than 200-fold upon binding to their targets [16], and this enables them to
function as sensitive probes with a high signal-to-background ratio. The stemloop
hairpin structure provides an adjustable energy penalty for hairpin opening which
improves probe specicity [17, 18]. The ability to transduce target recognition directly
into a uorescence signal with a high signal-to-background ratio, coupled with an
improved specicity, has allowed molecular beacons to enjoy a wide range of
biological and biomedical applications. These include multiple analyte detection,
real-time enzymatic cleavage assaying, cancer cell detection, real-time monitoring
of PCR, genotyping and mutation detection, viral infection studies and mRNA
detection in living cells [14, 1932].
As illustrated in Figure 6.2a, a conventional molecular beacon has four essential
components: loop, stem, uorophore (dye) and quencher. The loop usually consists
of 1525 nucleotides and is selected to have a unique target sequence and proper
melting temperature. The stem, which is formed by two complementary short-arm
sequences, is typically four to six bases long and chosen to be independent of the
target sequence (Figure 6.2a).
A novel design of hairpin probes is the wavelength-shifting molecular beacon, which
can uoresce in a variety of different colors [33]. As shown in Figure 6.2c, in this design,
a molecular beacon contains two uorophores (dyes): a rst uorophore that absorbs
j147
148
strongly in the wavelength range of the monochromatic light source, and a second
uorophore that emits at the desired emission wavelength due to uorescence
resonance energy transfer from the rst uorophore to the second uorophore. It
has been shown that wavelength-shifting molecular beacons are substantially brighter
than conventional molecular beacons, which contain a uorophore that cannot
efciently absorb energy from the available monochromatic light source.
One major advantage of the stemloop hairpin probes is that they can recognize
their targets with higher specicity than can linear ODN probes. The results of
solution studies [17, 18] have suggested that, by using molecular beacons it is possible
to discriminate between targets that differ by a single nucleotide. In contrast to
current techniques for detecting SNPs which are often labor-intensive and timeconsuming molecular beacons may provide a simple and promising tool for
detecting SNPs in disease diagnosis.
The basic features of molecular beacon versus uorescence in situ hybridization
(FISH) are compared in Figure 6.3. Specically, molecular beacons are dual-labeled
hairpin probes of 1525 nt, while FISH probes are dye-labeled linear oligonucleotides
of 4050 nt. The molecular beacon-based approach has the advantage of detecting
RNA in live cells, without the need for cell xation and washing. However, it does
requires the cellular delivery of probes and has a low target accessibility (this is
discussed below). The advantage of FISH assays is the ease of probe design due to
a better target accessibility. Although FISH assays can be used to image the
localization of mRNA in xed cells, they rely on stringent washing to achieve signal
specicity, and do not have the ability to image the dynamics of gene expression in
living cells.
In the conventional molecular beacon design, the stem sequence is typically
independent of the target sequence (see Figure 6.2b), although sometimes two end
bases of the probe sequence, each adjacent to one arm sequence of the stem, could
be complementary with each other, thus forming part of the stem (the light blue
base of the stem shown in Figure 6.2a). Molecular beacons can also be designed
such that all the bases of one arm of the stem (to which a uorophore is conjugated)
are complementary to the target sequence, thus participating in both stem formation and target hybridization (shared-stem molecular beacons) [34] (Figure 6.2d).
The advantage of this shared-stem design is to help x the position of the
uorophore that is attached to the stem arm, limiting its degree-of-freedom of
motion, and increasing the FRET in the dual-FRET molecular beacon design (as
discussed below).
A dual-FRET molecular beacon approach was developed [2628] to overcome the
difcult that, in live-cell RNA detection, molecular beacons are often degraded by
nucleases or open due to nonspecic interaction with hairpin-binding proteins,
causing a signicant amount of false-positive signal. In this dual-probe design, a
pair of molecular beacons labeled with a donor and an acceptor uorophore,
respectively are employed (Figure 6.4). The probe sequences are chosen such that
this pair of molecular beacons hybridizes to adjacent regions on a single RNA target
(Figure 6.4). As FRET is very sensitive to the distance between donor and acceptor
uorophores, and typically occurs when the donor and acceptor uorophores are
within 10 nm, the FRET signal is generated by the donor and acceptor beacons
only if both probes are bound to the same RNA target. Thus, the sensitized
emission of the acceptor uorophore upon donor excitation serves as a positive
signal in the FRET-based detection assay; this can be differentiated from non-FRET
false-positive signals due to probe degradation and nonspecic probe opening. This
approach combines the low background signal and high specicity of molecular
j149
150
beacons with the ability of FRET assays in differentiating between true target
recognition and false-positive signals, leading to an enhanced ability to quantify
RNA expression in living cells [28].
6.2.3
Fluorescent Protein-Based Probes
tagged with the split fragments of a uorescent protein, such that their binding to
the target mRNA results in the restoration of uorescence [39]. The advantage of
this novel approach is that the background signal is low; there is no uorescence
signal unless the RNA-binding proteins (or protein fragments) are bound to the
target mRNA. The split-GFP method, however, may have difculties in tracking the
dynamics of RNA expression in real time, as reconstitution of the uorescent
protein from the split fragments typically takes 24 h, during which time the RNA
expression level may change. Transfection efciency may also be a major concern in
the GFP-based approaches, in that usually only a small percentage of the cells
express the uorescent proteins following transfection. This limits the application
of the split-GFP methods in detecting diseased cells using mRNA as a biomarker
for the disease.
6.3
Probe Design and StructureFunction Relationships
6.3.1
Target Specificity
There are three major design issues of molecular beacons: probe sequence; hairpin
structure; and uorophore/quencher selection. In general, the probe sequence is
selected to ensure specicity, and to have good target accessibility. The hairpin
structure, as well as the probe and stem sequences, are determined to have the proper
melting temperature, while the uorophorequencher pair should produce a high
signal-to-background ratio. To ensure specicity, for each gene to target, it is possible
to use the NCBI BLAST [41] or similar software to select multiple target sequences
of 1525 bases that are unique to the target RNA. As the melting temperature of the
molecular beacons affects both the signal-to-background ratio and detection specicity (especially for mutation detection), it is often necessary to select the target
sequence with a balanced G-C content, and to adjust the loop and stem lengths and
the stem sequence of the molecular beacon to realize the optimal melting temperature. In particular, it is necessary to understand the effect of molecular beacon design
on melting temperature so that, at 37 C, single-base mismatches in target mRNAs
can be differentiated. This is also a general issue for detection specicity in that, for
any specic probe sequence selected, there might be multiple genes in the mammalian genome that have sequences which differ from the probe sequence by only a few
bases. Therefore, it is important to design the molecular beacons so that only the
specic target RNA would produce a strong signal.
Several approaches can be taken to validate the signal specicity. For example,
one could either upregulate or downregulate the expression level of a specic RNA,
quantify the level using RT-PCR, and then compare the PCR result with that of
molecular beacon-based imaging of the same RNA in living cells. However, complications may arise when the approach used to change the RNA expression level in
living cells has an effect on multiple genes, as this would lead to some ambiguity, even
j151
152
when the PCR and beacon results match. It is possible that the best way to downregulate the level of a specic mRNA in live cells is to use small interfering RNA
(siRNA) treatment, which typically leads to a >80% reduction of the specic mRNA
level. As the effect of siRNA treatment varies depending on the specic probe used,
the siRNA delivery method, cell type and optimization of the protocol (i.e. probe
design and delivery method/conditions) is often needed.
6.3.2
Molecular Beacon StructureFunction Relationships
The loop, stem lengths and sequences are critical design parameters for molecular
beacons, since at any given temperature they largely control the fraction of molecular
beacons that are bound to the target [17, 18]. In many applications, the choices of the
probe sequence are limited by target-specic considerations, such as the sequence
surrounding a single nucleotide polymorphism (SNP) of interest. However, the
probe and stem lengths, and stem sequence, can be adjusted to optimize the
performance (i.e. specicity, hybridization rate and signal-to-background ratio) of
a molecular beacon for a specic application [17, 34].
In order to demonstrate the effect of molecular beacon structure on its melting
behavior, the melting temperature for molecular beacons with various stemloop
structures is displayed in Figure 6.5a. In general, the melting temperature was found
to increase with probe length, but appeared to plateau at a length of 20 nucleotides.
The stem length of the molecular beacon was also found to have a major inuence
on the melting temperature of the molecular beacontarget duplexes.
While both the stability of the hairpin probe and its ability to discriminate targets
over a wider range of temperatures increase with increasing stem length, it is
accompanied by a decrease in the hybridization on-rate constant (see Figure 6.5b).
For example, molecular beacons with a four-base stem had an on-rate constant up to
100-fold higher than did molecular beacons with a six-base stem. Changing the probe
length of a molecular beacon may also inuence the rate of hybridization, as shown in
Figure 6.5b.
The results of thermodynamic and kinetic studies showed that, if the stem length
was too large then it would be difcult for the beacon to open on hybridization. But, if
the stem length was too small, then a large fraction of beacons might open due to the
thermal force. Likewise, and relative to the stem length, whilst a longer probe might
lead to a lower dissociation constant, it might also reduce the specicity, as the relative
free energy change due to a one base mismatch would be smaller. A long probe length
may also lead to coiled conformations of the beacons, resulting in reduced kinetic
rates. Consequently, the stem and probe lengths must be carefully chosen in order
to optimize both hybridization kinetics and molecular beacon specicity [17, 34].
In general, molecular beacons with longer stem lengths have an improved ability to
discriminate between wild-type and mutant targets in solution, over a broader range
of temperatures. This effect can be attributed to the enhanced stability of the
molecular beacon stemloop structure and the resulting smaller free energy difference between closed (unbound) molecular beacons and molecular beacontarget
duplexes, which generates a condition where a single-base mismatch reduces the
energetic preference of probetarget binding. Longer stem lengths, however, are
accompanied by a reduced probetarget hybridization kinetic rate. On a similar note,
molecular beacons with short stems have faster hybridization kinetics but suffer from
lower signal-to-background ratios compared to molecular beacons with longer stems.
6.3.3
Target Accessibility
One critical issue in molecular beacon design is target accessibility, as is the case
for most oligonucleotide probes for live-cell RNA detection. It is well known that a
functional mRNA molecule in a living cell is always associated with RNA-binding
proteins, thus forming a RNP. An mRNA molecule also often has double-stranded
portions and forms secondary (folded) structures (Figure 6.6). Therefore, when
designing a molecular beacon it is necessary to avoid targeting mRNA sequences
that are double-stranded, or occupied by RNA-binding proteins, for otherwise the
probe will have to penetrate into the RNA double strand or compete with the RNAbinding protein in order to hybridize to the target. In fact, molecular beacons
designed to target a specic mRNA often show no signal when delivered to living
cells. One difculty in molecular beacon design is that, although predictions of
mRNA secondary structure can be made using software such as Beacon Designer
(www.premierbiosoft.com) and mfold (http://www.bioinfo.rpi.edu/applications/
mfold/old/dna/), they may be inaccurate due to limitations of the biophysical models
used, and the limited understanding of proteinRNA interaction. Therefore, for each
gene to be targeted it may be necessary to select multiple unique sequences along the
target RNA, and then to design, synthesize and test the corresponding molecular
beacons in living cells in order to select the best target sequence.
j153
154
In aiming to reveal the possible molecular beacon design rules, the accessibility of
BMP-4 mRNA was studied using different beacon designs [42]. Specically, molecular
beacons were designed to target the start codon and termination codon regions, the
siRNA and anti-sense oligonucleotide probe sites (which were identied previously)
and also the sites that were chosen at random. All of the target sequences are unique
to BMP-4 mRNA. Of the eight molecular beacons designed to target BMP-4 mRNA,
only two were found to produce a strong signal: one which targeted the start codon
region, and one which targeted the termination codon region. It was also found that,
even for a molecular beacon which functioned well, shifting its targeting sequence
by only a few bases towards the 30 or 50 ends caused a signicant reduction in the
uorescence signal from beacons in a live-cell assay. This indicated that the target
accessibility was quite sensitive to the location of the targeting sequence. These
results, together with molecular beacons validated previously, suggest that the start
and termination codon regions and the exonexon junctions are more accessible than
other locations in an mRNA.
6.3.4
Fluorophores and Quenchers
With a correct backbone synthesis and uorophore/quencher conjugation, a molecular beacon can in theory be labeled with any desired reporterquencher pair.
However, the correct selection of the reporter and quencher could also improve the
signal-to-background ratio and multiplexing capabilities. The selection of a uorophore label for a molecular beacon as reporter is normally less critical than for the
hairpin probe design, as many conventional dyes can yield satisfactory results.
However, the correct selection may yield additional benets such as an improved
signal-to-background ratio and multiplexing capabilities. As each molecular beacon
utilizes only one uorophore, it is possible to use multiple molecular beacons in the
same assay, assuming that the uorophores are chosen with minimal emission
overlap [19]. Molecular beacons can even be labeled simultaneously with two
uorophores that is with wavelength shifting reporter dyes (see Figure 6.2c),
allowing multiple reporter dye sets to be excited by the same monochromatic light
source but to uoresce in a variety of colors [33]. Clearly, multicolor uorescence
detection of different beacon/target duplexes may in time become a powerful tool for
the simultaneous detection of multiple genes.
For dual-FRETmolecular beacons (see Figure 6.4), the donor uorophores typically
emit at shorter wavelengths compared with the acceptor. Energy transfer then occurs
as a result of long-range dipoledipole interactions between the donor and acceptor.
The efciency of such energy transfer depends on the extent of the spectral overlap
of the emission spectrum of the donor with the absorption spectrum of the acceptor,
the quantum yield of the donor, the relative orientation of the donor and acceptor
transition dipoles [43], and the distance between the donor and acceptor molecules
(usually four to ve bases). In selecting the donor and acceptor uorophores so as to
create a high signal-to-background ratio, it is important to optimize the above
parameters, and to avoid direct excitation of the acceptor uorophore at the donor
excitation wavelength. It is also important to minimize donor emission detection at
the acceptor emission detection wavelength. Examples of FRETdye pairs include Cy3
(donor) with Cy5 (acceptor), TMR (donor) with Texas Red (acceptor), and uorescein
(FAM) (donor) with Cy3 (acceptor).
By contrast, it is relatively straightforward to select the quencher molecules.
Organic quencher molecules such as dabcyl, BHQ2 (blackhole quencher II)
(Biosearch Tech), BHQ3 (Biosearch Tech) and Iowa Black (IDT) can all effectively
quench a wide range of uorophores by both FRET and the formation of an exciton
complex between the uorophore and the quencher [44].
6.4
Cellular Delivery of Nanoprobes
One of the most critical aspects of measuring the intracellular level of RNA molecules
using synthetic probes is the ability to deliver the probes into cells via the plasma
membrane, which itself is quite lipophilic and restricts the transport of large, charged
molecules. Thus, the plasma membrane serves as a very robust barrier to polyanionic
molecules such as hairpin oligonucleotides. Further, even if the probes enter the cells
successfully, the efciency of delivery in an imaging assay should be dened not
only by how many probes enter the cell, or how many cells have probes internalized,
j155
156
but also by how many probes remain functioning inside the cells. This is a different
situation from both antisense and gene delivery applications, where the reduction
in level of protein expression is the nal metric used to dene efciency or success.
For measuring RNA molecules (including mRNA and rRNA) in the cytoplasm,
a large amount of the probe should remain in the cytoplasm.
Existing cellular delivery techniques can be divided into two categories, namely
endocytic and nonendocytic. Endocytic delivery typically employs cationic and polycationic molecules such as liposomes and dendrimers, whereas nonendocytic
methods include microinjection and the use of cell-penetrating peptides (CPPs) or
streptolysin O (SLO). Probe delivery via the endocytic pathway typically takes 24 h.
It has been reported that ODN probes internalized via endocytosis are predominantly
trapped inside endosomes and lysosomes, where they are degraded by the action of
cytoplasmic nucleases [45]. Consequently, only 0.01% to 10% of the probes remain
functioning after having escaped from endosomes and lysosomes [46].
Oligonucleotide probes (including molecular beacons) have been delivered into
cells via microinjection [47]. In most cases, the ODNs were accumulated rapidly in
the cell nucleus and prevented the probes from targeting mRNAs in the cell
cytoplasm. The depletion of intracellular ATP or lowering the temperature from
37 to 4 C did not have any signicant effect on ODN nuclear accumulation, thus
ruling out any active, motor protein-driven transport [47]. It is unclear if the rapid
transport of ODN probes to the nucleus is due to electrostatic interaction, or is driven
by a microinjection-induced ow, or the triggering of a signaling pathway. There is no
fundamental biological reason why ODN probes should accumulate in the cell
nucleus, but to prevent such accumulation streptavidin (60 kDa) molecules were
conjugated to linear ODN probes via biotin [13]. After being microinjected into the
cells, the dual-FRET linear probes could hybridize to the same mRNA target in the
cytoplasm, resulting in a FRETsignal. More recently, it was shown that when transfer
RNA (tRNA) transcripts were attached to molecular beacons with a 20 -O-methyl
backbone and injected into the nucleus of HeLa cells, the probes were exported into
the cytoplasm. Yet, when these constructs were introduced into the cytoplasm,
they remained cytoplasmic [48]. However, even without the problem of unwanted
nuclear accumulation, microinjection is an inefcient process for delivering probes
into a large number of cells.
Another nonendocytic delivery method is that of toxin-based cell membrane
permeabilization. For example, SLO is a pore-forming bacterial toxin that has been
used as a simple and rapid means of introducing oligonucleotides into eukaryotic
cells [4951]. SLO binds as a monomer to cholesterol and oligomerizes into a ringshaped structure to form pores of approximately 2530 nm in diameter, allowing the
inux of both ions and macromolecules. It was found that SLO-based permeabilization could achieve an intracellular concentration of ODNs which was approximately
10-fold that achieved with electroporation or liposomal-based delivery. As cholesterol
composition varies with cell type, however, the permeabilization protocol must be
optimized for each cell type by varying the temperature, incubation time, cell number
and SLO concentration. One essential feature of toxin-based permeabilization is that
it is reversible. This can be achieved by introducing oligonucleotides with SLO under
serum-free conditions and then removing the mixture and adding normal media
with the serum [50, 52].
Cell-penetrating peptides have also been used to introduce proteins, nucleic acids
and other biomolecules into living cells [5355]. Included among the family of
peptides with membrane-translocating activity are antennapedia, HSV-1 VP22 and
the HIV-1 Tat peptide. To date, the most widely used peptides are HIV-1 Tat peptide
and its derivatives, due to their small sizes and high delivery efciencies. The Tat
peptide is rich in cationic amino acids (especially arginine, which is very common in
many CPPs); however, the exact mechanism of CPP-induced membrane translocation remains elusive.
A wide variety of cargos have been delivered to living cells, both in cell culture
and in tissues, using CPPs [56, 57]. For example, Allinquant et al. [58] linked the
antennapedia peptide to the 50 end of DNA oligonucleotides (with biotin on the
30 end) and incubated both peptide-linked ODNs and ODNs alone (as control) with
cells. By detecting biotin via a streptavidinalkaline phosphatase amplication,
the peptide-linked ODNs were shown to be internalized very efciently into all cell
compartments compared to control ODNs. Moreover, no indication of endocytosis
was found. Similar results were obtained by Troy et al. [59], with a 100-fold increase in
antisense delivery efciency when the ODNs were linked to antennapedia peptides.
Recently, Tat peptides were conjugated to molecular beacons using different linkages
(Figure 6.7); the resultant peptide-linked molecular beacons were delivered into
living cells to target glyceraldehyde phosphate dehydrogenase (GAPDH) and survivin
mRNAs [29]. It was shown that, at relatively low concentrations, peptide-linked
molecular beacons were internalized into living cells within 30 min, with near-100%
efciency. Further, peptide-based delivery did not interfere with either specic
targeting by, or hybridization-induced orescence of, the probes. In addition, the
peptide-linked molecular beacons were seen to possess self-delivery, targeting and
reporting functions. In contrast, the liposome-based (Oligofectamine) or dendrimerbased (Superfect) delivery of molecular beacons required 34 h and resulted in a
punctate uorescence signal in the cytoplasmic vesicles and a high background in
j157
158
both the cytoplasm and nucleus of cells [29]. These results showed clearly that the
cellular delivery of molecular beacons using a peptide-based approach is far more
effective than conventional transfection methods.
6.5
Living Cell RNA Detection Using Nanostructured Probes
6.5.1
Biological Significance
j159
160
6.6
Engineering Challenges in New Probe Development
j161
162
<1000) are observed. Therefore, the average copy number per cell may change with
the total number of cells observed due to the (often large) cell-to-cell variation of
mRNA expression.
Another issue in living cell gene detection using hairpin ODN probes is the
possible effect of probes on normal cell function, including protein expression.
As has been revealed in antisense therapy research, the complementary pairing of
a short segment of an exogenous oligonucleotide to mRNA can have a profound
impact on protein expression levels, and even cell fate. For example, tight binding of
the probe to the translation start site may block mRNA translation. Binding of a DNA
probe to mRNA can also trigger RNase H-mediated mRNA degradation. However,
the probability of eliciting antisense effects with hairpin probes may be very low when
low concentrations of probes (<200 nM) are used for mRNA detection, in contrast to
the high concentrations (typically 20 mM; [51]) employed in antisense experiments.
Further, it generally takes 4 h before any noticeable antisense effect occurs, whereas
the visualization of mRNA with hairpin probes requires less than 2 h after delivery.
However, it is important to carry out a systematic study of the possible antisense
effects, especially for molecular beacons with a 20 -O-methyl backbone, which may
also trigger unwanted RNA interference.
As a new approach for in vivo gene detection, nanostructured probes can be further
developed to have an enhanced sensitivity and a wider range of applications. For
example, it is likely that hairpin ODN probes with quantum dot as the uorophore
will have a better ability to track the transport of individual mRNAs from the cell
nucleus to the cytoplasm. Hairpin ODN probes with a near-infrared (NIR) dye as the
reporter, combined with peptide-based delivery, have the potential to detect specic
RNAs in tissue samples, animals or even humans. It is also possible to use lanthanide
chelate as the donor in a dual-FRET probe assay and to perform time-resolved
measurements to dramatically increase the SNR, thus achieving high sensitivity
while detecting low-abundance genes. Although very challenging, the development
of these and other nanostructured ODN probes will signicantly enhance our ability
to image, track and quantify gene expression in vivo, and provide a powerful tool for
basic and clinical studies of human health and disease.
There are many possibilities for nanostructured probes to become clinical tools for
disease detection and diagnosis. For example, molecular beacons could be used to
perform cell-based early cancer detection using clinical samples such as blood, saliva
and other body uids. In this case, cells in the clinical sample are separated, while the
molecular beacons designed to target specic cancer genes are delivered to the cell
cytoplasm for detecting mRNAs of the cancer biomarker genes. Cancer cells having
a high level of the target mRNAs (e.g. survivin) or mRNAs with specic mutations
that cause cancer (e.g. K-ras codon 12 mutations) would show high levels of
uorescence signal, whereas normal cells would show just a low background signal.
This would allow cancer cells to be distinguished from normal cells. When using
this approach, the target mRNAs would not be diluted compared to approaches
using a cell lysate, such as PCR. Thus, molecular beacon-based assays have the
potential for the positive identication of cancer cells in a clinical sample, with high
specicity and sensitivity. It might also be possible to detect cancer cells in vivo
References
Acknowledgments
These studies were supported by the National Heart Lung and Blood Institute of
the NIH as a Program of Excellence in Nanotechnology (HL80711), by the National
Cancer Institute of the NIH as a Center of Cancer Nanotechnology Excellence
(CA119338), and by the NIH Roadmap Initiative in Nanomedicine through a
Nanomedicine Development Center award (PN2EY018244).
References
1 Saiki, R.K., Scharf, S., Faloona, F.,
Mullis, K.B., Horn, G.T., Erlich, H.A. and
Arnheim, N. (1985) Science, 230, 1350.
2 Alwine, J.C., Kemp, D.J., Parker, B.A.,
Reiser, J., Renart, J., Stark, G.R. and
Wahl, G.M. (1979) Methods in Enzymology,
68, 220.
3 Adams, M.D., Dubnick, M., Kerlavage, A.R.,
Moreno, R., Kelley, J.M., Utterback, T.R.,
Nagle, J.W., Fields, C. and Venter, J.C.
(1992) Nature, 355, 632.
4 Velculescu, V.E., Zhang, L., Vogelstein, B.
and Kinzler, K.W. (1995) Science, 270, 484.
5 Liang, P. and Pardee, A.B. (1992) Science,
257, 967.
6 Schena, M., Shalon, D., Davis, R.W. and
Brown, P.O. (1995) Science, 270, 467.
7 Bassell, G.J., Powers, C.M., Taneja, K.L.
and Singer, R.H. (1994) The Journal of Cell
Biology, 126, 863.
8 Buongiorno-Nardelli, M. and Amaldi, F.
(1970) Nature, 225, 946.
9 Behrens, S., Fuchs, B.M., Mueller, F. and
Amann, R. (2003) Applied and
Environmental Microbiology, 69, 4935.
j163
164
19
20
21
22
23
24
25
26
27
28
29
30
31
32
References
50 Barry, M.A. and Eastman, A. (1993)
Archives of Biochemistry and Biophysics,
300, 440.
51 Giles, R.V., Spiller, D.G., Grzybowski, J.,
Clark, R.E., Nicklin, P., Tidd, D.M. (1998)
Nucleic Acids Research, 26, 1567.
52 Walev, I., Bhakdi, S.C., Hofmann, F.,
Djonder, N., Valeva, A., Aktories, K. and
Bhakdi, S. (2001) Proceedings of the National
Academy of Sciences of the United States of
America, 98, 3185.
53 Snyder, E.L. and Dowdy, S.F. (2001)
Current Opinion in Molecular Therapeutics,
3, 147.
54 Wadia, J.S. and Dowdy, S.F. (2002) Current
Opinion in Biotechnology, 13, 52.
55 Becker-Hapak, M., McAllister, S.S. and
Dowdy, S.F. (2001) Methods (San Diego,
Calif.), 24, 247.
56 Wadia, J.S. and Dowdy, S.F. (2005)
Advanced Drug Delivery Reviews, 57, 579.
57 Brooks, H., Lebleu, B. and Vives, E.
(2005) Advanced Drug Delivery Reviews,
57, 559.
58 Allinquant, B., Hantraye, P., Mailleux, P.,
Moya, K., Bouillot, C. and Prochiantz, A.
(1995) The Journal of Cell Biology,
128, 919.
59 Troy, C.M., Derossi, D., Prochiantz, A.,
Greene, L.A. and Shelanski, M.L. (1996)
The Journal of Neuroscience, 16, 253.
j165
j167
7
High-Content Analysis of Cytoskeleton Functions
by Fluorescent Speckle Microscopy
Kathryn T. Applegate, Ge Yang, and Gaudenz Danuser
7.1
Introduction
In 1949, Linus Pauling observed that hemoglobin in patients with sickle cell anemia
is structurally different from that in healthy individuals [1]. This seminal discovery of
a molecular disease overturned a century-old notion that all diseases were caused by
structural problems at the cellular level. Today, we know that disease can arise from
aberrations in the expression, regulation or structure of a single molecule. Frequently,
such aberrations interfere with one or more of the cells basic morphological activities, including cell division, morphogenesis and maintenance in different tissue
environments, or cell migration.
The advent of molecular pathology precipitated the rise of molecular biology and
genomics, which in turn jump-started other large-scale -omics elds. In parallel,
sophisticated imaging, quantitative image analysis and bioinformatics approaches
were developed. These methods have enabled a quantum leap in our knowledge base
about the molecular underpinnings of life, and what goes wrong during disease.
Much has already been translated to the clinic. For example, mutation and gene
expression proles can be used to prescribe targeted drugs to breast cancer
patients [2], and in the US many states have adopted metabolic screening programs
to test newborns for a growing number of disorders [3].
Yet on the whole, the genomic era has failed to yield the goldmine of personalized
interventions that it rst promised. Drug development pipelines rely heavily on
high-throughput screens to identify compounds that have a desired effect on the
biochemical activity of a particular drug target. These screens, however, cannot
resolve whether a hit will be active in living cells and specic to the pathway of
interest. To avoid this limitation, high-content screens also called phenotypic or
imaging screens use automated image analysis methods to detect desired changes
in cells photographed under the light microscope [4]. Changes in gross cell morphology or the spatial activity of a protein of interest become apparent when analyzing
a large population of cells. Although the identication of a molecular target causing a
168
phenotypic change can be rate-limiting, assays can often be designed with the drug
target already in mind. The larger challenge is to derive meaningful, quantitative
phenotypic information from images [4].
The problem of extracting phenotypic information is compounded by the fact that
many phenotypic differences can only be resolved in time; the dynamics of molecules,
not just their concentrations and localization, are important in disease development.
In other words, a cell may look healthy in a still image, but an analysis of the
underlying dynamics of cell-adhesion proteins, for example, may reveal that the cell
has metastatic potential. Current phenotypic screens can only distinguish between
coarse, spatially oriented phenotypes [5], while many diseases exhibit extremely
subtle, yet signicant, phenotypes. New methods are needed to extract and correlate
dynamic descriptors if we are to design drugs and other nanomedical intervention
strategies with minimal side effects.
In this chapter, we review quantitative uorescent speckle microscopy (qFSM),
a relatively young imaging technology that has been used to characterize the
dynamic infrastructure of the cell. qFSM has the potential to become a unique
assay for live-cell phenotypic screening that will guide the development of drugs
and other nanomedical strategies based on the dynamics of subcellular structures.
We begin by summarizing how regulation of the cytoskeleton contributes to
important cell morphological processes that go awry in disease. We then describe
how uorescent speckles form to mark the dynamics of subcellular structures.
Next, we illustrate critical biological insights that have been gleaned from qFSM
experiments. We conclude with new applications and an outlook on the future
of qFSM.
7.2
Cell Morphological Activities and Disease
The lamentous actin (F-actin), intermediate lament (IF) and microtubule (MT)
cytoskeleton systems are key mediators of cell morphology (Figure 7.1). Each lament system is unique in its physical properties and extensive subset of associated
proteins [6]. Endogenous and exogenous chemical and mechanical signals control
the precise arrangement of these dynamic polymers, and defects in their regulation
are seen in a wide variety of diseases [7].
7.2.1
Cell Migration
One of the most fundamental cell morphological functions is migration. Many cell
types in the body are motile, including broblasts, epithelial cells, neurons, leukocytes and stem cells. Failure to migrate, or migration to the wrong location in the
body, can lead to congenital heart or brain defects, atherosclerosis, chronic inammation, neurodegenerative disease, compromised immune response, defects in wound
healing and tumor metastasis [8].
Figure 7.1 The F-actin, microtubule and intermediate filament cytoskeletons and adhesions
in a migrating cell. A mesh-like F-actin network
(red) at the leading edge drives the plasma
membrane forward. Contractile F-actin bundles
(red) linked to strong adhesions (yellow) in the
Division is another cellular function that depends on a complex and highly regulated series of molecular events. Division is essential during embryogenesis and
j169
170
development, and also occurs constantly in tissues of the adult body. In the intestines
alone, approximately 1010 old cells are shed and replaced every day [14]. When DNA
replication is complete, the pairs of chromosomes must be pulled apart symmetrically and segregated to opposite ends of the cell. Segregation is accomplished by a
dynamic structure called the spindle, which is composed of MTs and motor proteins.
In the nal step of cell division cytokinesis the spindle elongates and a contractile
actin structure develops to pinch off the membrane, partitioning organelles and
cytoplasm into two daughter cells. While these processes progress with remarkable
delity in healthy individuals, unchecked and faulty cell division are hallmarks of
oncogenesis [15] and age-related disorders [16]. Cell cycling of adult neurons may be
implicated in Alzheimers disease [17]. Analysis of the architectural dynamics of the
F-actin and MTstructures involved in these steps is critical if we are to intervene with
abnormal events in cell division associated with disease development.
7.2.3
Response to Environmental Changes
Cells also must be able to respond to physical changes in the environment. Almost 15
years ago, Wang et al. reported that applying a mechanical stimulus to integrin transmembrane receptors in adhesions caused the cytoskeleton to stiffen in proportion to
the load [18]. The increased stiffness required the presence of intact F-actin, MTs and
IFs. Such adaptation is central to the formation of multicellular tissues and functional
organelles [19]. Recent studies have also provided evidence that mechanical cues
relayed through the cytoskeleton systems dictate stem cell fate. Nave mesenchymal
stem cells cultured on a soft, brain-like matrix differentiate into neurons, while those
cultured on a more rigid, bone-like matrix develop into osteoblasts [20]. When F-actin
contraction by myosin motors is inhibited, lineage specication by elasticity is
blocked. Mechanistic analyses of these differentiation-dening processes will require
a quantitative analysis of the underlying cytoskeleton dynamics.
7.2.4
CellCell Communication
devices into defective cells requires the specic activation of one of these pathways in
the right place, at the right time. Thus, our ability to understand disease and to
precisely manipulate cells depends on our ability to analyze the cytoskeleton structure
and dynamics in situ in living cells. We will now introduce qFSM as one of the
emerging tools to achieve this goal.
7.3
Principles of Fluorescent Speckle Microscopy (FSM)
j171
172
the structure, and the association and dissociation of subunits. Thus, FSM provides
the same information as the aforementioned photomarking techniques, but does
so across much or all of the cell simultaneously. As FSM does not require active
marking, it allows the continuous detection of nonsteady-state dynamics within
protein assemblies, and reveals spatial and temporal relationships between these
dynamic events at submicron and second resolution. FSM also reduces out-of-focus
uorescence and improves the visibility of uorescently labeled structures and
their dynamics in three-dimensional (3-D) polymer arrays, such as the mitotic
spindle [3739].
In the early years of its development, FSM used wide-eld epiuorescence light
microscopy and digital imaging with a sensitive, low-noise, cooled charge-coupleddevice (CCD) camera [24]. Since then, FSM has been transferred to confocal and total
internal reection uorescence (TIRF) microscopes [37, 4042]. The development of
fully automated, computer-based tracking and the statistical analysis of speckle
behavior proved to be critical steps in establishing FSM as a routine method for
measuring cytoskeleton architectural dynamics. Thus, FSM is an integrated technology in which sample preparation, imaging and image analysis are optimized to
achieve detailed information about polymer dynamics.
7.4
Speckle Image Formation
7.4.1
Speckle Formation in Microtubules (MTs): Stochastic Clustering of Labeled Tubulin
Dimers in the MT Lattice
MTs exhibit a variation in uorescence intensity along their lattices when cells are
injected with a small amount of labeled tubulin dimers, leading to a speckled appearance (Figure 7.2a and b). Several possibilities exist for how speckles arise in this
situation:
.
.
.
The rst hypothesis was discounted by showing that labeled tubulin dimers
sediment similarly to unlabeled dimers in an analytical ultracentrifugation assay.
Next, it was shown that MTs assembled from puried tubulin in vitro exhibited
similar speckle patterns to MTs in cells, where MAPs and organelles are present [43].
Thus, the most plausible explanation for MTspeckle formation is that variations exist
in the number of uorescent tubulin subunits in each resolution-limited image
region along the MT.
174
c 440 f 1f =440 f . Accordingly, the contrast c can be increased by decreasing f (Figure 7.2d) or by making the Airy disk smaller that is, effectively lowering the
number of tubulin subunits per Airy disk. The latter is accomplished by using optics
with the highest NApossible. Experiments have shown that fractions in the range of 0.5
to 2%, where speckles consisting of three to eight uorophores are optimal for the
speckle imaging of individual microtubules [46].
7.4.2
Speckle Formation in Other Systems: The Platform Model
unlabeled proteins give the adhesions a speckled appearance (Figure 7.2i and j).
As with F-actin networks, the speckles represent randomly distributed uorescent
adhesion proteins that are temporarily clustered in the adhesion complex within the
volume of one PSF.
In summary, a speckle is dened as a diffraction-limited region that is signicantly
higher in uorophore concentration (i.e. higher in uorescence intensity) than its
local neighborhood. For a speckle signal to be detected in an image, the contributing
uorescent molecules must be associated with a molecular scaffold, or speckle
platform, during the 0.12.0 s exposure time required by most digital cameras to
acquire the dim FSM image. Conversely, unbound, diffusible uorescent molecules
visit many pixels during the exposure and yield an evenly distributed background
signal, instead of speckles [48]. The same idea was illustrated for the MT MAP
ensconsin [58] and for the MT kinesin motor Eg5 [59]. Thus, association with the
platform can occur when labeled subunits either become part of the platform, as with
tubulin or actin, or simply bind to it, as in the case of cytoskeleton-associated proteins
or adhesion molecules.
7.5
Interpretation of Speckle Appearance and Disappearance
7.5.1
Nave Interpretation of Speckle Dynamics
Following the platform model, one would expect the appearance of a speckle to
correspond to the local association of subunits with the platform. Conversely, the
disappearance of a speckle would mark the local dissociation of subunits. In other
words, FSM allows in principle the direct kinetic measurement of subunit
turnover in space and time via speckle lifetime analysis. In addition, once a speckle is
formed, it may undergo motion that indicates the coordinated movement of labeled
subunits on the platform and/or the movement of the platform itself.
7.5.2
Computational Models of Speckle Dynamics
j175
176
j177
178
Each of the two slopes is tested for statistical signicance. Insignicant intensity
changes are discarded.
If both foreground and background slopes are signicant, the one with the higher
signicance (lower p-value) is selected as the cause of the event. In the example
in Figure 7.3c the foreground slope has the higher signicance. The magnitude
of the more signicant slope is recorded as the score of the birth/death event.
If neither foreground nor background slope is statistically signicant, no score
is generated.
turnover. Thus, the statistical model described in this section provides a robust
method for calculating spatiotemporal maps of assembly and disassembly of subcellular structures such as F-actin networks.
7.5.4
Single- and Multi-Fluorophore Speckles Reveal Different Aspects of the Architectural
Dynamics of Cytoskeleton Structures
Intuitively, it seems that FSM would be most powerful if implemented as a singlemolecule imaging method, where speckle appearances and disappearances unambiguously signal association and dissociation of uorescent subunits to the
platform [48]. However, the much simpler signal analysis of single-molecule
images is counterweighed by several disadvantages not encountered when using
multiuorophore speckles. First, establishing that an image contains only singleuorophore speckles can be challenging, especially when the signal of one
uorophore is close to the noise oor of the imaging system. Especially in 3-D
structures a large number of speckles have residual contributions from at least one
other uorophore, and those mixtures must be eliminated from the statistics.
Second, the imaging of single-uorophore speckles is practically more demanding
than multiuorophore FSM and requires longer camera exposures to capture the
very dim signals, thus reducing the temporal resolution. Third, in addition to the
lower temporal resolution, single-uorophore FSM offers lower spatial resolution
because the density of speckles drops signicantly with the extremely low labeling
ratio required for single-molecule-imaging conditions. In dense, crosslinked
structures such as the F-actin network the intensity variation during appearance
and disappearance events of multiuorophore speckles can distinguish between
fast and slow turnover. In contrast, single-uorophore speckles deliver on/off
information only. Thus, in order to measure rates of the turnover of molecular
structures on a continuous scale, single-uorophore speckle analysis must rely on
spatial and temporal averaging, which further decreases the resolution, while
multiuorophore speckles provide this information at a ner spatial and shorter
temporal scale.
On the other hand, multiuorophore speckles cannot resolve the dynamics of
closely apposed individual units with subcellular structures. If the dynamics of the
individual building blocks of a structure is of interest, and the lower density of spatial
and temporal samples can be afforded, single-uorophore speckles are adequate
probes. If information about individual units and the ensemble of units is needed,
single-uorophore and multiuorophore speckle imaging can be combined in two
spectrally distinct channels. This has been demonstrated in an analysis of the Xenopus
spindle [62], where combined single- and multiuorophore qFSM revealed the
overall dynamics of MTs in the spindle, as well as the dynamics and length
distribution of individual MTs within densely packed bundles inside the spindle
(see Section 7.9.2).
In summary, FSM can probe different aspects of the architectural dynamics of
subcellular structures at different spatial and temporal scales via modulation of the
j179
180
ratio of labeled to unlabeled subunits. Currently, the exact labeling ratio is difcult to
control in a given experiment. Statistical clustering analysis of the resulting speckle
intensity is required to identify the distribution of the numbers of uorophores
within speckles. In the near future, these mathematical methods will be complemented with sophisticated molecular biology that will allow relatively precise titration
of the labeled subunits. Together, these approaches will be invaluable to a systematic
mapping of the heterogeneous dynamics of complex subcellular structures such as
the cytoskeleton.
7.6
Imaging Requirements for FSM
low-noise, high dynamic range CCDs used with spinning-disk confocal microscope
systems.
7.7
Analysis of Speckle Motion
7.7.1
Tracking Speckle Flow: Early and Recent Developments
j181
182
recognized in the target frame. On the other hand, larger windows increased the
averaging of distinct speckle motions within the window.
Underlying the method of cross-correlation tracking is the assumption that the
signal of a probing window, although translocated in space, does not change between
source and target frame. In practice, this assumption is always violated because
of noise. However, the cross-correlation of two image signals appears to be tolerant
toward spatially uncorrelated noise, making it a prime objective function in computer
vision tracking [7173]. The many speckle appearances and disappearances in F-actin
networks, however, introduce signal perturbations that cannot be tackled by the crosscorrelation function [74]. Instead, we have developed a particle ow method, in which
the movement of each speckle was tracked individually [74]. Speckles were linked
between consecutive frames by nearest-neighbor assignment in a distance graph,
in which conicts between multiple speckles in the source frame competing for
the same speckle in the target frame were resolved by global optimization [75]. An
extension of the graph to linking speckles in three consecutive frames allowed
enforcement of smooth and unidirectional trajectories, so that speckles moving in
antiparallel ow elds could be tracked [74].
Surprisingly, cross-correlation-based tracking was successful in measuring average tubulin ux in meiotic spindles [76]. Simulated time-lapse sequences showed that
if a signicant subpopulation of speckles in the probing window moves jointly,
then the coherent component of the ow can be estimated even when the rest of
the speckles move randomly or, as in the case of the spindle apparatus, a smaller
population moves coherently in opposite directions. However, the tracking result will
be ambiguous if the window contains multiple, coherently moving speckle subpopulations of equal size. Miyamoto et al. [76] carefully chose windows in the central
region of a half-spindle, where the motion of speckles towards the nearer of the two
poles dominated speckle motion in the opposite direction and random components.
The approach was aided further by several features of the spindle system: tubulin ux
in a spindle is quasi-stationary; speckle appearances and disappearances are concentrated at the spindle midzone and in the pole regions, which were both excluded from
the probing window; and the ow elds were approximately parallel inside the
probing window.
Encouraged by these results, we returned to cross-correlation tracking of speckle
ow in F-actin networks [77]. The advantage of cross-correlation tracking over particle
ow tracking is that there is no requirement to detect the same speckle in at least two
consecutive frames. Hence, speckle ows can be tracked in movies with high noise
levels and weak speckle contrast [77]. In order to avoid trading correlation stability for
spatial resolution, we capitalized on the fact that cytoskeleton transport is often
stationary on the timescale of minutes. Thus, although the correlation of a single pair
of probing windows in source and target frames is ambiguous (Figure 7.5b-i),
rendering the tracking of speckle ow impossible (Figure 7.5c-i), time-integration
of the correlation function over multiple frame pairs yields robust displacement
estimates for probing windows as small as the Airy disk area (Figure 7.5b-ii, c-ii).
Figure 7.5d presents a complete high-resolution speckle ow map extracted by
integration over 20 frames (3 min).
7.7.2
Tracking Single-Speckle Trajectories
The extraction of kinetic data according to Figure 7.3 requires the accurate localization of speckle birth and death events. For this, it was necessary to devise methods
capable of tracking full trajectories at the single-speckle level. The large number
j183
184
(>100 000) of dense speckles poses a signicant challenge. Details of the current implementation of single-particle tracking of speckles are described by Ponti et al. [78].
Our approach follows the framework of most particle-tracking methods that is, the
detection of speckles as particles on a frame-by-frame basis, and the subsequent
assignment of corresponding particles in consecutive frames. Assignment is iterated
to close gaps in the trajectories created by the short-term instability of the speckle
signal. Our implementation of this framework included two algorithms that address
particularities of the speckle signal:
.
.
Speckle ow elds are extracted iteratively from previous solutions of singlespeckle trajectories [78], or by initial correlation-based tracking [77]. The elds
are then employed to propagate speckle motion from the source to the target frame,
prior to establishing the correspondence between the projected speckle position and
the effective speckle position in the target frame by global nearest-neighbor assignment [79, 80].
Motion propagation allows us to cope with two problems of FSM data. First, in
many cases the magnitude of speckle displacements between two frames signicantly exceeds half the distance between speckles. Hence, no solution to the
correspondence problem exists without prediction of future speckle locations.
Second, speckles undergo sharp spatial gradients in speed and direction of motion.
A global propagation scheme discarding regional variations will thus fail, whereas an
iterative extraction of the ow eld permits a gradually rened trajectory reconstruction in these areas.
Figure 7.6a displays the single-speckle trajectories for speckles initiated in the
rst 20 frames of the same movie for which speckle ow computation is demon-
7.8 Applications of FSM for Studying Protein Dynamics In Vitro and In Vivo
7.7.3
Mapping Polymer Turnover Without Speckle Trajectories
It frequently occurs that a lower speckle contrast or a high image noise does not allow
the precise identication of single-speckle trajectory endpoints. However, the trackable subsections of the trajectories are usually sufcient to extract the overall
structure of speckle ow. In this case, an alternative scheme relying on the continuity
of the optical density of the speckle eld permits the mapping of turnover at lower
resolution [81], as indicated in Figure 7.7 for the example of a crawling sh keratocyte,
a cell system where the generation of clear uorescent speckle patterns has proven
difcult [52].
7.8
Applications of FSM for Studying Protein Dynamics In Vitro and In Vivo
Applications of FSM have thus far focused mostly on the study of F-actin and MT
cytoskeleton systems, although other systems have also been analyzed in this way. A
j185
186
summary of the FSM literature can be found in Table 7.1, where the major biological
ndings and technical advances in FSM made to date are listed. Most of the FSM data
analysis has been limited to kymograph measurements of average speckle ow (see
Section 7.7) and to the manual tracking of a few hundred speckles to extract lifetime
information [48] and selected trajectories of cytoskeleton structures [26, 27, 69]. A
systematic analysis of the full spatiotemporal information offered by FSM regarding
transport and turnover in molecular assemblies is far from complete, although
signicant progress has already been made. In the following section, we describe
comprehensive qFSM analyses both of F-actin cytoskeleton dynamics in migrating
epithelial cells, and of the architectural dynamics of the spindle. Some of the most
interesting results of these studies, showcasing the technical possibilities of qFSM,
are also summarized.
7.8 Applications of FSM for Studying Protein Dynamics In Vitro and In Vivo
Table 7.1 Selection of fluorescent speckle microscopy (FSM)
general references and biological findings by subject.
j187
188
7.8 Applications of FSM for Studying Protein Dynamics In Vitro and In Vivo
Table 7.1 (Continued)
Other MAPs:
Co-imaging of full-length ensconsin or its MT-binding domain (EMTB) conjugated to GFP with
uorescent MTs suggests that dynamics of MAP:MT interactions is at least as rapid as tubulin:
MT dynamics in the polymerization reactiontt
Binding and unbinding of ensconsin generates a speckle pattern along the MT, the dynamics of
which can be evaluated to study the phosphorylation-dependent regulation of the turnoveruu
Multi-GFP tandems on MAPs signicantly increase the speckle contrast and stabilityuu
FSM Analysis of F-actin Dynamics in Migrating and Non-migrating Cells
Actin network dynamics in migrating tissue cells:
F-actin in polarized cells is organized in four distinct zones: a lamellipodium with rapid
retrograde ow and constant polymerization; a lamella with slower retrograde ow; a contraction
zone with no ow; and a zone of anterograde ow. The spatial transition from retrograde to
anterograde ow suggests the presence of a contractile belt powered by myosin II which may
drive cell migrationvv
Single-uorophore speckles can reveal F-actin turnover, as known from speckle analysis in MTs,
despite the complex lamentous structure of the F-actin meshworkww
Statistical clustering analysis of single speckle dynamics reveals two kinetically and kinematically
distinct, yet spatially overlapping, actin networks that mediate cell protrusionxx
Spatiotemporal correlation of F-actin assembly maps and GFP-Arp2/3 clustering indicate that, in
the lamellipodium, actin assembly is mediated by Arp2/3, while lamellar assembly is independent of Arp2/3 activityyy
Arp2/3- and colin-regulated assembly of the lamellipodia is not required for epithelial cell
protrusionzz
Molecular kinetics of Arp2/3 and capping protein can be measured by single-molecule FSMaaa
The dynamics of actin-binding proteins (capping protein, Arp2/3, tropomyosin) exhibit spatial
differentiation in the lamellipodium and lamella of Drosophila S2 cellsbbb
Actin network dynamics in neurons:
Steady-state retrograde ow in neuronal growth cones depends on both myosin II contractility
and actin-network treadmillingccc
Actin networks in keratocytes and keratocyte fragments:
Mechanical stimulation of keratocyte fragments activates acto-myosin contraction and causes
directional motilityddd
Keratocytes exhibit F-actin retrograde ow relative to the substrateeee in a biphasic relationship
between ow magnitude and adhesivenessfff
Actin and myosin undergo polarized assembly, suggesting force generation occurs at the
lamellipodium/cell body transition zoneggg
Directed motility is initiated by symmetry breaking actin-myosin network reorganization and
contractility at the cell rearhhh
Actin in contact-inhibited epithelial cells:
Unlike migrating cells, cortical actin in contact-inhibited cells is spatially stationary but
undergoes rapid turnoveriii, jjj
The spatiotemporal mapping of F-actin network turnover from speckle signal analysis during
appearance and disappearance events can be carried out at high resolution in contact-inhibited
cellsjjj
Actin dynamics in S. cerevisiae, using GFP-tubulin:
FSM can be used to visualize bud-associated assembly and motion of F-actin cables in budding
yeastkkk
Multi-Spectral FSM Analysis of F-actin and Other Macromolecular Structures
(Continued)
j189
190
Co-motion of the F-actin cytoskeleton and other structures, using spectrally distinct uorescent analogues:
FSM of MTs and F-actin in cytoplasmic extracts of Xenopus eggs conrms two basic types of
interaction between the polymers: a cross-linking activity and a motor-mediated interactionlll
Dynamic interactions between MTs and actin laments are required for axon branching and
directed axon outgrowthmmm
Direction of MT growth is guided by the tight association of MTs with F-actin bundlesnnn
F-actin contraction may be involved in the breaking of MTsvv,ooo
Rho and Rho effectors have differential effects on F-actin and MT dynamics during growth cone
motilityppp
In migrating epithelial cells, the dynamics of MTs and F-actin is coordinated by signaling
pathways downstream of Rac1qqq
FSM of F-actin and several focal adhesion (FA) proteins reveals a differential transmission of
F-actin network motion through the adhesion structure to the extracellular matrixrrr
FSM Analysis of Protein Turnover in Focal Adhesions (FAs)
FSM of low-level GFP-fusion protein expression, in combination with TIRF microscopy, allows
quantication of molecular dynamics within FA protein assemblies at the ventral surface of
living cellsf,rrr
Waterman-Storer, C.M., Desai, A., Bulinski, J.C. and Salmon E.D. (1998) Curr. Biol., 8, 1227.
Keating, T.J. and Borisy, G.G. (2000) Curr Biol., 10, R22.
Waterman-Storer, C.M. and Danuser, G. (2002) Curr. Biol., 12, R633.
d
Danuser, G. and Waterman-Storer, C.M. (2003) J. Microsc., 211, 191.
e
Dent, E.W. and Kalil, K. (2003) in Methods in Enzymology, Vol. 361, Academic Press, San Diego, pp. 390.
f
Adams, M., Matov, A., Yarar, D., Gupton, S., Danuser, G. and Waterman-Storer, C.M. (2004) J.
Microsc., 216, 138.
g
Danuser, G. and Waterman-Storer, C.M. (2006) Annu. Rev. Biophys. Biomol. Struct., 35, 361.
h
Adams, M.C., Salmon, W.C., Gupton, S.L., Cohan, C.S., Wittmann, T., Prigozhina, N. and
Waterman-Storer, C.M. (2003) Methods, 29, 29.
i
Waterman-Storer, C.M. (2002) in Current Protocols in Cell Biology (eds J.S. Bonifacino, M. Dasso, J.B.
Harford, J. Lippincott-Schwartz and K.M. Yamada), Wiley, New York.
j
Maddox, P.S., Moree, B., Canman, J.C. and Salmon, E.D. (2003) Methods in Enzymology, 360, 597.
k
Gupton, S.L. and Waterman-Storer, C.M. (2006) in Cell Biology: A Laboratory Handbook, Vol. 3, 3 edn.
(eds J. Celis, N. Carter, K. Simons, J.V. Small, T. Hunter and D. Shotton), Academic Press, San Diego,
pp. 137.
l
Waterman-Storer, C., Desai, A. and Salmon, E.D. (1999) in Methods in Cell Biology, Vol. 61, p. 155.
m
Ji, L., Loerke, D., Gardel, M. and Danuser, G. (2007) in Methods in Cell Biology, Vol. 83.
n
Waterman-Storer, C.M. and Salmon, E.D. (1998) Biophys. J., 75, 2059.
o
Waterman-Storer, C.M. and Salmon, E.D. (1999) FASEB J, 13, 225.
p
Grego, S., Cantillana, V. and Salmon, E.D. (2001) Biophys. J., 81, 66.
q
Hunter, A.W., Caplow, M., Coy, D.L., Hancock, W.O., Diez, S., Wordeman, L. and Howard, J. (2003)
Molecular Cell, 11, 445.
r
Maddox, P., Straight, A., Coughlin, P., Mitchison, T.J. and Salmon, E.D. (2003) J. Cell Biol., 162, 377.
s
Vallotton, P., Ponti, A., Waterman-Storer, C.M. Salmon, E.D. and Danuser, G. (2003) Biophys. J., 85,
1289.
t
Mitchison, T.J., Maddox, P., Groen, A., Cameron, L., Perlman, Z., Ohi, R., Desai, A., Salmon, E.D.
and Kapoor, T.M. (2004) Mol. Biol. Cell, 15, 5603.
u
Shirasu-Hiza, M., Perlman, Z.E., Wittmann, T., Karsenti, E. and Mitchison, T.J. (2004) Curr. Biol.,
14, 1941.
v
Gaetz, J. and Kapoor, T.M. (2004) J. Cell Biol., 166, 465.
w
Miyamoto, D.T., Perlman, Z.E., Burbank, K.S., Groen, A.C. and Mitchison, T.J. (2004) J. Cell Biol.,
167, 813.
x
Burbank, K.S., Groen, A.C., Perlman, Z.E., Fisher, D.D. and Mitchison, T.J. (2006) J. Cell Biol., 175,
369.
a
b
c
7.8 Applications of FSM for Studying Protein Dynamics In Vitro and In Vivo
y
Yang, G., Houghtaling, B.R., Gaetz, J., Liu, J.Z., Danuser, G. and Kapoor, T.M. (2007) Nat. Cell Biol.,
9, 1233.
z
Cameron, L.A., Yang, G., Cimini, D., Canman, J.C., Evgenieva, O.K., Khodjakov, A., Danuser, G. and
Salmon, E.D. (2006) J. Cell Biol., 173, 173.
aa
Maddox, P., Desai, A., Oegema, K., Mitchison, T.J. and Salmon, E.D. (2002) Curr. Biol., 12, 1670.
bb
Brust-Mascher, I. and Scholey, J.M. (2002) Mol. Biol. Cell, 13, 3967.
cc
Brust-Mascher, I., Civelekoglu-Scholey, G., Kwon, M., Mogilner, A. and Scholey, J.M. (2004) Proc.
Natl Acad. Sci. USA, 101, 15938.
dd
Rogers, G.C., Rogers, S.L., Schwimmer, T.A., Ems-McClung, S.C., Walczak, C., Vale, R.D., Scholey,
J.M. and Sharp, D.J. (2004) Nature, 427, 364.
ee
LaFountain, J.R. Jr., Cohan, C.S., Siegel, A.J. and LaFountain, D.J. (2004) Mol. Biol. Cell, 15, 5724.
ff
Vorobiev, I., Malikov, V. and Rodionov, V. (2001) Proc. Natl Acad. Sci. USA, 98, 10160.
gg
Oladipo, A., Cowan, A. and Rodionov, V. (2007) Mol. Biol. Cell, 18, 3601.
hh
Chang, S., Svitkina, T.M., Borisy, G.G. and Popov, S.V. (1999) Nat. Cell Biol., 1, 399.
ii
Dent, E.W., Callaway, J.L., Szebenyi, G., Baas, P.W. and Kalil, K. (1999) J. Neurosci., 19, 8894.
jj
Kabir, N., Schaefer, A.W., Nakhost, A., Sossin, W.S. and Forscher, P. (2001) J. Cell Biol., 5, 1033.
kk
Zhou, F.-Q., Waterman-Storer, C.M. and Cohan, C.S. (2002) J. Cell Biol. 2002, 157, 839.
ll
Maddox, P., Chin, E., Mallavarapu, A., Yeh, E., Salmon, E.D. and Bloom, K. (1999) J. Cell Biol., 144,
977.
mm
Maddox, P.S., Bloom, K.S. and Salmon, E.D. (2000) Nat. Cell Biol., 2, 36.
nn
Tran, P.T., Marsh, L., Doye, V., Inoue, S. and Chang, F. (2001) J. Cell Biol., 153, 397.
oo
Steinberg, G., Wedlich-Soldner, R., Brill, M. and Schulz, I. (2001) J. Cell Sci., 114, 609.
pp
Finley, K.R. and Berman, J. (2005) Eukaryotic Cell, 4, 1697.
qq
Perez, F., Diamantopoulos, G.S., Stalder, R. and Kreis, T.E. (1999) Cell, 96, 517.
rr
Tirnauer, J.S., Salmon, E.D. and Mitchison, T.J. (2004) Mol. Biol. Cell, 15, 1776.
ss
Kapoor, T.M. and Mitchison, T.J. (2001) J. Cell Biol., 154, 1125.
tt
Faire, K., Waterman-Storer, C.M., Gruber, D., Masson, D., Salmon, E.D. and Bulinski, J.C. (1999) J.
Cell Sci., 112, 4243.
uu
Bulinski, J.C., Odde, D.J., Howell, B.J., Salmon, T.D. and Waterman-Storer, C.M. (2001) J. Cell Sci.,
114, 3885.
vv
Salmon, W.C., Adams, M.C. and Waterman-Storer, C.M. (2002) J. Cell Biol., 158, 31.
ww
Watanabe, Y. and Mitchison, T.J. (2002) Science 2002, 295, 1083.
xx
Ponti, A., Machacek, M., Gupton, S.L., Waterman-Storer, C.M. and Danuser, G. (2004) Science, 305,
1782.
yy
Ponti, A., Matov, A., Adams, M., Gupton, S. Waterman-Storer, C.M. and Danuser, G. (2005) Biophys.
J., 89, 3456.
zz
Gupton, S.L., Anderson, K.L., Kole, T.P., Fischer, R.S., Ponti, A., Hitchcock-DeGregori, S.E.,
Danuser, G., Fowler, V.M., Wirtz, D., Hanein, D. and Waterman-Storer, C.M. (2005) J. Cell Biol.,
168, 619.
aaa
Miyoshi, T., Tsuji, T., Higashida, C., Hertzog, M., Fujita, A., Narumiya, S., Scita, G. and Watanabe,
N. (2006) J. Cell Biol., 175, 947.
bbb
Iwasa, J.H. and Mullins, R.D. (2007) Curr. Biol., 17, 395.
ccc
Medeiros, N.A., Burnette, D.T. and Forscher, P. (2006) Nat. Cell Biol., 8, 215.
ddd
Verkhovsky, A.B., Svitkina, T.M., Borisy, G.G. (1999) Curr. Biol., 9, 11.
eee
Vallotton, P., Danuser, G., Bohnet, S., Meister, J.J. and Verkhovsky, A. (2005) Mol. Biol. Cell, 16,
1223.
fff
Jurado, C., Haserick, J.R. and Lee, J. (2005) Mol. Biol. Cell, 16, 507.
ggg
Schaub, S., Bohnet, S., Laurent, V.M., Meister, J.-J. and Verkhovsky, A.B. (2007) Mol. Biol. Cell, E06.
hhh
Yam, P.T., Wilson, C.A., Ji, L., Hebert, B. Barnhart, E.L., Dye, N.A., Wiseman, P.W., Danuser, G.
and Theriot, J.A. (2007) J. Cell Biol., 178, 1207.
iii
Waterman-Storer, C.M., Salmon, W.C. and Salmon, E.D. (2000) Mol. Biol. Cell 2000, 11, 2471.
jjj
Ponti, A., Vallotton, P., Salmon, W.C., Waterman-Storer, C.M. and Danuser, G. (2003) Biophys. J., 84,
3336.
kkk
Yang, H.-C. and Pon, L.A. (2002) Proc. Natl Acad. Sci. USA, 99, 751.
lll
Waterman-Storer, C., Duey, D.Y., Weber, K.L., Keech, J., Cheney, R.E., Salmon, E.D. and Bement, W.
M. (2000) J. Cell Biol., 150, 361.
mmm
Dent, E.W. and Kalil, K. (2001) J. Neurosci., 15, 9757.
nnn
Schaefer, A.W., Kabir, N. and Forscher, P. (2002) J. Cell Biol., 158, 139.
j191
192
Gupton, S.L., Salmon, W.C. and Waterman-Storer, C.M. (2002) Curr. Biol., 12, 1891.
Zhang, X.-F., Schaefer, A.W., Burnette, D.T., Schoonderwoert, V.T. and Forscher, P. (2003) Neuron,
40, 931.
qqq
Wittmann, T., Bokoch, G.M. and Waterman-Storer, C.M. (2003) J. Cell Biol., 161, 845.
rrr
Hu, K., Ji, L., Applegate, K., Danuser, G. and Waterman-Storer, C.M. (2007) Science, 315, 111.
ooo
ppp
7.9
Results from Studying Cytoskeleton Dynamics
7.9.1
F-Actin in Cell Migration
Over the past few years, qFSM has critically driven our understanding of actin
cytoskeleton dynamics in cell migration. It has provided unprecedented details of the
spatial organization of F-actin turnover and the transport and deformation of F-actin
networks in vivo. In the following sections, we review a few key discoveries, enabled
by qFSM, that have dened a new paradigm for the functional linkage between actin
cytoskeleton regulation and epithelial cell migration.
7.9.1.1 F-Actin in Epithelial Cells is Organized Into Four Dynamically Distinct Regions
Figures 7.5 and 7.6 indicate the steady-state organization of the F-actin cytoskeleton
in four kinematically and kinetically distinct zones:
.
.
.
.
In summary, these data demonstrate how qFSM can be used to quantify spatiotemporal modulations of the kinetics and kinematics of molecular assemblies and to
identify dynamically distinct structural modules, even when they are composed of the
same base protein.
7.9.1.2 Actin Disassembly and Contraction are Coupled in the Convergence Zone
A similar spatiotemporal correlation analysis was performed to examine the relationship of F-actin network depolymerization and contraction in the convergence
zone [81]. It was rst established that transient increases in speckle ow convergence
are coupled to transient increases in disassembly. This begged the question whether
the rate of speckle ow convergence increases because disassembly boosts the
efciency of myosin II motors in contracting a more compliant network, or because
motor contraction mediates network disassembly. To address this question, we
transiently perfused cells with Calyculin A, a type II phosphatase inhibitor that
increases myosin II activity. Unexpectedly, we reproducibly measured a strong burst
of disassembly long before ow convergence was affected. This evidence suggested
that myosin II contraction can actively promote the depolymerization of F-actin, for
example, by breaking laments. The link between F-actin contractility and turnover
has since been conrmed by uorescence recovery after photobleaching measurements in the contractile ring required for cytokinesis [86]. In summary, these data
demonstrate the correlation of two qFSM parameters to decipher the relationship
between deformation and plasticity of polymer networks inside cells.
7.9.1.3 Two Distinct F-Actin Structures Overlap at the Leading Edge
The transition between the lamellipodium and the lamella is characterized by a
narrow band of strong disassembly adjacent to a region of mixed assembly and
disassembly and a sharp decrease in retrograde ow velocity (see Figure 7.6).
Together, these features dened a unique mathematical signature for tracking the
boundary between the two regions over time (Figure 7.8a). In view of the different
speckle velocities and lifetimes between the two regions, it was speculated that the
same boundary could be tracked by spatial clustering of speckle properties. It was
predicted that fast, short-living speckles (class 1) would preferentially localize in the
lamellipodium, whereas slow, longer-living speckles (class 2) would be dominant in
the lamella. To test this hypothesis, we solved a multiobjective optimization problem
in which the thresholds of velocity nth and lifetime tth separating the two classes, as
well as the boundary qLp between lamellipodium and lamella, were determined
subject to the rule {qLp,tth,nth} max(N1/(N1 N2) 2 Lp)&min(N1/(N1 N2) 2 La)
(Figure 7.8b and c), where N1 and N2 denote the number of speckles in classes 1 and
2, respectively. The prediction was conrmed in the lamella, where class 1 speckles
occupied a statistically insignicant fraction. However, class 2 speckles made up
3040% of the lamellipodium, indicating that in this region speckles with different
kinetic and kinematic behavior colocalize. This information was previously lost in the
averaged analysis of single-speckle trajectories. When mapping the scores of class 1
and class 2 speckles separately (Figure 7.8di-dii), it was discovered that class 1
speckles dene the bands of polymerization and depolymerization characteristic
j193
194
of the lamellipodium, and that class 2 speckles dene the puncta of assembly and
disassembly characteristic of the lamella, which reaches all the way to the leading
edge. Subsequent experiments specically disrupting actin treadmilling in the
lamellipodium conrmed the nding that the lamellipodium and lamella form two
spatially overlapping, yet kinetically, kinematically and molecularly different, F-actin
networks [83, 84].
7.9.2
Architecture of Xenopus laevis Egg Extract Meiotic Spindles
During cell division, MTs form a spindle, which maintains stable bipolar attachment
to chromosomes over tens of minutes. A sophisticated checkpoint system senses the
status of attachment and generates a signal to progress with symmetric segregation of
the replicated sister chromatids into the newly forming mother and daughter cells [6].
The minus ends of polar MTs are preferentially located at the spindle poles, whereas
the plus ends continually switch between growth and shrinkage, a process known as
MT dynamic instability [87]. Strikingly, dynamic instability in vertebrate spindles
occurs within a few tens of seconds, a time scale at least an order of magnitude shorter
than the existence of the spindle [88]. In addition to MTdynamic instability at the MT
plus end, individual MTs are transported toward the spindle poles, a behavior known
as poleward ux. Poleward ux has only been observed in higher eukaryotic
spindles, including the Xenopus laevis extract spindle system [89]. How the overall
stability of spindle architecture is maintained under the much faster dynamics of its
building blocks is largely unknown. qFSM has made several critical contributions to
the mechanistic analysis of spindle architecture (Table 7.1). For example, it has
revealed detailed maps of the organization of heterogeneous MT poleward ux
(Figure 7.9ac) [74, 90], and it was also used to show that MTs form distinct types of
bundles with different ux dynamics, depending on whether they are attached to
chromosomes (kinetochore bers) or form a scaffold of overlapping bers emanating
from opposite poles (interpolar MT bers) [38]. Together, these data have indicated an
enormous architectural complexity, which requires ne regulation of the dynamics of
each MT. However, the high MTdensity in the spindle has precluded measurement of
the dynamics of individual MTs within bundles. Speckles generally consist of
multiple uorophores distributed over many different MTs.
Recently, this difculty has been overcome by single-uorophore speckle imaging of
Xenopus laevis extract spindles [62]. As a cell-free spindle model, the extract spindle
allows convenient control over uorescent tubulin levels to achieve sparse labeling of
MTs (Figure 7.9d). For low labeling ratios f, speckle intensities cluster in multiples of
500 AU, indicating that speckles are composed of a discrete low number (e.g. one,
two, three or four) of uorophores (Figure 7.9e and f) [62]. At the lowest concentrations
of labeled tubulin, only one intensity cluster with a mean value of 500 AU was found.
Furthermore, the average intensity of a detectable speckle remained constant over time
at 500 AU (Figure 7.9g), although the speckle number decreased due to photobleaching, Together, the cluster analysis of speckle intensities and the photobleaching analysis
conrmed that >98% of the speckles reected the image of a single uorophore.
7.9.2.1 Individual MTs within the Same Bundle Move at Different Speeds
Single-uorophore speckles were then used to investigate how individual MTs in
close proximity move relative to one another. In order to avoid any a priori assumptions, the spatial organization of spindle MTs was mapped using the dense ow eld
measured in a spectrally distinct channel displaying multiuorophore speckles. Path
integration of the ow eld allowed the construction of equally spaced bands of
uniform width (480 nm), which reects the average position of MT bundles within
the spindle (Figure 7.9h). The band width was chosen to match the diffraction limit of
the microscope, and is consistent with electron microscopy studies which showed
that MTs form bundles typically a few hundred nanometers wide, with individual
MTs 2050 nm apart [91, 92]. Next, the pairwise difference between the velocities
of speckles located in the same band showed that MTs spaced at a distance comparable to the width of MT bundles exhibit remarkably heterogeneous movement
(Figure 7.9i). Thus, individual MTs appear to slide past one another over very short
distances, suggesting that the spindle is a MT scaffold that is continuously restructured at the scale of tens of seconds.
7.9.2.2 The Mean Length of Spindle MTs is 40% of the Total Spindle Length
Despite the heterogeneous movement of the majority of speckles within one band, a
small percentage (1% of all speckles) moved in synchronized pairs: not only did
j195
196
they stay within the same band and move in the same direction at the same time
(Figure 7.9j and k), but they also concurrently changed velocities (Figure 7.9k, 13).
Clearly, these single-uorophore speckle pairs must reside on the same MT. By applying stringent detection criteria on spatial colocalization, temporal overlap, relative
distance change and relative velocity change, a total of 328 synchronous speckle pairs
was identied from 13 spindles. Interestingly, 90% of the speckle distances were less
than half the length of the spindle (Figure 7.9l). To estimate the lengths of spindle
MTs from the measured distances between synchronously moving speckle pairs, a
mathematical model was developed of the stochastic incorporation of labeled tubulin
into a population of MTs with an a priori unknown length distribution f(l) [62].
Assuming a hypothetical function f(l), the model dened the expected cumulative
distribution P of distances d between two speckles, given the event A that the speckles
reside on the same MT:
d 2 c l r
f ldl d 2dld2 ec l r f ldl
0 l e
:
PD < djA
u2 ec u r f udu
0
Here, l denotes the steady-state length of individual MTs in microns, c is the
number of tubulin dimers per micron (1625), and r is the fraction of labeled tubulin
(2.86 106 for 0.0660.033 nM labeled tubulin). Fitting the above formula for the
expected cumulative distribution to the measured cumulative histogram of distances
between speckle pairs made it possible to estimate parameters of f(l) (Figure 7.9m).
For instance, it was estimated that the ratio between the mean length of MTs and the
spindle length is 0.4.
In summary, by integrating single-uorophore imaging with computational image
analysis, it was found that spindle MTs in close proximity move at highly heterogeneous velocities, and that the majority have a length shorter than the spindle pole-tometaphase plate distance. These results, along with molecular perturbation data (not
shown), suggest that MTs in the vertebrate meiotic spindle are dynamically organized
as a crosslinked tiled-array in a way similar to how the actin network is organized in
motile cells (Figure 7.9n) [62]. This model challenges longstanding textbook models,
which assume that the majority of MTs emanate from the two poles. The mechanical
stability of the tiled-array is maintained by dynamic crosslinks. Thus, a structure can
be formed where the stability of the ensemble is much higher than the stability of its
individual building blocks. It is speculated that the design of cytoskeleton structures
3
same band, coexisted over a time interval of at
least 10 s, and varied synchronously in flux
velocity. Scale bar 10 mm; (k) Kymograph
representation of the synchronous movement of
the speckle pairs shown in (j); (l) Histogram of
the measured distances between speckle pairs
(328 pairs from n 13 spindles); (m) Estimated
length distributions under different models:
exponential distribution (mean SD:
11.75 11.75 mm) (light blue), Rayleigh
j197
198
follows the general principle of coupling many short and dynamic components into
larger, longlasting ensembles to achieve both the exibility and stability needed for
cellular life under constantly changing conditions.
7.9.3
Hierarchical Transmission of F-Actin Motion Through Focal Adhesions
Cell migration requires a delicate spatial balance of cell adherence to the substrate.
Dynamic structures called focal complexes assemble next to the leading edge and
mature over time into FAs, macromolecular assemblies of more than 100 different
proteins. Focal adhesions tether the F-actin network to integrin receptors, which
in turn bind to the substrate. Forces generated by F-actin polymerization and/or
contraction are transmitted to the extracellular matrix (ECM) via the coupling of
F-actin to FAs. It has long been known that F-actin and FA proteins are coupled; many
FA proteins bind directly or indirectly to F-actin [9395] or integrin receptors [9698],
and the ends of contractile actin bundles often appear to be embedded in FAs [57, 99].
However, despite many years of intensive research aimed at identifying the
molecular parts list of FAs, it has been impossible to determine the hierarchy of
interactions between specic FA proteins and the F-actin cytoskeleton in living cells.
Hu et al. used two-color total internal reection uorescent speckle microscopy
(TIR-FSM) to simultaneously image X-rhodamine actin and various GFP-tagged FA
proteins (Figure 7.10a) [25]. As expected, F-actin retrograde ow slowed down directly
3
Figure 7.10 Measuring the coupling between Factin and FA proteins. (a) F-actin (red) and FA
protein vinculin (green) in a live cell; (b)
Simulated F-actin (red) and FA protein vinculin
(green) from a Monte Carlo simulation. (ce)
Varying the association and dissociation rate
constants of FA proteins in Monte Carlo
simulations affects the speed of FA speckle
motion and the coupling to F-actin. In these
simulations, FA proteins switched between
unbound (1), FA platform-bound (2) and F-actinbound (3) states to allow coupling to F-actin flow.
The VMCS and DCS were calculated using a
simulation of F-actin flowing from left to right at
v 0.25 mm min1; (c) Tracked motion of FA
speckles (yellow vectors) for three representative
FSM movies out of 25. VMCS increased from left
to right as FA protein speckle flow became more
aligned with F-actin speckle flow and more FA
proteins were bound to the F-actin network; (d)
Surface plot of VMCS determined by tracking 25
simulated FSM movies with the same
dissociation rate constant
(koff koff21 koff31 0.005) and variable
association rate constants to the FA platform
j199
200
over the FAs, suggesting that the latter may dampen ow by engaging F-actin to the
ECM. Furthermore, when the motions of three classes of FA proteins were compared
with F-actin, major differences in the speeds and visual coherence of the ow elds
were observed. The ECM-binding aV integrin exhibited slow, incoherent retrograde
ow compared to actin, while the FA core proteins paxilin, zyxin and focal-adhesion
kinase (FAK), which do not bind F-actin or the ECM directly but have structural or
signaling roles, moved slightly faster and more coherently. The third class, composed
of the actin-binding proteins a-actinin, vinculin and talin, moved signicantly faster
(close to the speed of actin) and with the highest coherence.
The next step was to estimate coupling by quantifying the degree of correlated
motion between FA and F-actin speckles. Correlated motion would strongly indicate
that FA proteins help to transmit the force generated during actin polymerization and
myosin II-mediated contraction to the ECM. To quantify this coupling, both speckle
ow maps were interpolated to a common grid and two parameters were calculated:
the direction coupling score (DCS) and the velocity magnitude coupling score
(VMCS). The DCS measures the level of directional similarity between F-actin and
FA speckle motion, while the VMCS measures the component of FA motion in the
direction of actin, thus taking into account both direction and speed. The quantitative
interpretation of these parameters, in terms of the kinetics of molecular interactions
between F-actin and FA components, required mathematical modeling. Monte Carlo
simulations were used to generate synthetic two-color movies of speckles associated
with two transiently coupled protein structures (Figure 7.10bd). Relating the
binding/unbinding events at the molecular level to the relative movement of speckles
in the two structures, along with an analysis of the noise characteristics of such
movies, revealed that the VMCS is a linear reporter of the ratio between the time that
the FA component is bound to F-actin alone and the time it is bound to both F-actin
and the substrate (Figure 7.10e). Thus, the relative movement of two speckle elds
directly reects the degree of interaction between two protein structures in living cells.
When the DCS and VMCS were calculated for the three classes of FA proteins,
integrin had the lowest coupling to actin, and F-actin-binding proteins the highest.
The core proteins showed intermediate coupling. Such quantitative analysis provides
even more insight when scores are compared over time. For example, coupling
scores stayed constant for stationary FAs in a protrusive area of the cell edge, but
F-actinvinculin coupling increased in FAs that slid backwards in a retracting area of
the cell edge (Figure 7.10f and g). Multicolor qFSM analysis therefore suggests a
hierarchical molecular clutch model of force transmission, in which the efciency of
force transmission depends on the make-up of the FAs [25].
7.10
Outlook: Speckle Fluctuation Analysis to Probe Material Properties
j201
202
measured in vitro. The 1/r decay of the uctuation correlation is known from two-point
microrheology, in which embedded beads instead of speckles are used to track thermal
uctuations in polymer networks [100, 101]. Thus, spatially correlated yet undirected
components of speckle motion could be used to probe material properties of polymer
networks inside a cell at the scale of the interspeckle distance. Figure 7.11b compares
the stiffness between an in vitro network of entangled actin laments and a cortical
F-actin network in an epithelial cell. The marked difference originates in the dense
crosslinking of in vivo networks, both intracellularly and extracellularly.
The stiffness of cellular networks is so high that correlations above noise are
measurable only for speckles at a distance <1 mm (Figure 7.11b, gray zone). The
possibility to extract meaningful information from speckle uctuations over these
short distances relies on recent enhancements of speckle tracking to an accuracy of
approximately one-tenth of a pixel, even when speckles overlap. A module was also
implemented that performs correlation analysis in small windows to map out the
spatial modulation of material properties. Figure 7.11c and d compare the stiffness
maps of a control epithelial cell and a cell expressing constitutively active colin(S3A).
While the control cell has minimal spatial variation, the cell with colin(S3A) has a
much softer lamellipodium (Lp) network but an unchanged lamella (La). These data
show that colin a severing factor and promoter of F-actin depolymerization acts
selectively in the lamellipodium network. High colin activity eventually eliminates
the crosslink between the lamellipodium and lamella, resulting in a substantial
softening of the lamellipodium network structure [85]. This example illustrates the
potential of qFSM to derive spatial maps of the mechanical properties of cytoskeleton
structures from speckle uctuations.
7.11
Conclusions
Over the past few years, FSM has become a versatile tool for simultaneously probing
the motion, deformation, turnover and materials properties of macromolecular
assemblies. Despite the many exciting discoveries already made using FSM (see
Table 7.1), it is a technology still in its infancy. In a next step, FSM measurements will
be combined with correlational analyses to establish how assemblies operate as
dynamic and plastic structures, enabling a broad variety of cell functions. In parallel,
FSM will continue to go multispectral, so that these parameters can be correlated
among different macromolecular structures. This requires major modications to
the current qFSM software to cope with the explosion of combinatorial data in two or
more simultaneously imaged speckle channels.
With regards to future applications, FSM has the potential to uncover new
biology outside the cytoskeleton eld, and the analyses of FA dynamics have made
some initial steps in this direction. Projects are also under way to apply qFSM to
studies of the dynamic interaction of clathrin, dynamin and actin structures during
endocytosis; of individual interphase MTs, MT-associated proteins and F-actin; and of
DNA repair [102].
References
Acknowledgments
These studies were supported by NIH through grants R01 GM67230 and NIH R01
GM60678 to the Danuser laboratory. Fellowship support from the Burroughs-Wellcome LJIS program (G.Y.) and the National Science Foundation (K.A.) is also
acknowledged. We thank our collaborators, Clare Waterman-Storer, Edward Salmon,
Tarun Kapoor, Julie Theriot, Paul Forscher and their laboratory members, for image
data and uncountable discussions, without which the development of qFSM would
not have been possible. We also thank James Lim and Dinah Loerke for sharing their
unpublished data.
References
1 Pauling, L., Itano, H., Singer, S.J. and
Wells, I. (1949) Science, 110, 543.
2 Nahta, R. and Esteva, F.J. (2003) Clinical
Cancer Research, 9, 5078.
3 Garg, U. and Dasouki, M. (2006) Clinical
Biochemistry, 39, 315.
4 Eggert, U.S. and Mitchison, T.J. (2006)
Current Opinion in Chemical Biology, 10,
232.
5 Dorn, J.F., Danuser, G. and Yang, G.
(2008) in Fluorescent Proteins, 2nd edn,
Academic Press, Elsevier, Vol. 85,
pp. 497.
j203
204
References
48 Watanabe, Y. and Mitchison, T.J. (2002)
Science, 295, 1083.
49 Waterman-Storer, C.M., Desai, A.,
Bulinski, J.C. and Salmon, E.D. (1998)
Current Biology, 8, 1227.
50 Waterman-Storer, C.M. Salmon, W.C. and
Salmon, E.D. (2000) Molecular Biology of
the Cell, 11, 2471.
51 Jurado, C., Haserick, J.R. and Lee, J.
(2005) Molecular Biology of the Cell, 16,
507.
52 Vallotton, P., Danuser, G., Bohnet, S.,
Meister, J.J. and Verkhovsky, A. (2005)
Molecular Biology of the Cell, 16, 1223.
53 Zhang, X.-F., Schaefer, A.W., Burnette,
D.T., Schoonderwoert, V.T. and Forscher,
P. (2003) Neuron, 40, 931.
54 Pollard, T.D., Blanchoin, L. and Mullins,
R.D. (2000) Annual Review of Biophysics
and Biomolecular Structure, 29, 545.
55 Small, V. (1981) The Journal of Cell Biology,
91, 695.
56 Svitkina, T.M., Verkhovsky, A.B.,
McQuade, K.M. and Borisy, G.G. (1997)
The Journal of Cell Biology, 139, 397.
57 Geiger, B., Bershadsky, A., Pankov, R. and
Yamada, K.M. (2001) Nature Reviews
Molecular Cell Biology, 2, 793.
58 Bulinski, J.C., Odde, D.J., Howell, B.J.,
Salmon, T.D. and Waterman-Storer,
C.M. (2001) Journal of Cell Science, 114,
3885.
59 Kapoor, T.M. and Mitchison, T.J. (2001)
The Journal of Cell Biology, 154, 1125.
60 Ponti, A., Vallotton, P., Salmon, W.C.,
Waterman-Storer, C.M. and Danuser, G.
(2003) Biophysical Journal, 84, 3336.
61 Ponti, A. (2004) High-resolution analysis
of F-actin meshwork kinetics and
kinematics using computational
uorescent speckle microscopy.
Dissertation No. 15286, ETH Zurich
(Zurich).
62 Yang, G., Houghtaling, B.R., Gaetz, J.,
Liu, J.Z., Danuser, G. and Kapoor, T.M.
(2007) Nature Cell Biology, 9, 1233.
63 Mikhailov, A.V. and Gundersen, G.G.
(1995) Cell Motility and the Cytoskeleton,
32, 173.
j205
206
j207
8
Harnessing Biological Motors to Engineer Systems
for Nanoscale Transport and Assembly
Anita Goel and Viola Vogel
By considering how the biological machinery of our cells carries out many different
functions with a high level of specicity, we can identify a number of engineering
principles that can be used to harness these sophisticated molecular machines for
applications outside their usual environments. Here, we focus on two broad classes of
nanomotors that burn chemical energy to move along linear tracks: assembly
nanomotors and transport nanomotors.
8.1
Sequential Assembly and Polymerization
The molecular machinery found in our cells is responsible for the sequential
assembly of complex biopolymers from their component building blocks (monomers): polymerases make DNA and RNA from nucleic acids, and ribosomes
construct proteins from amino acids. These assembly nanomotors operate in
conjunction with a master DNA or RNA template that denes the order in which
individual building blocks must be incorporated into a new biopolymer. In addition to
recognizing and binding the correct substrates (from a pool of many different ones),
the motors must also catalyze the chemical reaction that joins them into a growing
polymer chain. Moreover, both types of motors have evolved highly sophisticated
mechanisms so that they are able not only to discriminate the correct monomers
from the wrong ones, but also to detect and repair mistakes as they occur [1].
Molecular assembly machines or nanomotors (Figure 8.1a) must effectively
discriminate between substrate monomers that are structurally very similar. Polymerases must be able to distinguish between different nucleosides, and ribosomes
need to recognize particular transfer RNAs (tRNAs) that carry a specic amino acid.
These well-engineered biological nanomotors achieve this by pairing complementary
WatsonCrick base pairs and comparing the geometrical t of the monomers to their
respective polymeric templates. This molecular discrimination makes use of the
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
208
j209
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
210
the precision of monomer selection and the inbuilt proofreading machinery for
monomer repair that nanomotors have. Building such copolymers with polymerase
nanomotors ex vivo would yield much more homogeneous products of the correct
sequence and precise length. Natural (e.g., nanomotor-enabled) designs could
inspire new technologies to synthesize custom biopolymers precisely from a given
blueprint.
Ribosome motors have likewise been harnessed ex vivo to drive the assembly of
new bioinorganic heterostructures [27] and peptide nanowires [28, 29] with goldmodied amino acids inserted into a polypeptide chain. These ribosomes are forced
to use inorganically modied tRNAs to sequentially assemble a hybrid protein
containing gold nanoparticles wherever the amino acid cysteine was specied by
the messenger RNA template. Such hybrid gold-containing proteins can then attach
themselves selectively to materials used in electronics, such as gallium arsenide [28].
This application illustrates how biomotors could be harnessed to synthesize and
assemble even nonbiological constructs such as nanoelectronic components (see
www.cambrios.com).
Assembly nanomotors achieve such high precision in sequential assembly by
making use of three key features: (i) geometric shape-tting selection of their
building blocks (e.g., nucleotides); (ii) motion along a polymeric template coupled
to consumption of an energy source (e.g., hydrolysis of ATP molecules); and (iii)
intricate proofreading machinery to correct errors as they occur. Furthermore,
nanomotor-driven assembly processes allow much more stable, precise and complex
nanostructures to be engineered than can be achieved by thermally driven selfassembly techniques alone [3032].
We should also ask whether some of these principles, which work so well at the
nanoscale, could be realized at the micrometer scale as well. Whitesides and
coworkers, for example, have used simple molecular self-assembly strategies, driven
by the interplay of hydrophobic and hydrophilic interactions, to assemble microfabricated objects at the mesoscale [33, 34]. Perhaps the design principles used by
nanomotors to improve precision and correct errors could also be harnessed to
engineer future ex vivo systems at the nanoscale, as well as on other length scales.
Learning how to engineer systems that mimic the precision and control of nanomotor-driven assembly processes may ultimately lead to efcient fabrication of
complex nanoscopic and mesoscopic structures.
8.2
Cargo Transport
Cells routinely use another set of nanomotors (i.e., transport nanomotors) to recognize, sort, shuttle and deliver intracellular cargo along lamentous freeways to welldened destinations, allowing molecules and organelles to become highly organized
(for reviews, see Refs. [3544]).This is essential for many life processes. Motor proteins
transport cargo along cytoskeletal laments to precise targets, concentrating molecules in desired locations. In intracellular transport, myosin motors are guided by actin
laments, whereas dynein and kinesin motors move along rodlike microtubules.
Figure 8.2a illustrates how conventional kinesins transport molecular cargo along
nerve axons towards the periphery, efciently transporting material from the cell body
to the synaptic region [45]. Dyneins, in contrast, move cargo in the opposite direction,
so that there is active communication and recycling between both ends (see reviews
[42, 46]). In fact, the blockage of such bidirectional cargo transport along nerve axons
can give rise to substantial neural disorders [4750].
The long-range guidance of cargo is made possible by motors pulling their cargo
along lamentous rods. Microtubules, for example, are polymerized from the dimeric
j211
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
212
tubulin into protolaments that assemble into rigid rods around 30 nm in diameter [36]. These polymeric rods are inherently unstable: they polymerize at one end
(plus) while depolymerizing from the other (minus) end, giving rise to a structural
polarity. The biological advantage of using transient tracks is that they can be rapidly
recongured on demand and in response to changing cellular needs, or to various
external stimuli. Highly efcient unidirectional cargo transport is realized in cells by
bundling microtubules into transport highways where all microtubules are oriented in
the same direction. Excessively tight bundling of microtubules, however, can greatly
impair the efciency of cargo transport, by blocking the access of motors and cargo to
the microtubules in the bundle interior. Instead, microtubule-associated proteins are
thought to act as repulsive polymer-brushes, thereby regulating the proximity and
interactions between neighboring microtubules [51].
Trafc control is an issue when using the laments as tracks on which kinesin and
dynein motors move in opposite directions. Although different cargoes can be
selectively recognized by different members of the motor protein families and
shuttled to different destinations, what happens if motors moving in opposite
directions encounter each other on the same protolament (Figure 8.2b)? If two
of these motors happen to run into each other, kinesin seems to have the right of
way. As kinesin binds the microtubule much more strongly, it is thought to force
dynein to step sideways to a neighboring protolament [52]. Dynein shows greater
lateral movement between protolaments than kinesin [5254] as there is a strong
diffusional component to its steps [55]. When a microtubule becomes overcrowded
with only kinesins, the runs of individual kinesin motors are minimally affected. But
when a microtubule becomes overloaded with a mutant kinesin that is unable to step
efciently, the average speed of wild-type kinesin is reduced, whereas its processivity
is hardly changed. This suggests that kinesin remains tightly bound to the microtubule when encountering an obstacle and waits until the obstacle unbinds and frees
the binding site for kinesins next step [56].
8.2.1
Engineering Principle No. 2: various track designs
Figure 8.3 Track designs to guide nanomotordriven filaments ex vivo. A variety of track designs
have been used. (a), A chemical edge (adhesive
stripes coated with kinesin surrounded by
nonadhesive areas). The filament crosses the
chemical edge and ultimately falls off as it does
not find kinesins on the nonadhesive areas [61];
(b), Steep channel walls keep the microtubule on
the desired path as they are forced to bend
[61,65]; (c), Overhanging walls have been shown
to have the highest guidance efficiency [64]; (d),
Electron micrograph of a microfabricated open
channel with overhanging walls [64]; (e),
be collectively propelled forward [45]. The head domains of the kinesin and myosin
motors can rotate and swivel with respect to their feet domains, which are typically
bound in random orientations to the surface. These motor heads detect the structural
anisotropy of the microtubules and coherently work together to propel a lament
forward [59, 60].
Various examples of such inverted designs for motor tracks have been engineered
to guide laments efciently. Some of these are illustrated in Figure 8.3. Inverted
motility assays can be created, for example, by laying down tracks of motor proteins in
microscopic stripes of chemical adhesive on an otherwise at, protein-repellent
surface, surrounded by nonadhesive surface areas. Such chemical patterns
(Figure 8.3a) have been explored to guide actin laments or microtubules. The loss
rate of guiding laments increases exponentially with the angle at which they
approach an adhesive/nonadhesive contact line [61]. The passage of the contact line
by laments at nongrazing angles, followed by their drop off, can be prevented by
using much narrower lanes whose size is of the order of the diameter of the moving
j213
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
214
object. Such nanoscale kinesin tracks provide good guidance and have been fabricated by nanotemplating [62].
Alternatively, considerably improved guidance has been accomplished by topographic surface features (Figure 8.3b). Microtubules hitting a wall are forced to bend
along this obstacle and will continue to move along the wall [6366]. The rigidity of the
polymeric laments used as shuttles thus greatly affects how tracks should be
designed for optimal guidance. Whereas microtubules with a persistence length of
a few millimeters can be effectively guided in channels a few micrometers wide as
they are too stiff to turn around [61], the much more exible actin laments require
channel widths in the submicrometer range [67, 68]. Finally, the best long-distance
guidance of microtubules has been obtained so far with overhanging walls [64, 69]
(Figure 8.3c). The concept of topographic guidance in fact works so well that swarms
of kinesin-driven microtubules have been used as independently moving probes to
image unknown surface topographies. After averaging all their trajectories in the
focal plane for an extended time period, the image grayscale is determined by the
probability of a surface pixel being visited by a microtubule in a given time frame [70].
But how can tracks be engineered to produce unidirectional cargo transport? All
the motor-propelled laments must move in the same direction to achieve effective
long-distance transport. When polar laments land from solution onto a motorcovered surface, however, their orientations and initial directions of movement are
often randomly distributed. Initially, various physical means, such as ow elds [71],
have been introduced to promote their alignment. Strong ows eventually either
force gliding microtubules to move along with the ow, or force microtubules, if
either their plus or minus end is immobilized on a surface [72], to rotate around the
anchoring point and along with the ow. The most universal way to control the local
direction in which the lamentous shuttles are guided is to make use of asymmetric
channel features. Figure 8.3df illustrates how laments can be actively sorted
according to their direction of motion by breaking the symmetry of the engineered
tracks. This local directional sorting has been demonstrated on surfaces patterned
with open-channel geometries, where asymmetric intersections are followed by
dead-ended channels (that is, reector arms), or where channels are broadened into
arrow heads. Both of these topographical features not only selectively pass laments
moving in the desired direction, but can also force laments moving in the opposite
direction to turn around [65, 69, 73, 74]. Once directional sorting has been accomplished, electric elds have been used to steer the movement of individual microtubules as they pass through engineered intersections [75, 76].
In addition to using isolated nanomotors, hybrid biodevices and systems that
harness self-propelling microbes could be used to drive transport processes along
engineered tracks. Flagellated bacteria, for example, have been used to generate both
translational and rotational motion of microscopic objects [77]. These bacteria can be
attached head-on to solid surfaces, either via polystyrene beads or polydimethylsiloxane, thereby enabling the cell bodies to form a densely packed monolayer, while
their agella continue to rotate freely. In fact, a microrotary motor, fuelled by glucose
and comprising a 20 mm-diameter silicon dioxide rotor, can be driven along a silicon
track by the gliding bacterium Mycoplasma [78]. Depending on the specic applica-
tion and the length scale on which transport needs to be achieved, integrating bacteria
into such biohybrid devices (that work under physiological conditions) might
ultimately prove more robust than relying solely on individual nanomotors.
8.3
Cargo Selection
j215
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
216
with antibodies or to biotinylate microtubules and coat the cargo with avidin or
streptavidin (Figure 8.4) (for reviews, see Refs. [74, 79]), as done for polymeric and
magnetic beads [84, 85] (Figure 8.4a), gold nanoparticles [8688], DNA [87, 89, 90] and
viruses [79, 81] (Figure 8.4b), and nally mobile bioprobes and sensors [80, 81, 91]
(Figure 8.4c). However, if too much cargo is loaded onto the moving laments and
access of the propelling motors is even partially blocked, the transport velocity can be
signicantly impaired [92]. Finally, the binding of cargo to a moving shuttle can be used
to regulate its performance. In fact, microtubules have recently been furnished with a
backpack that selfsupplies the energy source ATP. Cargo particles bearing pyruvate
kinase have been tethered to the microtubules to provide a local ATP source [93]
(Figure 8.4c). The coupling of multiple motors to cargo or other scaffold materials can
affect the motor performance. If single-headed instead of double-headed kinesins are
used, cooperative interactions between the monomeric motors attached to protein
scaffolds increase hydrolysis activity and microtubule gliding velocity [59].
At the next level of complexity successful cargo tagging sorting and delivery will
depend on the engineering of integrated networks of cargo loading, cargo transport
and cargo delivery zones. Although the construction of integrated transport circuits is
still in its infancy, microfabricated loading stations have been built [88] (Figure 8.5).
The challenge here is to immobilize cargo on loading stations such that it is not easily
detached by thermal motion, yet to allow for rapid cargo transfer to passing
microtubules. By properly tuning bond strength and multivalency, and most importantly by taking advantage of the fact that mechanical strain weakens bonds, cargo can
be efciently stored on micropatches and transferred after colliding with a microtu-
j217
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
218
coated channels that contain reector arms. Once oriented by self-propelled motion,
the seedlings were polymerized into mature microtubules that were conned to grow
in the open channels until the channels were lled with dense networks of microtubules all oriented in the same direction [97]. Single kinesins take only a few
hundred steps before they fall off, but the walking distance can be greatly increased if
the cargo is pulled by more than one motor [98]. Such approaches to fabricating
networks of microtubule bundles could be further expanded to engineer future
devices that use either the full toolbox of native scaffolding proteins or new
scaffolding proteins that target both biological and synthetic cargo.
Nanoengineers would not be the rst to harness biological motors to transport
their cargo. Various pathogens are known to hijack microtubule or actin-based
transport systems within host cells (reviewed in Ref. [99]). Listeria monocytogenes, for
example, propels itself through the host cell cytoplasm by means of a fast-polymerizing actin lament tail [100]. Likewise, the vaccinia virus, a close relative of smallpox,
uses actin polymerization to enhance its cell-to-cell spreading [101], and the alpha
herpesvirus hijacks kinesins to achieve long-distance transport along the microtubules of neuronal axons [102]. Signaling molecules and pathogens that cannot alter
cell function and behavior by simply passing the outer cell membrane can thus hijack
the cytoskeletal highways to get transported from the cell periphery to the nucleus.
8.3.2
Engineering Principle No. 4: active transport of tailored drugs and gene carriers
8.4
Quality Control
the ability to recognize and repair defects. Living systems use numerous quality
control procedures to detect and repair defects occurring during the synthesis and
assembly of biological nanostructures. As yet, this has not been possible in synthetic
nanosystems. Many cellular mechanisms for damage surveillance and error correction rely on nanomotors. Such damage control can occur at two different levels as
follows.
8.4.1
Engineering Principle No. 5: error recognition and repair at the molecular level
j219
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
220
8.4.2
Engineering Principle No. 6: error recognition and repair at the system level
8.5
External Control
8.5.1
Engineering Principle No. 7: performance regulation on demand
j221
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
222
including the stretching of substrate molecules like DNA [13]. Although this external
control over nanomotors has been demonstrated in a few different contexts ex vivo, a
rich detailed mechanistic understanding of how such external control knobs can
modulate the dynamics of the molecular motor is emerging from recent work on the
DNA polymerase motor [9, 107, 116, 121, 127, 128, 131].
Remote-controlling the local ATP concentration by the photoactivated release of
caged ATP can allow a nanomotor-driven transport system to be accelerated or
stopped on demand [84]. External control knobs or regulators can also be engineered
into the motors. For instance, point mutations can be introduced into the gene
encoding the motor protein, such that it is engineered to respond to light, temperature, pH or other stimuli [43, 85]. Engineering light-sensitive switches into nanomotors enables the rate of ATPase [43, 132] to be regulated, thereby providing an
alternate handle for tuning the motors speed, even while the ATP concentration is
kept constant and high. When additional ATP-consuming enzymes are present in
solution, the rate of ATP depletion regulates the distance the shuttles move after
being activated by a light pulse and before again coming to a halt [84].
Future applications could require that, instead of all the shuttles being moved at the
same time, only those in precisely dened locations be activated, on demand. Some of
the highly conserved residues within motors help to determine the motors ATPase
rate [43]. Introducing chemical switches near those locations might provide a handle
for chemical manipulation of the motors speed. In fact, this has already been realized
for a rotary motor [132] as well as for a linear kinesin motor, where the insertion of a
Ca2 -dependent chemical switch makes the ATPase activity steeply dependent on
Ca2 concentrations [133]. In addition to caged ATP, caged peptides that block
binding sites could be used to regulate the motility of such systems. Caged peptides
derived from the kinesin C-terminus domain have already been used to achieve photo
control of kinesin-microtubule motility [134]. Instead of modulating the rate of ATP
hydrolysis, the access of microtubules to the motors head domain can also be blocked
in an environmentally controlled manner. In fact, temperature has already been
shown to regulate the number of kinesins that are accessible while embedded in a
surface-bound lm of thermoresponsive polymers [135].
The nanomotor-driven assembly of DNA by the DNA polymerase motor provides
an excellent example of how precision control over the nanomotor can be achieved by
various external knobs in the motors environment [107, 116, 127, 128]. The DNAp
motor moves along the DNA template by cycling through a given sequence of
geometric shape changes. The sequence of shapes or internal states of the nanomachine can be denoted by nodes on a simple network [107, 116, 127, 128]. As illustrated
in Figure 8.8, this approach elucidates how mechanical tension on a DNA molecule
can precisely control (or tune) the nanoscale dynamics of the polymerase motor
along the DNA track by coupling into key conformational changes of the motor [107].
Macroscopic knobs to precision-control the motors movement along DNA tracks can
be identied by probing how the motors dynamics vary with each external control knob
(varied one at a time). Efforts are currently under way to control even more precisely the
movement of these nanomotors along DNA tracks by tightly controlling the parameters
in the motors environment (see www.nanobiosym.com). Concepts of ne-tuning and
robustness could also be extended to describe the sensitivity of other nanomotors
(modelled as simple biochemical networks) to various external control parameters [107].
Furthermore, such a network approach [107] provides experimentally testable predictions that could aid the design of future molecular-scale manufacturing methods that
integrate nanomotor-driven assembly schemes. External control of these nanomotors
will be critical in harnessing them for nanoscale manufacturing applications.
8.6
Concluding Remarks
We have reviewed several key engineering design principles that enable nanomotors
moving along linear templates to perform a myriad of tasks. Equally complex
j223
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
224
biomimetic tasks have not yet been mastered ex vivo, either by harnessing biological
motors or via synthetic analogues. Engineering insights into howsuch tasks arecarried
out by the biological nanosystems will inspire new technologies that harness nanomotor-driven processes to build new systems for nanoscale transport and assembly.
Sequential assembly and nanoscale transport, combined with features currently
attributed only to biological materials, such as self-repair and healing, might one day
become an integral part of future materials and biohybrid devices. In the near term,
molecular biology techniques could be used to synthesize and assemble nanoelectronic components with more control (www.cambrios.com; see also Ref. [29]).
Numerous proof-of-concept experiments using nanomotors integrated into synthetic
microdevices have already been demonstrated (for reviews, see Refs. [74, 136]).
Among many others, these applications include stretching surface-bound molecules
by moving microtubules [87, 90]; probing the lifetime of a single receptorligand
interaction via a cantilevered microtubule that acts as a piconewton force sensor [85];
topographic surface imaging by self-propelled probes [70]; and cargo pick-up from
loading stations [88] as illustrated in Figure 8.5.
Although much progress is being made in the synthesis of articial motors (see
Ref. [137]), it has been difcult, in practice, to synthesize articial motors that come
even close in performance to their natural counterparts (see Ref. [39]). Harnessing
biological motors to perform nanoscale manufacturing tasks might thus be the best
near-term strategy. Although many individual nanoparts can be easily manufactured,
the high-throughput assembly of these nanocomponents into complex structures is
still nontrivial. At present, no ex vivo technology exists that can actively guide such
nanoscale assembly processes. Despite advances in deciphering the underlying
engineering design principles of nanomotors, many hurdles still impede harnessing
them for ex vivo transport and sequential assembly in nanosystems. Although the use
of biological nanomotors puts intrinsic constraints on the conditions under which
they can be assembled and used in biohybrid devices, many of their sophisticated
tasks are still poorly mimicked by synthetic analogues. Understanding the details of
how these little nanomachines convert chemical energy into controlled movements
will nevertheless inspire new approaches to engineer synthetic counterparts that
might some day be used under harsher conditions, operate at more extreme
temperatures, or simply have longer shelf lives.
Certain stages of the materials production process might one day be replaced by
nanomotor-driven sequential self-assembly, allowing much more control at the
molecular level. Biological motors are already being used to drive the efcient
fabrication of complex nanoscopic and mesoscopic structures, such as nanowires [31]
and supramolecular assemblies. Techniques for precision control of nanomotors that
read DNA are also being used to engineer integrated systems for rapid DNA detection
and analysis (www.nanobiosym.com). The specicity and control of assembly and
transport shown by biological systems offers many opportunities to those interested
in assembly of complex nanosystems. Most importantly, the intricate schemes of
proofreading and damage repairfeatures that have not yet been realized in any
manmade nanosystemsshould provide inspiration for those interested in producing synthetic systems capable of similarly complex tasks.
References
Acknowledgments
We thank Sheila Luna, Christian Brunner and Jennifer Wilson for the artwork, and
all of our collaborators who contributed thoughts and experiments. At the same
time, we apologize to all authors whose work we could not cite owing to space
limitations.
Correspondence and requests for materials should be addressed to A.G. or V.V.
References
1 Rodnina, M.V. and Wintermeyer, W.
(2001) Fidelity of aminoacyl-tRNA
selection on the ribosome: kinetic and
structural mechanisms. Annual Review of
Biochemistry, 70, 415435.
2 Kunkel, T.A. (2004) DNA replication
delity. The Journal of Biological Chemistry,
279, 1689516898.
3 Erie, D.A., Hajiseyedjavadi, O., Young,
M.C. and von Hippel, P.H. (1993)
Multiple RNA polymerase conformations
and GreA: control of the delity of
transcription. Science, 262, 867.
4 Liu, D.R., Magliery, T.J., Pastrnak, M. and
Schultz, P.G. (1997) Engineering a tRNA
and aminoacyl-tRNA synthetase for the
site-specic incorporation of unnatural
amino acids into proteins in vivo.
Proceedings of the National Academy of
Sciences of the United States of America, 94,
1009210097.
5 Bustamante, C., Smith, S.B., Liphardt, J.
and Smith, D. (2000) Single-molecule
studies of DNA mechanics. Current
Opinion in Structural Biology, 10,
279285.
6 Davenport, R.J., Wuite, G.J.L., Landick, R.
and Bustamante, C. (2000) Singlemolecule study of transcriptional pausing
and arrest by E. coli RNA polymerase.
Science, 287, 24972500.
7 Greulich, K.O. (2005) Single-Molecule
Studies on DNA and RNA.
ChemPhysChem, 6, 24592471.
8 Wang, M.D. et al. (1998) Force and
velocity measured for single molecules of
RNA polymerase. Science, 282, 902907.
j225
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
226
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
References
42
43
44
45
46
47
48
49
50
51
j227
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
228
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
References
82 Muthukrishnan, G., Hutchins, B.M.,
Williams, M.E. and Hancock, W.O. (2006)
Transport of semiconductor nanocrystals
by kinesin molecular motors. Small, 2,
626630.
83 Taira, S. et al. (2006) Selective detection
and transport of fully matched DNA by
DNA-loaded microtubule and kinesin
motor protein. Biotechnology and
Bioengineering, 95, 533538.
84 Hess, H., Clemmens, J., Qin, D., Howard,
J. and Vogel, V. (2001) Light-controlled
molecular shuttles made from motor
proteins carrying cargo on engineered
surfaces. Nano Letters, 1, 235239.
85 Hess, H., Howard, J. and Vogel, V. (2002)
A piconewton forcemeter assembled
from microtubules and kinesins. Nano
Letters, 2, 11131115.
86 Boal, A.K., Bachand, G.D., Rivera, S.B.
and Bunker, B.C. (2006) Interactions
between cargo-carrying biomolecular
shuttles. Nanotechnology, 17, 349354.
87 Diez, S. et al. (2003) Stretching and
transporting DNA molecules using motor
proteins. Nano Letters, 3, 12511254.
88 Brunner, C., Wahnes, C. and Vogel, V.
(2007) Cargo pick-up from engineered
loading stations by kinesin driven
molecular shuttles. Lab on a Chip, 7,
12631271.
89 Ramachandran, S., Ernst, K.H., Bachand,
G.D., Vogel, V. and Hess, H. (2006)
Selective loading of kinesin-powered
molecular shuttles with protein cargo and
its application to biosensing. Small, 2,
330.
90 Dinu, C.Z. et al. (2006) Parallel
manipulation of bifunctional DNA
molecules on structured surfaces using
kinesin-driven microtubules. Small, 2,
10901098.
91 Soldati, T. and Schliwa, M. (2006)
Powering membrane trafc in
endocytosis and recycling. Nature
Reviews. Molecular Cell Biology, 7, 897908.
92 Bachand, M., Trent, A.M., Bunker, B.C.
and Bachand, G.D. (2005) Physical factors
affecting kinesin-based transport of
93
94
95
96
97
98
99
100
101
102
j229
j 8 Harnessing Biological Motors to Engineer Systems for Nanoscale Transport and Assembly
230
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
References
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
j231
Part Four:
Innovative Disease Treatments and Regenerative Medicine
j235
9
Mechanical Forces Matter in Health and Disease:
From Cancer to Tissue Engineering
Viola Vogel and Michael P. Sheetz
9.1
Introduction: Mechanical Forces and Medical Indications
One of our earliest experiences showing that mechanical forces matter goes back to
when we got our rst blisters. Excessive friction causes a tear between the upper layer
of the skin the epidermis and the layers beneath. When these skin layers which
in healthy skin are held together by cellcell adhesion complexes begin to separate,
the resultant pocket lls with serum or blood. In some people, who have inherited
skin diseases, the blistering occurs much more easily, and studies of point mutations
that cause easy blistering have provided considerable insights into the underlying
molecular mechanisms. Molecular defects can exist in different intracellular and
extracellular proteins that are responsible for weakening the mechanical strength of
cellcell adhesions. The proteins implicated by genetic analysis include keratins,
laminins, collagens and integrins [13]. Unfortunately, exactly how mutations in
these proteins regulate the mechanical stability of the linkages that cells form with
their environment remains unknown.
Mechanical forces acting on cells also affect our lives in many other, often
unexpected, ways. Regular exercise, for example, not only strengthens our body
tone but also offers protection against mortality by delaying the onset of various
diseases. It is thought that physical training reduces the chance of chronic heart
diseases, atherosclerosis and also type 2 diabetes [4]. But how can exercise have such a
profound impact on so many diseases? Chronic low-grade systemic inammation is a
feature of these and many other chronic diseases that have been correlated with
elevated levels of several cytokines [57]. By yet unknown mechanisms, it is
suggested that regular exercise induces anti-inammatory processes, thus suppressing the production of pro-inammatory signaling proteins [5, 8].
Many more severe diseases for which we do not have cures also have a mechanical
origin, or show abnormalities in cellular mechanoresponses. These range from
cancer to cardiovascular disorders, from osteoporosis to other aging-related diseases.
In the case of many cancers, the cells grow inappropriately and with the wrong
mechanoresponse, which in turn destroys normal tissue mechanics and often also
236
tissue function [911]. While cardiovascular diseases have many forms, cardiac
hypertrophy, plaque formation and heart repair are obvious cases where mechanosensory functions are important [1214]. Abnormal mechanical forces can
trigger an aberrant proliferation of endothelial and smooth muscle cells, as observed
in the progression of vascular diseases such as atherosclerosis [15]. There is
furthermore emerging evidence that immune synapse formation is a mechanically
driven process [16]. Finally, damaged tissue is often repaired by new cells that
differentiate from pluripotent cells to nally replace and regenerate the damaged
regions. Successful healing includes re-establishing the proper mechanical tissue
characteristics; even bioscaffolds that are used in reconstructive surgery heal best if
they are mechanically exercised [14, 17]. Thus, from molecules to tissue, although the
mechanical aspects are recognized as being critical, relatively little has yet been
done to correlate mechanical effects with biochemical signal changes, and how these
impact clinical outcomes.
There is, therefore, overwhelming evidence that physical and not just biochemical
stimuli matter in tissue growth and repair in health and disease. But how do cells
sense mechanical forces? A complete answer cannot yet be given as too few
techniques have been available in the past to explore this question. However, the
broad availability of nanoanalytical and nanomanipulation tools is beginning to have
impact. This tool chest provides novel opportunities to decipher how physical and
biochemical factors, in combination, can orchestrate the hierarchical control of cell
and tissue functions (we will illustrate this point with some concrete examples later in
the chapter). The diversity of biological forms in different organisms most likely
belies a wide range of mechanosensing mechanisms that are specically engineered
to provide the desired morphology.
To summarize, based on the progress that has been made recently in the eld of cell
biomechanics, it is now clear that individual cells are dramatically affected in their
functions, from growth to differentiation, by the mechanical properties of their
environments and by externally applied forces (for reviews, see Refs [1821]). But the
question remains: How do cells sense mechanical forces, and how are mechanical
stimuli translated at the nanoscale into biochemical signal changes that ultimately
regulate cell function? A few examples are illustrated here of how physical junctions
are formed between cells and their environment, how mechanical forces acting on
molecules associated with junctions regulate their functional states, and what the
downstream implications might be on cell signaling events. Considering the
complexity of the puzzle, this chapter cannot provide a comprehensive review;
rather, we will focus on describing a few selected molecular players involved in
mechanochemical signal conversion, followed by a discussion of the associated
signaling pathways and subsequent cellular responses, and then concluding on the
role of physical stimuli in various diseases. Once the molecular pathways are
identied, and the mechanisms deciphered by which force regulates diverse cell
functions, the development of new drugs and therapies will surely emerge.
In particular, it is expected that in the future, a number of diseases associated with
altered mechanoresponses will be resolved more efciently by treating the source of
the problem, rather than the symptoms.
9.2
Force-Bearing Protein Networks Hold the Tissue Together
The search for proteins that are structurally altered if mechanically stretched, and
which could thus serve as force sensors for cells, should start in junctions between
cells and their environments. The focus should rst be on the junctions that
experience the highest tensile forces. The force-bearing elements in tissues are
typically the cytoskeleton and extracellular matrix (ECM) bers, and all the proteins
that physically link the cell interior to the exterior. For different tissues, the major
force-bearing elements can differ.
9.2.1
CellCell Junctions
Some tissues such as epithelial and endothelial cell layers have barrier functions
(Figure 9.1), where the majority of the force is born by the tight cellcell junctions.
These junctions couple the cellcell adhesion molecules (cadherins) that hold the
cells together to the cytoplasmic proteins that ultimately link cadherins to the actin,
myosin and intermediate laments in the cytoskeleton [22].
j237
238
Epithelial tissue lines both, the outside of the body (skin) and the cavities that are
connected to the outside, such as the lungs and the gastrointestinal tract. Epithelial
cells assume packing geometries in junctional networks that are characterized by
different cell shapes, number of neighbor cells and contact areas. The development of
specic packing geometries is tightly controlled [23].
Endothelial cells line the tight barrier between the circulating blood and the
surrounding vessel wall. A synchronized migration of endothelial cells is required
in order to grow blood vessels (the process of angiogenesis). When a new blood vessel is
forming, for example in response to a lack of oxygen, the endothelial cells must
maintain their cellcell contacts, remain anchored to the basement membrane, and
form curved continuous surfaces [24]; otherwise, the walls of the growing vessels
would become leaky. Blood vessel formation is thus a tightly regulated process.
9.2.2
CellMatrix Junctions
In contrast to the tight cellcell junctions, cells can also form junctions with
surrounding extracellular bers. The ECM, which is abundant in connective tissue,
includes the interstitial matrix and basement membranes. The ECM provides
structural support to the cells (Figure 9.1), in addition to performing many other
important functions that regulate cell behavior. Cellmatrix contacts are formed by
integrins; these molecules can link various ECM proteins, including bronectin,
vitronectin, laminins and collagens, via cytoplasmic adopter proteins to the cytoskeleton. During the formation or regeneration of tissue, major cell movements occur on or
through the ECM, such that the cellmatrix junctions enable and facilitate integrinmediated tissue growth, remodeling and repair processes. Integrins are also required
for the assembly of the ECM (for reviews, see Refs [2]). The bronectin matrix, the
assembly of which is upregulated in embryogenesis and wound-healing processes,
often serves as an early provisional matrix that is reinforced at later stages, for example,
by collagen deposition [27]. Integrins thus mediate the regulatory functions of the
ECM on cell migration, growth and differentiation. During wound healing, angiogenesis and tumor invasion, cells often change their expression proles of bronectinbinding integrins [28, 29]. Integrinmatrix interactions thus play central roles in
regulating cell migration, invasion and extra- and intra-vasation (i.e. moving from the
vasculature to the tissue, or vice versa), as well as in platelet interaction and wound
healing [24, 2936]. The functional roles of these interactions in health and disease will
be discussed in much more detail below.
The forces acting on cellcell or cellmatrix junctions can either be applied
externally, or generated by the contractile cytoskeleton. Shear stresses due to the
ow of blood, urine and of other body uids impart forces on either the endothelial
blood vessel linings, the linings of the epithelial urinary tract, as well as on bone cells,
respectively, and are known to actively inuence cell morphology, function and tissue
remodeling (for reviews, see Refs [12, 3741]). Lung expansion and contraction
imposes great strain on lung tissue, and mechanical forces exerted on the lung
epithelium are a major regulator of fetal lung development, as well as of the overall
pulmonary physiology [42]. Mechanical exercising of the lung also triggers the
release of surfactants onto the epithelial surface [43]. Consequently, the levels of
force generated and transmitted through cellcell and cellmatrix junctions can
change drastically with time, and between different organ tissues.
The forces that cells apply to their neighbors and matrices are furthermore
dependent on the rigidity of their environment. Cell-generated tractile forces are
lowest for soft tissues and increase with the rigidity of the organ. The brain is
one of the softest tissues, whereas bone cells nd themselves in one of the stiffest
microenvironments of the body [18]. Yet, in all of these tissues the cells generate
forces that provide the basis of active mechanosensing and mechanochemical signal
conversion processes. The formation of force-bearing protein networks that connect
the contractile cytoskeleton of cells with their surroundings is essential to prevent cell
apoptosis of most normal cells.
What is missing is a mechanistic understanding of how the forces that are applied
to cells are locally sensed and nally regulate a collective response of many cells
to produce the proper tissue morphology and morphological transformations
(Figure 9.2). Although it may be general for all tissues, in endothelia there is a
j239
240
polarity and bending that must be controlled over many cell lengths. Studies of the
development of y wings and convergent extension in frogs have provided some
important clues about mechanisms that can establish an axis in a tissue that would
then result in axial contractions. In the y wing, there are gradients of proteins that
affect wing organization by inuencing the physical properties [23, 44, 45], and some
of those proteins are asymmetrically distributed in the hexagonal wing cells. In many
tissues, however, the cells can move and change partners while they change tissue
morphology in a stereotypical way, indicating that the multicellular coordination does
not solely rely upon stationary protein complexes but rather is sensitive to intercellular forces or curvature.
To better understand the cellular nanomachinery by which cells sense mechanical
stimuli, and how forces might synchronize cellular responses, it should be noted that
the cellular nanomachinery is subjected both to exogenously applied forces and to
cell-generated forces that the cells apply locally to the ECM and neighboring cells.
As compressive forces on cells are primarily counterbalanced by the hydrostatic
pressure of the cell volume that is contained by the plasma membrane, we will focus
here entirely on the impact of tensile forces on proteins and protein networks and the
subsequent changes in cell signaling.
When cells stretch their proteins, the protein structural changes may represent one
important motif by which mechanical factors can be translated into biochemical
signal changes in a variety of tissues and cell types (Figure 9.3). Many proteins
are involved in force-bearing networks that connect the cell interior with the exterior,
and they all are potential candidate proteins for mechanosensors (for reviews,
see Refs [21, 26, 4654]).
As cells actively bind, stretch and remodel their surroundings, they use a variety of
specialized adhesion structures [25, 56], and their molecular composition will be
discussed below (see Section 9.4). Once formed, the rst contacts either mature rapidly
or break (see Sections 9.5 and 9.7). These structures mechanically link the cell
cytoskeleton and force-generating machinery within the cell to the ECM. Intracellular
traction can thus generate large forces on the adhesive junctions forces which are
easily visualized as strain applied by cells to stretchable substrates [5759], as discussed
in Section 9.8. In addition, focal contacts are not passively resistant to force, but force
actively induces focal contact strengthening through the recruitment of additional
focal adhesion proteins, and nally initiates intracellular signaling events [6064]
(Section 9.6). Cell generated forces allow for rigidity sensing (Section 9.9), and causes
matrix assembly and remodeling (Section 9.10). The matrix in turn regulates cell
motility (Section 9.11). Ultimately, the structure and composition of the adhesions play
regulatory roles in tissue formation and remodeling, and also control whether cells
derail and evolve into cancer cells or cause other disease conditions (Section 9.12).
9.3
Nanotechnology has Opened a new Era in Protein Research
The advent of nanotech tools, particularly atomic force microscopy (AFM) and optical
tweezers [6567], followed by atomistic simulations of the force-induced unfolding
pathways [68], were a major milestone in recognizing the unique mechanical
properties of proteins and other biopolymers. The rst force measurements on
single multimodular proteins were performed on titin, and revealed that the modules
cannot be deformed continuously but rather that they ruptured sequentially. But do
cells take advantage of switching protein function mechanically? The rst functional
signicance of unfolding proteins upon rapid tissue extension, for example when
overstretching a muscle, was seen in them serving as mechanical shock absorbers.
Beyond muscle tissue, protein unfolding might be a much more common theme
by which cells sense and transduce a broad range of mechanical forces into distinct
sets of biochemical signals that ultimately regulate cellular processes, including
adhesion, migration, proliferation, differentiation and apoptosis. The results of
recent studies have shown that force-induced protein unfolding does indeed occur
in cells and in their surrounding matrices [5155, 6971].
9.3.1
Mechanochemical Signal Conversion and Mechanotransduction
How, then, is force translated at the molecular level into biochemical signal changes
(mechanochemical signal conversion) that have the potential to alter cellular behavior
(mechanotransduction)? Despite all the experimental indications, only limited
information is available on how mechanical forces alter the structurefunction
relationship of proteins and thus coregulate cell-signaling events. After a decade of
new insights into single molecule mechanics, a new eld is beginning to emerge:
j241
242
How can the force-induced mechanical unfolding of proteins and other biomolecules
switch their functions?
Through careful investigations of the conformational changes of isolated proteins
that are mechanically stretched in vitro, and through computational simulations that
have provided high-resolution structural information of the unfolding pathways
of proteins, key design principles are beginning to emerge that describe how
intracellular, extracellular and transmembrane proteins might sense mechanical
forces and convert them into biochemical signal changes as discussed below
(for reviews, see Refs [21, 26, 72, 73]). Stretch has been shown experimentally to
expose cryptic phosphorylation sites, resulting in the onset of a major signaling
cascade [51], to increase the reactivity of cysteines [52], and also to induce bronectin
brillogenesis (for a review, see Ref. [26]). Yeast two-hybrid measurements,
crystallographic analyses and high-performance steered molecular dynamics
(SMD) calculations all indicate that the exposure of amphipathic helices (e.g. talin,
a-actinin) will cause binding to unstrained proteins (vinculin) or to the membrane,
as detailed below. Thus, it seems that not a single mechanism can account for all the
mechanical activities sensed by cells. Consequently, there is a need to develop a
detailed understanding of the mechanical steps in each function of interest, in
order to elucidate which of these mechanisms is responsible, or whether a new one
must be formulated.
Design principles are also emerging by which such mechanosensory elements
are integrated into structural motifs of various proteins, the conformations of which
can be switched mechanically (for reviews, see Refs [26, 47, 7479]). Multidomain
proteins that are large and have many interaction sites constitute a major class of
potentially force-transducing proteins [26, 80]. Both, matrix and cytoskeletal proteins
fall into this class; for example, the cytoskeletal (titin, alpha actinin, lamin, etc.)
and membrane skeletal molecules (spectrin, dystrophin, ankyrin) have series of
between four and 100 repeat domains that can be stretched over a range of forces.
An important feature here is that the repeats are often structurally homologous, but
differ in their mechanical stability. Indeed, the differences in the mechanical stability
of individual domains determines the time-dependent order in which their structure
is altered by force, and consequently the sequence in which the molecular recognition
sites are switched by force. Multimodularity thus provides for a mechanism not only
for sensing but also for transducing a broad range of strains into a graded alteration of
biochemical functionalities. Matrix molecules also have multiple domains and
presumably exhibit similar characteristics. In both cases, the stretching of molecules
can either reveal sites which can bind to and activate other proteins that could start a
signaling cascade, or they can destroy recognition sites that are exposed only under
equilibrium conditions [26].
9.3.2
Mechanical Forces and StructureFunction Relationships
As tensile force can stabilize proteins in otherwise short-lived structural intermediates, deciphering how the structurefunction relationship of proteins is altered by
mechanical forces may well open totally new avenues in biotechnology, systems
biology, pharmaceutics and medicine. In order to summarize our current understanding and future opportunities, we will rst identify the critical molecules that are
involved in linking the cell outside to the inside, and then discuss current knowledge
on the effect of force on protein structure and associated force-regulated changes of
protein function, and the downstream consequences. Cellular mechanotransduction systems can then transduce these primary physical signals into biochemical
responses. More complex physical factors, such as matrix rigidity and the microscale and nanoscale textures of their environments, can be measured by cells
through integrated force- and geometry-dependent transduction processes. Thus,
it is important to differentiate between the primary sensory processes, the transduction processes and the downstream mechanoresponsive pathways that integrate multiple biochemical signals from sensing and transduction events over
space and time, as shown schematically in Figure 9.2. It has also been postulated
that cytoskeletal laments can directly transmit stresses to distant cytoskeletal
transduction sites [81, 82], which would involve additional distant mechanosensory
and transductional components. Even in those cases, the forces would be focused
on sites where primary transduction would occur.
Beyond the unfolding of stretched proteins, there are also other mechanisms in
place by which force can alter many biochemical activities (see Box 9.1). The specic
force-induced changes in motor protein velocity can lead to stalling their movement or
buckling of their respective laments [74, 83, 84]. Stretch-sensitive ion channels exist
where the membrane pressure can regulate the ion current [20, 8587]. Finally, even
the lifetime of the strongest noncovalent bonds that last days under equilibrium, break
down within seconds under the tensile force generated by a single kinesin [88, 89]. Not
surprisingly, some adhesive bonds have evolved that are not weakened but are
strengthened by force; these are also referred to as catch bonds (as reviewed in
Refs [9093]). However, most of these force-regulated processes do not have an evident
link to changes in cellular-level functions, or the links are currently not understood.
For example, motor protein velocity is not generally linked to mechanically induced
changes in cellular function, and neither are the ion currents that accompany the
stretch activation of ion channels. Thus, it is unclear whether observed mechanochemical responses are products of the primary transduction of mechanical
Box 9.1
Activities altered by force-induced structural alterations:
.
.
.
.
.
j243
244
9.4
Making the Very First Contacts
9.4.1
Molecular Players of CellExtracellular Matrix Junctions
Cell motility is regulated by the polymerization of actin which drives the protrusion of
the leading edge of the cell. Cells use lamellipodia and lopodia to feel their
environment and to identify locations to which they can adhere. Lamellipodia are at,
thin extensions of the cell edge that are supported by branched actin networks, while
lopodia are nger-like extensions of the cell surface supported by parallel bundles of
actin laments [98]. Both are involved in sensing the environment through cycles of
extension and retraction, in the attachment of particles for phagocytosis, in the
anchorage of cells on a substratum, and in the response to chemoattractants or other
guidance cues [99, 100]. When cells encounter a ligand bound to an extracellular
surface, the ligand might bind to a transmembrane protein and ultimately induce
coupling of the transmembrane protein to the cytoskeleton. Integrins are the key
transmembrane proteins that mediate cell matrix interactions. Some integrins can
recognize the tripeptide RGD, which is found for example in bronectin, vitronectin
and other matrix molecules, while other integrins bind specically to collagens and
laminins. Once a rst bond (or set of bonds) is formed, a competition sets in between
the time taken for a bond to break again and the cellular processes that can stabilize an
early adhesion. The bond lifetime, however, is signicantly decreased if a high tensile
force is applied to it [101]. For example, without force, bronectin can bind to a5b1
j245
246
module, each of which has different structural folds, including 12 Fn type I domains,
two Fn type II domains, and 1517 Fn type III domains per Fn monomer. Both, FnI
and FnII domains contain two intrachain disulde bonds, while FnIII domains are
not stabilized by disuldes and are hence more susceptible to force-dependent
unfolding. Fibronectin displays a number of surface-exposed molecular recognition
sites for cells, including integrin binding sites such as the RGD loop, PHSRN synergy
site and LDV sequence, as well as binding sites for other ECM components, including
collagen, heparin and brin. A number of cryptic binding sites and surface-exposed
binding sites have been proposed to be exposed or deactivated, respectively, as a result
of force-dependent conformational change (as reviewed in Ref. [26]). Interestingly, it
is not only bronectin that contains these modules; in fact, approximately 1% of all
mammalian proteins contain FnIII domains that adopt a similar structural fold to the
FnIII domains in bronectin [80].
9.4.1.2 Integrins
Integrins the major cell matrix adhesins are transmembrane dimers composed of
noncovalently bound a and b subunits which associate to form the extracellular,
ligand-binding head, followed by two multi-domain legs, two single-pass transmembrane helices and two short cytoplasmic tails (Figure 9.5). Although integrins
are not constitutively active, their activation is required to form a rm connection
with RGDligands. Conformational alterations at the ligand binding site of the
extracellular integrin head domains propagate all the way to the cytoplasmic integrin
tails, and vice versa, by not-yet understood mechanisms (for reviews, see Refs [31, 32,
35, 106108]). When a ligand binds to the integrin head, it becomes activated. The
activation involves a conformational change that propagates through the extracellular
integrin domains, nally forcing the crossed transmembrane helices of the integrin
a- and b-subunits to separate, thereby opening up binding sites on their cytoplasmic
tails. In contrast, if intracellular events force the crossed integrin tails to separate,
then a conformational change will propagate to the extracellular headpiece, thereby
priming the integrin head into the high-binding state, even in the absence of an
RGDligand. This bidirectional conformational coupling between the outside and
inside is remarkable, as the integrin molecule is approximately 28 nm long [35, 106
108]. Integrins, however, can also be constitutively activated, for example in the
presence of Mn2 ions, by point mutations and via activating monoclonal antibodies [48, 49, 109111]. Integrin-mediated adhesion often occurs under tensile
forces such as uid ow or myosin-mediated contractions that cells exert to sample
the rigidity of their surroundings. In fact, a dynamic mechanism has recently been
proposed as to how mechanical forces can accelerate the activation of the RGDintegrin complex [112].
9.4.1.3 Talin
Talin is a cytoplasmic protein that can not only activate integrins [113], but also
physically links integrins to the contractile cytoskeleton [114, 115], as depicted in
Figure 9.6. The talin head has binding sites for integrin b-tails [116], PIP kinase
j247
248
g [117], focal adhesion kinase (FAK) [118], layilin [118] and actin [119] (see also
Figure 9.8 below). The 60 nm long talin rod is composed of bundles of amphipathic
a-helices [120, 121]. The talin rod contains up to 11 vinculin binding sites
(VBSs) [122], including ve located within the helices H1H12, residues
486889. All of these ve binding sites are buried inside helix bundles (native talin
shows a considerably lower afnity for vinculin compared to peptide fragments
isolated from talin). In addition to the VBSs, the talin rod has binding sites for
actin [123] and for integrins [124].
j249
250
9.5
Force-Upregulated Maturation of Early CellMatrix Adhesions
9.5.1
Protein Stretching Plays a Central Role
When the integrins latch on to their binding sites in the ECM, the cells apply force to
these newly formed adhesion sites, ultimately promoting a rapid bond reinforcement
through molecular recruitment. Such recruitment must occur within the lifetime of
the initially labile adhesion bond. Key to the reinforcement is integrin clustering,
followed or paralleled by protein recruitment [125, 137139]. At least three integrins
are needed to form an adhesion [140], and cells show a delayed spreading if the
integrins are not sufciently close [141]. The maturation of adhesion sites seems to
involve the stretching and unfolding of proteins, since proteins that are part of such
force-bearing linkages might change their structure and, therefore, also their
function. One protein which is stretched early in the adhesion process is talin,
which links integrins to the cytoskeleton. One of the many proteins that are recruited
to newly formed adhesions is vinculin.
j251
252
larger bundle, it might be stabilized by insertion into either the hydrophilic pockets of
other proteins or even into the lipid bilayer [148]. Alternatively, other proteins that
form helix bundles might also bind vinculin in a force-regulated manner. For
example, a-actinin also has a vinculin-binding helix that can form a similar structural
complex with vinculin [152, 153, 155, 156]. Similarly to talin, the VBS in a-actinin is
buried in the native structure. Identifying the repertoire of mechanisms by which
forces can upregulate adhesive interactions has led to the recent discovery of catch
bonds, where a receptorligand interaction is enhanced when tensile mechanical
force is applied between a receptor and its ligand (for reviews, see Refs [90, 93]). In
contrast, the force-activated helix-swapping mechanism proposed here requires that
the force is applied to just one of the binding partners, thereby activating bond
formation with a free ligand. Also in contrast to catch bonds, the ligand need not
necessarily form part of the force-bearing protein network at the time the switch is
initiated. Thus, while force-induced helix swapping primarily upregulates the bondformation rate, the catch bond mechanism primarily extends the lifetime of an
already existing complex under tension.
9.6
Cell Signaling by Force-Upregulated Phosphorylation of Stretched Proteins
9.6.1
Phosphorylation is Central to Regulating Cell Phenotypes
While bond reinforcement is crucial for the cell to develop a stable adhesion site, the
subsequent transformation of mechanical stimuli into biochemical signals is needed
to alter cell behavior. But, which molecules act as the major mechanochemical signal
converters? Although any experimental demonstration of the stretch-dependence of
binding to the cytoskeleton had long been missing, there has always been some
concern that the opening of stretch-activated ion channels was the cause of mechanosensation. By using matrix-attached, detergent-extracted cell cytoskeletons, it could
"
Figure 9.9 Stretch of cytoskeletons activates
adhesion protein binding and tyrosine
phosphorylation. (a) Diagram of protocol for
stretch-dependent binding of cytoplasmic
proteins to Triton X-100-insoluble
cytoskeletons. L-929 cells were cultured on a
collagen-coated silicone substrate, and
cytoskeletons prepared by treating with 0.25%
Triton X-100/ISO buffer for 2 min. Triton X-100insoluble cytoskeletons were either left
unstretched or stretched (or relaxed from
prestretch) with ISO after washing three times.
The ISO buffer was replaced with the
cytoplasmic lysate solution, incubated for 2 min
at room temperature, and washed four times
j253
254
other signaling pathways. The phosphorylation of Cas requires both an active Src or
Abl family kinase, as well as mechanical unfolding of Cas. The phosphotyrosine sites
recruit other signaling molecules such as Crk that initiate signaling cascades. The
primary transduction event is complicated, however, because the kinase activation
step may occur through a force-activated pathway such as a receptor-like protein
tyrosine phosphatase or through a hormone receptor. Consequently, primary and
secondary force-sensing distinctions can become blurred and involve extrapolation
from only a couple of incomplete examples (these are mentioned here only to
stimulate further thought on these important mechanotransduction pathways).
9.6.1.2 Tyrosine Phosphorylation as a General Mechanism of Force Sensing
The reversible phosphorylation of intracellular proteins catalyzed by a multitude of
protein kinases and phosphatases is central to cell signaling. The recently described
phenomenon of substrate priming or stretch-activation of a tyrosine kinase substrate
appears to be a major mechanism of force transduction [51]. Anti-phosphotyrosine
immunostaining of individual broblasts has revealed that tyrosine-phosphorylated
proteins are predominantly located at focal adhesions [159161], where cellgenerated forces are concentrated. Furthermore, an adhesion- or stretch-dependent
enhancement of tyrosine phosphorylation was observed in many tyrosine phosphorylation sites in T cells [162], broblasts (Figure 9.9b) [160, 161] and epithelial cells
(Y. Sawada, unpublished observations). In addition, receptor tyrosine kinases have
been reported to be tyrosine phosphorylated (i.e. activated) by mechanical stimulation in a ligand-independent manner [163, 164]. These ndings indicate that tyrosine
phosphorylation plays a general role in adhesion and force-sensing [126, 127]. Due to
their hydrophobic character, phosphorylatable tyrosines are typically buried by
intramolecular interactions under equilibrium conditions. When such proteins are
subjected to stress, however, the buried tyrosines may be exposed, thus enabling
them to become phosphorylated. Tyrosine-phosphorylatable proteins also very often
carry more than one tyrosine that can be phosphorylated, as does Cas. Multiple
repeats of structurally homologous domains are characteristic of many proteins with
mechanical functions [26]. Progressive stretching of the molecule can then affect one
domain after the other, thus gradually upregulating the response [51]. These
observations indicate that substrate priming is a common mechanism for the
regulation of tyrosine phosphorylation. As tyrosine phosphorylation appears to be
generally involved in force-response (as mentioned above), substrate priming is most
likely a universal mechanism of force sensing.
With regards to the force available for stretching molecules in the adhesion
complex, the force exerted on one integrin molecule in the adhesion site is estimated
to be on the order of 1 pN [58]; this is lower than the forces that allow refolding of
many proteins in atomic force microscopy (AFM) experiments [165]. Consistently,
the stretching of CasSD (p130Cas substrate domain, the central portion that
contains 15 YXXP motifs and is phosphorylated upon stretch) by using AFM gave
the appearance of stretching a random coil without any distinct barriers to
unfolding (Y. Sawada, J.M. Fernandez and M.P. Sheetz, unpublished observations),
suggesting that the Cas substrate domain could be extended by forces below
j255
256
the detection limit of AFM (10 pN). Further, it was observed that CasSD was
signicantly phosphorylated by Src-kinase with longer incubation times in vitro, and
that the stretch-dependent enhancement of in vitro CasSD-phosphorylation (i.e. fold
phosphorylation of stretched/unstretched) is attenuated in longer incubations with
Src-kinase (Y. Sawada and M.P. Sheetz, unpublished observations). This indicated
that thermal uctuations of the Cas substrate domain were sufcient to expose
tyrosine phosphorylation sites buried in the domain, and raises the possibility that
proteins that bind to the native substrate domain like zyxin could stabilize it and
inhibit phosphorylation. Thus, the unfolding of p130Cas and its phosphorylation
appear to occur at very low applied forces, although the phosphatases that bind to
p130Cas could rapidly remove the phosphates.
Finally, tyrosine phosphorylation of the adaptor protein, paxillin, functions as a
major switch, regulating the adhesive phenotype of cells [126, 127]. Paxillin, which
has binding sites for vinculin and p130cas, can be phosphorylated by tyrosine kinases
(including FAK and ABL) and dephosphorylated by the phosphatase Shp2 [126, 127].
Whilst phosphorylated paxillin enhanced lamellipodial protrusions, nonphosphorylated paxillin is essential for brillar adhesion formation and for bronectin brillogenesis. The modulation of tyrosine phosphorylation of paxillin thus regulates
both the assembly and turnover of adhesion sites. Whilst the method by which
force regulates paxillin recruitment and phosphorylation remains unknown,
9.7 Dynamic Interplay between the Assembly and Disassembly of Adhesion Sites
9.7
Dynamic Interplay between the Assembly and Disassembly of Adhesion Sites
9.7.1
Molecular Players of the Adhesome
It is essential that cells are able constantly to sense their environment and respond
to alterations in mechanical parameters. Thus, while one set of cues and molecular
players is required that will promote and drive the assembly of an adhesion
complex, another set is responsible for regulating their disassembly. Cell adhesion
to the ECM triggers the formation of integrin-mediated contacts and ultimately the
reorganization of the actin cytoskeleton. The formation of matrix adhesions is a
hierarchical process, consisting of several sequential molecular events during
maturation (for reviews, see Refs [24, 125, 166170]). The very rst contacts are
formed by matrix-specic integrins, and this leads to the immediate recruitment of
talin and of phosphorylated paxilin [171]. This event of building the rst integrin
connection with actin laments is followed by FAK activation and the forcesensitive recruitment of vinculin [58, 172174] and the recruitment of a-actinin
(a-actinin crosslinks actin laments). Vinculin has binding sites for vasodilatorstimulated phosphoprotein (VASP) [175] and FAK [176], both of which coregulate
actin assembly (via recruitment of prolin/G-actin complex to talin as well as Arp2/
3, respectively). pp125FAK also functions as a key regulator of bronectin receptorstimulated cell migration events through the recruitment of both SH2 and SH3
domain-containing signaling proteins to sites of integrin receptor clustering [177,
178]. While the adhesions mature further, zyxin and tensin are recruited [167],
and zyxin upregulates actin polymerization [179, 180]. The transition from
paxillin-rich focal complexes to denitive, zyxin-containing focal adhesions, takes
place only after the leading edge stops advancing or retracts [181]. A decrease in
cellular traction forces on focal adhesions then leads to an increased off-rate for
zyxin [182].
Tensin plays a central role in bronectin brillogenesis which is upregulated
by enhanced cell contractility [183]. As with talin, tensin binds to the NPxY
j257
258
9.7 Dynamic Interplay between the Assembly and Disassembly of Adhesion Sites
j259
260
9.8 Forces that Cells Apply to Mature CellMatrix and CellCell Junctions
9.8
Forces that Cells Apply to Mature CellMatrix and CellCell Junctions
9.8.1
Insights Obtained from Micro- and Nanofabricated Tools
Major experimental tools were developed to probe, with high spatial and temporal
resolution, the forces that cells apply to two-dimensional (2-D) surfaces [58, 59, 206
212]. For example, the deection of microfabricated pillars makes it possible to
observe the complete spatial pattern of actinmyosin-driven traction forces applied to
the substrate, as shown in Figure 9.13 [59]. This and other tools enable research
groups to determine how the linkage between the ECM and the cytoskeleton is
stabilized by mechanical force [62, 115, 173, 174, 213221].
If the force that cells apply to a newly formed junction is too high, the junction will
break instantaneously. Thus, the generation and sensing of force is critical for the
correct formation of the organism and functions of its tissues. First, however, we
should dene what is meant by the basic physical parameters of stress and strain in
the cellular context, since the anchoring of a cell to an environment is critical for its
survival. If rmly anchored to the ECM, the integrins couple the matrix to the
contractile machinery of the cell; in this way the major cellular forces are generated by
myosin II laments pulling on actin in early spreading cells [222]. While the level of
force orthogonal to the membrane plane that can be supported by the uid lipid
j261
262
membrane alone is quite small (typically 1020 pN for a circular area of 100 nm
diameter; i.e. a membrane tether) for mammalian cells [223], the plasma membrane
is supported by internal and external lamentous proteins that have links to the
cytoskeleton. As discussed above, mechanical reinforcement of the very rst contacts
that a cell forms with the ECM is thus essential. When the adhesions have matured,
they can typically withstand forces of a few nN per square micrometer (see
Figure 9.13). Tensile forces are then transmitted to the cytoskeleton network in the
cell, which can disseminate the force to many or a few sites, even on the opposite side
of the cell [224]. Major cellular forces are generated by myosin II laments pulling on
actin throughout the cytoskeleton in early spreading cells, and in the periphery of
epithelial cells, particularly around damaged areas in the tissue [160]. In mature
epithelial cells, networks of intermediate laments are generated that bear much of
the force when the tissues are stretched.
Far more is currently known about the mechanical characteristics of cellECM
contacts than about the mechanical properties of cellcell contacts [21, 126, 127, 225].
The formation of tight cellcell junctions, however, is critical for many developmental
processes such as formation of the gut, kidney, breast and many other epithelial
tissues, and is mediated by homologous cadherincadherin bonds [22]. These bonds
must be dynamic because there are movements of cells relative to each other in
epithelial monolayers [23, 226]. Further, when a cell dies, its neighbors rapidly close
the gap by rst forming an actomyosin collar around the hole; the collar then
contracts to cover the hole [160]. Similarly, in the process of convergent extension
during embryogenesis, cells converge along one axis while being able to move relative
to one another. This is the major morphological movement responsible for organizing the spinal cord axis [227]. Many of the important morphological changes in
development involve the contraction of epithelial cells, from the early formation of
the gut to the later formation of kidney tubules. In all cases, there is evidence that
although the individual cells can move independently, the whole tissue still acts as a
unit to undergo a coordinated morphological change. The molecular basis of the
mechanical coordination in epithelial or endothelial cell monolayers is not known,
but the precision of movement implies that a rapid feedback mechanism is present. It
is thus very interesting to note that when force-sensing micropillars were coated with
cadherins rather than with bronectin, the mechanical stresses transduced through
cadherin-adhesions were of the same order of magnitude as those previously
characterized for focal adhesions on bronectin [225]. So, the question is, what is
the relative importance of cellcell and cellmatrix contacts in different tissues on
transducing mechanical stimuli into altering the downstream behavior? In both
tissue types, cellcell interactions predominate. Endothelial cells require cellcell
contacts, while vascular endothelial cells utilize cadherin engagement to transduce
stretch into proliferative signals [15]. Hence, stretch stimulated Rac1 activity in
endothelial cells, whereas RhoA was activated by stretch in smooth muscle cells.
Finally, tissue remodeling often reects alterations in local mechanical conditions
that result in an integrated response among the different cell types that share and
thus cooperatively manage the surrounding ECM and its remodeling. The question
therefore is whether mechanical stresses can be communicated between different
cell types to synergize a matrix remodeling response. When normal stresses were
imposed on bronchial epithelial cells in the presence of unstimulated lung broblasts, it could be shown that mechanical stresses can be communicated from stressed
to unstressed cells to elicit a remodeling response. Thus, the integrated response of
two cocultured cell types to mechanical stress mimics the key features of airways
remodeling as seen in asthma: namely, an increase in production of bronectin,
collagen types III and V and matrix metalloproteinase type 9 (MMP-9) [228].
9.9
Sensing Matrix Rigidity
9.9.1
Reciprocity of the Physical Aspects of the Extracellular Matrix and Intracellular Events
As long as the cellmatrix and cellcell linkages hold tight, intracellular motile activity
will interrogate the matrix, and the subsequent cellular activity will depend on the
physical properties of the extracellular brillar network, and vice versa. The interrogation involves a mechanical testing of the rigidity as well as the geometry of the
environment through the normal cell motility processes (Box 9.2). When the rigidity
is determined, the cell will respond appropriately. For example, the extracellular
network structure is remodeled if it is of the same or a softer compliance than the
intracellular network [18]. Rigidity is an important part of the environment of a cell,
and different tissues have different rigidities as well as different levels of activity
(this point will be discussed later). There is considerable interest in learning how this
is achieved at a molecular level.
A number of reports have indicated that matrix rigidity is a critical factor
regulating fundamental cell processes, including differentiation and growth.
Dischers group recently showed that the differentiation of mesenchymal stem
cells is heavily dependent on the rigidity of the matrix to which cells adhere [19]. The
group reported that mesenchymal stem cells preferentially become neurogenic on
soft substrates, while they preferentially commit to myogenic and osteogenic
differentiation on intermediate and rigid substrates, respectively. The cellular
response to rigidity has been seen at the time scale of seconds for submicron
latex beads [213] and during cell spreading [229]. Fibroblast migration toward rigid
substrates indicates further that the process has important ramications for in vivo
motile activities [230]. From the signal transduction point of view, these observations indicate that the sensing of different rigidities can be very rapid and may have
profound effects on cell function at a variety of levels. Rigidity is a rather
complicated parameter for the cell to sense because measurements of both force
and displacement must be combined. In order for a cell to sense matrix rigidity, the
cell must actively pull on the matrix; thus, the cell must actively test the rigidity of its
environment. As a corollary, the cell in an inactive state will not be able to develop a
rigidity response. Although these statements apply for single cells in vitro, the
situation is more complicated in a tissue environment where neighboring cells and
j263
264
Box 9.2
Cell Forces
.
Tissue Rigidity
.
Resting Rigidity: A tissue at rest has a rigidity that is dened by the Youngs
modulus, E, which can be determined from the slope of the tensile stress versus
the tensile strain curve:
E tensile stress=tensile strain ds=de dF=A0 =dL=L0
Rigidity Sensing: To measure rigidity, cells develop force on matrix over time.
This means that the mechanism of sensing rigidity must compare force and
displacement during a given time period. If a cell pulls on an object, the total
displacement is the sum of the matrix and of the intracellular displacements.
Since the cell directly probes the relative molecular displacements in the
adhesion site, it should not matter whether the force is applied from the outside
or the inside. This statement, however, is true for soft surfaces only if
the external force is applied rapidly before the cell has displaced the substrate.
The time window is important since the cell does not pull with a constant force
but gradually increases the force after making a rst contact. The physiologically
relevant time window in this context typically lasts for just 12 seconds. The real
question is whether the cell ramps up the force until a max force applied to the
adhesion site is reached, or alternatively a maximal stress, or until a certain
relative displacement of the intracellular rigidity sensors within adhesion sites is
reached which might be in the order of 130 nm. If the latter is the case, the cells
pull with larger forces on rigid objects attempting to reach the critical displacement of the intracellular rigidity sensors.
j265
266
rigidities will cause higher forces that will in turn enhance the intensity at upstream
signals in mechanotransduction (e.g. tyrosine phosphorylation) [51] and may result in a
greater activation of downstream signals on rigid substrates.
Another, possibly important, factor which affects the rigidity response is the
distance between the cytoskeleton and cellsubstrate anchor sites. As elasticity is
dened by the magnitude of deformation (e.g. change of dimension) per unit force,
cellsubstrate anchor sites will be displaced by a larger distance on soft substrates for
the same force (see Figure 9.14). If that is the case, the distance between cytoskeleton
and cellsubstrate anchor sites (the length of the matrix plus the integrin and the
actin-binding molecules) would be larger on rigid substrates than on soft substrates [233, 234]. However, larger displacement of cellsubstrate anchor sites has
not yet been demonstrated, and the uniform cellular deformation of elastic substrates
of vastly different rigidities [231] implies that the displacement of anchor sites is not
necessarily larger on soft substrate.
9.9.1.1 Time Dependence and Rigidity Responses
When cells pull on the substrate to sense rigidity, they use the rearward movement of
actin to generate the force. Actin moves rearward at a rate of 30120 nm s1, which
means that displacements of 130 nm will take more than 1 s. In the models of rigidity
sensing that have been discussed, a rapid rise in force that is sustained could elicit a
rigid substrate response. In vivo, if the cell experiences tensile forces from the
neighboring cells or the matrix during the period where it is pulling on the matrix,
then the matrix can appear rigid because the force will rise rapidly. In experiments
where the force was increased rapidly on bronectin beads with a soft laser tweezers,
the bead appeared to be in a rigid laser tweezers, as was predicted. Similarly, in vivo
many tissues experience external forces on roughly a second time-scale that could
develop a rigidity response. Thus, rigidity-dependent growth could be stimulated in
vivo by tissue contractions.
The time dependence of the assembly of components in the integrincytoskeleton
complex might affect the rigidity response, since different components bind
and detach during the life cycle of an adhesion site [184, 235]. Primary connections
between integrins and the cytoskeleton, and their reinforcement, depend on talin
(which is probably one of the rst proteins in adhesion sites) [64, 115], on a-actinin
(which crosslinks actin laments) and on zyxin (which enters the adhesion site
during its maturation) [181, 236]. The transition from mature focal adhesion to
brillar adhesion is characterized by the segregation of tensin and specic integrins [56]. Because the ECMintegrincytoskeleton connection is a viscoelastic material (i.e. it is not purely elastic) [237], the time required to reach the threshold force for
rigidity responses probably differs depending on the stiffness of the ECM. Accordingly, a soft optical trap could mimic the effects of a rigid trap on the stabilization of
the integrincytoskeleton linkages if externally applied forces rise rapidly [233]. In
lamellipodia, the cytoskeletal-dependent radial transport of a contractile signal
directs the timing of contraction and, probably, adhesion site initiation to stabilize
protrusive events [229] (Figure 9.15). Consequently, the formation of cell contacts
with the ECM is not a continuous process but rather involves cycles of contraction and
relaxation.
9.9.1.2 Position and Spacing Dependence of the Rigidity Responses
The position dependence of rigidity responses is exemplied by the fact that
structural and signaling proteins that are necessary for rigidity responses are placed
at strategic locations for example, at the cell edge during protrusive events and at
early adhesion sites. Many proteins involved in rigidity responses and/or phosphotyrosine signaling, including talin [64, 115], integrins (avb3) [115, 238], paxillin [173, 174], a-actinin [173, 174, 229], RPTPa [173, 174], Rap1 [239] and
p130Cas [240], are localized at the leading edge of the cell, ready to respond to
any contraction generated by the cell or by the ECM. There is a position-dependent
binding-and-release cycle of bronectinintegrincytoskeleton interactions, with
preferential binding occurring at the active edges of motile broblasts and release at
0.53 mm back from the edge [241]. Interestingly, reinforcement occurs preferentially at the edge in rigid tweezers [233], whereas weak connections that break easily
are favored by nonrigid tweezers and at sites >1 mm back from the leading edge
[115, 233]. At the molecular level, the reinforcement of integrincytoskeleton
interactions are limited to linkages that have experienced force, and not those
nearby (<1 mm) [213].
j267
268
Finally, many tissues experience periodic stretch in vivo during normal activity.
When the tissue is inactive, it often experiences atrophy; this is obviously true for
muscle, bone and connective tissue. The greatest problem for space travelers is that
astronauts typically lose 12% of their bone mass for each month in space, even
though they may exercise regularly [242]. Similarly, the skin on the feet or hands
thickens with use or labor, and thins with disuse. Thus, force from activity or rigidity
appears to be a global regulator of tissue function, and an understanding of the
mechanisms whereby force is transduced into biochemical signals is an important
area of future research.
At the subcellular level, there are many forces that must be regulated to
produce normal cell morphology and the proper distribution of organelles. Although
the protein composition of many genomes and even individual cell types is
known, relatively little knowledge exists of how those proteins are assembled
into functional complexes. Individual proteins, typically 520 nm in diameter, are
assembled into larger functional complexes that can be considered as subcellular
machines, controlling and regulating complex cellular functions, from reading and
translating the genetic blueprint to the synthesis and transport of proteins, from cell
migration to cell division, from cell differentiation to cell death. Those subcellular
complexes range in size from ribosomes to the lamellipodial machines that
drive ECM assembly and remodeling [243], including collagen ber rearrangements [244]. It is important to be able to dissect the steps into these subcellular
processes to enable greater understanding of the sequence of coherent molecular
events [245] (Figure 9.16).
9.10
Cellular Response to Initial Matrix Conditions
9.10.1
Assembly, Stretching and Remodeling of the Extracellular Matrix
Once the rst contacts have been made by the cell with its surrounding, it will often
soon begin to assemble its own matrix. Initially, cells build a provisional matrix which
j269
270
to bronectin to exposure of sites with enzymatic functions [21, 26, 5355, 69, 70].
These cellular studies are important as they demonstrate that the nonequilibrium
conformations of proteins can be stabilized by force and are thus physiologically
relevant.
j271
272
monitored optically [250, 251]. To probe how the alterations of the structure of
stretched Fn impact biochemical interactions and cellular behavior, such a strain
assay needs to be amenable to cell culture environments. To tune the conformation of
bronectin, bers were drawn manually from a concentrated bronectin solution [266268]. When adding trace amounts of FRET-labeled bronectin into the
solution, followed by a step where the freshly drawn bers are deposited onto
polymeric sheets that are mounted into stretch devices, bers can be generated
that have a far more narrow conformational distribution, as found in native
matrix [250, 251]. Furthermore, the mechanical strain can be externally adjusted,
which enabled protein-binding studies to be conducted as a function of the strain of
brillar bronectin [250, 251]. An image of conformationally tuned bronectin bers
can be seen in Figure 9.17. These assays further open the doors to the question of
whether and how cell phenotypes are regulated by force-induced alterations of
bronectin conformation.
9.10.1.2 Cell Responses to Initial Biomaterial Properties and Later to Self-Made
Extracellular Matrix
While the differentiation of mesenchymal stem cells has been correlated with the
rigidity of the substrate on which they were initially plated [19], does the rigidity
response of the cell change as it assembles its own matrix hours or days after the cells
have been seeded on a substrate with dened rigidity? For example, the initial rigidity
of a substrate has been proposed to determine whether or not mammary epithelial
cells upregulate integrin expression and differentiate into a malignant phenotype [145], and also to dictate whether mesenchymal stem cells differentiate into
bone, muscle or neuronal tissue [19]. In those experiments, the macroscopic
materials properties of the substrate were typically correlated with outcome after
four to ten days of cell culturing. While the mechanical properties of a substrate or
engineered scaffold have indeed been correlated with various aspects of cell behavior,
the underlying mechanisms how substrate rigidity ultimately regulates the long-term
responses have not yet been dened.
Once cells have been seeded onto synthetic matrices they rapidly begin to assemble
their own matrix, and will ultimately touch, feel and respond to their self-made ECM.
A few hours after cells have attached to surfaces and begun to assemble their matrix,
the physical characteristics of the newly assembled matrix do indeed depend on the
rigidity of the substrate. After 4 h of cell culture, the FRET data showed that the
bronectin matrix was indeed more unfolded on a rigid (33 kPa) compared to a soft
substrate (7 kPa) (Figure 9.18a) [269]. Surprisingly, however, after only one day the
broblasts that were initially seeded onto glass had produced sufcient matrix so as to
sit on a much softer biopolymer cushion. The cells then assembled a matrix that was
far less unfolded than the matrix they made during the rst 4 h on glass, as probed by
adding FRET-labeled bronectin only during the last 2324 h after seeding. This
newly made matrix is comparable to the cells feeling a 7 kPa substrate [269]. Most
interestingly, the aging matrix changes its physical properties. When the cells were
seeded onto glass and allowed to assemble matrix for three days, the matrix deposited
during the rst 24 h was highly unfolded, while the younger matrix was far less
j273
274
unfolded (Figure 9.18b). Thus, the matrix was progressively more unfolded as it
aged, while the newly deposited matrix showed little unfolding. These data provided
the rst evidence that matrix maturation occurs, and that aging is associated with
an increased stretching of bronectin brils. Matrix assembly and remodeling
involves at least partial unfolding of the secondary structure of bronectin modules.
Consequently, matured and aged matrix may display different physical and biochemical characteristics, and is structurally distinguishable from newly deposited matrix.
A comparison of the conformation of Fn in these three-dimensional (3-D) matrices
with those constructed by cells on rigid and exible polyacrylamide surfaces suggests
that cells in maturing matrices experience a microenvironment of gradually increased rigidity [269]. A future goal must be to understand the physiological
consequences of matrix unfolding on cell function, including cancer and stem cell
differentiation.
9.11
Cell Motility in Response to Force Generation and Matrix Properties
The relationship between force generation and motility is not simple. Fibroblasts that
develop high forces on substrates do not move rapidly [222], whereas neutrophils that
move at the highest rates reported for cells (about 40 mm min1 or 700 nm s1)
generate very low forces on substrates [5355]. There is considerable interest in the
extravasation of cancer cells moving out of the bloodstream into tissues in the process
of metastasis, although many of the steps in that process involve proteolysis of the
matrix and deformation of the cell to pass through small gaps in the endothelium [270272]. Although force generation is needed for motility, it is not the most
important factor indeed, cell polarization, matrix proteases, directional signals and
the deformation of the cytoplasm are often as important.
Cell motility depends on substrate density and rigidity, and therefore also on
the processes that respond to rigidity [273]. Many of the proteins involved in
rigidity response have been linked to motility disorders, including cancer as well
as malformations in development and neuronal connectivity. Src family kinases
(SFKs) [274, 275], FAK [173, 174, 276]), the SH2 domain-containing phosphatase
SHP-2 [173, 174] and RPTPs) [173, 174] are important components of the forcedependent signal transduction pathways that lead to the assembly of adhesion sites.
The force-dependent initiation of adhesion sites and their rapid reinforcement
occurs in protruding portions of cells, where adhesion sites can transmit cell
propulsive forces [179, 277]. In extending regions of the cell, forces are generated
on integrins by actin rearward ow rather than stress bers. At the trailing
end of the cell, mature focal adhesions create passive resistance during cell
migration. To overcome this resistance, high forces must be generated by nascent
adhesion sites [179]. However, in some static cells, higher forces are correlated
with mature focal adhesions [58]. At the cell rear, traction stresses induce the
disassembly instead of the reinforcement of focal adhesions and linked stress
bers [189, 278280]; this is dependent on mechanosensitive ion channels and
calcium signaling in keratocytes and astrocytoma cells. SFKs, FAK and PEST
domain-enriched tyrosine phosphatase (PTP-PEST) are also crucial factors in
adhesion site disassembly [229, 274, 281]. Recent studies have shown that the
rigidity of 3-D matrices affects the migration rate differently from 2-D matrices, in
that the less-rigid matrices cause an increase in migration rate [271]. At a biochemical level, the actin depolymerizing protein colin has been implicated in the motility
of broblasts in vitro, and it is downstream in the biochemical signaling pathway of
the integrins that have been activated at the leading edge [282285]. These varying
results indicate that different modalities of force generation and rigidity response at
the cell front versus rear of the cell or in 3-D versus 2-D matrices can correlate
with position-dependent regulation of phosphotyrosine signaling, and that
different mechanisms of rigidity responses based on phosphotyrosine signaling
can independently direct cell morphology as well as motility. Many observations
point to tight links between morphology, migration, rigidity responses and tyrosine
kinase activity.
j275
276
9.12
Mechanical Forces and Force Sensing in Development and Disease
During the development and regeneration of tissues, forces act on and are propagated
throughout most tissues. Such forces provide a local and global mechanism to shape
cells and tissues, and to maintain homeostasis. Forces play a critical role by which cells
interact with their environment and gain environmental feedback that regulates cell
behavior. The signal for wound healing is often the loss of tissue integrity and the
concomitant loss of force. Further, use of tissue and the periodic generation of force are
often tied to the growth of that tissue, whereas inactivity is tied to the atrophy of the
tissue. Bedridden patients suffer from a loss of muscle tissue and other aspects of
atrophy. Similarly, with aging, osteoporosis, as well as many other cardiovascular
diseases, mechanical changes and inappropriate responses of cells to mechanical
changes, are critical and give rise to many symptoms. The size of the organism and its
form are also set, at least in part, through a physical feedback between individual cells
and their neighbors that is dynamic. As the cells grow and divide in development, they
are constantly moving and even changing neighbors on occasion. The force-bearing
cytoskeleton is actively remodeling and must clearly be responsive to changes in the
level of force, or else the tissue would relax or contract too much. Contractile activity in
individual cells can change the turgor of the tissue, and that parameter is under control
of the signaling pathways that activate myosin contraction.
Consequently, forces play a critical role in health and disease in controlling the
outcome of many biological processes (Figure 9.19). One obvious case is in cancer,
where the cells ignore normal environmental cues and grow aberrantly, although
there are many other examples (including problems with angiogenesis and tissue
repair). Thus, it is speculated that in the future there will be an increasing emphasis
Many cancer biologists have realized that cancer is inherently associated with a
diseased mechanosensing and mechanoresponsive system. Many cancer cells ignore
the normal environmental signals that regulate growth, and many of those cues are
mechanical. For example, one of the early hallmarks of cancer cells is that they are
often transformed, which was dened as the ability of those cells to grow on soft agar
whereas normal cells required a rigid substrate [911]. Early observations linked
transformation to uncontrolled cellular growth and to profound alterations in cell
shape, as well as to the deregulation of tyrosine kinase and phosphatase activity. The
rst dened oncogene, v-Src, encodes an altered form of an important cellular
tyrosine kinase, c-Src [286, 287]. In most studies on tumor cells, changes in
morphology but not cytoskeletal dynamics have been reported.
The progression of cells from normal to a cancerous or even metastatic state is
reected in an increased softening of the cells, as probed by laser traps on suspended
single cells [288]. Malignant broblasts, for example, have 3035% less actin than
normal cells. Transformed cells in culture often are rounder in morphology than
primary cells. Tumor cells are also generally less adhesive than normal cells, and
deposit less ECM [289], and the resulting loosened matrix adhesions may contribute
to the ability of tumor cells to leave their original position in the tissue. In
transformed cells, many aspects of nuclear and cell morphology as well as migration
are altered. Focal adhesions can be replaced by podosomes and in addition, stress
bers can be absent [290]. Some transformed cells acquire anchorage independence
that is, they can grow without attachment to a substrate, suggesting a rigidity
response deregulation. For example, transformed cells generate weak, poorly coordinated traction forces [291] but increased contractility. Thus, the one generality is
that transformed cells can grow inappropriately, ignoring the mechanical cues of the
environment that neighboring normal cells will follow to maintain appropriate tissue
morphology. Although other factors, such as hormonal signals, form part of many
cancers, the inappropriate mechanical responsiveness of cancer cells must also be
considered as an important part of the process.
Cancer progression leads to a loss of tissue differentiation due to abnormal cell
proliferation rates. Even if isolated malignant cells are associated with an increased
softness of their overall cytoskeleton, it is equaly signicant that tumors have a stiffer
j277
278
ECM [145, 292, 293]. Malignant transformations of the breast, for example, are
associated with dramatic changes in gland tension that include an increased ECM
stiffness of the surrounding stroma [293]. Chronically increased mammary gland
tension may inuence tumor growth, perturb tissue morphogenesis, facilitate tumor
invasion, and alter tumor survival and treatment responsiveness. However, changes
in environmental factors (i.e. changes in ECM rigidity) and internal force generation
(i.e. inappropriate rigidity responses) might be key factors in determining a transformed cell morphology and malignant phenotype [145]. For example, tumors are
stiffer than normal tissue because they have a stiff stroma and elevated Rhodependent cytoskeletal tension that drives focal adhesions, disrupts adherens junctions, perturbs tissue polarity, enhances growth, and hinders lumen formation [145].
Matrix stiffness thereby perturbs epithelial morphogenesis by clustering integrins to
enhance ERK activation and increase ROCK-generated contractility and focal adhesions, thereby promoting malignant behavior [145].
Metastatic cells escape the tumor by invading the surrounding tissue, entering the
circulation and nally attaching to previously unaffected tissues in often remote
locations. In 1889, Stephen Paget published an article in The Lancet that described the
propensity of various types of cancer to form metastases in specic organs, and
proposed that these patterns were due to the . . . dependence of the seed (the cancer
cell) on the soil (the secondary organ) [294]. This has often been linked to the chemical
environment of the secondary organ, although recent results have indicated that it
could also be a result of the mechanical environment in the secondary organ [295]. It
was found that lung metastases from human breast cancer cells would grow better on
soft bronectin substrates than on hard, whereas bone metastases would grow better
on hard than on soft (A. Kostic and M.P. Sheetz, unpublished results). Metastasis is an
inefcient process, and many cancer cells are shed but few actually grow into a tumor
at a new site. One reason for this is that the new site might not have the appropriate
mechanical properties. At another level, tumor cells are generally less adhesive than
normal cells and deposit less ECM [289]. The resulting loosened matrix adhesions,
combined with the softened cytoskeleton, may contribute to the ability of tumor cells to
leave their original position in the tissue and squeeze through tiny holes.
Many of the molecules discussed above play key roles in cancer progression, and
also metastasis. Integrin-mediated cell adhesion leads to the activation of FAK and cSrc, after which the FAKSrc complex binds to and can phosphorylate various
adaptor proteins such as p130Cas and paxillin. The results of recent studies have
shown that the FAKSrc complex is activated in many tumor cells, and generates
signals leading to tumor growth and metastasis (as reviewed in Ref. [296]). Tyrosine
phosphorylation of paxillin regulates both the assembly and turnover of adhesion
sites. Phosphorylated paxillin enhanced lamellipodial protrusions and thus promoted cell migration [126, 127]; the migration of tumor cells in 3-D matrices is then
governed by matrix stiffness, along with cellmatrix adhesion and proteolysis [271].
As discussed above, the phosphorylation of p130Cas is upregulated when cells are
located on a more rigid substrate.
The overall survival of breast cancer patients is inversely correlated with the levels of
p130Cas (BCAR1) in the tumors [297] indicating that increased levels of p130Cas in
tumor cells contributed to patient death. It was subsequently found that cell migration
was activated by p130Cas and the associated GEF (AND-34 or BCAR3), which
indicated that metastasis was favored by elevated p130Cas [298]. Both, p130Cas and
a p130Cas binding protein, AND34 (BCAR3), will increase the epithelial to mesenchymal transition when overexpressed [298]. p130Cas appears to have a central role in
cell growth and motility; in many cases, it is dramatically altered in its phosphotyrosine
levels in correlation with transformation [299302] and metastasis [303]. In the specic
case of lung tumors, metastasis was increased following removal of the primary tumor,
and required p130Cas expression. Further, the substrate domain YxxP tyrosines were
needed for both invasive and metastatic properties of the cells [304]. Even the invasion
of Matrigel and the formation of large podosome structures required the YxxP
tyrosines. Thus, it is suggested that the inappropriate growth of cancer cells may be
partly due to changes in the normal force and rigidity-sensing pathways that can alter
the cellular program. This means, in turn, that the protein mechanisms involved in
controlling mechanical responses can be good targets for therapies in cancer. In
addition, mechanical treatments can possibly alter the course of cancers. Several levels
can be identied in the process of mechanosensing, transduction and response where
alterations in cancer cells could result in abnormal growth control. For example, c-Src,
Fyn and Yes knockout cells are each missing three important Src family kinases, and
will grow on soft agar while not sensing any difference between soft and hard agar.
However, the restoration of Fyn activity will enable the cells to sense rigidity, and they
will no longer grow on soft agar [198]. Thus, an understanding of the mechanisms of
force and rigidity sensing can provide an important perspective on cancer.
9.12.2
Angiogenesis
The growth of new blood vessels that is, angiogenesis is crucial not only in tissue
growth and remodeling but also in wound healing and cancer. Vascular development
requires correct interactions among endothelial cells, pericytes and surrounding
cells [24]. Thus, the formation of new blood vessels might be compromised if any of
these interactions including cellmatrix interactions, both with basement membranes and with surrounding ECMs are perturbed. Equally important, the injurymediated degradation of the ECM can lead to changes in matrixintegrin interactions,
causing an impaired reactivity of the endothelial cells that will lead to vascular wall
remodeling. Consequently, alterations in integrin signaling, growth factor signaling,
and even of the architecture and composition of the ECM, might all affect vascular
development. As in other motility processes, angiogenesis involves a very stereotypical
set of movements of the endothelial cells that result in the formation of capillary tubes.
The role and mechanisms by which mechanical forces promote angiogenesis
remain unclear. It is notable, however, that angiogenesis is regulated by integrin
signaling [305308]. Angiogenesis is furthermore promoted by vascular endothelial
growth factor (VEGF). As tumor neovascularization plays critical roles for the
development, progression and metastasis of cancers, new therapeutic approaches
to treat malignancies have been aimed at controlling angiogenesis by monoclonal
j279
280
antibodies targeting VEGF, as well as with several tyrosine kinase inhibitors targeting
VEGF-related pathways (for a review, see Ref. [309]).
VEGF binds to its transmembrane receptor by stimulated complex formation
between VEGF receptor-2 and b3 integrin. Prior studies have suggested, for example,
that av-integrins (avb3 and avb5) could act as negative regulators of angiogenesis (as
discussed in Refs [31,32]).Neovascularization is impaired in mutant micewhere theb3
integrins were unable to undergo tyrosine phosphorylation [310]. The lack of integrin
phosphorylation suppressed the complex formation with VEGF. Furthermore, the
phosphorylation of VEGF receptor-2 was signicantly reduced in cells expressing
mutant b3 compared to wild-type, leading to an impaired integrin activation in these
cells. With its binding locations at both the N and C termini, VEGF also binds to
bronectin bers [311, 312] and, when bound, has been shown to increase cell
migration, proliferation and differentiation [311, 313, 314]. A reduced extracellular
pHis one of the key signals that can induce angiogenesis. By demonstrating that VEGF
binding to bronectin is dependent on pH, and that released VEGF sustained
biological activity, Goerges et al. [315] suggested that cells may use a lowered pH as
a localized mechanism of controlled VEGF release [316]. Goerges and colleagues also
suggested that VEGFmight be stored in the ECM via interactions with bronectin and
heparan sulfate in tissues that are in need of vascularization, so that it can aid in
directing the dynamic process of growth and migration of new blood vessels. If and
how VEGF signaling is regulated by mechanical force, however, remains unclear.
Tumor blood vessels have an altered integrin expression prole, and both blood and
lymphatic vessels have pathological lesions [28]. In contrast to healthy tissue, integrin
b4 signaling in tumor blood vessels promotes the onset of the invasive phase of
pathological angiogenesis [317], while loss of the b4 integrin subunit reduces tumorigenicity [318]. Integrin b4 binds to laminin (Figure 9.5), which is enriched in basement
membranes, but not to the RGD-ligand as exposed in bronectin. Another difference
from the RGD-binding integrins is that integrin b4 connects to the cytoskeleton via
plectin (not talin, as illustrated in Figure 9.7) [319], and little is known about the
mechanoresponsivity of that linkage.
Another open question is whether degradation of the ECM is regulated by force.
Exploring this question is of particular relevance since, in addition to serving as an
anchoring scaffold and storage for growth factors, a group of angiogenesis regulators
are derived from fragments of ECM or blood proteins. Endostatin, antithrombin and
anastellin are members of this group of substances. Some of these compounds are
currently undergoing clinical trials as inhibitors of tumor angiogenesis [320], as well as
synthetic peptides modeled after these anti-angiogenic proteins, such as Anginex
[321]. RGD-containing breakdown products of the ECM may also cause sustained
vasodilation [87].
9.12.3
Tissue Engineering
Deciphering the mechanisms by which ECMs might sense and transduce mechanical stimulation into functional alterations of cell behavior and fate is also a critical
j281
the left atrium and exits through the aortic valve. Pulsatile distention of the left
ventricle and a compliance loop attached to the ascending aorta provide
physiological coronary perfusion and afterload. Coronary perfusate (effluent)
exits through the right atrium; (c) Formation of a working perfused bioartificial
heart-like construct by recellularization of decellularized cardiac extracellular
matrix. Top, recellularized whole rat heart at day 4 of perfusion culture in a
working heart bioreactor. Upper insert: cross-sectional ring harvested for
functional analysis (day 8). Lower insert: Massons trichrome staining of a ring
thin section, showing cells throughout the thickness of the wall. Scale
bar 100 mm. Bottom: force generation in left ventricular rings after 1 Hz
(left) and 2 Hz (right) electrical stimulation.
282
stem cells directly injected into scarred tissue, for example into an infarcted heart, are
not properly directed by the matrix to form new myocardium [325]. Rather than
injecting cells into rigidied scar tissue, tissue regeneration was far more effective
when transplanting an entire monolayer sheet of mesenchymal stem cells [326]. The
engrafted cell sheet gradually grew to form a thick stratum that included newly
formed vessels, undifferentiated cells and few cardiomyocytes which might be
promoted by the more favorable microenvironment that the cells would nd if
transplanted as sheets rather than being injected individually.
Early studies have already shown that matrix crosslinking would compromise the
ability of tissue-derived matrices for use in functional reconstruction [327]. One of
several reasons for this might be that crosslinking alters the rigidity of the matrix, and
an upregulated rigidity response of the reseeded cells might interfere with regaining
tissue function. Another not necessarily exclusive possibility is that crosslinking
would inhibit the protein conformation changes caused by stretching the matrix
bers. It has been found that bronectin bers, for example, can be stretched on
average more than ve times their equilibrium length before they break [250, 251],
whereas crosslinked bers show a markedly decreased extensibility. Crosslinked cellderived matrices cause an upregulated cellular rigidity response and alter the
biophysical properties of the matrix that the newly seeded cells are generating [328].
Thus, force-induced protein unfolding in a newly deposited matrix is, at least in part,
upregulated by crosslinking, with all the functional implications as discussed above.
Another aspect of native matrix, that might be compromised by crosslinking, is its
ability to serve as a scaffold for storing cytokines and growth factors and to release
them upon demand. Integrins were also shown recently to play a central role in
activating the matrix-bound cytokine transforming growth factor-beta 1 (TGF-b1) by
cell-generated tension acting on the matrix [329]. TGF-b1 controls tissue homeostasis
in embryonic and normal adult tissues, and also contributes to the development of
brosis, cancer, autoimmune and vascular diseases. In most of these conditions,
active TGF-b1 is generated by dissociation from a large latent protein complex that
sequesters latent TGF-b1 in the bronectin-containing ECM [330]. The studies of
Wipff and colleagues might suggest that matrix stiffness could regulate the equilibrium between storage and release of a host of matrix-bound growth factors [331].
Finally, the fact that not only the intact ECM but also its breakdown products have
regulatory functions, can be actively exploited in tissue engineering. Low-molecularweight peptides derived from the ECM, for example, can act as chemo-attractants for
primary endothelial cells [138]. ECM extracts were found to have antimicrobial
activity [332], and fragments of ECM or blood proteins, including endostatin,
antithrombin and anastellin, may serves as inhibitors of angiogenesis [321]. Moreover, these angiostatic peptides use plasma bronectin to home to the angiogenic
vasculature [321]. Finally, uncharacterized digestive products of the ECM seem to act
as strong inammatory mediators [333]. Extensive future investigations are required
in order to provide a full comprehension of the multifaceted regulatory roles of the
ECM and its constituents, and how forces might coregulate many of these functions.
Consequently, learning how to switch the structurefunction relationship of
proteins by force has far-reaching potential not only in tissue engineering but also
j283
284
in biotechnology, and for the development of new drugs that might target proteins
stretched into nonequilibrium states.
Acknowledgments
We gratefully acknowledge the many discussions with colleagues and our students,
and thank in particular Sheila Luna for the graphics. Financial support was provided
by the Nanotechnology Center for Mechanics in Regenerative Medicine (an NIH
Roadmap Nanomedicine Development Center), the Volkswagen Stiftung, and various grants from NIH and ETH Zurich.
References
1 Korge, B.P. and Krieg, T. (1996)
The molecular basis for inherited bullous
diseases. Journal of Molecular Medicine
(Berlin, Germany), 74 (2), 5970.
2 McGrath, J.A. (1999) Hereditary diseases
of desmosomes. Journal of Dermatological
Science, 20 (2), 8591.
3 Jonkman, M.F., Pas, H.H., Nijenhuis, M.,
Kloosterhuis, G. and Steege, G. (2002)
Deletion of a cytoplasmic domain of
integrin beta4 causes epidermolysis
bullosa simplex. The Journal of Investigative
Dermatology, 119 (6), 12751281.
4 Pedersen, B.K. (2006) The antiinammatory effect of exercise: its role in
diabetes and cardiovascular disease
control. Essays in Biochemistry, 42,
105117.
5 Petersen, A.M. and Pedersen, B.K. (2005)
The anti-inammatory effect of exercise.
Journal of Applied Physiology (Bethesda,
Md: 1985), 98 (4), 11541162.
6 Fries, R.S., Mahboubi, P., Mahapatra,
N.R., Mahata, S.K., Schork, N.J., SchmidSchoenbein, G.W. and OConnor, D.T.
(2004) Neuroendocrine transcriptome in
genetic hypertension: multiple changes
in diverse adrenal physiological systems.
Hypertension, 43 (6), 13011311.
7 Harrison, D.G., Widder, J., Grumbach, I.,
Chen, W., Weber, M. and Searles, C.
(2006) Endothelial mechanotransduction,
10
11
12
13
References
14
15
16
17
18
19
20
21
22
j285
286
45
46
47
48
49
50
51
52
53
References
54 Smith, M.L., Gourdon, D., Little, W.C.,
Kubow, K.E., Eguiluz, R.A., Luna-Morris,
S. and Vogel, V. (2007) Force-induced
unfolding of bronectin in the
extracellular matrix of living cells. PLoS
Biology, 5 (10), e268.
55 Smith, M.L., Gourdon, D., Little, W.C.,
Kubow, K.E., Eguiluz, R.A., Luna-Morris,
S. and Vogel, V. (2007) Force-induced
unfolding of bronectin in the
extracellular matrix of living cells. Public
Library of Science Biology, 5 (10).
56 Vogel, V., Sheetz, M.P. (2009) Cell Fate
Regulation by Coupling Mechanical
Cycles to Biochemical Signaling
Pathways. Current Opinion Cell Biol.
21 (1):in press.
57 Harris, A.K., Wild, P. and Stopak, D.
(1980) Silicone rubber substrata: a new
wrinkle in the study of cell locomotion.
Science, 208 (4440), 177179.
58 Balaban, N.Q., Schwarz, U.S., Riveline,
D., Goichberg, P., Tzur, G., Sabanay, I.,
Mahalu, D., Safran, S., Bershadsky, A.,
Addadi, L. and Geiger, B. (2001) Force and
focal adhesion assembly: a close
relationship studied using elastic
micropatterned substrates. Nature Cell
Biology, 3 (5), 466472.
59 Tan, J.L., Tien, J., Pirone, D.M., Gray, D.S.,
Bhadriraju, K. and Chen, C.S. (2003) Cells
lying on a bed of microneedles: an
approach to isolate mechanical force.
Proceedings of the National Academy of
Sciences of the United States of America, 100
(4), 14841489.
60 Chrzanowska-Wodnicka, M. and
Burridge, K. (1996) Rho-stimulated
contractility drives the formation of stress
bers and focal adhesions. The Journal of
Cell Biology, 133 (6), 14031415.
61 Helfman, D.M., Levy, E.T., Berthier, C.,
Shtutman, M., Riveline, D., Grosheva, I.,
Lachish-Zalait, A., Elbaum, M. and
Bershadsky, A.D. (1999) Caldesmon
inhibits nonmuscle cell contractility and
interferes with the formation of focal
adhesions. Molecular Biology of the Cell, 10
(10), 30973112.
j287
288
81
82
83
84
85
86
87
88
89
References
90
91
92
93
94
95
96
97
98
99
100
j289
290
111
112
113
114
115
116
117
References
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
j291
292
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
References
160
161
162
163
164
165
166
j293
294
References
192 Barry, S.T., Flinn, H.M., Humphries,
M.J., Critchley, D.R. and Ridley, A.J.
(1997) Requirement for Rho in integrin
signalling. Cell Adhesion and
Communication, 4 (6), 387398.
193 Zhong, C., Chrzanowska-Wodnicka, M.,
Brown, J., Shaub, A., Belkin, A.M. and
Burridge, K. (1998) Rho-mediated
contractility exposes a cryptic site in
bronectin and induces bronectin
matrix assembly. The Journal of Cell
Biology, 141 (2), 539551.
194 Danen, E.H., Sonneveld, P., Brakebusch,
C., Fassler, R. and Sonnenberg, A. (2002)
The bronectin-binding integrins
alpha5beta1 and alphavbeta3
differentially modulate RhoA-GTP
loading, organization of cell matrix
adhesions, and bronectin
brillogenesis. The Journal of Cell Biology,
159 (6), 10711086.
195 Miao, H., Li, S., Hu, Y.L., Yuan, S., Zhao,
Y., Chen, B.P., Puzon-McLaughlin, W.,
Tarui, T., Shyy, J.Y., Takada, Y., Usami, S.
and Chien, S. (2002) Differential
regulation of Rho GTPases by beta1 and
beta3 integrins: the role of an extracellular
domain of integrin in intracellular
signaling. Journal of Cell Science, 115 (Pt
10), 21992206.
196 Giannone, G. and Sheetz, M.P. (2006)
Substrate rigidity and force dene form
through tyrosine phosphatase and kinase
pathways. Trends in Cell Biology, 16 (4),
213223.
197 Kostic, A. and Sheetz, M.P. (2006)
Fibronectin rigidity response through Fyn
and p130Cas recruitment to the leading
edge. Molecular Biology of the Cell, 17 (6),
26842695.
198 Kostic, A., Sap, J. and Sheetz, M.P. (2007)
RPTPalpha is required for rigiditydependent inhibition of extension and
differentiation of hippocampal neurons.
Journal of Cell Science, 120 (Pt 21),
38953904.
199 Geiger, R.C., Taylor, W., Glucksberg,
M.R. and Dean, D.A. (2006) Cyclic
stretch-induced reorganization of the
200
201
202
203
204
205
206
207
208
j295
296
References
227
228
229
230
231
232
233
234
j297
298
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
References
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
j299
300
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
References
294
295
296
297
298
299
300
301
302
303 Gotoh, T., Cai, D., Tian, X., Feig, L.A. and
Lerner, A. (2000) p130Cas regulates the
activity of AND-34, a novel Ral, Rap1, and
R-Ras guanine nucleotide exchange
factor. Journal of Biological Chemistry, 275,
30118.
304 Brabek, J., Constancio, S.S., Siesser, P.F.,
Shin, N.Y., Pozzi, A. and Hanks, S.K.
(2005) Crk-associated substrate
tyrosine phosphorylation sites
are critical for invasion and
metastasis of SRC-transformed
cells. Molecular Cancer Research, 3,
307315.
305 Ingber, D.E. and Folkman, J. (1989)
Mechanochemical switching between
growth and differentiation during
broblast growth factor-stimulated
angiogenesis in vitro: role of extracellular
matrix. The Journal of Cell Biology, 109 (1),
317330.
306 Larsen, M., Wei, C. and Yamada, K.M.
(2006) Cell and bronectin dynamics
during branching morphogenesis.
Journal of Cell Science, 119 (Pt 16),
33763384.
307 Heil, M. and Schaper, W. (2007) Insights
into pathways of arteriogenesis. Current
Pharmaceutical Biotechnology, 8 (1), 3542.
308 Mammoto, A., Mammoto, T. and Ingber,
D.E. (2008) Rho signaling and mechanical
control of vascular development. Current
Opinion in Hematology, 15 (3), 228234.
309 Furuya, M. and Yonemitsu, Y. (2008)
Cancer neovascularization and
proinammatory microenvironments.
Current Cancer Drug Targets, 8 (4),
253265.
310 Mahabeleshwar, G.H., Feng, W., Phillips,
D.R. and Byzova, T.V. (2006) Integrin
signaling is critical for pathological
angiogenesis. The Journal of Experimental
Medicine, 203 (11), 24952507.
311 Wijelath, E.S., Murray, J., Rahman, S.,
Patel, Y., Ishida, A., Strand, K., Aziz, S.,
Cardona, C., Hammond, W.P., Savidge,
G.F., Rai, S. and Sobel, M. (2002) Novel
vascular endothelial growth factor
binding domains of bronectin enhance
j301
302
312
313
314
315
316
317
318
319
References
329
330
331
332
j303
j305
10
Stem Cells and Nanomedicine: Nanomechanics of the
Microenvironment
Florian Rehfeldt, Adam J. Engler, and Dennis E. Discher
10.1
Introduction
Tissue cells in our body adhere to other cells and matrix and have evolved to require
such attachment. While it has been known for some time that adhesion is needed
for viability and normal function of most tissue cells, it has only recently been
appreciated that adhesive substrates in vivo namely other cells and the extracellular
matrix (ECM) are generally compliant. The only rigid substrate in most mammals is
calcied bone. While the biochemical milieu for a given cell generally contains a wide
range of important and distinctive soluble factors (e.g. neuronal growth factor,
epidermal growth factor, broblast growth factor, erythropoietin), the physical
environment may also possess very different elasticity from one tissue to another.
It is well accepted that cells can smell or taste the soluble factors and respond via
specic receptor pathways; however, it is also increasingly clear that cells feel the
mechanical properties of their surroundings. Regardless of the adhesion mechanism
that is, cadherins binding to adjacent cells or integrins binding to the ECM cells
engage their contractile actin/myosin cytoskeleton to exert forces on the environment, and this drives a feedback with responses that range from structural remodeling
to differentiation. In this chapter, we aim to provide a brief overview of the diversity of
in vivo micro/nano-environments in the human body, and also seek to describe some
mechanosensitive phenomena, particularly with regards to adult stem cells cultured in
in vitro systems and intended to mimic the elastic properties of native tissues.
10.2
Stem Cells in Microenvironment
10.2.1
Adult Stem Cells
Adult stem cells are distinct from embryonic stem cells (ESCs). Two properties
are required for a stem cell: self-renewal and pluripotency. Stem cells must be able to
Nanotechnology, Volume 5: Nanomedicine. Edited by Viola Vogel
Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31736-3
306
divide indenitely (or at least many times compared to other cells) and also maintain
their undifferentiated state. They must be potent to differentiate into various
lineages. The fertilized egg cell is totipotent because all possible cell types in the
body are derived from it. Adult stem cells or somatic stem cells are multipotent and are
not derived directly from eggs, sperm or early embryos, as are ESCs. Among the adult
stem cells found in fully developed organisms, two classes are of paramount
importance for both basic scientic inquiry and possible medical application:
mesenchymal stem cells (MSCs) and hematopoietic stem cells (HSCs), both of
which can be obtained from adult bone marrow (Figure 10.1).
For more than 50 years, it has been known that HSCs are present in the bone
marrow and can differentiate into all of the different blood cell types. Becker and
coworkers identied a certain type of cell in mouse bone marrow that, when
transplanted into mice which had been heavily irradiated to kill any endogenous
cells, would reconstitute the various HSCs: red blood cells, white blood cells,
platelets, and so on [1]. Since then, the transplantation of HSCs has become a
routine medical treatment for many blood diseases, such as leukemia. However,
controversies persist regarding the differentiation and de-differentiation of HSCs,
particularly whether or not they can become pluripotent, being able to differentiate
into neurons, muscle or bone [25].
MSCs also reside in the bone marrow, and can certainly differentiate into various
types of solid tissue cells such as muscle, bone, cartilage and fat. From human bone
marrow, Pittenger and coworkers successfully isolated truly multipotent MSCs and
also demonstrated in vitro differentiation into various lineages [6]. Using media cocktails often composed of glucocorticoids such as dexamethasone many other groups
have since standardized their differentiation into different tissue lineages [713].
Unlike the adult stem cells, ESCs are derived from a newly fertilized egg that has
divided sufciently to form a blastocyst. Once isolated from the inner mass of the
blastocyst, the ESCs are then cultured and expanded in vitro to produce a sufcient
number of cells for study. These ESCs are pluripotent in that all organismal cell types
in the developing embryo emerge from them. This cell type is therefore considered
to be the most promising for cell therapy and regenerative medicine. On the other
hand, major problems such as immune rejection are signicant obstacles as ESCs
will generally not be from the same organism.
"
Figure 10.1 In vivo microenvironments of adult
stem cells: the physiological range of stiffness.
(a) Range of physiological elasticity of native cells
and tissue: mesenchymal stem cells (MSCs)
reside in the bone marrow (see panel (b)), but can
egress from their niche into the bloodstream and
travel to different tissues and organs, facing new
environments with a wide range of stiffness.
Nerve cells and brain tissue (around and below
1 kPa) are softest, adipocytes are assumed to be
slightly stiffer, while muscle has an intermediate
stiffness (10 kPa). The elasticity of chondrocytes
j307
308
Regardless of the type of stem cell, it is essential to understand how the stem cell
differentiates and in which ways it interacts with its environment, both within the
niche and potential target tissues. Biochemical factors are key but not singular
factors: mechanical cues in the microenvironment have only recently been recognized as contributing to the fate of MSCs.
In this chapter we focus on the nanomechanics of adult stem cells, as these cells
interact with a substrate of well-dened stiffness. Stiffness or, more formally,
Youngs elastic modulus, E (measured in Pascal (Pa)) is a potential issue because of
its physiological variation. That is, the only tissue in the body that is rigid is bone all
other solid tissues are soft, with elastic moduli in the range of 0.1 to 100 kPa
(Figure 10.1a). Over the past decade, it has become particularly evident that the
ability of an adherent cell to exert forces and build up tension reects this elastic
resistance of the surrounding microenvironment. This mechanosensitivity is based
on the tension generated by ubiquitously expressed non-muscle myosins II (NMM II)
which are the molecular motors that drive transduction. As with other class II
myosins, such as skeletal muscle myosin (that moves all of our limbs against
everyday), NMM II assembles into laments. These NMM II minilaments of
300 nm length are bipolar with heads on either side that bind and actively traverse
actin laments [14]. In many situations involving moderately stiff substrates, actin
myosin assemblies are visible in cells as stress bers, which is a prototypical contractile
unit seen at least within cells grown on glass coverslips and other rigid substrates.
Of the three isoforms of NMM II (ac), only non-muscle myosin IIa (NMMIIa)
appears prominent near the leading edge of crawling cells where cells probe their
microenvironment, and is responsible for the bulk of force generation in non-muscle
cells [15]. NMM IIa is also the only isoform that is essential in the developing
embryo: [16] NMM IIa knockout mice fail to exhibit any functional differentiation
beyond germ layers, in that they do not undergo gastrulation, and the null embryos
die by day 7 in utero. Embryoid bodies grown in suspension culture also appear at
and accid, rather than spherical, which suggests weak and unstable cellcell
contacts, even though cadherins are clearly expressed. Such results highlight the
fact that cell motors as an active part of cell mechanics are important even in the
earliest stages of differentiation and development.
10.2.2
Probing the Nanoelasticity of Cell Microenvironments
Cells probe the elasticity of their microenvironment on the micro- and nanometer
scale, whether the surroundings are tissue, ECM or an articial culture substrate.
Since this micro/nano-length scale is the range that cells can feel their surrounding,
an appropriate experimental tool is needed to measure the elasticity on the same
length scale. Perhaps the most suitable and pervasive technique for this is atomic
force microscopy (AFM), which has been most widely used for imaging but can also
make accurate force and elasticity measurements. The atomic force microscope
exploits a microfabricated cantilever with a probe tip of well-dened geometry.
This tip is pressed into the sample and the indentation depth as well as the required
force is measured. The Youngs elastic modulus E of the material surface can then be
calculated from classical expressions and compared to bulk measurements, such as
results from classic tensile tests.
AFM was developed during the 1980s by Binnig and coworkers [17], and was
originally designed to investigate surfaces at the atomic scale with nanometer
resolution. The instruments ability to operate in a uid environment has made it
increasingly important for biological samples, and today it is used frequently to
measure the nanomechanical properties of fre sh tissue samples [18], hydrogels for
cell culture [18], and even living cells [19] as well as single proteins [20]. Many
commercial AFM instruments allow for a precise measurement of forces, and have
the ability to raster the sample at nanometer resolution, which permits mapping of
a samples elasticity.
The basic principle of AFM for determining the elasticity of a sample is sketched
in Figure 10.2a. The cantilevers tip is pressed into the surface and the deection of
j309
310
p1n2 F
d 2 2 tana
Here, F is the applied force of the tip, d is the indentation depth of the probe into the
sample, and n is the Poisson ratio of the sample that is separately measured or can be
estimated typically as 0.30.5.
Figure 10.2b shows a representative force indentation curve for a polyacrylamide
hydrogel with a modulus of E 5.5 kPa, an elasticity typical of soft tissues.
The measured data points (black thick line) span a range of surface indentation
(02000 nm) and surface forces (010 nN) that are typical of matrix displacements
and cell-generated forces at focal adhesions [23]. AFM experiments involve measurements of the same surface THAT a cell would engage, and the results t very well to
a modied Hertz model (thin red line). The elastic modulus determined by this type
of experiment only reects the low frequency (0.110 Hz) or quasi-static elasticity
of the material, which is relevant to studies such as cardiac myocyte beating [23].
Additional techniques can address frequency-dependent viscoelasticity, which can be
important for the interactions of cells and their surroundings. Several studies have
shown a differential behavior of cells subject to an external static or oscillating
force eld, and a simple theoretical model seems to describe the cell response [24].
Though it seems unlikely that frequencies in the MHz range or higher will have an
effect, timescales from milliseconds to hours are likely relevant when compared to
processes of assembly and disassembly of actin laments, microtubules, focal
contacts and focal adhesions [25].
Frequency-dependent rheology measurements that not only encompass the static
elasticity as a low-frequency limit but also measure dynamic viscous properties, are
commonly performed with bulk samples using rheometers. Here, the hydrogel
sample is placed between two parallel plates or a plate and a cone with very small angle
(around 1 ) and a well-dened stress is applied while the strain within the sample is
measured. Using such a rheometer, Storm et al. investigated the complex rheology of
several biopolymers (collagen, brin, neurolaments, actin, vimentin) and polyacrylamide (PA) hydrogels, and found a nonlinear increase of the complex shear modulus
with higher strain so-called strain-hardening behavior [26]. In contrast to the
biological gels, PA hydrogels exhibit a constant shear modulus over a large strain
range. The nonlinear behavior found for biopolymers has clear, albeit unproven,
implications for cellmatrix interactions. With these instruments the bulk measures
of the storage and loss modulus of the material can be determined, but they do not
access the rheology on the cellular scale. AFM can be similarly used if, after the probe is
indented into the sample (cells, gels, etc.), a sinusoidal signal is superimposed to
measure the frequency-dependent viscoelasticty. Mahaffy et al. used this technique to
determine the viscoelastic parameters for cells and PA gels, and found good agreement with bulk measurements, at least for PA gels [27]. Additional particle-based
techniques include magnetic tweezers or magnetic twisting rheology, optical tweezers
and two-particle passive rheology. Although beyond the scope of this chapter, Hoffman
et al. have shown that the use of all these different microrheology tools converges
towards a . . .consensus mechanics of cultured mammalian cells [28].
10.2.3
Physical Properties of Ex-Vivo Microenvironments
Type of tissue
Reference(s)
Brain (macroscopic)
Mammary gland tissue
Muscle (passive, lateral)
Osteoid (secreted lm in culture)
Mammary gland tumor tissue
Dystrophic muscle
Infarcted myocardium (surface)
Granulation tissue
0.11
0.2
10
3050
4
18
55
50
[2931]
[32]
[33,34]
[3638]
[32]
[39]
[42]
[41]
j311
312
10.3
In Vitro Microenvironments
In 1980, Harris and coworkers demonstrated that most cell types actively exert forces
on their substrates [43]. The culture of 3T3 broblasts on thin silicone rubber lms
showed that these cells actively deform these lms, generating a wrinkling pattern
(Figure 10.3a). Opas demonstrated a decade later that chick retinal pigmented
epithelial (RPE) cells exhibit a differential response to substrates that are rigid or
viscoelastic, despite a similar surface composition [44]. On a thick compliant Matrigel
substrate, the cells did not spread and remained heavily pigmented. In contrast, on
the rigid glass substrate that was covalently coated with soluble basement membrane
(Matrigel), the cells spread, developed stress bers, vinculin- and talin-rich focal
contacts, and expressed the dedifferentiated phenotype. These results were perhaps
the rst to suggest the mechanosensitivity of cells to substrate exibility, although the
study was far from conclusive: rather, it was limited to only two conditions with
j313
314
matrix than was exerted under more typical focal adhesions formed on 11 kPa gels.
It was proposed that this is a means by which the matrix inuences the tension that
the cells apply and therefore helps to steer the wound-healing process.
Similar ndings were recently reported for liver-derived portal broblasts and their
differentiation to myobroblasts in vitro on PA substrates in the presence or absence of
transforming growth factor-b (TGF-b) [47]. When these broblasts differentiate
towards myobroblasts as occurs in response to an acute liver injury they start
to express a-smooth muscle actin (a-SMA) and form stress bers on rigid surfaces
(>3 kPa), but not on very soft (400 Pa) gels that resemble the elasticity of native rat or
human liver tissue. When treated with 100 pM TGF-b the portal broblasts began to
express a-SMA even on the soft matrix, although they did not develop organized stress
bers; stiffer matrices were required for cell spreading and stress ber organization.
Cells treated with 5 mM TGF-b receptor kinase inhibitor did not differentiate on any of
the substrates, which suggests that TGF-b functions as an essential contractile inducer
in these cells (opposite to myosin inhibitors), leading to higher a-SMA expression and
stress ber organization as stiffer substrates. Both, biochemical and biophysical
stimuli are thus part of the complex interplay of mechanosensing.
10.3.2
Cells React to External Forces
j315
316
oscillating electromagnetic eld. The model not only agrees with experimental
evidence but also demonstrates the applicability of basic physical concepts to cell
mechanics.
10.3.3
Adult Stem Cell Differentiation
The impact of substrate elasticity on cell behavior is now evident in many studies.
One last but central example for this chapter perhaps highlights the potent
inuence of elastic matrix effects, namely the differentiation of adult stem cells
(MSCs) [53]. The usual method for inducing the differentiation of MSCs towards any
particular lineage (e.g. adipocytes, chondrocytes, myocytes, osteocytes) is to use
media cocktails based on steroids and/or growth factors [613]. Our approach has
been to use a single, 10% serum-containing media and to vary only the stiffness of the
culture substrate in sparsely plated cultures. These cells are exposed to serum in vivo,
but during natural processes of emigration from the marrow to repair and maintain
tissue, they also encounter different micromechanical environments. It is this latter
aspect of environment that we sought to mimic.
MSCs were plated on collagen-I PA hydrogels of different elasticity E
(Figure 10.3b) and found to exhibit after just 4 h a signicantly different morphology
that becomes even more pronounced over the next 24 to 96 h (Figure 10.4a). The cells
spread more with increasing substrate stiffness, as found with other cells [45], but
they also take on different morphologies. As the cells are multipotent, it was of further
interest to assess whether substrate mechanics could also inuence gene transcription, and therefore differentiation. Immunostaining for early lineage specic proteins indeed revealed that the neurogenic marker, b3 tubulin was only present on the
softest 1 kPa gels, the myogenic transcription factor MyoD was most prominent on
the 11 kPa gels, and an early osteogenic transcription factor, CBFa1, was detectable
only on the stiffest 34 kPa substrates. Remarkably, these stiffnesses that induced
differentiation correspond to the elasticities that the various lineages would experience in their respective native tissues: quantitative analyses of differentiation
markers emphasizes the nding that adult stem cells adjust their lineage towards
the stiffness of their surrounding tissues (Figure 10.4b).
"
Figure 10.4 (Continued) increase of any of the
three proteins. Dashed green and orange curves
depict the substrate-dependent upregulation of
the markers for already committed cells [C2C12
(muscle) and hFOB (bone)] exhibiting the same
qualitative behavior at a higher intensity due to
their committed nature; (c) Transcription
profiling array shows selective upregulation of
several lineage-specific genes due to matrix
elasticity. Values for MSCs cultured on PA gels for
one week were normalized by b-actin and then
j317
318
MSCs are believed to have considerable potential for cell therapies and regenerative
medicine. Taking into account the impact of the microenvironment (as described
10.4
Future Perspectives
This chapter could only highlight some of many studies on the implications of the
mechano-chemical environment of cells, even this small selection underscores the
importance of a better understanding of the interactions between cells and environment to improve the design of therapeutic applications. Adult stem cells are probably
one of the most promising candidates for successful tissue regeneration, given their
multipotency, availability and limited ethical considerations, although their interactions with the microenvironment must be taken into account. Further studies must
elucidate the complex interplay of biochemistry and biophysics, and should identify
ways to inuence either side with stimuli from the other. As a prime example,
approaches to repair the infarcted heart reveal how new strategies are needed to
overcome the physical limitations of a brotic tissue. Perhaps it is possible to alter the
cells perception of the surrounding stiffness so that adult stem cells could develop
towards a suitable myogenic lineage (instead of osteogenic)? This is clearly a large
playground for future studies of what are ultimately diseases that couple to cell
mechanics.
j319
320
Acknowledgments
F.R. gratefully acknowledges the Feodor Lynen fellowship from the Alexander von
Humboldt foundation, and thanks Andre E.X. Brown for critical reading of the
manuscript and Andrea Rehfeldt for help with the illustrations. A.J.E. and D.E.D.
acknowledge the NIH and NSF for support via NRSA and R01 funding, respectively.
References
1 Becker, A.J., Till, J.E. and McCulloch, E.A.
(1963) Nature, 197, 452.
2 Corbel, S.Y., Lee, A., Yi, L., Duenas, J.,
Brazelton, T.R., Blau, H.M. and Rossi,
F.M.V. (2003) Nature Medicine, 9, 1528.
3 Hess, D.C., Abe, T., Hill, W.D.,
Studdard, A.M., Carothers, J., Masuya, M.,
Fleming, P.A., Drake, C.J. and Ogawa, M.
(2004) Experimental Neurology, 186, 134.
4 Roybon, L., Ma, Z., Asztely, F., Fosum, A.,
Jacobsen, S.E.W., Brundin, P. and Li, J.Y.
(2006) Stem Cells (Dayton, Ohio), 24, 1594.
5 Deten, A., Volz, H.C., Clamors, S.,
Leiblein, S., Briest, W., Marx, G. and
Zimmer, H.G. (2005) Cardiovascular
Research, 65, 52.
6 Pittenger, M.F., Mackay, A.M., Beck, S.C.,
Jaiswal, R.K., Douglas, R., Mosca, J.D.,
Moorman, M.A., Simonetti, D.W., Craig,
S. and Marshak, D.R. (1999) Science,
284, 143.
7 Caplan, A.I. (1991) Journal of Orthopaedic
Research, 9, 641.
8 Hofstetter, C.P., Schwarz, E.J., Hess, D.,
Widenfalk, J., El Manira, A., Prockop, D.J.
and Olson, L. (2002) Proceedings of the
National Academy of Sciences of the United
States of America, 99, 2199.
9 Kondo, T., Johnson, S.A., Yoder, M.C.,
Romand, R. and Hashino, E. (2005)
Proceedings of the National Academy of
Sciences of the United States of America,
102, 4789.
10 McBeath, R., Pirone, D.M., Nelson, C.M.,
Bhadriraju, K. and Chen, C.S. (2004)
Developmental Cell, 6, 483.
11 Kuznetsov, S.A., Krebsbach, P.H.,
Satomura, K., Kerr, J., Riminucci, M.,
12
13
14
15
16
17
18
19
20
21
References
22 Sneddon, I.N. (1965) International Journal
of Engineering Science, 3, 47.
23 Balaban, N.Q., Schwarz, U.S., Riveline, D.,
Goichberg, P., Tzur, G., Sabanay, I.,
Mahalu, D., Safran, S., Bershadsky, A.,
Addadi, L. and Geiger, B. (2001) Nature Cell
Biology, 3, 466.
24 De, R. Zemel, A. and Safran, S.A. (2007)
Nature Physics, 3, 655.
25 von Wichert, G., Haimovich, B., Feng, G.S.
and Sheetz, M.P. (2003) EMBO Journal,
22, 5023.
26 Storm, C., Pastore, J.J., MacKintosh, F.C.,
Lubensky, T.C. and Janmey, P.A. (2005)
Nature, 435, 191.
27 Mahaffy, R.E., Shih, C.K., MacKintosh,
F.C. and Kas, J. (2000) Physical Review
Letters, 85, 880.
28 Hoffman, B.D., Massiera, G., Van
Citters, K.M. and Crocker, J.C. (2006)
Proceedings of the National Academy of
Sciences of the United States of America,
103, 10259.
29 Gefen, A. and Margulies, S.S. (2004)
Journal of Biomechanics, 37, 1339.
30 Lu, Y.B., Franze, K., Seifert, G.,
Steinhauser, C., Kirchhoff, F., Wolburg,
H., Guck, J., Janmey, P., Wei, E.Q., Kas, J.
and Reichenbach, A. (2006) Proceedings of
the National Academy of Sciences of the
United States of America 103, 17759.
31 Georges, P.C., Miller, W.J., Meaney, D.F.,
Sawyer, E.S. and Janmey, P.A. (2006)
Biophysical Journal, 90, 3012.
32 Paszek, M.J., Zahir, N., Johnson, K.R.,
Lakins, J.N., Rozenberg, G.I., Gefen, A.,
Reinhart-King, C.A., Margulies, S.S.,
Dembo, M., Boettiger, D., Hammer, D.A.
and Weaver, V.M. (2005) Cancer Cell, 8, 241.
33 Yoshikawa, Y., Yasuike, T., Yagi, A. and
Yamada, T. (1999) Biochemical and
Biophysical Research Communications 256,
13.
34 Bosboom, E.M.H., Hesselink, M.K.C.,
Oomens, C.W.J., Bouten, C.V.C.,
Drost, M.R. and Baaijens, F.P.T. (2001)
Journal of Biomechanics, 34, 1365.
35 Collinsworth, A.M., Torgan, C.E.,
Nagda, S.N., Rajalingam, R.J., Kraus, W.E.
36
37
38
39
40
41
42
43
44
45
46
47
j321
322
j323
11
The Micro- and Nanoscale Architecture of the Immunological
Synapse
Iain E. Dunlop, Michael L. Dustin, and Joachim P. Spatz
11.1
Introduction
In vivo, biological cells come into direct physical contact with other cells, and with
extracellular matrices in a wide variety of contexts. These contact events are in turn
used to pass an enormous variety of cell signals, often by bringing ligandreceptor
pairs on adjacent cells into contact with each other. Whereas, some traditional
outlooks on cell signaling arguably focused strongly on these individual ligation
events as triggers for signaling cascades, it is now becoming clear that this is
insufcient. Rather, in some cases where signal-activating ligands are found on cell
or matrix surfaces in vivo, the properties of each surface as a whole need to be
considered if the events leading to signaling are to be fully understood. That is, the
strength of signaling and whether signaling occurs at all can depend on factors
such as the spatial distribution of signaling-inducing ligands that are presented on a
surface, the mobility of these ligands, the stiffness of the substrate, and the force and
contact time between the surface and the cell being stimulated [1]. The effects of such
surface properties on the activation of cell signaling pathways can often be studied by
bringing the cells into contact with articial surfaces, the properties of which can be
controlled and systematically varied, so that the effects of such properties on
signaling pathway activation can be observed. These studies have been successfully
conducted in the context of signaling pathways associated with cell behaviors such as
broblast adhesion to the extracellular matrix (ECM) [2, 3] and rolling adhesion of
leukocytes [4, 5]. One important system in which cellcell communication has been
studied is the immunological synapse formed between T lymphocytes and tissue
cells at multiple stages of the immune response.
We rst introduce the role of Tcells in the immune response and the concept of an
immunological synapse (for an introduction to immunological concepts, see Ref. [6]).
Tcells are an important component of the mammalian adaptive immune system, and
each of the billions of T cells in a mammal expresses a unique receptor generated by
the recombination of variable genomic segments. This can then serve as a substrate
324
for the selection of pathogen-specic T cells suitable for combating only infections
by identical or similar pathogens, the proteomes of which share a particular
short peptide sequence, known as the T cells agonist peptide. There are three main
subclasses of Tcell, classied according to their effector functions: helper-, killer- and
regulatory T cells. Broadly speaking, helper T cells act to stimulate and maintain the
immune response in the vicinity of an infection, whereas killer Tcells are responsible
for detecting and destroying virus-infected cells. Regulatory T cells play a role in the
prevention of autoimmune disease. In this chapter we will concern ourselves almost
entirely with the activation of helper T cells. As the number of possible pathogens is
enormous, the body does not maintain large stocks of T cells of a wide variety of
specicities, but rather maintains small numbers of inactive T cells of each possible
specicity in locations such as the lymph nodes and the spleen. When a pathogen
is detected in the body, specialist antigen-presenting cells (APCs) travel to these
locations and locate the correct T cells to combat the infection. This causes the T cells
to become activated, whereupon they proliferate, producing a large number of T cells
that travel to the infected tissues to carry out their antipathogen roles. Activation of
the T cell occurs during direct physical contact between the T cell and the APC, and
proceeds via the formation of a stable contact region between the T cell and the APC,
known as the immunological synapse. (The term synapse was applied due to a
number of shared features with neurological synapses, such as stability, a role for
adhesion molecules and directed transfer of information between cells [7].) It is
known that one of the central requirements for activation is the ligation of T-cell
receptors (TCRs) on the T-cell surface by peptide-major histocompatibility complex
protein (p-MHC) complexes on the APC surface. The MHC may be thought of as
a molecule in which the APC mounts short peptides made by breaking down all
proteins in the cytoplasm (MHC class I) or in the endocytic pathway (MHC class II),
including both pathogenic and self proteins. As MHC class II molecules are relevant
for helper T cells, we will focus on these from here on. Each TCR molecule strongly
binds MHC molecules that mount the agonist peptide, and weakly binds MHC
molecules that mount a subset of self-peptides. These strong and weak interactions
synergize to generate sustained signals in the T cell. Thus, the APC activates only the
correct Tcells to combat the particular infection that is under way due to the necessary
role of the agonist peptide in T-cell activation, but does so with great sensitivity (early
in infection) due to the contribution of the self-peptides [8].
In addition to the initial activation process, helper T cells may encounter other
agonist p-MHC-bearing APCs later in the infection process with which they can also
form immunological synapses. This is particularly important in the infected tissues,
where helper T cells coordinate responses by many immune cell types. The signaling
from these synapses effectively informs the Tcells that the infection is still in progress,
encouraging them to continue countering it locally. Although there are differences
between the initial activation process and these subsequent restimulations, similar
signaling methods may underlie them both, and we will usually not be concerned with
such distinction in this chapter. Although most of the agonist peptide-specic T cells
will die when the infectious agent has been eliminated from the body, a small subset
will live on as memory T cells and can facilitate the mounting of a response to
reinfection with the same or closely related agents at a later time [911]. In fact, memory
T cells are the basis of vaccination, and the process by which they are reactivated is
likely to be similar in its requirements for immunological synapse formation.
Although TCRp-MHC ligation is necessary for T-cell activation, there is evidence that the structure of the T cellAPC contact zone on a wide variety of length
scales from tens of micrometers down to one to a few nanometers plays a role in
determining the strength of activation signaling. Articial surfaces functionalized
with p-MHC and other immune cell proteins have been used to study structures that
arise in the contact zone, and their effect on the activation process.
In this chapter, we will review the emerging body of work in which surface
functionalization and lithography techniques have been used to produce articial
surfaces that have shed light on the nature and dynamics of the immunological
synapse. We will rst consider the structure of the immunological synapse on
the micrometer scale, including the so-called supramolecular activation clusters
(SMACs). These are essentially segregated areas in which different ligated receptor
species predominate. Although SMACs have been widely studied, it now seems
unlikely that they are the critical structures in providing activation signaling, with
smaller-scale microclusters consisting of around 520 TCRp-MHC pairs bound
closely together being of greater signicance [12]; these microclusters will also be
discussed. We will then describe experiments that demonstrate the importance of the
spatial distribution of molecules on the nanometer scale that is, one to a few protein
molecules using materials such as soluble p-MHC oligomers to stimulate Tcells. By
illustrating the importance of the nanoscale, these results should motivate future
studies in which T cells are brought into contact with surfaces that are patterned on a
nanometer scale with p-MHC and other immunological synapse molecules. We will
then discuss an emerging nanolithography technique that could plausibly be used to
perform such studies, namely block copolymer micellar nanolithography. Finally, we
will consider the possibility of making direct therapeutic use of micro- and nanopatterned T-cell-activating surfaces, and conclude that the most likely application is
in adoptive cell therapy. In this method, T cells are removed from a patient, expanded
in vitro, and returned to the patient to combat a disease most commonly a cancerous
tumor. It has been suggested that the success of adoptive cell therapy can depend
heavily on the detailed phenotype of the returned T cells; the use of micro- and
nanopatterned surfaces for in vitro T-cell activation could help to control their
phenotype.
11.2
The Immunological Synapse
11.2.1
Large-Scale Structure and Supramolecular Activation Clusters (SMACs)
j325
326
interactions. The articial substrates discussed here are based on a simplied model
of the synapse that includes two of the most signicant ligandreceptor pairs: TCR
with p-MHC, and lymphocyte function-associated antigen 1 (LFA-1) with intracellular adhesion molecule 1 (ICAM-1). LFA-1 is an integrin-family protein, the function
of which is to control Tcell to APC adhesion. LFA-1 is expressed on Tcell surfaces and
binds ICAM-1 on the APC surface.
A major contribution to the understanding of the immunological synapse has
been derived from studies in which Tcells are allowed to settle on glass substrates on
which lipid bilayers have been deposited. These bilayers contain some lipid molecules that are bound to protein constructs containing the extracellular portions of
p-MHC and ICAM-1, respectively. Due to the uidity of the lipid bilayer, the p-MHC
and ICAM-1 are mobile, creating a simplied model of the APC surface (see for
example, Refs [1215]). Although a simplied system, this model reproduces features
of immunological synapses observed in vivo with some types of APC, for example the
so-called B cells; differences between these synapses and those observed between
T cells and so-called dendritic cells (another type of APC) may be due to the dendritic
cell cytoskeletons restricting and controlling of p-MHC and/or ICAM-1 motion [16].
On the largest length scales, the evolution of the synapse can be seen to proceed in
three stages, as illustrated in Figure 11.1 [13]. In the rst stage, the T cell is migrating
over the model bilayer surface; this corresponds to an in vivo T cell forming transient
contacts with passing APCs. A central core of adhesive LFA-1ICAM-1 contacts
forms, around which the cytoskeleton deforms to produce an area of very close
contact between the cell and the substrate, in which TCR with agonist p-MHC pairs
can readily form. This cytoskeletal deformation is important, as TCR and p-MHC
molecules are both rather short and consequently easily prevented from coming into
contact by abundant larger glycosolated membrane proteins [13]. Signaling arising
from the formation of TCRp-MHC pairs causes LFA-1 molecules to change their
shape so that they bind ICAM-1 strongly, which in turn causes the cell to stop
migrating and thus stabilizes the synapse. In vivo, this mechanism enables the APCs
to adhere strongly to T cells of the correct specicity, for the periods of up to several
hours that may be necessary for full activation to take place. However, it simultaneously prevents the APCs from forming time-wasting, long-lasting contacts with
other T cells.
In the second stage of immunological synapse evolution, p-MHC-ligated TCRs
migrate to the middle of the contact zone, possibly due to actin-based transport,
leading to the third stage, where a stable central region of bound TCRp-MHC pairs,
the central supramolecular activation cluster (cSMAC), is surrounded by a ring of
ICAM-1 bound to LFA-1, the peripheral supramolecular activation cluster (pSMAC),
which in turn is surrounded by an area of very close contact between the cell and the
surface, suitable for the formation of new TCRp-MHC pairs, the distal supramolecular activation cluster (dSMAC). As the primary purpose of the LFA-1ICAM-1
bond is to bind the T cell to the APC, most of the lines of adhesive force between the
cells pass through the dSMAC where these molecules are highly present, as shown by
the arrows in Figure 11.1. Close examination of the structure and dynamics of
cytoskeletal actin in the cell, as well as the distribution of LFA-1ICAM-1 pairs, has
shown that the dSMAC and pSMAC may be thought of as respectively analogous to
lamellapodium and lammella structures that are exhibited by many motile cells [17],
such as broblasts moving across the ECM. In the case of the broblast, avb3 integrin
molecules (analogous to LFA-1) bind to the ECM surface in the lamellapodium,
which is pushed out in the direction of desired motion. A characteristic feature is that
the actin in the lamellapodium/dSMAC is organized into two stacked layers, whereas
that in the lamella/pSMAC is organized into one layer only. By a combination of actin
polymerization at the periphery and depolymerization at the cell center, the cell
center effectively pulls on the anchored integrin molecules and thus moves towards
them. In the case of the immunological synapse, the same actin polymerization and
depolymerization occur, but because the dSMAC extends out in all directions, and
because the ICAM-1 molecules are mobile in the APC lipid membrane (in contrast to
integrin-binding elements of the ECM), the center of the T cell remains stationary,
although there is a constant motion of actin towards the center of the cell [18]. Some
important implications of this effect will be described in Section 11.2.2.
In a recent study conducted by Doh and Irvine, photolithographic methods were
used to produce a substrate that encouraged T cells to form a cSMAC/pSMAC-like
structure, but without using mobile ligands [19]. Rather, a surface was patterned with
shaped patches of anti-CD3, a type of antibody that binds TCR and can thus simulate
the effect of p-MHC, against a background of ICAM-1. This study employed a novel
method for patterning surfaces with two proteins using photolithography, by using a
photoresist that can be processed using biological buffers [20]. The photoresist used
in this method is a random terpolymer with a methacrylate backbone, and methyl,
o-nitrobenzyl and poly(ethylene glycol) (PEG) (Mn 600 Da) side groups randomly
distributed along the chain, where some of the PEG side chains are terminated
with biotin. The PEG chains make the polymer somewhat hydrophilic, while the
o-nitrobenzyl group can be cleaved to a carboxylic acid-bearing group by ultraviolet
j327
328
(UV) light. The photoresist is spincoated onto a catonic substrate (in this case
aminosilane-functionalized glass) so that, if the resist is exposed to UV light, and then
rinsed with pH 7.4 buffer so that it contains negatively charged carboxylic acid
groups, the negative charge of these groups causes the majority of the photoresist to
be soluble and thus to be washed away. This will leave behind a thin layer of resist
molecules, the negatively charged groups of which are ionically bound to positively
charged amine groups on the glass surface. The sequence of events in preparing
the patterned surface used in the T-cell studies [19] is shown in Figure 11.2. The
photoresist layer was rst exposed to UV through a photomask and then washed with
pH 7.4 buffer to remove all but the thin residual polymer layer from the areas to be
patterned with anti-CD3. After further UV irradiation of the whole surface, streptavidin followed by biotinylated anti-CD3 was deposited at pH 6.0 (at which the resist is
stable) with the streptavidin binding the anti-CD3 to the biotin sites on the resist
polymer surfaces. A second wash at pH 7.4 removed most of the resist from the nonanti-CD3 functionalized area, leaving a thin biotinylated polymer layer to which
streptavidin followed by biotinylated ICAM-1 could be attached.
The T cells that were allowed to settle on surfaces bearing widely spaced circles of
anti-CD3 6 mm in diameter against a background of ICAM-1 (prepared using this
method) migrated until they encountered an anti-CD3 circle, and then formed a
central cSMAC-like area of TCRs bound to anti-CD3, surrounded by a pSMAC-like
ring of LFA-1 bound to ICAM-1 [19]. Molecules known to be associated respectively
with LFA-1 (talin) and TCR (protein kinase C, q isoform (PKC-q)) signaling localized
in the pSMAC- and cSMAC-like regions respectively, in accordance with some
previous observations of T cellAPC contact zones [21]. T cells that formed these
model cSMACpSMAC structures showed elevated levels of intracellular free
calcium (an early sign of activation), and eventually proliferated and showed
increased cytokine production with respect to cells on control surfaces, thus conrming that full activation had taken place.
Although the cSMACpSMACdSMAC model corresponds well to in vivo images
of some TcellAPC contacts (notably where a so-called B cell is used as the APC [21]),
when the important dendritic cell APC type is used, a different structure is seen,
which may be conceptualized as a multifocal pattern with several smaller cSMAC-like
zones. This type of pattern, which could conceivably arise from the dendritic cell
cytoskeletons imposing constraints on the mobility of TCR [16], was reproduced by
Doh and Irvine [19], by using their photolithographic technology to produce groups of
four small anti-CD3 spots (spots 2 mm diameter, spot centers placed at corners of a
5 mm cube). The T cells that encountered such groups indeed formed multifocal
contacts.
Similar multifocal contacts have also been produced by Mossmann et al., who used
electron-beam lithography (EBL) to produce chromium walls (about 100 nm wide and
5 nm high) on a silicon dioxide substrate on which a lipid bilayer containing ICAM-1
and p-MHCwas then deposited. In this way, the ICAM-1 and p-MHCwere able to move
freely laterally up to, but not through, the walls. The p-MHC molecules could thus be
conned to several large regions, resulting in the formation of a multifocal pattern with
several miniature cSMAC-like regions [15, 16], as shown in Figure 11.3.
The T-cell structures produced on the anti-CD3-patterned substrates of Doh and
Irvine [19] might be thought of as a good model of an activating Tcell, with the cSMAC
as the principal source of activation signaling from ligated TCR. However, as will
be seen below, evidence has emerged which suggests that the cSMAC is not
an important source of activation signaling, which rather comes primarily from
TCRp-MHC microclusters, and the physiological relevance of the model substrate
of Doh and Irvine [19] may be questioned in this respect. It is possible that the
interfacial line between anti-CD3- and ICAM-1-coated areas in the studies of Doh and
Irvine [19] served the same function as the dSMAC in immunological synapses
formed on B cells on supported planar bilayers. The important generalization is that
T cells may be highly adaptable as part of their evolution to navigate a wide variety of
anatomic sites and interact with essentially any cell in the body to combat continually
evolving pathogens. Hence, one important role of nanotechnology may be to test the
limits of adaptability and understand the fundamental recognition elements and how
they may be manipulated.
11.2.2
TCRp-MHC Microclusters as Important Signaling Centers
j329
330
central region of initial ICAM-1 adhesion, with the resulting TCRp-MHC pairs then
migrating to the center of the contact zone to nally form the cSMAC. A closer
examination of this system in fact shows the TCRp-MHC pairs combining to form
microclusters throughout the contact zone, which then combine to form the
cSMAC [22]. By using highly sensitive total internal reection uorescence microscopy (TIRFM), it has also proved possible to image a subsequent continuous rain of
microclusters, each consisting of between approximately ve and 20 p-MHCTCR
pairs, that form in the peripheral dSMAC region, and then move radially inwards
eventually joining the cSMAC [12, 23]. This motion is likely to occur because the TCR
are indirectly connected to actin laments, which are moving continuously inwards in
the dSMAC and pSMAC (as discussed in Section 11.2.1). Experiments in which an
antibody that disrupts TCR binding to p-MHC was added after the Tcells had formed a
stable cSMAC on a lipid bilayer surface (such that at early times the formation of new
microclusters was prevented but the cSMAC was not yet disrupted) suggest strongly
that activation signaling arises from the microclusters rather than from the cSMAC, as
signaling almost completely ceased at a time when the cSMAC was still intact [12]. It
therefore seems likely that, rather than functioning as a signaling device, the cSMAC
in fact plays other roles. In particular, it has been observed that signicant numbers of
TCR are endocytosed in the cSMAC. Some of these may be recycled through the cell
for reincorporation into the dSMAC, ensuring a continuous supply of TCR and thus
enabling signaling to continue for a long time, while others may be degraded [14, 24].
If TCRp-MHC microclusters arising in the peripheral dSMAC give rise to
activation signaling, which switches off as they join the cSMAC, there are two
possible hypotheses: the initial signaling may decrease either with time, or with
proximity to the center of the contact zone. It proved possible to distinguish between
these hypotheses by using the chromium walls of Mossmann et al. to divide the
contact zone into many small areas, thus preventing microclusters from moving
large distances. Using this approach, it was shown that, while the signaling from each
microcluster has a nite lifetime, such lifetime decreased strongly with proximity to
the center of the contact zone. This showed that spatial factors do play a role, and help
to conrm the picture of the cSMAC as a non-signaling region [15].
In contrast to the studies just mentioned, in which TCRp-MHC microclusters
formed spontaneously by the coming together or pulling together of mobile p-MHC
molecules in a lipid bilayer [12, 15], Anikeeva et al. effectively created articial
TCRp-MHC microclusters by exposing T cells to a solution of quantum dots. These
are uorescent semiconductor nanocrystals, to which p-MHC molecules have been
bound, with the binding mechanism being the ligation of zinc ions on the semiconductor surfaces by carboxylic acid groups belonging to six histidine residues inserted at
the base of the p-MHC molecule [25]. Approximately 12 p-MHC molecules were found
per quantum dot, as determined by the measurement of nonradiative energy transfer
between the quantum dot and uorophores bound to the p-MHC molecules. This
suggested that the number of ligated TCRs in the articial microclusters might have
been about six, comparable to the size of the smallest signaling microclusters observed
in one of the previously mentioned bilayer studies [12]. The stimulation of Tcells with
appropriate p-MHC-functionalized quantum dots caused activation signaling to occur.
Although this study does not relate directly to our theme of T-cell activation by articial
substrates, we mention it here because it is indicative of the potential value of studies
performed using p-MHC molecules bound to nanospheres. In fact, it will be seen below
that surfaces functionalized with nanospheres may play an important role in future
studies.
In addition to TCRp-MHC microclusters, LFA-1ICAM-1 microclusters have
also been observed; the latter seem to form in the dSMAC and to move inwards
before eventually joining thread-like LFA-1ICAM-1 structures in the pSMAC [18].
Figure 11.4 summarizes schematically the localization of TCRs, p-MHCs, LFA-1 and
ICAM-1 in the three SMACs. The structure of the cytoskeletal actin in these regions,
as discussed in Section 11.2.1, is also shown.
11.3
The Smallest Activating Units? p-MHC Oligomers
j331
332
being enough for signaling. It transpires that activation signaling can indeed not be
initiated by a single ligated TCR, but that the coming together of at least two ligated
TCRs is necessary for signaling. This was demonstrated by Boniface et al., who
combined biotin-functionalized p-MHCs with naturally tetravalent streptavidin
molecules to produce p-MHC monomers, dimers, trimers and tetramers. T cells
exposed to the p-MHC monomer showed no activation signaling, whereas signicant
signaling was already present in the case of the dimer, and the strength increased
when trimer or tetramer was used [26]. This suggests that some degree of
TCR clustering is necessary for T-cell activation signaling. This could conceivably
indicate that part of the signaling mechanism requires the close proximity of the
cytoplasmic parts of neighboring TCRs.
Interestingly, doubt was cast on the nding that TCR clustering is required for the
activation signal when, in an experiment using APCs in which all of the agonist
p-MHC molecules had been uorescently labeled, activation signaling was observed
to be initiated by a T cell where the contact zone with the APC surface contained
only one agonist p-MHC molecule [27]. This apparent contradiction may have been
resolved by Krogsgaard et al., who obtained a T-cell activation signal by stimulating
cells with a synthetic heterodimer consisting of one MHC molecule with agonist
peptide and one with self-peptide (i.e. peptide found within the proteome of the Tcellproducing mammal, in this case a mouse) [28]. Krogsgaard et al. argued that such
heterodimers may play a role in in vivo activation. This controversy and its resolution
underlines the roles that molecules other than agonist p-MHC may play in in vivo
T-cell activation; one of the principal advantages of experiments performed on
articial substrates is that clean experiments can be performed, without the
possible intrusion of unknown ligands. The ability of T cells to respond to mixed
stimulations by agonist and self-peptide-loaded MHC molecules may be important
for the functioning of the immune system, as it could increase the likelihood of T-cell
activation by APC surfaces that present only small amounts of agonist peptide [8].
If TCR clustering is indeed required for T-cell activation, then it is interesting to ask
how close together the TCRs need to be drawn in order for signaling to occur. A
signicant contribution towards answering this question was made by Cochran et al.,
who used p-MHC molecules genetically engineered to contain free cystine residues
to produce p-MHC dimers; the dimers were created by reacting the cystine residues
Figure 11.4 Summary of our current
understanding of the structure of the
immunological synapse. Schematic top and side
views of the T cell only are provided here: it is
assumed that all or most TCRs and LFA-1
molecules shown are ligated by p-MHCs or
ICAM-1 on the opposing APC surface (not
shown). In the top view, TCRs, LFA-1 and actin
are shown in separate locations purely for visual
clarity. In the dSMAC, which is analogous to a
lamellapodium in a migrating cell, and thus
contains two stacked layers of cytoskeletal actin,
microclusters of TCRp-MHC and LFA-1ICAM1 form and are transported towards the cell
center by the inward motion of actin filaments, as
indicated by arrows below the cell. The actin
filament motion is caused by depolymerization
at the edge of the cSMAC, and polymerization at
the edge of the cell. The direction of actin
filament growth is indicated by arrows within the
cell. In the pSMAC, TCRp-MHC microclusters
j333
334
11.4
Molecular-Scale Nanolithography
It can be seen from the above discussions that the clustering of p-MHC-ligated TCRs
is critical to the initiation of T cell activation signaling. In this section, we describe
possible future experiments aimed at further examining these effects, using the
technology of block copolymer micellar nanolithography. This has recently become
available, and enables surfaces to be patterned on the nanometer scale with singleprotein molecules such as p-MHCs. Here, we will describe the technique in detail
and review its previous uses in cell signaling studies. We will also discuss how the
technique can be used, in combination with chemistry and protein engineering
methods, to perform experiments to further our understanding of the immunological synapse.
11.4.1
Block Copolymer Micellar Nanolithography
chemistry. The power of this technique is well illustrated by the studies of Arnold
et al., who functionalized the gold nanoparticles with cyclic arginine-glycine-aspartate
(RGD) peptide molecules that were bound to the gold via a thiol-functionalized
linker [2]. These RGD peptides bind strongly to avb3 integrins, which are membranebound receptors that play an important role in the initiation of adhesion by broblasts
to the ECM. The large size of avb3 integrins ensured that only one integrin could bind
to each gold particle, so that the interparticle spacing could be used as a measure of
the minimum separation between adjacent ligated integrin molecules. Experiments
showed that broblasts adhered readily to substrates with an interparticle spacing of
58 7 nm or less, but did not adhere to substrates with an interparticle spacing of
73 8 nm or more. This suggested that some clustering of ligated integrins is
necessary for the initiation of adhesion signaling in broblasts, and that the critical
spacing below which integrins may be considered to be clustered lies between
58 nm and 73 nm. Additionally, actin-rich protein clusters known as focal adhesions
that form at sites of avb3 integrin-mediated adhesion, and which may be considered
j335
336
as local indicators of adhesion signaling, were observed to form only when the
interparticle spacing was higher than this critical value.
It is important to note that, while other lithographic techniques such as EBL [3] and
dip-pen nanolithography [3335], have been used to immobilize biological ligands on
surfaces, to the best of our knowledge, block copolymer micellar lithography is the
only lithographic method that has thus far been used to spatially isolate individual
ligand receptor interactions. This is most likely due to its ability to reliably produce
particle sizes as small as 3 nm. This limit compares favorably with, for example, the
lower size limit for reliable structure production using EBL with conventional poly
(methyl methacrylate) (PMMA) resists, which is about 10 nm [36], and the smallest
protein feature that has been created to date using dip-pen nanolithography, which is
about 2540 nm [33].
In view of the apparent requirement for the clustering of ligated TCRs if T-cell
activation signaling were to occur, it would clearly be very interesting to perform
an analogous experiment to that just described [2], but to study TCR rather than
avb3 integrin clustering. In order to perform such an experiment, each gold
nanoparticle would need to be functionalized with a single molecule of p-MHC,
and the effect of the interparticle spacing on the activation signaling behavior of
T cells brought into contact with such surfaces determined. If TCR clustering were
indeed necessary for T-cell activation signaling, then one would expect to observe
no signaling when the interparticle spacing was high, with signaling perhaps
onsetting as the interparticle spacing was reduced below a critical value. Indeed,
the above-mentioned studies of Cochran et al. suggested that this spacing should
range between 1 and 15 nm [29].
The binding of p-MHCs to the gold dots could be achieved by creating a
recombinant MHC construct containing an appropriately located free cystine
residue that could react directly with the gold. Alternatively, protein constructs
containing multiple consecutive histidine residues have been successfully bound
to gold nanospheres on block copolymer micellar nanolithography-patterned
surfaces by binding thiol-functionalized nitrilotriacetic acid (NTA) molecules,
and allowing the carboxylic acid groups of the NTA and histidine to simultaneously coordinate the same nickel cation. Functionalization of the silicon dioxide
surface between the gold nanospheres should also be considered. In the integrinclustering experiments of Arnold et al., the area between the gold nanoparticles
was functionalized with protein-repellent PEG molecules that were end-functionalized with trimethoxysilane groups (this enabled them to bind covalently to the
silicon dioxide surface). This PEG functionalization ensured that the cells under
study interacted with the surface only via receptor interactions with ligandfunctionalized gold particles, and not via nonspecic attractions, as well as
resisting the deposition of cell-secreted proteins onto the silicon dioxide surface [2]. In the context of experiments to study T-cell activation, it might be
advantageous to functionalize the area between the gold nanoparticles with a
combination of PEG molecules to reduce the effect of nonspecic cell-surface
attractions, and ICAM-1, to bring about the LFA-1-mediated cell adhesion that is a
critical feature of the immunological synapse. This could be achieved by incor-
porating functional groups into the PEG layer that could be bound specically to
ICAM-1; an example would be to incorporate biotin groups into the PEG layer
that bind via streptavidin to biotinylated ICAM-1 molecules. The incorporation of
biotin into surface-grafted PEG layers has been achieved [37, 38]; I.E. Dunlop
et al., unpublished results], and its incorporation into the PEG layer between the
gold nanoparticles should be readily achievable.
11.4.2
Micronanopatterning by Combining Block Copolymer Micellar Nanolithography
and Electron-Beam Lithography
Surfaces that are structured on both the micrometer and nanometer scales can be
produced using a method that combines block copolymer micellar nanolithography
with EBL.
The principle of the method is shown schematically in Figure 11.6 [39]. After
having deposited a close-packed monolayer of HAuCl4-loaded block copolymer
micelles onto the substrate, part of the layer is exposed to an electron beam, which
causes the polymer molecules to become highly crosslinked. The substrate is then
rinsed with acetone to remove all noncrosslinked polymer, leaving micelles behind
only in the area that was exposed to the electron beam. These micelles are then
exposed to a hydrogen plasma, which leads to hexagonally arranged gold nanoparticles in the normal manner. It is thus possible, by using a steerable electron
beam (such as that in the scanning electron microscope) to pattern only parts of a
surface using block copolymer micellar nanolithography, and thus to produce
patches of patterning containing controlled numbers of gold nanoparticles (see
Figure 11.6).
Surfaces prepared using this method could enable experiments that address
questions relating to the number of p-MHC molecules or clusters required for
T-cell activation signaling to be addressed. For example, if p-MHC-ligated TCR
dimers are sufcient to cause T-cell activation signaling, then it is interesting to ask
whether one dimer would be sufcient to produce a detectable activation signal, as
suggested by the cellcell contact experiments of Irvine et al. [27], and how the
signaling strength would depend on the number of dimers with which a cell
interacts. Additionally, the suggestion of Varma et al., that signaling might arise
primarily from microclusters of between roughly ve and 20 p-MHC-ligated TCRs,
suggests that it would be interesting to examine the effect on signaling intensity of
the microcluster size, and the number of microclusters per cell. Simulated
microclusters containing precisely controlled numbers of molecules could be
produced using patches of gold nanoparticles, similar to that shown in Figure 11.6;
here, each gold nanoparticle could bear one p-MHC molecule and the interparticle
spacing could be chosen sufciently small that a T cell could see the resulting
ligated TCRs as being clustered. Alternatively, microclusters could be simulated by
allowing several p-MHCs to bind to one larger nanosphere, as in the experiments of
Anikeeva et al. mentioned above, in which p-MHCs were bound to soluble quantum
dots [25].
j337
338
11.5
Therapeutic Possibilities of Immune Synapse Micro- and Nanolithography
The principle of the techniques is shown schematically in Figure 11.7. Here, Tcells
are removed from a patient and a selected subpopulation is deliberately activated,
causing it to expand in vitro, before being returned to the patients body. The returned
T cells should then produce a strong immune response to the disease being
treated. Although adoptive T cell transfer may prove useful in combating certain
viral infections, much research has focused on the treatment of cancerous tumors,
where the subpopulation of cells to be expanded should clearly be selected to be
responsive to tumor-related antigens (for reviews, see Refs [4042]). One approach is
to use extracted tumor cells directly to selectively activate T cells of an appropriate
specicity; this leads to a population of Tcells that are specic for a variety of epitopes
contained in the tumor [43]. Alternatively, epitopes that are known to be tumorassociated can be chosen, and Tcells that are specic to those epitopes activated using
articial MHCpeptide constructs [40]. Here, we will focus on the second approach,
as micro- and nanopatterned biomimetic surfaces functionalized with p-MHCs and
costimulatory molecules might be of value in this context.
The identication of tumor-specic antigens is key if the adoptive cell therapy is to
target a tumor, without damaging the healthy tissue: this approach is of particular
value in tumors that are virus-induced, where antigens derived from viral proteins
can be used [44]. Equally, many tumors express signicantly mutated proteins that
could be targeted, although the individual genetic analysis of a patients tumor could
prove expensive [41, 45]. Alternatively, antigens can be chosen from proteins that
are known to be overexpressed in particular types of tumor, or even from healthy
but tissue-specic proteins in tissues that are not necessary for survival, such as the
prostate gland [41, 46].
In order to activate T cells using synthetic p-MHCs it is not necessary to use
sophisticated spatially patterned substrates; the p-MHCs could simply be bound to a
surface with no control of its spatial distribution. However, the results of recent
studies of adoptive cell therapy have emphasized that T-cell activation is not a simple
j339
340
on-off event; depending on the details of the activation method, as well as the prior
history of the T cell, a huge variety of subtly different phenotypes can be obtained.
Moreover, the differences between these phenotypes can determine the outcome of
treatment [40]. An important factor here is the strength of T-cell stimulation with
p-MHCs. Tcells that are fairly strongly stimulated tend to differentiate to an effector or
effector memory T-cell phenotype which will combat infection but not give rise to a
long-lived population of T cells in vivo. In contrast, less-strongly stimulated cells tend
to differentiate to longer-lived central memory T cells, which are more likely to act as
progenitors of a large, long-lived T-cell population [40, 47]. It has been suggested that
adoptive cell therapy can be more effective if central memory, rather than effector or
effector memory, T cells are used [48]. P-MHC micro- and nanopatterned surfaces
could clearly be used to control the activation dose delivered to each T cell by, for
example, producing spatially separated activating patches, each of which contains a
given number of p-MHCs, along with appropriate adhesion molecules and cofactors.
As discussed above, the spatial distribution of p-MHCs on an activating surface can
play a role in determining immunological synapse structure and also the degree of
T-cell activation; spatially structured p-MHC-functionalized surfaces may therefore
be of use in controlling the phenotype of T cells used for adoptive cell therapy.
A number of other factors, in addition to the nature of the p-MHC stimulus, have
been identied as important in the preparation of Tcells for adoptive cell therapy [40].
For example, it may be necessary to selectively activate either helper or killer T cells,
and it is certainly important not to activate regulatory Tcells which act to suppress the
immune response to the target epitope [42, 49]. Also, certain effects of in vitro culture
may cause T cells to senesce in ways that resemble the weakening of the immune
system on aging, thus reducing their therapeutic effectiveness [40, 50, 51]. Both of
these issues have been addressed by activating T cells using costimulatory molecules
simultaneously with p-MHCs. Given the spatial structuring of the immunological
synapse, using lithographic methods to determine the relative positions of p-MHCs
and costimulatory molecules (as described above for p-MHCs and ICAM-1) might
well lead to a better control of the nal T-cell phenotype. Recently, when using
microcontact printing to generate patterns of costimulatory anti-CD28 and TCRactivating anti-CD3, it was shown that multiple peripheral anti-CD28 foci were better
than one large central spot with the same amount of anti-CD28 for the stimulation of
T-cell interleukin-2 production [52].
To summarize, adoptive cell therapy based on the ex vivo activation of Tcells shows
promise as an anti-cancer therapy, but better control of the detailed phenotype of
the activated T cells is desired. Lithographic patterning of activating surfaces with
p-MHCs and costimulatory molecules may contribute to attaining this control.
11.6
Conclusions
Studies performed by bringing T cells into contact with articial surfaces that mimic
aspects of the APC surfaces have contributed greatly to our understanding of the
References
immunological synapse, and such surfaces may be of therapeutic value in the future.
Among the most informative experiments have been those performed using substrates bearing lipid bilayers that contain mobile ICAM-1 and p-MHCs. Photolithographic methods have been used to control the mobility of molecules within such
bilayers, stimulating possible effects of the APC cytoskeleton and enabling the effects
of reduced mobility on signaling by p-MHC-ligated TCR microclusters to be
investigated. In separate studies, photolithographic methods that enable the patterning of surfaces with multiple proteins have been used to bring about articial SMAClike structures. The importance of studying p-MHC-ligated TCR clustering effects at
the nanometer scale is attested to by evidence from several studies in which T cells
were stimulated with soluble p-MHC oligomers, and substrates that are patterned
with single p-MHC molecules on the nanometer scale will accordingly be required
for the next generation of such studies. Block copolymer micellar nanolithography
represents a suitable technique for generating such substrates and, when combined
with EBL. will enable the production of surfaces patterned on both nanometer and
micrometer length scales. Tcell activation experiments performed on such substrates
are likely to play a role in extending our understanding of the immunological synapse.
Both, micro- and nanopatterned substrates may also be used for ex vivo T cell
activation in the context of T cell adoptive immunotherapy, where T cells removed
from a patient are activated and expanded ex vivo before being returned to combat
disease, notably cancer. The use of these substrates may also help to gain close control
of the phenotypes of ex vivo-activated T cells, leading to more effective treatments.
Acknowledgments
The authors thank Thomas O. Cameron and Rajat Varma for useful discussions. This
chapter was partially supported by the National Institutes of Health through the NIH
Roadmap for Medical Research (PN2 EY016586) (I.E.D., M.L.D., J.P.S.) and by the
Max Planck Society (I.E.D., J.P.S.). I.E.D. acknowledges a Humboldt Research
Fellowship.
References
1 Vogel, V. and Sheetz, M. (2006) Nature
Reviews Molecular Cell Biology, 7, 265.
2 Arnold, M., Cavalcanti-Adam, E.A.,
Glass, R., Blummel, J., Eck, W.,
Kantlehner, M., Kessler, H. and Spatz, J.P.
(2004) Chemphyschem, 5, 383.
3 Cherniavskaya, O., Chen, C.J., Heller, E.,
Sun, E., Provezano, J., Kam, L., Hone, J.,
Sheetz, M.P. and Wind, S.J. (2005)
Journal of Vacuum Science & Technology B,
23. 2972.
j341
342
7
8
9
10
11
12
13
14
15
16
17
18
19
20
References
36 Vieu, C., Carcenac, F., Pepin, A., Chen, Y.,
Mejias, M., Lebib, A., Manin-Ferlazzo, L.,
Couraud, L. and Launois, H. (2000) Applied
Surface Science, 164, 111.
37 Morgenthaler, S., Zink, C., Stadler, B.,
Voros, J., Lee, S., Spencer, N.D. and
Tosatti, S.G.P. (2006) Biointerphases,
1, 156.
38 You, Y..-Z. and Oupicky, D. (2007)
Biomacromolecules, 8, 98.
39 Glass, R., Arnold, M., Blummel, J.,
Kuller, A., Moller, M. and Spatz, J.P. (2003)
Advanced Functional Materials, 13, 569.
40 June, C.H. (2007) Journal of Clinical
Investigation, 117, 1204.
41 June, C.H. (2007) Journal of Clinical
Investigation, 117, 1466.
42 Gattinoni, L., Powell, D.J., Rosenberg, S.A.
and Restifo, N.P. (2006) Nature Reviews
Immunology, 6, 383.
43 Milone, M.C. and June, C.H. (2005)
Clinical Immunology, 117, 101.
44 Straathof, K.C.M., Bollard, C.M.,
Popat, U., Huls, M.H., Lopez, T.,
Morriss, M.C., Gresik, M.V., Gee, A.P.,
Russell, H.V., Brenner, M.K., Rooney,
C.M. and Heslop, H.E. (2005) Blood,
105, 1898.
45 Sjoblom, T., Jones, S., Wood, L.D.,
Parsons, D.W., Lin, J., Barber, T.D.,
Mandelker, D., Leary, R.J., Ptak, J.,
Silliman, N., Szabo, S., Buckhaults, P.,
Farrell, C., Meeh, P., Markowitz, S.D.,
Willis, J., Dawson, D., Willson, J.K.V.,
Gazdar, A.F., Hartigan, J., Wu, L., Liu, C.S.,
Parmigiani, G., Park, B.H., Bachman, K.E.,
Papadopoulos, N., Vogelstein, B.,
46
47
48
49
50
51
52
53
j343
j345
12
Bone Nanostructure and its Relevance for Mechanical
Performance, Disease and Treatment
Peter Fratzl, Himadri S. Gupta, Paul Roschger, and Klaus Klaushofer
12.1
Introduction
The human skeleton not only serves as an ion reservoir for calcium homeostasis but
also has an obvious mechanical function in supporting and protecting the body.
These functions place serious requirements on the mechanical properties of bone,
which should be stiff enough to support the bodys weight and tough enough to
prevent easy fracturing. Such outstanding mechanical properties are achieved by
a very complex hierarchical structure of bone tissue, which has been described in a
number of reviews [13]. Starting from the macroscopic structural level, bones can
have quite diverse shapes, depending on their respective function. Long bones, such
as the femur or the tibia, are found in the bodys extremities and provide stability
against bending and buckling. In other cases, for example in the vertebra or the head
of the femur, the applied load is mainly compressive, and in such cases the bone
shell is lled with highly porous cancellous bone (see Figure 12.1). Several levels of
hierarchy are visible in this gure, with trabeculae or osteons in the hundred-micron
range (Figure 12.1b and c), a lamellar structure in the micron range (Figure 12.1d),
collagen brils of 50200 nm diameter (Figure 12.1e), and collagen molecules as well
as bone mineral particles with just a few nanometers thickness.
This hierarchical structure is largely responsible for the outstanding mechanical
properties of bone. At the nanoscale, both collagen and mineral and also their
structural arrangement play a crucial role. In this chapter we review the structure of
bone at the nanoscale, and describe some recent ndings concerning the inuence
of bone on deformation and fracture. We also outline some approaches to studying
biopsy specimens in diseases and in treatments that are known to inuence bone at
the nanoscale.
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
346
12.2
Nanoscale Structure of Bone
and later grow in thickness [15, 16]. Among bone tissues from several different
mammalian and nonmammalian species, the bone mineral crystals have thicknesses
ranging from 1.5 to 4.5 nm [2, 7, 1620]. While bone mineral is based mainly on
hydroxyapatite (Ca5(PO4)3OH), it also typically contains additional elements that
replace either the calcium ions or the phosphate or hydroxyl groups; one of the most
common such occurrences is replacement of the phosphate group by carbonate [1, 2].
12.3
Mechanical Behavior of Bone at the Nanoscale
The fracture resistance of bone results from the ability of its microstructure to
dissipate deformation energy, without the propagation of large cracks leading to
eventual material failure [2123]. Different mechanisms have been reported for the
dissipation of energy [24], including: the formation of nonconnected microcracks
ahead of the crack tip [25, 26]; crack deection and crack blunting at interlamellar
interfaces and cement lines [27]; and crack bridging in the wake zone of the crack
[2830], which was attributed a dominant role [28].
One striking feature of the fracture properties in compact bone is the anisotropy of
the fracture toughness, which differs by almost two orders of magnitudes between a
crack that propagates parallel or perpendicular to the collagen brils [24]. This results
in a zig-zag pattern of the crack path, when it needs to propagate perpendicular
to the bril direction (Figure 12.2a). This dependence of fracture properties on
collagen orientation underlines the general importance of the organic matrix and
j347
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
348
its organization for bone toughness. The organic matrix varies with genetic background, age and disease, and this will clearly inuence bone strength and toughness
[2, 3139].
The dominant structural motif at the nanoscale is the mineralized collagen bril.
Important contributions to the fracture resistance and defect tolerance of bone
composites are believed to arise from these nanometer-scale structural motifs.
In recent studies [40], it has been shown that both mineral nanoparticles and the
mineralized brils deform at rst elastically, but to different degrees, in a ratio of
12 : 5 : 2 between tissue, brils and mineral particles. These different degrees of
deformation of different components arranged in parallel manner within the tissue
can be explained by a shear deformation between the components [41]. This means
that there is shear deformation within the collagen matrix inside the bril to
accommodate for the difference between the strain in mineral particles and brils.
In addition, there must also be some shear deformation between adjacent collagen
brils to accommodate the residual tissue strain. This shear deformation occurs
presumably in a glue layer between brils (Figure 12.2c), which may consist of
proteoglycans and noncollagenous phosphorylated proteins [40, 4245]. The existence of a glue layer was originally proposed as a consequence of investigations
using scanning force microscopy [43]. Beyond the regime of elastic deformation,
it is likely that the glue matrix is partially disrupted, and that neighboring brils
move past each other, breaking and reforming the interbrillar bonds. An alternative explanation could be the debonding between organic matrix and hydroxyapatite
particles (Figure 12.2c) and a modication of the frictional stress between bril
structures [46].
The maximum strain seen in mineral nanoparticles (0.150.2%) can reach up to
twice the fracture strain calculated for bulk apatite. The origin of this very high
strength (200 MPa) of the mineral particles may result directly from their extremely
small size [49]. The strength of brittle materials is known to be controlled by the size
of the defects, and of course it can be argued that a defect in a mineral particle cannot
be larger than the particle itself. Under such conditions, the strength approaches
the theoretical value determined by the chemical bonds between atoms rather than
by the defects [49]. Although the nanoparticles in bone are still a way off this value
(E/10 or 10 GPa), it is believed that the trend towards higher strengths is related to
their small size.
As a consequence, it must be concluded that the mechanical properties of bone
material are determined by a number of structural features, including:
.
.
.
.
.
the mineral concentration inside the organic matrix, the bone mineral density
distribution (BMDD)
the size of mineral particles
the quality of the collagen, in terms of its amino-acid sequence, crosslinks and
hydration
the quality and composition of the extrabrillar organic matrix between the
collagen brils (consisting mostly of noncollagenous organic molecules)
the orientation distribution of the mineralized collagen brils.
Assuming that these parameters are typically optimized in healthy bone material,
it is likely that any variation from normal might affect the mechanical performance
of the bone. Although these material characteristics cannot typically be determined in
a noninvasive manner for patients, they are accessible when studying biopsies using
different and in some cases well-established technology.
12.4
Bone Mineral Density Distribution in Osteoporosis and Treatments
The mineral concentration inside the organic bone matrix is a major determinant of
bone stiffness and strength [2, 33, 50, 51]. However, the mineral content within both,
the trabecular and the cortical bone motifs, is far from homogeneous (Figure 12.3).
At least two processes that occur in bones over the whole lifetime of an adult
individual are responsible for this situation [52]:
.
Bone remodeling: The cortical and trabecular bone compartments are continuously remodeled. This means that, during a cycle of about 200 days, areas of bone
are resorbed by specic bone cells (osteoclasts); this results in resorption lacunae
which are re-lled with new bone matrix [53] produced by other bone cells
(osteoblasts). Thus, the bone tissue of an individual adult is on average younger
than that adults chronological age, because the bone turnover time is about
j349
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
350
ve years [54]. In addition, the more such remodeling sites act on the bone surface,
the higher will be the bone turnover rate, and more bone packets will be present at
a younger stage.
.
12.4.1
Osteoporosis
j351
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
352
12.5
Examples of Disorders Affecting the Structure of Bone Material
As discussed above, the mechanical performance of bone tissue depends on all levels
of hierarchy, and several diseases are characterized by modications at the nanostructural level. In this section, we will detail three examples: (i) osteogenesis imperfecta,
which is based on mutations of the collagen gene; (ii) pycnodysostosis, which originates
from a mutation of the cathepsin K gene; and (iii) uorosis, which is caused by higher
doses of uoride. Whilst all three conditions are characterized by a modication
at the nanoscale either of the organic matrix or of the mineral particles, adaptation
processes during bone remodeling [3] may lead to a partial compensation of the
original defect, sometimes at higher hierarchical levels. This means that the
modication of bone structure may spread over different hierarchical levels, making
it more difcult to pin down the actual origin of the defect.
12.5.1
Osteogenesis Imperfecta
Osteogenesis imperfecta (OI) is a genetic disease that generally affects the collagen
gene and leads to brittle bone with different degrees of severity [6870]. The origin of
the brittleness of the tissue is not fully understood, but must be linked to a mutation
of the collagen molecule and the resultant changes in tissue quality. Generally, OI also
leads to a reduced bone mass and cortical thickness [70] which additionally increases
bone fragility. It has been shown that anti-resorptive treatment of affected children
with the bisphosphonate pamidronate leads to an increase in cortical thickness and to
a concomitant reduction of fracture incidence [70]. At the nanostructural level,
an increased mineral content was found in the bone matrix [71, 72], which leads to
increased stiffness and hardness of the bone tissue [73]. However, the signicance of
bone fragility for this increased mineralization is not yet fully clear, as it is not affected
by bisphosphonate treatment [73].
More detailed information on the bone matrix nanostructure and the diseaserelated changes of its properties were obtained in a mouse model of OI [74, 7682].
This model, which is known as osteogenesis imperfecta murine (oim), is characterized by an absence of the a2 procollagen molecule, leading to the formation of
collagen a1 homotrimers. The mechanical properties of bone tissue were found to be
altered, with a reduced failure load [83] and toughness [82] in oim compared to
controls. The mineral content was, however, increased in oim [75], leading to a stiffer
matrix [75, 78] (see Figure 12.4). In agreement with observations in humans, this
increased tissue mineralization was preserved in treatment with bisphosphonates [84]. The increased brittleness of the tissue is most likely due to a weakness
of the collagen-matrix, associated with an increase in mineralization. Indeed, the
collagen brils seem to break at only half the load in oim homozygotes [74]
(see Figure 12.4). The reason for this inherent weakness of collagen might be a
modied crosslink pattern in the brils [81], due to the fact that normal crosslinks
between a2 and a1 chains cannot form due to an absence of a2.
12.5.2
Pycnodysostosis
j353
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
354
collagen left by the dysfunctional osteoclasts may also be possible by the action of
matrix-metalloproteinases synthesized by the bone lining-cells, which are members
of the osteoblastic lineage [90]. It appears that these two pathways are not equivalent,
however, as the lack of cathepsin K activity leads not only to a disturbed bone
resorption [91] but also to a decreased bone formation activity [92, 93]. Bone tissue
analyses of two affected patients also revealed defects at the nanostructural level,
with mineral crystals being increased in size and reecting a less-remodeled older
bone tissue. Moreover, the trabecular architecture appeared to be severely disturbed,
with an unusually large variability in the orientation of the mineral particles and
a highly disturbed lamellar organization, with the main collagen brils not oriented
in the principal stress direction (see Figure 12.5). Thus, the absence of functional
cathepsin K activity has a profound effect on bone quality at the nanoscale, and leads
most likely to an observed increase in bone fragility.
12.5.3
Fluorosis
Fluoride has an anabolic effect on bone and is known to increase cancellous bone
mass. During the 1980s and 1990s this led to uoride being considered as a potential
treatment of osteoporosis [94, 95], although clinical trials failed to conrm the
anticipated anti-fracture efcacy [65, 96]. One reason for this is that uoride clearly
not only stimulates osteoblasts to form new bone but also has a direct effect on bone
material quality. Studies involving small-angle X-ray scattering (SAXS) and backscattered electron imaging [66, 97, 98] revealed that bone formed under the inuence
of uoride has a quite different microscopic structure (see Figure 12.6). Moreover,
the collagenmineral nanocomposite was seen to be massively disturbed. Indeed, the
12.6 Conclusions
strongly modied SAXS signal from bone areas newly formed under the inuence of
uoride revealed the presence of mineral crystals much larger than in normal bone
(Figure 12.6). This implied that the collagen and mineral in uorotic bone did not
form a well-organized nanocomposite, but that the large mineral crystals simply
coexisted with the collagen brils. The result was a bone material of lower quality that
would most likely be more brittle than usual. The images in Figure 12.6, which show
a bone biopsy of a patient treated with sodium uoride, also indicate that old bone
with a normal structure coexists with newly formed uorotic bone material. Due to a
constant bone turnover, the old normal bone is gradually replaced by new bone with
a uorotic structure. This gradually compensates the positive mechanical effect of
the bone mass increase and nally leads to a deterioration of bone stability against
fracture [66, 97, 98].
12.6
Conclusions
j355
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
356
New analyses of epidemiologic data provide strong evidence for the view that all
(or better still, the overwhelming majority of) fractures regardless of when they
occur and of the level of trauma that precipitates them may be based upon bone
fragility [59], thus focusing all aspects of pathophysiology, diagnosis and treatment
of skeletal diseases to the central question of mechanical competence and bone
fragility. Following Robert Marcus thoughts on . . . the nature of osteoporosis [102],
bone fragility might be dened most appropriately from the pathophysiological
point of view as . . . the consequence of a stochastic process, that is, multiple genetic,
physical, hormonal and nutritional factors acting alone or in concert to diminish
skeletal integrity.
Based on the fact that skeletal integrity is determined by the outstanding mechanical properties of bone at all hierarchical levels of its structure and organization [2],
it becomes increasingly evident that a simple diagnostic parameter such as lumbar
spine or hip BMD [103105], although frequently used as a noninvasive diagnostic
tool in clinical routine, does not have the diagnostic power to reect the complex
pathophysiological mechanisms that determine bone fragility. Thus, the availability
of new diagnostic tools developed by materials scientists, coupled with a possible
combinatorial approach using different methods to dene the material qualities of
bone from the micrometer to the nanometer scale, should introduce a renaissance
of bone biopsies as diagnostic tools in clinical osteology. For example, the BMDD of
trabecular human bone (as described above) was shown to be evolutionarily optimized within relative small variations (ca. 3%), independently of different skeletal
regions for healthy adults aged between 25 and 95 years. Until now, no differences
have been identied for BMDD-derived parameters with regards to gender or
ethnicity. As shown in several examples, deviations from the normal BMDD seem
to be associated with skeletal disorders, and in many examples indicate bone
fragility [57]. BMDD can be determined by using qBEI on a transiliac biopsy, as
routinely occurs for histomorphometry, and combined with a variety of techniques
based on spectroscopy, light scattering or biomechanical testing [2, 57].
When investigating the treatment of post-menopausal osteoporosis with the antiresorptives alendronate [62] or risedronate [60] and the anabolic intermittent
PTH [55], slight but signicant deviations in BMDD indicated a lower mineralization for all placebo groups, and this was conrmed for idiopathic osteoporosis in
pre-menopausal women. An example of learning from a materials science perspectives was that of uorosis, and the uoride treatment of post-menopausal osteoporosis [66, 97]. Yet, despite sodium uoride being used widely to treat post-menopausal
osteoporosis, no anti-fracture efcacy was reported. Rather, the bone quality revealed
extensive and pathologic mineralization at both micro- and nanoscale, leading to
a more brittle material with increased fragility.
Two classical genetic bone diseases pycnodysostosis [92] and OI [68, 70, 71,
7476, 78, 84] point to a genetically related diminution of skeletal integrity. In OI,
which often is fatal, the primary pathology was shown as brittle bones, inefcient
repair mechanisms and a high bone turnover, whereas in pycnodysostosis the effects
were caused by nonfunctioning osteoclasts due to mutations of the essential enzyme
cathepsin K [85]. However, an inability to optimize structure by bone remodeling
References
results in a sclerosing bone disease with high bone mass and fragility fractures due to
a disorganized structure at several hierarchical levels.
In conclusion, a wealth of evidence has been accumulated during the past years
supporting the concept that the study of bone micro- and nanostructures will not only
improve our understanding of the mechanisms that underlie bone fragility but also
help to identify the effects of treatments. Nanomedicine, and its application to bone
research, will in time undoubtedly broaden our knowledge of pathopysiology and
improve the diagnoses, prevention and treatment of bone diseases. The availability of
new techniques to investigate bone biopsies will surely challenge clinical osteologists
and bone pathologists in the near future.
References
1 Weiner, S. and Wagner, H.D. (1998)
Annual Review of Materials Science,
28, 271.
2 Fratzl, P., Gupta, H.S., Paschalis, E.P. and
Roschger, P. (2004) Journal of Materials
Chemistry, 14, 2115.
3 Fratzl, P. and Weinkamer, R. (2007)
Progress in Materials Science, 52, 1263.
4 Canty, E.G. and Kadler, K.E. (2002)
Comparative Biochemistry and Physiology.
Part A, Molecular & Integrative Physiology,
133, 979.
5 Kadler, K.E., Holmes, D.F., Trotter, J.A.
and Chapman, J.A. (1996)
The Biochemical Journal, 316, 1.
6 Hodge, A.J. and Petruska, J.A. (1963)
Aspects of Protein Structure
(ed G.N. Ramachandran), Academic
Press, New York, p. 289.
7 Rubin, M.A., Rubin, J. and Jasiuk, W.
(2004) Bone, 35, 11.
8 Fantner, G.E., Hassenkam, T., Kindt, J.H.,
Weaver, J.C., Birkedal, H., Pechenik, L.,
Cutroni, J.A., Cidade, G.A.G., Stucky, G.D.,
Morse, D.E. and Hansma, P.K. (2005)
Nature Materials, 4, 612.
9 Landis, W.J. (1996) Connective Tissue
Research, 35, 1.
10 Weiner, S. and Traub, W. (1992)
The FASEB Journal, 6, 879.
11 Hassenkam, T., Fantner, G.E.,
Cutroni, J.A., Weaver, J.C., Morse, D.E.
and Hansma, P.K. (2004) Bone, 35, 4.
j357
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
358
References
56 Boivin, G. and Meunier, P.J. (2003)
Osteoporosis International, 14, S22.
57 Roschger, P., Paschalis, E.P., Fratzl, P.
and Klaushofer, K. (2008) Bone, 42, 456.
58 Roschger, P., Fratzl, P., Eschberger, J. and
Klaushofer, K. (1998) Bone, 23, 319.
59 Mackey, D.C., Lui, L.Y., Cawthon, P.M.,
Bauer, D.C., Nevitt, M.C., Cauley, J.A.,
Hillier, T.A., Lewis, C.E., Barrett-Connor,
E. and Cummings, S.R. (2007) The Journal
of the American Medical Association,
298, 2381.
60 Zoehrer, R., Roschger, P., Paschalis, E.P.,
Hofstaetter, J.G., Durchschlag, E.,
Fratzl, P., Phipps, R. and Klaushofer, K.
(2006) Journal of Bone and Mineral
Research, 21, 1106.
61 Fratzl, P., Roschger, P., Fratzl-Zelman, N.,
Paschalis, E.P., Phipps, R. and
Klaushofer, K. (2007) Calcied Tissue
International, 81, 73.
62 Roschger, P., Rinnerthaler, S., Yates, J.,
Rodan, G.A., Fratzl, P. and Klaushofer, K.
(2001) Bone, 29, 185.
63 Haas, M., Leko-Mohr, Z., Roschger, P.,
Kletzmayr, J., Schwarz, C.,
Mitterbauer, C., Steininger, R.,
Grampp, S., Klaushofer, K., Delling, G.
and Oberbauer, R. (2003) Kidney
International, 63, 1130.
64 Roschger, P., Mair, G., Fratzl-Zelman, N.,
Fratzl, P., Kimmel, D., Klaushofer, K.,
LaMotta, A. and Lombardi, A. (2007)
Journal of Bone and Mineral Research,
22, S129.
65 Riggs, B.L., Hodgson, S.F., Ofallon, W.M.,
Chao, E.Y.S., Wahner, H.W., Muhs, J.M.,
Cedel, S.L. and Melton, L.J. (1990) The
New England Journal of Medicine, 322, 802.
66 Fratzl, P., Roschger, P., Eschberger, J.,
Abendroth, B. and Klaushofer, K. (1994)
Journal of Bone and Mineral Research,
9, 1541.
67 Finkelstein, J.S., Hayes, A.,
Hunzelman, J.L., Wyland, J.J., Lee, H. and
Neer, R.M. (2003) The New England
Journal of Medicine, 349, 1216.
68 Prockop, D.J. (1992) The New England
Journal of Medicine, 326, 540.
j359
j 12 Bone Nanostructure and its Relevance for Mechanical Performance, Disease and Treatment
360
j361
13
Nanoengineered Systems for Tissue Engineering
and Regeneration
Ali Khademhosseini, Bimal Rajalingam, Satoshi Jinno, and Robert Langer
13.1
Introduction
362
structure at the nanoscale is a potentially powerful approach for generating biomimetic tissues. These processes will be critical not only for generating tissue
engineered constructs but also for engineering in vitro systems that can be used
for various drug discovery and diagnostics applications.
Nanotechnology is an emerging eld that is concerned with the design, synthesis,
characterization and application of materials and devices that have a functional
organization in at least one dimension on the nanometer scale, ranging from a few to
about 100 nm [1]. Due to this ability to control features at small length scales,
nanotechnology is becoming more commonly used in a number of biomedical
endeavors ranging from drug delivery [25] to in vivo imaging [6].
In this chapter we will discuss the application of nanotechnology to tissue engineering as an enabling tool. Specically, we will provide an overview of two different
types of nanoengineered system that are used in tissue engineering. First, we will
focus on various approaches that are used to generate nanoscale modications to
existing polymers and materials. These nanoengineered systems, such as nanopatterned substrates and electrospun scaffolds, provide structures that inuence cell
behavior and the subsequent tissue formation. Furthermore, we will discuss the use
of other nanoscale structures such as controlled-release nanoparticles for tissue
engineering. In the second part of the chapter we will discuss the use of nanotechnology for the synthesis of novel materials that behave differently as bulk compared to
their nanoscale versions. Such materials include self-assembled materials, carbon
nanotubes and quantum dots. Throughout the chapter we will discuss the use of
nanomaterials for controlling the cellular microenvironment and for generating 3-D
tissues. We will also detail the potential limitations and emerging topics of interest
and challenges in this area of research. Clearly, the application of nanotechnology to
tissue engineering and cell culture is an exploding eld, and hopefully in this
chapter we will provide a glimpse into the various applications. Throughout the
chapter, when applicable, the reader is directed to more extensive reviews to provide
further detail regarding specic topics.
13.2
Nanomaterials Synthesized Using Top-Down Approaches
j363
364
cellcell, cellECM and cell-soluble factors in the resulting scaffolds. This has
prompted the use of other techniques to fabricate nanofabricated scaffolds. For
example, nanofabricated PLLA scaffolds that supported the differentiation and
outgrowth of NSCs have been fabricated by a liquidliquid phase separation
method [20]. Yet, despite these limitations, electrospinning is a powerful technology
for the fabrication of 3-D nanoscaffolds.
13.2.2
Scaffolds with Nanogrooved Surfaces
Both, micro- and nanotextured substrates can signicantly inuence cell behavior
such as adhesion, gene expression [15] and migration [16]. This is because the
interaction of cells with a biomaterial results in the localization of focal adhesions,
actin stress bers and microtubules [21]. Focal contacts are involved in signal
transduction pathways which in turn can regulate a wide array of cell function [22]
(Figure 13.2). As nanober scaffolds have a larger surface area, they have more
potential binding sites to cell membrane receptors, thus affecting the cell behavior in
unique ways. It has also been observed that the cells display more lopodia when they
come in contact with the nanoscale surfaces, presumably to sense the external
topography [23]. Although the mechanism of cellular response to nanotopography is
not entirely understood, it is suggested that an interaction of the cellular processes
and interfacial forces results in peculiar cellular behavior [7]. Both, micro- and
nanotextured substrates can be engineered either on tissue culture substrates or
within 3-D tissue scaffolds. Within tissue engineering scaffolds, nanotextures
provide physical cues to seeded cells and regulate the interaction of host cells
with the scaffold. For example, surfaces that have desired roughness have been
shown to increase osteoblast adhesion for orthopedic replacement/augmentation
applications [24].
Nanotextures can be generated using a variety of techniques, depending on the
material as well as the dimension and shapes of the desired structures. For example,
features less than 100 nm may be produced by a range of techniques including
chemical etching in metals [25], the embedding of carbon nanobers in composite
j365
366
materials [26], casting polymer replicas from ECM [27], or the embedding of
constituent nanoparticles in materials ranging from metals to ceramics to composites [2831]. Electron-beam lithography (EBL) technology has been used to fabricate
well-dened nanostructures at sub-50 nm length scales [21, 32]. These tools can be
used to form structures at the same length scales as the native ECM, and thus enable
the systematic study of cell behavior (for a review, see Ref. [24]).
The shape of the nanostructures inuences the cell behavior and phenotype. For
example, nanogrooves [3337] result in an alignment of cells parallel to the direction
of the grooves [33, 38] as well as the alignment of actin, microtubules and other
cytoskeletal elements [3941]. Interestingly, both the pitch and depth of the grooves
inuence cell behavior. For example, typically the orientation increases with increased depth of the nanogrooves [42]. Another shape that has been shown to
inuence cell behavior is the natural roughness of tissues. For example, endothelial
cells that were cultured on the ECM-textured replicas spread faster and had an
appearance more like the cells in their native arteries than did cells grown on
nontextured surfaces [27, 43]. Fibroblasts cultured on nanopatterned e-PCL surfaces
were also less spread compared to those on a planar substrate [44]. Furthermore,
human mesenchymal stem cells (MSCs) and ESCs align on nanofabricated substrates and differentiate in a specic manner [45, 46]. Therefore, by controlling the
nanotopography of tissue engineering scaffolds inductive signals can be delivered to
enhance tissue formation and function.
13.3
Nanomaterials Synthesized using Bottom-Up Approaches
In this section we describe the synthesis and application of nanomaterials that are
built by nanoscale assembly of molecules with properties that are often different from
their individual components and their bulk material. These materials include selfassembled peptide hydrogels, quantum dots, carbon nanotubes and layer-by-layer
deposited lms.
13.3.1
Self-Assembled Peptide Scaffolds
The ability to control the surface properties of biological interfaces is useful in various
aspects of tissue engineering. One means of obtaining such controlled surfaces is by
the layer-by-layer (LBL) deposition of the charged biopolymers. LBL deposition uses
the electrostatic interaction between the surface and the polyelectrolyte solutions to
generate lms with nanoscale dimensions. LBL has been used extensively to control
the cellular microenvironment in vitro. For example, we have generated patterned
cellular cocultures using the LBL deposition of ionic biopolymers hyaluronic
acid (HA), poly-L-lysine (PLL) [55] and collagen. In this approach, micropatterns of
j367
368
toxicity, instead of PLL other positively charged molecules such as ECM components
(i.e. collagen) have also been used (Figure 13.4) [56]. In a related experiment, cultured
human endothelial cells have been patterned using LBL on a polyurethane surface.
Here, it was observed that the cells did not spread on the negatively charged surface
due to an electrostatic repulsion, whereas inversing the surface charge by adding
positively charged collagen increased the cell spreading and proliferation. Thus, cell
j369
370
attachment on multilayer thin lms may depend on the charge of the terminal
polyion layer [57].
The number of layers in the LBL lms may play a role in the cell attachment
behavior. It was reported that increasing the number of layers of titanium oxide
nanoparticle thin lms increased the surface roughness, cell attachment and the rate
of cell spreading. Although this may be due to the increased surface roughness, it also
demonstrates the potential of this technology in controlling cell-surface properties [58]. Furthermore, the LBL assembly of PLL and dextran sulfate could be used to
increase the rate of bronectin deposition and the subsequent cell adhesion relative
to the control substrates [59].
The LBL deposition of materials has been used in a variety of tissue engineering
applications. For example, the LBL assembly of HgTe has been used to fabricate a
hybrid device where the absorption of light by quantum nanoparticles stimulates
neural cells by a sequence of photochemical and charge-transfer reactions [60]. These
devices may be of potential use in tissue engineering applications in which it is
desired to stimulate nerve cells using external cues. The LBL assemblies of
nanoparticles have also been explored as a means of protecting arteries damaged
during revascularization procedures. It has been reported that the deposition of
self-assembled nanocoatings comprising alternating depositions of HA and chitosan
onto aortic porcine arteries, led to a signicant inhibition of the growth of thrombus
on the damaged arterial surfaces. Clearly, this technique has the potential for clinical
application to protect damaged arteries and to prevent subsequent restenosis [61].
Therefore, by properly choosing the LBL materials it is possible to modify the surface
properties of materials for tissue engineering as well as for biosensing [62, 63] and
drug delivery applications [64].
Mironov and colleagues have used the LBL technique for organ printing, with
precise control over the spatial position of the deposited cell [65]. LBL deposition can
also be used for the fabrication of immunosensors [66], islet cell encapsulation [67]
and polyelectrolyte capsules for drug release [68].
13.3.3
Carbon Nanotubes
Carbon nanotubes are nanomaterials with unique mechanical and chemical properties. They have been used for cell tracking, for the delivery of desired molecules to
cells, and as components of tissue engineering scaffolds [69]. Carbon nanotubes,
depending on the number of carbon walls, can range from 1.5 to 30 nm in diameter
and may be hundreds of nanometers in length.
Within tissue engineering scaffolds carbon nanotubes can be used to modify the
mechanical and chemical properties. Furthermore, carbon nanotubes can be functionalized with biomolecules to signal the surrounding cells, or they may be
electrically stimulated due to their high electrical conductivity to excite tissues such
as muscle cells and nerve cells. One potentially powerful method of integrating
carbon nanotubes into tissue engineering is by generating composite materials
which comprise a biocompatible material such as collagen with embedded, single-
walled carbon nanotubes. As an example, smooth muscle cells (SMCs) have been
encapsulated within collagencarbon nanotube composite matrices with high cell
viability (>85%) for at least 7 days [70]. Single-walled carbon nanotubes can also be
used for culturing excitable tissues such as neuronal and muscle cells [71]. It has also
been suggested that the growth of the neuronal circuits in carbon nanotubes might
result in a signicant increase in the network activity and an increase in neural
transmission, perhaps due to the high electric conductivity of carbon nanotubes [72].
Furthermore, the electrical stimulation of osteoblasts cultured in nanocomposites
comprising PLA and carbon nanotubes increased their cell proliferation and the
deposition of calcium after 21 days. These data show that the use of novel currentconducting nanocomposites would be valuable for enhancing the function of the
osteoblasts, and also provide useful avenues in bone tissue engineering [73].
Carbon nanotubes have also been used to delivery pharmaceutical drugs [7476],
genetic material [7779] and biomolecules such as proteins [80, 81] to various cell
types. For example, carbon nanotubes have been used to deliver Amphotericin B to
fungal-infected cells. Here, the Amphotericin B was found to be bonded covalently to
carbon nanotubes and was uptaken by mammalian cells, without signicant toxicity,
while maintaining its antifungal activity [82]. Thus, carbon nanotubes may be used
for the delivery of antibiotics to specic cells.
The inuence of carbon nanotubes on cells varies, depending on the type and their
surface properties. For example, it has been reported that rat osteosarcoma (ROS)
17/2.8 cells, when cultured on carbon nanotubes carrying a neutral electric charge,
proliferated to a greater extent than did other control cells [83]. Chemically modifying
the surface of carbon nanotubes can also be used to enhance their cytocompatibility.
For example, carbon nanotubes have been coated with bioactive 4-hydroxynonenal in
order to culture embryonic rat brain neurons that promote neuronal branching
compared to unmodied carbon nanotubes [84]. Despite such promise, however, the
cytotoxicity of carbon nanotubes remains unclear. It is well known that various
properties such as surface modications and size greatly inuence the potential
toxicity of these structures. For example, long carbon nanotubes have been shown to
generate a greater degree of inammation in rats than shorter carbon nanotubes
(200 nm), which suggests that the smaller particles may be engulfed more easily by
macrophages [85]. Other studies have also shown that carbon nanotubes may not only
inhibit cell growth [86] but also induce pulmonary injury in the mouse model [87],
when sequential exposure to carbon nanotubes and bacteria enhanced pulmonary
inammation and infectivity. Thus, more extensive and systematic studies must be
conducted to ensure that the use of these nanomaterials in tissue engineering does
not result in long-term toxicity.
13.3.4
MRI Contrast Agents
j371
372
Although MRI contrast agents take many forms, nanoparticle systems have emerged
in recent years as one of the most promising as nanoparticles not only provide an
enormous surface area but can also be functionalized with targeting ligands and
magnetic labels. Moreover, their smaller size provides an easy permeability across
the blood capillaries.
Iron oxide nanoparticles have shown great promise for use in MRI to track cells,
because they can be uptaken without compromising cell viability and are relatively
safe. A wide variety of iron oxide-based nanoparticles have been developed that differ
in hydrodynamic particle size and surface-coating material (dextran, starch, albumin,
silicones) [91]. In general, these particles are categorized based on their diameter into
superparamagnetic iron oxides (SPIOs) (50500 nm) and ultrasmall superparamagnetic iron oxides (USPIOs) (<50 nm), with the size dictating their physico-chemical
and pharmacokinetic properties. It has also been shown that clearance of the iron
oxide nanoparticles in the rat liver depends on the outer coating [92].
Iron oxide nanoparticles have been used for imaging various organs, including the
gastrointestinal tract, liver, spleen and lymph nodes. Furthermore, smaller-sized
particles can also be used for angiography and perfusion imaging in myocardial and
neurological diseases. Iron oxide particles can be coated with various molecules to
increase their circulation and targeting. For example, dextran-coated iron oxide
nanoparticles have been used for labeling cells, while anionic magnetic nanoparticles
can be used to target positively charged tissues by using electrostatic interactions [93]
(Figure 13.5). In addition, iron oxide nanoparticles can be used to track cells in vivo
after transplantation. For example, MSCs and other mammalian cells labeled with
SPIO nanoparticles were used to track cells in both experimental and clinical
settings [94, 95]. Furthermore, green uorescent protein (GFP ) ESCs that were
labeled with dextran-coated iron oxide nanoparticles and implanted into the brains of
rats with brain stroke showed that the cells could be tracked for at least three
weeks [96]. The in vivo tracking of iron oxide nanoparticle-labeled rat bone marrow
MSCs and mouse ESCs and human CD34 hematopoietic progenitor cells in rats
with a cortical or spinal cord lesion has also shown that cells may remain visible in the
lesion for at least 50 days [97, 98]. Taken together, these and other results [99] indicate
that magnetic nanoparticles are well-suited for the noninvasive analysis of cell
migration, engraftment and morphological differentiation at high spatial and
temporal resolution.
In order to target desired cells or to modify the rate of cellular uptake, nanoparticles
may be engineered with specic molecules on their surfaces. For example, in order to
increase their internalization, NSCs and CD34 bone marrow cells were labeled
with superparamagnetic nanoparticles that were conjugated with short HIV-Tat
peptides. This increased the internalization of the particles by the cells, without
affecting their viability, differentiation or proliferation. The localization and retrieval
of cell populations in vivo enabled a detailed analysis of specic stem cell and organ
interactions that were critical for advancing the therapeutic use of stem cells [100].
In addition to iron oxide nanoparticles, other types of nanoparticle have also been
used for tissue imaging, notably with applications in tissue engineering. As an
example, uoroscein isothiocyanate (FITC) -conjugated mesoporous silica nanoparticles (MSNs) have been used to label human bone marrow MSCs and 3T3-L1 cells.
The FITC-MSNs were efciently internalized into MSCs and 3T3-L1 cells, even with
short-term incubation (24 h), without affecting cell viability [101]. Thus, it seems
that nanoparticles can be used potentially not only to track cells but also to image
tissues which may be useful for the noninvasive imaging of tissue-engineered
constructs.
13.3.5
Quantum Dots
Nanoscale probes can also be used in tissue engineering applications for the study of
various biological processes, as well as for real-time cell detection and tracking.
Fluorescent dyes, which traditionally have been used to image cells and tissues, have
several drawbacks including photobleaching and a lack of long-term stability.
Quantum dots (QDs) are nanoparticles that have several advantages over conventional uorophores for imaging, including tunable properties and a resistance to
photobleaching [6]. QDs are semiconductor nanostructures that conne the motion
of conduction band electrons, valence band holes or excitons in all three spatial
directions. The band gap energy of the QD is the energy difference between the
valence band and the conduction band. For nanoscale semiconductor particles such
as QDs, the bandgap is dependent on the size of the nanocrystal, which results in a
size-dependent variation in emission. A single light source can also be used for
the simultaneous excitation of a spectrum of emission wavelength, which makes
the method useful for multicolor, multiplexed biological detection and imaging
applications.
j373
374
QDs can be used for the ultrasensitive imaging of molecular targets in deep tissue
and living animals (Figure 13.6). Here, they are used as specic markers for cellular
structures [102, 103] and molecules [104], for monitoring physiological events in live
cells [105107], for measuring cell motility [108], and for monitoring RNA delivery
and tracking cells [109] in vivo. As an example, QDs have been used for locating
multiple distinct endogenous proteins within cells, thus determining the precise
protein distribution in a high-throughput manner [110]. Peptide ligand-conjugated
QDs have also been used for imaging G-protein-coupling receptors in both wholecells and as single-molecules [111]. Cellular events such as the transport of lipids and
proteins across membranes have also been tracked using QDs with molecular
resolution in live cells [112]. Furthermore, QDs conjugated to immunoglobulin G
(IgG) and streptavidin have been used to label the breast cancer marker Her2 on the
surface of xed and live cancer cells [113].
QDs have signicant potential in analyzing the mechanisms of cell growth,
apoptosis, cellcell interactions, cell differentiations and inammatory responses.
For example, QDs have been used to study the signaling pathways of mast cells
during an inammatory response [114], as well as to quantify changes in organelle
morphology during apoptosis [115]. In addition, the photostability and biocompatibility of QDs make them the preferred agents for the long-term tracking of live
cells [116]. QDs are internalized into cells by endocytosis [117], by receptor-mediated
uptake [118], by peptide-mediated transportation [119, 120] or microinjection [121].
An example of this was recently demonstrated in studies in which ligand-conjugated
QDs were used to monitor antigen binding, entry and trafcking in dendritic
cells [122]. QDs, when conjugated to a transporter protein, have also been used to
label malignant and nonmalignant hematological cells and to track cell division, thus
enabling lineage tracking [109].
Despite the remarkable potential for the application of QDs in clinical medicine,
their toxicity and long-term adverse effects are still not clearly understood. The
metabolism, excretion and toxicity of QDs may depend on multiple factors such as
size, charge, concentration and outer coating bioactivity, as well as their oxidative,
photolytic and mechanical stabilities and other unknown factors [123]. Importantly,
these issues must be addressed before QDs can be used for in vivo applications in
humans.
13.4
Future Directions
j375
376
critical barrier for their use in humans. Traditionally, tissue engineers have favored
materials that have a long history of medical application (i.e. FDA approved),
although many such materials have limitations to be overcome, perhaps through
rational design enabled by nanotechnology. Thus, there remains a clear need to
develop nanoengineered materials capable of addressing the various challenges of
tissue engineering. In addition, systematic toxicity studies must be conducted in
order to fully optimize and characterize not only the function but also the long-term
behavior of nanomaterials in vivo. Yet, many of the traditional methods used to
analyze previous generations of biomaterials do not apply to nanoscale materials,
and a clear paradigm shift is required in these analytical and standardization
procedures. This will range from how we study materialcell interactions in vitro
and in vivo, to the standardization requirements of regulatory bodies such as
the FDA. Clearly, these modications will require extensive discussion amongst
the scientists, the patients, the general public, the clinicians and the regulatory
ofcers.
13.5
Conclusions
Acknowledgments
The authors greatly appreciate the helpful discussions with Drs Hossein Hosseinkhani and Hossein Baharvand. They also acknowledge generous funding from the
Draper laboratory, the CIMIT, NIH and Coulter Foundation.
References
1 Khademhosseini, A. and Langer, R.
(2006) Nanobiotechnology for Tissue
Engineering and Drug Delivery. Chemical
Engineering Progress, 102, 3842.
2 Sengupta, S. and Sasisekharan, R. (2007)
Exploiting nanotechnology to target
References
10
11
12
13
14
15
16
17
18
19
20
21
j377
378
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
References
40 Wojciak-Stothard, B., Madeja, Z.,
Korohoda, W., Curtis, A. and Wilkinson,
C. (1995) Activation of macrophage-like
cells by multiple grooved substrata.
Topographical control of cell behaviour.
Cell Biology International, 19, 485490.
41 Oakley, C. and Brunette, D.M. (1995)
Response of single, pairs, and clusters of
epithelial cells to substratum topography.
Biochemistry and Cell Biology, 73, 473489.
42 Webb, A., Clark, P., Skepper, J.,
Compston, A. and Wood, A. (1995)
Guidance of oligodendrocytes and their
progenitors by substratum topography.
Journal of Cell Science, 108 (Pt 8),
27472760.
43 Flemming, R.G., Murphy, C.J., Abrams,
G.A., Goodman, S.L. and Nealey, P.F.
(1999) Effects of synthetic micro- and
nano-structured surfaces on cell behavior.
Biomaterials, 20, 573588.
44 Gallagher, J.O., McGhee, K.F., Wilkinson,
C.D. and Riehle, M.O. (2002) Interaction
of animal cells with ordered
nanotopography. IEEE Transactions on
NanoBioscience, 1, 2428.
45 Yim, E.K. and Leong, K.W. (2005)
Signicance of synthetic nanostructures
in dictating cellular response.
Nanomedicine, 1, 1021.
46 Yim, E.K., Wen, J. and Leong, K.W. (2006)
Enhanced extracellular matrix production
and differentiation of human embryonic
germ cell derivatives in biodegradable
poly(epsilon-caprolactone-co-ethyl
ethylene phosphate) scaffold. Acta
Biomaterialia, 2, 365376.
47 Hartgerink, J.D., Beniash, E. and Stupp,
S.I. (2001) Self-assembly and
mineralization of peptide-amphiphile
nanobers. Science, 294, 16841688.
48 Zhang, S. (2002) Emerging biological
materials through molecular selfassembly. Biotechnology Advances,
20, 321339.
49 Zhang, S. (2003) Fabrication of novel
biomaterials through molecular selfassembly. Nature Biotechnology, 21,
11711178.
j379
380
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
References
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
j381
382
91
92
93
94
95
96
References
105
106
107
108
109
110
111
112
j383
384
j385
14
Self-Assembling Peptide-Based Nanostructures for Regenerative
Medicine
Ramille M. Capito, Alvaro Mata, and Samuel I. Stupp
14.1
Introduction
The goal of regenerative medicine is to develop therapies that can promote the growth
of tissues and organs in need of repair as a result of trauma, disease or congenital
defects. For most of the patient population this means regeneration of our bodies in
adulthood, although there are also many critical pediatric needs in regenerative
medicine. One specic target that would deeply impact the human condition is
regeneration of the central nervous system (CNS). This would bring a higher quality
of life to individuals paralyzed as a result of spinal cord injury, brought into serious
dysfunction by stroke, aficted with Parkinsons and Alzheimers diseases, or those
blind as a result of macular degeneration or retinitis pigmentosa. Another area
that would benet from regenerative medicine is heart disease, which continues to be
one of the most dominant sources of premature death in humans. Here, the potential
to regenerate myocardium would have a great impact on clinical outcomes. Many
additional important targets exist. The regeneration of insulin-producing pancreatic
b cells would bring a higher quality of life to individuals suffering from diabetes.
Damage to cartilage a critical tissue in correct joint function is an enormous
source of pain and compromised agility for many individuals, especially in societies
that value a physically active life style for as long as possible. Other musculoskeletal
tissues such as bone, intervertebral disc, tendon, meniscus and ligament all remain
major therapeutic challenges in regenerative medicine. Another emerging target that
could have an enormous impact is the regeneration of teeth, as this would prevent the
need for dentures and other dental implants. All of these important targets in
regenerative medicine would not only raise the quality of life for many individuals
worldwide, but they would also have, for obvious reasons, a signicant economic
impact.
The development of effective regenerative medicine strategies generally
includes the use of cells, soluble regulators (e.g. growth factors or genes) and
386
scaffold technologies. In their natural environment, mammalian cells live surrounded by a form of solid or uid matrix composed of structural protein bers
(i.e. collagen and elastin), adhesive proteins (i.e. bronectin and laminin), soluble
proteins (i.e. growth factors) and other biopolymers (i.e. polysaccharides), all of which
have specic inter-related roles in the structure and normal function of the extracellular matrix (ECM). The creation of biomimetic articial matrices represents a
common theme in designing materials for regenerative medicine therapies, and
stems from the idea that providing a more natural three-dimensional (3-D) environment can preserve cell viability and encourage cell differentiation and matrix
synthesis. The nanoscale design of biomaterials, with particular attention to dimension, shape, internal structure and surface chemistry, may more effectively emulate
the very sophisticated architecture and signaling machinery of the natural ECM for
improved regeneration.
Strategies utilizing self-assembled supramolecular aggregates, macromolecules
and even inorganic particles could be used to design a signaling machinery de novo
that initiates regeneration events which do not occur naturally in mammalian
biology. Self-assembly a bioinspired phenomenon which involves the spontaneous
association of disordered components into well-dened and functionally
organized structures [1] can play a major role in creating sophisticated and
biomimetic biomaterials for regenerative therapies [27]. In molecular systems,
self-assembly implies that molecules are programmed by design to organize
spontaneously into supramolecular structures held together through noncovalent
interactions, such as electrostatic or ionic interactions, hydrogen bonding,
hydrophobic interactions and van der Waals interactions. Large collections of
these relatively weak bonds compared to covalent bonds can result in very stable
structures.
The rst fundamental reason for a link between self-assembly and regenerative
medicine is the potential to create multifunctional articial forms of an ECM starting
with liquids. Such liquids could contain dissolved molecules or pre-assembled
nanostructures, and they could then be introduced by injection at a specic site or
targeted through the circulation. Following self-assembly, a solid matrix could
mechanically support cells and also signal them for survival, proliferation, differentiation or migration. Alternatively, the self-assembled solid matrix could be designed
to recruit specic types of cells in order to promote a regenerative biological event,
or serve as cell delivery vehicles by localizing them in 3-D environments within
tissues and organs. These self-assembling molecules could also be used to modify the
surfaces of solid implants in order to render them bioactive [1, 8, 9]. The bottom-up
approach that is possible using self-assembly can permit the creation of an architecture that multiplexes signals or tunes their concentration per unit area. This
versatility makes self-assembling systems ideal for creating optimal materials for
regenerative medicine therapies.
In this chapter, we focus on the use of self-assembling nanostructures in
particular, peptide-based molecules which are currently being developed for regenerative medicine applications. Although many of these technologies are relatively
new, much very promising biological data both in vitro and in vivo demonstrating
14.2
Self-Assembling Synthetic Peptide Scaffolds
Peptides are among the most useful building blocks for creating self-assembled
structures at the nanoscale; they possess the biocompatibility and chemical versatility
that are found in proteins, yet they are more chemically and thermally stable [10].
They can also be easily synthesized on a large scale by using conventional chemical
techniques, and designed to contain functional bioactive sequences. A variety of short
peptide molecules have been shown to self-assemble into a wide range of supramolecular structures including nanobers, nanotubes, nanospheres and nanotapes.
Some self-assembling nanostructures have been used successfully to generate
injectable scaffolds with an extremely high water content and architectural features
that mimic the natural structure of the ECM. These self-assembling scaffolds show
great potential as 3-D environments for cell culture and regenerative medicine
applications, and also as vehicles for drug, gene or protein delivery.
14.2.1
b-Sheet Peptides
Aggeli and colleagues demonstrated that the biological peptide b-sheet motif can
be used to design oligopeptides that self-assemble into semi-exible b-sheet
nanotapes [11]. Depending on the intrinsic chirality of the peptides and concentration, these nanostructures can further assemble into twisted ribbons (double tapes),
brils (twisted stacks of ribbons) and bers (brils entwined edge-to-edge)
(Figure 14.1a) [12]. The assembly process is principally driven by hydrogen bonding
along the peptide backbone and interactions between specic side chains [13].
At sufciently high peptide concentrations, these structures can become entangled
to form gels, the viscoelastic properties of which can be altered by controlling the pH,
by applying a physical (shear) stress, or by altering the peptide concentration. As in
other peptide self-assembling systems, the hierarchical assembly can be altered by
the addition and position of charged amino acids within the peptide sequence that
is highly controlled by changes in pH [12] (Figure 14.1b and c). It has also been
shown that mixing aqueous solutions of cationic and anionic peptides that have
complementary charged side chains and a propensity to form antiparallel b-sheets,
results in the spontaneous self-assembly of brillar networks and hydrogels that are
robust to variations in pH and peptide concentration [14].
j387
388
These b-sheet peptide nanostructures have been studied for the treatment of
enamel decay [13]. In vitro, extracted human premolar teeth (containing caries-like
lesions) were exposed to several cycles of demineralizing (acidic conditions) and
remineralizing solutions (neutral pH conditions). Application of the self-assembling
peptides to the defects signicantly decreased demineralization during exposure
Another peptide design that exploits b-sheet nanostructure formation into a hydrogel
network is composed of strands of alternating hydrophilic and hydrophobic residues
anking an intermittent tetrapetide [1621] (Figure 14.2a and b). These peptides are
designed so that they are fully dissolved in aqueous solutions in random coil
conformations. Under specic stimuli, the molecules can be triggered to fold into
a b-hairpin conformation that undergoes rapid self-assembly into a b-sheet-rich,
highly crosslinked hydrogel. The molecular folding event where one face of the
b-hairpin structure is lined with hydrophobic valines and the other with hydrophilic
lysines is governed by the arrangement of polar and nonpolar residues within the
sequence. Subsequent self-assembly of the individual hairpins occurs by hydrogen
bonding between distinct hairpins and hydrophobic association of the hydrophobic
faces. One such peptide was designed to self-assemble under specic pH conditions [16]. Under basic conditions, the peptide intramolecularly folds into the
hairpin structure and forms a hydrogel. Unfolding of the hairpins and
dissociation of the hydrogel structure can be triggered when the pH is subsequently
lowered below the pKa of the lysine side chains, where unfolding is a result of the
intrastrand charge repulsion between the lysine residues [16]. Rheological studies
indicate that these b-hairpin hydrogels are both rigid and shear-thinning; however,
the mechanical strength is quickly regained after cessation of shear [16]
(Figure 14.2a).
These gels can also be triggered to self-assemble when the charged amino acid
residues within the sequence are screened by ions [22]. If a positively charged side
chain of lysine is replaced by a negatively charged side chain of glutamic acid, the
overall peptide charge is decreased and the peptide can be more easily screened,
resulting in a much faster self-assembly [21]. The kinetics of hydrogelation were
found to be signicant for the homogeneous distribution of encapsulated cells
within these types of self-assembling gels [21] (Figure 14.2c). Thermally reversible,
self-assembling peptides were also synthesized by replacing specic valine residues
with threonines to render the peptides less hydrophobic [17]. At ambient temperature
and slightly basic pH, the peptide is unfolded; however, upon heating the peptide
j389
390
encapsulate mesenchymal stem cells (MSCs) and hepatocytes [21]. Another study
also showed that these gels have an inherent antibacterial activity, with selective
toxicity to bacterial cells versus mammalian cells [23].
14.2.3
Block Copolypeptides
Deming and colleagues have developed diblock copolypeptide amphiphiles containing charged and hydrophobic segments that self-assemble into rigid hydrogels and
can remain mechanically stable even at high temperatures (up to 90 C) [24, 25]
(Figure 14.3). These hydrogels were also found to recover rapidly after an applied
stress, attributed to the relatively low molecular mass of the copolypetides, enabling
rapid molecular organization. Their amphiphilic characteristics, architecture (diblock versus triblock) and block secondary structure (e.g. a-helix, b-strand or random
coil) were found to play important roles in the gelation, rheological and morphological properties of the hydrogel [2426]. One type of block copolypeptide consists of a
hydrophobic block of poly-L-lysine and a shorter hydrophobic block of poly-Lleucine [26]. The helical secondary structure of the poly-L-leucine blocks was shown
to be instrumental for gelation, while the hydrophilic polyelectrolyte segments
helped to stabilize the twisted bril assemblies by forming a corona around the
hydrophobic core [26] (Figure 14.3).
In vitro studies using mouse broblasts revealed that, at concentrations below
gelation, lysine-containing diblocks were cytotoxic to the cells, whereas glutamic
j391
392
acid-containing peptides were not cytotoxic [27]. In gel form, however, both lysine and
glutamic acid-based diblocks were noncytotoxic, although the scaffolds did not
support cell attachment or proliferation. This demonstrates how molecular design
and charge can signicantly affect the cytoxicity and biological activity of the resulting
self-assembled material. Future research is directed towards covalently incorporating
bioactive sites within these hydrogels in order to increase cellular attachment and
enhance the biological response [27].
14.2.4
Ionic Self-Complementary Peptides
14.2.5
Fmoc Peptides
A more recently developed class of self-assembling peptides that uses uorenylmethyloxycarbonyl (Fmoc) -protected di- and tri-peptides have been shown to form
highly tunable hydrogel structures (Figure 14.5a). The formation of these gels can be
achieved either by pH adjustment [48] (Figure 14.5b) or by a reverse-hydrolysis
enzyme action [49] (Figure 14.5c). Assembly occurs via hydrogen bonding in b-sheet
arrangement and by pp stacking of the uorenyl rings that also stabilize the
system [48] (Figure 14.5b). A number of sheets then twist together to form nanotubes
(Figure 14.5d).
The results of in vitro studies indicated that these hydrogels can support chondrocyte survival and proliferation in both two and three dimensions [48]. It was also
observed that cell morphology varied according to the nature of the molecular
structure [48].
j393
394
14.2.6
Peptide Amphiphiles
Peptide amphiphiles (PAs) are self-assembling molecules that also use hydrophobic
and hydrophilic elements to drive self-assembly. There are different types of peptide
amphiphiles that can assemble into a variety of nanostructures such as spherical
micelles, brils, tubes or ribbons [50]. One unique PA design, which forms highaspect ratio cylindrical nanobers, has been exclusively studied during the past
decade for regenerative medicine applications. These molecules are particularly
distinguished from the other peptide systems described above, in that their amphiphilic nature derives from the incorporation of a hydrophilic head group and a
hydrophobic alkyl tail, as opposed to molecules consisting of all amino acid residues
with resultant hydrophilic and hydrophobic faces.
Stupp and colleagues have developed a family of amphiphilic molecules that can
self-assemble from aqueous media into 3-D matrices composed of supramolecular
nanobers [46, 9, 5159]. These molecules consist of a hydrophilic peptide segment
which is bound covalently to a highly hydrophobic alkyl tail found in ordinary lipid
molecules. The alkyl tail can be located at either the C or N terminus [51], and can also
be constructed to contain branched structures [55]. In Stupp et al.s specic design,
the peptide region contains a b-sheet-forming peptide domain close to the hydrophobic segment and a bioactive peptide sequence (Figure 14.6a). Upon changes in
pH or the addition of multivalent ions, the structure of these molecules drives their
assembly into cylindrical nanobers through hydrogen bonding into b-sheets and
hydrophobic collapse of alkyl tails away from the aqueous environment to create
nanobers with a hydrophobic core (Figure 14.6b). This cylindrical nanostructure
allows the presentation of high densities of bioactive epitopes at the surface of the
nanobers [6], whereas, if peptides were assembled into twisted sheets or tubes, this
type of orientational biological signaling would not be possible [7]. These systems can
also be used to craft nanobers containing two or more PA molecules that can
effectively coassemble, thus offering the possibility of multiplexing different biological signals within a single nanober [56].
The presence of a net charge in the peptide sequence ensures that the molecules or
small b-sheet aggregates remain dissolved in water, inhibiting self-assembly through
coulombic repulsion. Self-assembly and gelation is subsequently triggered when the
charged amino acid residues are electrostatically screened or neutralized by pH
adjustment, or by the addition of ions (Figure 14.6ce). The growth of nanobers
can therefore be controlled by changing the pH or raising the concentration of
screening electrolytes in the aqueous medium [7]. Growth and bundling of the
nanobers eventually lead to gelation of the PA solution. In vivo, ion concentrations
present in physiological uids can be sufcient to induce the formation of PA
nanostructures [54]. Thus, a minimally invasive procedure could be designed
j395
396
with these systems through a simple injection of the PA solution that spontaneously
self-assembles into a bioactive scaffold at the desired site. Over time, the small
molecules composing the nanobers should biodegrade into amino acids and lipids,
thus minimizing the potential problems of toxicity or immune response [54].
The results of both in vitro and in vivo studies have shown that these PA systems can
serve as an effective analogue of the ECM by successfully supporting cell survival and
attachment [60, 61], mediating cell differentiation [6] and promoting regeneration in
vivo [57]. In efforts to address neural tissue regeneration for the repair of a spinal cord
injury or treatment of extensive dysfunction as a result of stroke or Parkinsons
disease, Stupp and colleagues have designed PAs to display the pentapeptide epitope
isoleucine-lysine-valine-alanine-valine (IKVAV). This particular peptide sequence is
found in the protein laminin, and has been shown to promote neurite sprouting and
to direct neurite growth [6]. When neural progenitor cells were encapsulated within
this PA nanober network, the cells more rapidly differentiated into neurons
compared to using the protein laminin or the soluble peptide (Figure 14.7a
and b). The PA scaffold was also found to discourage the development of astrocytes,
a type of cell in the CNS which is responsible for the formation of glial scars that
prevent regeneration and recovery after spinal cord injury. In this same study, the
j397
398
at the implant site, thus eliminating the need for exogenous growth factor supplementation altogether. Hartgerink et al. synthesized PAs with a combination of
biofunctional groups including a cell-mediated enzyme-sensitive site, a calciumbinding site and a cell-adhesive ligand [66]. The incorporation of an enzyme-specic
cleavage site allows cell-mediated proteolytic degradation of the scaffold for cellcontrolled migration and matrix remodeling. In vitro studies demonstrated that these
PA scaffolds do degrade in the presence of proteases, and that the morphology of cells
encapsulated within the nanober scaffolds was dependent on the density of the celladhesive ligand, with more elongated cells observed in gels with a higher adhesive
ligand density [66].
To date, hundreds of peptide amphiphile nanobers designs are known, including
those that nucleate hydroxyapatite with some of the crystallographic features found in
bone [4], increase the survival of cultured islets for the treatment of diabetes, bind to
various growth factors [51], contain integrin-binding sequences [61], incorporate
contrast agents for fate mapping of PA nanostructures [52], and have pro-apoptotic
sequences for cancer therapy, among many others. Research investigating the
development of hybrid materials using these versatile PA systems is also emerging.
For example, PA nanobers were integrated within titanium foams as a means to
promote bone ingrowth or bone adhesion for improved orthopedic implant
xation (Figure 14.9) [1]. Preliminary in vivo results implanting these PATi hybrids
within bone defects in a rat femur demonstrated de novo bone formation around
and inside the implant, vascularization in the vicinity of the implant, and no cytotoxic
response [1]. Another type of hybrid system developed by Hartgerink et al. includes
hydrogels that contain a mixture of PA and phospholipid (Figure 14.10) [67].
The phospholipid inclusions within the PA nanostructure were found to modulate
Figure 14.10 (a) Chemical structure of the PA and (b) crosssection of a PA fiber and a PA fiber containing 6.25 mol% of lipid
(yellow). Highlighted in pink are the PA molecules situated
adjacent to the lipid molecules. (Reproduced with permission
from Ref. [67]; 2006, American Chemical Society.)
the peptide secondary structure as well as the mechanical properties of the hydrogel,
with little change in the nanostructure. This composite system enables the optimization of mechanical and chemical properties of the hydrogel by simple adjustment
of the PA to phospholipid ratios [67].
The ability to access new mechanisms to control self-assembly across the scales,
and not just at the nanostructure level, offers new possibilities for regenerative
therapies as bioactive functions can be extended by design into microscopic and
even macroscopic dimensions. One system involves the self-assembly of hierarchically ordered materials at an aqueous interface resulting from the interaction
between small, charged self-assembling PA molecules and oppositely charged highmolar mass biopolymers [68]. A PApolymer sac can be formed instantly by injecting
the polymer directly into the PA solution (Figure 14.11a). The interfacial interaction
between the two aqueous liquids allows the formation of relatively robust membranes with tailorable size and shape (Figure 14.11b), self-sealing and suturable sacs
(Figure 14.11cf), as well as continuous strings (Figure 14.11g). The membrane
structure grows to macroscopic dimensions with a high degree of hierarchical order
across the scales. Studies have demonstrated that the sac membrane is permeable to
large proteins, and therefore can be successfully used to encapsulate cells
(Figure 14.11h). In vitro studies of mouse pancreatic islets (Figure 14.11i) and
human MSCs (Figure 14.11j) cultured within the sacs showed that these structures
can support cell survival and can be effective 3-D environments for cell differentiation. The unique structural and physical characteristics of these novel systems
offer signicant potential in cell therapies, drug diagnostics and regenerative
medicine applications.
j399
400
14.3
Self-Assembling Systems for Surface Modification
Implantable materials are the essence of todays regenerative medicine. The ability to
control these materials at the nanoscale has moved them from simple inert materials
to biocompatible and bioactive materials [69]. The surface of a biomaterial is
particularly important in regenerative medicine as it is the rst point of contact
with the body. Whether it is presenting a biomimetic atmosphere, disguising a
foreign body, or activating specic biological processes, the surface of an implant
plays a crucial role and can determine its success or failure. One key advantage of selfassembly is the possibility to modify and tailor surfaces to elicit a specic biological
response. In the following, we discuss the use of self-assembly to modify the
properties of surfaces and 3-D structures.
14.3.1
Coatings on Surfaces
j401
402
which offer a unique opportunity to recreate and study dynamic biological processes.
These types of surface can be achieved by controlling SAMs through different
switching mechanisms (i.e. electrical, electrochemical, photochemical, thermal, and
mechanical transduction [78, 79]) that organize specic ligands and peptides. By
using these techniques, SAMs of peptides such as EG3- and RGD-terminated
peptides have been used to study dynamic mechanisms controlling the adhesion
and migration of bovine capillary endothelial cells [80] and broblasts [81], respectively (Figure 14.12b). Another modication of traditional SAM patterning takes
advantage of dip-pen nanolithography (DPN), which uses atomic force microscopy
(AFM) tips dipped into alkanethiol inks to transfer molecules by capillary force on
Figure 14.12 Approaches to create complex selfassembled monolayers (SAMs) including: (a)
Micro-contact printing to create adhesive
patterns of SAMs to study cell mechanisms such
as growth and apoptosis. (Reproduced with
permission from Ref. [76]; 1997, American
Association for the Advancement of Science.);
(b) Patterns of SAMs that can be
the gold surface (Figure 14.12c) [82, 83]. A major advantage of this technique is that it
can create patterns of SAMs down to 15 nm in lateral dimension, signicantly
surpassing the resolution of soft lithographic techniques [82]. These types of
studies not only provide reproducible tools to engineer biomimetic cell microenvironments, but also offer great promise for a deeper understanding of cell behaviors
that could then be applied to the design of materials and implants in regenerative
medicine [69].
While SAMs rely on individual molecules or peptides to create single-layer
coatings, more complex self-assembled structures such as tubes or bers are also
being used as surface modiers. One such example is a class of organic selfassembled bers, referred to as helical rosette nanotubes (HRNs), that have been
used to coat and functionalize bone prosthetic biomaterials (Figure 14.13). This
approach was recently used to coat titanium surfaces, and caused a signicant
enhancement of osteoblast adhesion in vitro [84]. These molecules self-assemble
from a single bicyclic block resulting from the complementary hydrogen-bonding
arrays of both guanine (G) and cytosine (C). This C/G motif serves as the building
block that self-assembles in water to form a six-membered supermacrocycle (rosette)
maintained by 18 H-bonds. Subsequent assembly of these rosettes forms hollow
nanotubes that are 1.1 nm in diameter and up to millimeters in length [85]. The outer
surface of the G/C motif could further be modied to present specic physical
and chemical properties.
j403
404
Figure 14.15 Peptide amphiphile (PA) molecules and selfassembling mechanism used to functionalize surfaces with
improved molecular properties. A diacetylene-photosensitive
segment in the hydrophobic tail promotes PA polymerization and
subsequent monolayer stability. (Reproduced with permission
from Ref. [89]; 2006, Elsevier Limited.)
j405
406
14.4
Clinical Potential of Self-Assembling Systems
As discussed above, several preclinical studies have already shown great promise for
the use of self-assembling biomaterials in regenerative medicine. Particularly, in vivo
experiments using self-assembling peptide amphiphiles by Stupp and colleagues
have shown that these bioactive matrices can be specically designed to: (i) promote
angiogenesis (rat cornea model); (ii) promote regeneration of axons in a spinal cord
injury model (mouse and rat models), of cartilage (rabbit model), and of bone (rat
model); (iii) promote recovery of cardiac function after infarct (mouse model); and
(iv) show promise as treatments for Parkinsons disease (mouse models). It is
expected that the self-assembly of supramolecular systems will, in time, lead to many
effective regenerative medicine therapies providing an excellent platform to design
for bioactivity, harmless degradation with appropriate half-lives after providing a
function, and noninvasive methods for clinical delivery:
.
Design for bioactivity: it is possible to engineer these peptide-based, selfassembling systems to include various combinations of amino acid sequences
j407
408
that are bioactive and can enhance the regeneration process that is, deliver growth
factors, contain cell adhesion sequences, mimic the bioactivity of growth factors,
and so on.
.
Noninvasive methods for clinical delivery: the ability of these peptidebased molecules to self-assemble spontaneously allows for the administration of
materials through noninvasive methods. For example, a solution of the selfassembling molecules could be injected into the defect site, after which gelation
in vivo could be triggered by ions within the body.
14.5
Conclusions
Today, research into the development of self-assembling biomaterials for regenerative medicine continues to expandthe main aim being to achieve real
improvements in the quality of life for mankind. Without strategies for regeneration, genomic data and personalized medicine will not have the signicant
impact that is being promised. It is important that therapies for regenerative
medicine must be not only highly effective and predictable, but also as noninvasive as possible, with the capacity to reach deep into problem areas of the
heart, brain, skeleton, skin and other vital organs. It is for this reason that selfassembly at the nanoscale appears as the most sensible technological strategy, to
signal and recruit the organisms own cells, or to manage the delivery of cell
therapies to the correct sites after effective in vitro manipulation. The ability to
design at both the nanoscale and macroscale will open the door to vast
possibilities for biomaterials and regenerative medicine, with materials that can
be designed to multiplex the required signals, can be delivered in a practical and
optimal manner, and can reach targets across barriers via the blood circulation.
In addition, molecular self-assembly on the surfaces of implants may enhance
the bioactivity and predictable biocompatibility of metals, ceramics, composites
and synthetic polymers. Self-assembly is at the root of structure versus function
in biology and, in the context of regenerative medicine technology, is the
ultimate inspiration from Nature.
References
References
1 Sargeant, T.D., Guler, M.O.,
Oppenheimer, S.M., Mata, A., Satcher,
R.L., Dunand, D.C. and Stupp, S.I. (2008)
Biomaterials, 29, 161.
2 Hwang, J.J., Iyer, S.N., Li, L.S., Claussen,
R., Harrington, D.A. and Stupp, S.I. (2002)
Proceedings of the National Academy of
Sciences of the United States of America,
99, 9662.
3 Klok, H.A., Hwang, J.J., Iyer, S.N. and
Stupp, S.I. (2002) Macromolecules, 35, 746.
4 Hartgerink, J.D., Beniash, E. and
Stupp, S.I. (2001) Science, 294, 1684.
5 Hartgerink, J.D., Beniash, E. and
Stupp, S.I. (2002) Proceedings of the
National Academy of Sciences of the United
States of America, 99, 5133.
6 Silva, G.A., Czeisler, C., Niece, K.L.,
Beniash, E., Harrington, D.A., Kessler, J.A.
and Stupp, S.I. (2004) Science, 303, 1352.
7 Stupp, S.I. (2005) MRS Bulletin, 30, 546.
8 Stendahl, J.C., Li, L., Claussen, R.C. and
Stupp, S.I. (2004) Biomaterials, 25, 5847.
9 Harrington, D.A., Cheng, E.Y.,
Guler, M.O., Lee, L.K., Donovan, J.L.,
Claussen, R.C. and Stupp, S.I. (2006)
Journal of Biomedical Materials Research
Part A, 78, 157.
10 Gazit, E. (2007) Chemical Society Reviews,
36, 1263.
11 Aggeli, A., Bell, M., Boden, N., Keen, J.N.,
Knowles, P.F., McLeish, T.C.,
Pitkeathly, M. and Radford, S.E. (1997)
Nature, 386, 259.
12 Aggeli, A., Bell, M., Carrick, L.M.,
Fishwick, C.W., Harding, R., Mawer, P.J.,
Radford, S.E., Strong, A.E. and Boden, N.
(2003) Journal of the American Chemical
Society, 125, 9619.
13 Kirkham, J., Firth, A., Vernals, D.,
Boden, N., Robinson, C., Shore, R.C.,
Brookes, S.J. and Aggeli, A. (2007) Journal
of Dental Research, 86, 426.
14 Aggeli, A., Bell, M., Boden, N., Carrick,
L.M. and Strong, A.E. (2003) Angewandte
Chemie - International Edition in English, 42,
5603.
j409
410
41
42
43
44
45
46
47
48
49
50
51
52
53
54
References
55 Guler, M.O., Soukasene, S., Hulvat, J.F.
and Stupp, S.I. (2005) Nano Letters, 5, 249.
56 Niece, K.L., Hartgerink, J.D., Donners, J.J.
and Stupp, S.I. (2003) Journal of the
American Chemical Society, 125, 7146.
57 Rajangam, K., Behanna, H.A., Hui, M.J.,
Han, X., Hulvat, J.F., Lomasney, J.W. and
Stupp, S.I. (2006) Nano Letters, 6, 2086.
58 Sone, E.D. and Stupp, S.I. (2004) Journal
of the American Chemical Society, 126,
12756.
59 Stendahl, J.C., Rao, M.S., Guler, M.O. and
Stupp, S.I. (2006) Advanced Functional
Materials, 16, 499.
60 Beniash, E., Hartgerink, J.D., Storrie, H.,
Stendahl, J.C. and Stupp, S.I. (2005) Acta
Biomaterialia, 1, 387.
61 Storrie, H., Guler, M.O., Abu-Amara, S.N.,
Volberg, T., Rao, M., Geiger, B. and
Stupp, S.I. (2007) Biomaterials, 28, 4608.
62 Tysseling-Mattiace, V.M., Sahni, V.,
Niece, K.L., Birch, D., Czeisler, C.,
Fehlings, M.G., Stupp, S.I. and Kessler,
J.A. (2008) The Journal of Neuroscience, 28,
3814.
63 Kapadia, M.R., Chow, L.W., Tsihlis, N.D.,
Ahanchi, S.S., Eng, J.W., Murar, J.,
Martinez, J., Popowich, D.A., Jiang, Q.,
Hrabie, J.A., Saavedra, J.E., Keefer, L.K.,
Hulvat, J.F., Stupp, S.I. and Kibbe, M.R.
2008 Journal of Vascular Surgery, 47, 173.
64 Jiang, H., Guler, M.O. and Stupp, S.I.
(2007) Soft Matter, 3, 454.
65 Palmer, L.C., Velichko, Y.S., Olvera De La
Cruz, M. and Stupp, S.I. (2007)
Philosophical Transactions of the Royal
Society of London. Series A: Mathematical
and Physical Sciences, 365, 1417.
66 Jun, H., Yuwono, V., Paramonov, S.E. and
Hartgerink, J.D. (2005) Advanced
Materials, 17, 2612.
67 Paramonov, S.E., Jun, H.W. and
Hartgerink, J.D. (2006) Biomacromolecules,
7, 24.
68 Capito, R.M., Azevedo, H.S.,
Velichko, Y.S., Mata, A. and Stupp, S.I.
(2008) Science, 319, 1812.
69 Stupp, S.I., Donners, J.J.J.M., Li, L.S. and
Mata, A. (2005) MRS Bulletin, 30, 864.
j411
412
94
95
96
97
j1
1
Spin-Polarized Scanning Tunneling Microscopy
Mathias Getzlaff
1.1
Introduction and Historical Background
Until the 1980s an idealized and rather unrealistic view was found in surface physics
for a lack of techniques which allowed real-space imaging. During this time, surfaces
were often assumed to be perfect that is, imperfections such as step edges,
dislocations or adsorbed atoms were neglected. Most of the important information
was gained rather indirectly by spatially averaging methods or experimental techniques with insufcient resolution.
However, in 1982, with the invention of the scanning tunneling microscope by G.
Binnig and H. Rohrer [1, 2], the situation changed dramatically. This instrument
allowed, for the rst time, the topography of surfaces to be imaged in real space with
both lateral and vertical atomic resolution.
Subsequently, a number of different spectroscopic modes were introduced which
provided additional access to electronic behavior, thus allowing the correlation of
topographic and electronic properties down to the atomic scale.
In 1988, Pierce considered the possibility of making the scanning tunneling
microscope sensitive towards the spin of tunneling electrons by using spin-sensitive
tip materials as a further development [3], and this was also predicted theoretically
by Minakov et al. [4]. As a step towards this aim, Allenspach et al. [5] replaced the
electron gun of a scanning electron microscope with a scanning tunneling microscope tip. Thus, in eld emission mode the electrons impinged on a magnetic
surface, and the spin polarization of the emitted electrons was subsequently monitored; this, at least in principle, would allow magnetic imaging with nanometer
resolution.
However, it was the rst direct realization by Wiesendanger et al. [6] that opened
the possibility of imaging the magnetic properties at atomic resolution. Moreover, the
importance of this proposal was not restricted only to basic studies but was also
applicable to research investigations. Meanwhile, a rapidly increasing interest
emerged from an industrial point of view, a concept which became even more
important when considering the need for dramatic increases in the storage density of
devices such as computer hard drives. Clearly, further developments in this area will
require tools that allow high spatial resolution magnetic imaging for an improved
understanding of nanoscaled objects such as magnetic domains and domain wall
structures.
In this chapter, we will describe the successful development and implementation
of spin-polarized scanning tunneling microscopy (SP-STM), and will also show by
means of selected examples how our understanding of surface magnetic behavior
has vastly increased in recent years.
1.2
Spin-Polarized Electron Tunneling: Considerations Regarding Planar Junctions
FM 1
insulator
FM 2
ferromagnetic nanoparticles
I
insulating matrix
In this situation, the tunneling current for a parallel magnetization is given by:
I"" / n"1 n"2 n#1 n#2
1:1
with ni being the electron density of electrode i at the Fermi level EF. For the
antiparallel orientation, the tunneling current amounts to:
I"# / n"1 n#2 n#1 n"2
ai n"i =n"i n#i
n#i =n"i n#i that
With
1ai
electrode i is given by:
Pi
n"i n#i
n"i n#i
1:2
being the part of majority electrons of electrode i, and
of the minority electrons, the spin polarization Pi of
2ai 1
1:3
1:4
1:5
j3
100
90
90
180
270
360
Angle ()
Figure 1.3 Dependence of the tunneling conductance (inverse
resistance) of a planar Fe-Al2O3-Fe junction on the angle Q
between the magnetization vectors of both electrodes. (Data
taken from Ref. [9]).
Gap
Rp
G"#
1:6
DR
2P1 P2
Rp 1P 1 P 2
1:7
In the above considerations, it has been assumed that the magnetizations in both
ferromagnetic electrodes are oriented parallel or antiparallel. However, the differential conductance also depends on the angle Q between both directions of magnetization. This behavior is shown in Figure 1.3 for an FeAl2O3Fe junction. Thus, until
now the situation has been discussed for Q 0 and Q 180 , respectively. For an
arbitrary angle Q, the differential conductance can be expressed as:
G G0 1 P1 P2 cosQ
1:8
1.3
Spin-Polarized Electron Tunneling in Scanning Tunneling Microscopy (STM):
Experimental Aspects
The substitution of a ferromagnetic electrode (as discussed above) with a ferromagnetic probe tip represents the situation in SP-STM. The insulating barrier is realized
by the vacuum between the sample and the tip, which are separated by a distance of
several ngstroms, thus allowing the laterally resolved determination of magnetic
1.3 Spin-Polarized Electron Tunneling in Scanning Tunneling Microscopy (STM): Experimental Aspects
properties. As a consequence, the zero bias anomaly that is, the decrease in the TMR
with increasing bias voltage in planar junctions is not present for SP-STM
investigations because the anomaly can be attributed to scattering of electrons at
defects in amorphous barriers [10].
The following sections describe the two fundamental experimental aspects concerning spin-polarized electron tunneling in an STM experiment. Here, different
probe tips and modes of operation are employed in order to obtain magnetic
information from a sample.
1.3.1
Probe Tips for Spin-Polarized Electron Tunneling
In order to realize SP-STM, the probe tip should fulll most of the following
conditions:
.
The higher the spin polarization of the apex atom, the more pronounced the
magnetic information (cf. Equations 1.41.7) in comparison to electronic and
topographic information. Due to the typical reduction of this spin polarization by
adsorbates from the residual gas, even under ultra-high vacuum (UHV) conditions,
a clean environment or an inert tip material is certainly advantageous [11].
The interaction between tip and sample should be as low as possible because the stray
eld of a ferromagnetic tip may modify or destroy the samples domain structure.
Controlling the orientation of the magnetization axis of the tip parallel or perpendicular to the sample surface allows the domain structure of any sample to be
imaged, independent of whether its easy axis is in-plane or out-of-plane.
j5
1.3 Spin-Polarized Electron Tunneling in Scanning Tunneling Microscopy (STM): Experimental Aspects
tip surface and the magnetic overlayer. While the overall shape of the tip [as shown in
the scanning electron microscopy (SEM) overview image of Figure 1.5a] remains
almost unaffected by the high-temperature treatment, the high-resolution SEM
image shown in Figure 1.5b reveals that the tip diameter is increased to 1 mm, most
likely due to melting of the tip apex. Following this high-temperature treatment, the
tips are magnetically coated with a magnetic lm exhibiting a thickness of several
monolayers (MLs). In contrast to bulk tips, the magnetization direction is governed
by the shape anisotropy. The anisotropy of thin lm tips can thus be adjusted by
selecting an appropriate lm material. For example, while 310 ML Fe [27] and
<50 ML Cr [28] are almost always sensitive to the in-plane component of the
magnetization, 79 ML Gd [29], 1015 ML Gd90Fe10 [30] and 2545 ML Cr [30] are
usually perpendicularly magnetized at low temperature. The well-known spin
reorientation transition of Co lms on Au (see Refs [31, 32]) which occur with
increasing thickness of the magnetic material allows tuning of the magnetically
sensitive direction of the tip with the same set of coating materials. For thin Co
coverages (<8 ML) on a Au-coated W tip, an out-of-plane magnetic sensitivity is
achieved, whereas for thicker Co lms the in-plane component of the sample
magnetization can be probed [33].
At least qualitatively, this observation can easily be understood. Two anisotropy
terms are relevant: (i) the shape anisotropy which arises due to the pointed shape of
the tip; and (ii) the surface and interface anisotropy of the lm. The rst term will
always try to force the magnetization along the tip axis that is, perpendicular to the
sample surface. In contrast, the second term is material-dependent. Due to the rather
large curvature of the tip as compared to the thickness of the coating lm (see
Figure 1.5c), the effective surface and interface anisotropy of a thin lm can be
deduced from an equivalent lm on a at W(110) substrate. For instance, in the case
of 10 ML Fe the two anisotropy terms favor different directions. While the shape
anisotropy still tries to force the magnetization along the tip axis, the ferromagnetic
j7
lm on W(110) exhibits a strong in-plane anisotropy [35] which obviously overcomes the shape anisotropy. An external eld of 2 T is required to force the tip
magnetization out of the easy (in-plane) into the hard (out-of-plane) direction [34];
this is consistent with results of Elmers and Gradmann [35] concerning thin lm
systems.
Even at room temperature magnetic thin lm tips can be used for several days
without losing their spin sensitivity. Initially, this is surprising as any surface is
continuously exposed to the residual gas in the vacuum chamber which, depending
on the reactivity of the sample under investigation, leads to a more or less rapid but
continuous degradation of the surface spin polarization [36]. However, the
geometry of the tunnel junction must be taken into account as it differs from that
of an open surface. While residual gas particles may impinge onto a at, uncovered
surface from the entire half-space, the tip apex is almost completely shadowed by the
sample, as shown schematically in Figure 1.5c. Thereby, gas transport onto the tip
apex is dramatically reduced. Of course, the same argument can be applied to the
sample surface which is, however, only valid for the particular location of the sample
surface that is right under the tip apex. As this location varies when scanning the tip
across the sample, the shadowing is only temporarily effective for any particular site
of the sample surface, whereas the tip is shadowed at all times.
1.3.1.2 Antiferromagnetic Probe Tips
Despite the fact that the magnetic dipole interaction between the sample and the tip is
considerably reduced for ferromagnetic ultrathin lm coatings on a nonmagnetic tip,
in comparison to thicker coatings or even bulk ferromagnetic tips, an additional
inuence cannot be ruled out. One straightforward and experimentally feasible
solution, however, is to use an antiferromagnetically coated (see Ref. [30]) or a bulk
antiferromagnetic tip consisting of, for example, a MnNi alloy [3740]. The tip should
exhibit no signicant stray eld, since opposite contributions compensate on an
atomic scale, thus allowing the nondestructive imaging and investigation of spin
structures even for magnetically soft samples. The spin sensitivity is determined
solely by the orientation of the magnetic moment of the atom that forms the very end
of the tip apex; the orientation of all other magnetic moments plays no role.
Furthermore, the tip is insensitive to external elds, which allows direct access to
intrinsic sample properties in eld-dependent studies.
In order to demonstrate this insensitivity, we can refer to an investigation
conducted by Kubetzka et al. [30]. Here, the response of an identically prepared
system to an applied perpendicular magnetic eld is shown using a ferromagnetic tip
on the one hand, and an antiferromagnetic tip on the other hand. Figure 1.6a shows a
series of dI/dU maps that is, maps of the differential conductance, of 1.95 ML Fe on
W(110) recorded with a ferromagnetic GdFe tip. Figure 1.6a(i) shows an overview of
the initial state, while Figure 1.6a(ii) is taken at higher resolution, as indicated by the
frame in Figure 1.6a(i). Because the coverage is slightly below 2 ML, narrow ML areas
can be seen with a bright appearance at the chosen bias voltage. These ML areas
efciently decouple double layer (DL) regions on adjacent terraces. At 350 mT [see
Figure 1.6a(iii)] the domain distribution is asymmetric; the bright domains have
1.3 Spin-Polarized Electron Tunneling in Scanning Tunneling Microscopy (STM): Experimental Aspects
grown and the dark domains have shrunk. In some places the magnetic contrast
changes abruptly from one horizontal scan line to the next (see arrows), this being
the result of a rearrangement of the samples magnetic state during the imaging
process. At 700 mT [see Figure 1.6a(iv)] the sample has almost reached saturation
within the eld of view. However, it becomes obvious in the overview image,
subsequently recorded at the same location in zero applied eld [see Figure 1.6a(v)],
that this eld value does not reect intrinsic sample properties. A large fraction of
the dark domains has survived outside the region which was scanned previously at
700 mT. Thus, superpositioning of the applied eld and the additional eld
emerging from the magnetic coating of the tip is much more efcient than the
applied eld alone.
Figure 1.6b shows an analogous series of images of a sample which was identically
prepared but imaged with an antiferromagnetic Cr-covered tip. This exhibits an outof-plane sensitivity, like the GdFe tips [30]. A dark domain is marked as an example to
be recognized in all ve images. The domain structure in Figure 1.6b(i)(iii) displays
no signicant difference to the corresponding structures in Figure 1.6a. Since a
rearrangement of the domain structure during imaging is not observed throughout
this series, the occurrence of such events in Figure 1.6a can be attributed to the
GdFe tips stray eld. As in Figure 1.6a(iii), the dark domains are compressed at
350 mT, which proceeds at 700 mT. At this eld value, and in contrast to Figure 1.6a,
a large fraction of the dark domains has survived. In the overview image of
Figure 1.6b(v), which was taken again subsequently in a zero-applied eld, the
previously scanned area exhibits no signicant difference in domain distribution in
j9
10
1.3 Spin-Polarized Electron Tunneling in Scanning Tunneling Microscopy (STM): Experimental Aspects
(and vice versa for s light). As a result, while the theoretical limit of the electron spin
polarization in photoemission is 50%, values as high as 43% have been achieved
experimentally [41].
The preliminary experiments on GaAsinsulatorferromagnet tunnel junctions
were reported in Ref. [42]. The specimen was prepared by cleaving a GaAs crystal in
air along the (1 1 0) plane. A 2040 layer of Al was then evaporated onto the GaAs
(1 1 0) surface and subsequently oxidized. Onto this insulating barrier was then
deposited a 150 thick Co lm, which was itself protected by a 50 Au cap layer. The
chosen experimental set-up required that the light would traverse the ferromagnetic
layer and the insulator before reaching the semiconductor. After magnetization of the
Co lm perpendicular to the plane, a dependence of the tunneling current on the
helicity of the light was measured, which suggested the existence of spin-polarized
transport. However, an even stronger signal was detected when there was no
tunneling barrier between the semiconductor and the ferromagnet this was explained by . . .an intensity modulation of the circularly polarized light upon
transmission through the magnetically ordered layer, determined by the polar
magneto-optic coefcients [42]. The reduction of the helical asymmetry when using
a tunnel barrier, compared to a situation without any barrier, was explained in a later
analysis [43, 44] by a negative tunneling conductance. As this method to create spinpolarized electrons involves neither a magnetic material nor magnetic elds, it offers
excellent conditions for application in SP-STM.
In spite of this favorable situation, GaAs tips have not yet been used successfully for
the imaging of magnetic surfaces. Similar to the experiments performed with planar
junctions, this may be caused by difculties in separating spin-polarized tunneling
from magneto-optical effects [4549].
Compared to the proposal of Pierce [3] (as shown in Figure 1.7), the rst successful
observation of spin-polarized electron tunneling with the scanning tunneling
j11
12
microscope, using a GaAs electrode [50], was obtained by exchanging the role of tip
and sample that is, by using a Ni tip and a GaAs(110) surface. Moreover, instead of
optically pumping the GaAs sample and thereby producing spin-polarized charge
carriers in the GaAs, the reverse process was used, with spin-polarized electrons
being injected from the Ni tip into the conduction band of GaAs. Upon transition of
the injected electrons from this metastable state in the conduction band into the nal
state in the valence band recombination, luminescence was seen to occur and the
circular polarization of the emitted photons was analyzed.
Evidence for a second explanation for the failure of magnetic imaging with
GaAs electrodes namely an insufcient lifetime of the spin carriers at the tip
apex comes from analogous STM-excited luminescence experiments performed
with single crystalline Ni(110) tips and a stepped GaAs(110) sample [51]. With the
tip positioned above at terraces, a high-spin injection efciency was measured.
However, the intensity of the recombination luminescence on the upper terrace
was found to have decreased by a factor of 1000. Simultaneously, the polarization
decreased by a factor of 6. This observation was explained by a reduction of either
the spin injection efciency or of the spin relaxation lifetime, and attributed to the
metallic nature of the step edge caused by midgap states of the (111) surface. As a
sharp tip must possess numerous step edges around the apex atom, it is a
straightforward conclusion that the spin relaxation lifetime may be drastically
reduced at the very end of the tip.
1.3.1.4 Nonmagnetic Probe Tips
Surprisingly, even nonmagnetic tips can be used to image certain magnetic sample
properties, as demonstrated by Bode et al. [52, 53] and by Pietzsch et al. [54]. The
images shown in Figure 1.9a and b [54] were taken for slightly less than 2 ML Fe on
1.3 Spin-Polarized Electron Tunneling in Scanning Tunneling Microscopy (STM): Experimental Aspects
W(110), which allows a comparison with the results shown in Figure 1.6 (which were
obtained using a magnetic tip). However, a tungsten tip without any magnetic coating
was now used, whereupon the dark lines revealed the presence of domain walls. The
main difference between the two measurements was that the nonmagnetic tip did not
provide sensitivity to the spin orientation inside the walls. Instead, both walls of a pair
were imaged equally, in contrast to the observation made with the ferromagnetic
tip [55] (cf. Figure 1.26). This is a consequence of the fact that the measurement made
with the W tip does not involve spin-polarized tunneling; rather, it is the spinaveraged electronic structure of the sample that gives rise to the signal. The electronic
structure of the DL stripes is locally modied due to the presence of a domain wall. In
other words, the electronic structure is sensitive to whether the magnetization is in an
easy or a hard direction. First-principle calculations have shown [52, 53] that the
spinorbit-induced mixing of different d-states depends on the magnetization
direction, and changes the local density of those states that are detectable by nonspin-polarized STS.
As an important implication of this effect, the magnetic nanostructure of surfaces
can be investigated with conventional nonmagnetic tips. This has the clear advantage
that there is denitely no dipolar magnetic stray eld from the tip that could interfere
with the sample. In addition, the preparation of a magnetic tip is omitted.
1.3.2
Modes of Operation
I r t ; U; q I0 r t ; U I sp r t ; U; q
1:9
!
!~
! !
ns r t ; U mt m s r t ; U
const: nt~
1:10
with nt being the non-spin-polarized local density of states (LDOS) at the tip apex, ~
ns
!
!
the energy-integrated LDOS of the sample, and m t and ~
m
the
vectors
of
the
(energys
integrated) spin-polarized LDOS with:
! !
! !
~
m s r t ; U ms r t ; EdE
1:11
and q the angle between the magnetization direction of tip and sample. For a nonspin-polarized STM experiment that is, using either a nonmagnetic tip or a
nonmagnetic sample the second term, Isp, vanishes.
j13
14
The constant current mode is restricted to some limited cases, which is partly due
to the integral in Equation 1.11 and taken over all energies between the Fermi energy
EF and eU, with U being the applied bias voltage because Isp becomes reduced if the
spin polarization changes sign between EF and eU. Furthermore, the magnetically
induced corrugation is small compared to the topographic and electronic corrugation; this is due to the exponential dependence of the tunneling current on the
distance between tip and sample. Nevertheless, it is still possible to obtain information of complex atomic-scale spin structures at ultimate magnetic resolution (as
shown in Ref. [56]), although it is necessary to understand the inuence of the
tip [57, 58].
1.3.2.2 Spectroscopy of the Differential Conductance
The difculties of separating the topographic, electronic and magnetic information
can be overcome by measuring the differential conductance, dI/dU, with a magnetic
tip [56]:
!
!
! ~
! !
ns r t ; E F eU mt m s r t ; E F eU
dI=dU r t ; U / nt~
1:12
In this situation, the measured quantity no longer depends on the energyintegrated spin polarization, but rather on the spin polarization in a narrow energy
window DE around EF eU.
The differential conductance is determined experimentally by adding a small acvoltage to the dc-bias voltage at a frequency which is signicantly above the cut-off
frequency of the feedback loop that ensures a constant current. The amplitude of the
ac-voltage is responsible for the width DE. The current modulation is amplied by
means of a lock-in technique.
The electronically homogeneous surfaces maps of differential conductance reect
the magnetic behavior, since any variation of the signal must be due to the second
spin-dependent term, Isp. The situation becomes more complicated for electronically
heterogeneous surfaces; nevertheless, a careful comparison between spin-averaged
and spin-resolved measurements often allows a distinction to be made between
topographic and electronic contrast compared to the magnetically induced information. However, this set of experiment requires measurements to be made with both
nonmagnetic and magnetic probe tips.
The determination of differential conductance also provides access to the spin
polarization of a surface which is locally resolved [5961].
The recording of inelastic tunneling spectra (i.e. the second derivative of the
conductance d2I/dU2) with a magnetic tip in an external magnetic eld, it becomes
possible to study directly that is, without a separating insulating layer magnon
excitations in magnetic nanostructures [62].
1.3.2.3 Local Tunneling Magnetoresistance
!
As an alternative method, the local tunneling magnetoresistance dI=dm t between the
magnetic tip and magnetic sample can be determined by modulation of the tip
magnetization direction and determining the variation of the tunneling current
using a lock-in technique. This type of measurement was rst proposed by Johnson
and Clarke [63] and later accomplished by Wulfhekel and Kirschner [17]. By taking the
derivative of Equation 1.10 one obtains:
!
!
~
dI=dmt / m s
1:13
1.4
Magnetic Arrangement of Ferromagnets
j15
16
peak in the dI/dU spectra just above the Fermi level [73]. In contrast, the majority
(minority) part of the Gd(0001) surface state has a binding energy of 220 meV
(500 meV) at 20 K [71]; that is, the exchange splitting only amounts to 700 meV, far
below the Curie temperature of 293 K.
In the following section we consider vacuum tunneling between a Gd(0001)
surface and a tip material; for simplicity, we assume a constant spin polarization (see
Figure 1.10a, lower part). If the magnetization direction of the tip remains constant,
then two possible magnetic orientational relationships occur between the tip and
sample parallel or antiparallel under the assumption that the magnetization of the
tip and sample is aligned. Since, however, both the majority and the minority
component of the Gd(0001) surface state appear in the tunneling spectra, the spins
of one component of the surface state will in any case be parallel with the tip, while the
spins of the other component will be antiparallel. Therefore, the spin valve effect will
act differently on the two spin components; due to the strong spin dependence of the
density of states, the spin component of the surface state parallel to the tip
magnetization is enhanced at the expense of its counterpart being antiparallel.
The domain structure of the surface of a Co(0001) single crystal has been studied by
Wulfhekel et al. [1723]. The uniaxial magnetocrystalline anisotropy of hcp-Co points
along the c-axis that is, perpendicular to the (0001) surface. However, the total
energy of the sample is minimized by the formation of surface closure domains
where the magnetization locally tilts towards the surface plane, thereby reducing the
dipolar energy. As the magnetocrystalline anisotropy energy and dipolar energy are
similar in size, and the in-plane components of the magnetocrystalline anisotropy
energy are almost degenerate, a complicated dendritic pattern is formed at the
surface. Figure 1.11a shows the typical dendritic-like perpendicular domain pattern
of Co(0001) as measured by Wulfhekel et al. [23]. Due to the fact that the tip
magnetization is intentionally modulated by a small coil, the bright and dark
locations in Figure 1.11a can be assigned to specic magnetic orientations, namely
the magnetization vector points out of or into the surface, respectively. A sharp
contrast can be recognized in Figure 1.11b, which is completely absent in the
topographic image [21]. The absence of any correlation conrms that this contrast
is not caused by different local structural or electronic properties; rather, it represents
a domain wall separating two regions of different magnetization directions. This
j17
18
domain wall is found to correspond to the domain wall across two canted surface
domains (see Figure 1.11c).
Further information concerning the ferromagnetic transition metal Fe on W(110)
and Mo(110) is provided below. The spin-resolved electronic properties of dislocation
lines that occur during thin lm growth of Fe lms on W(110) are described in
Ref. [75], while details of the complex magnetic structure of Fe on W(001) are reported
in Refs [76, 77]. The easy magnetization axis was shown to be layer-dependent,
whereas the second and third Fe layers were magnetized along h110i or equivalent
directions; the easy axis of the fourth layer was rotated by 45 .
1.5
Spin Structures of Antiferromagnets
The lateral averaging of magnetically sensitive techniques often fails in the imaging
of antiferromagnetic surfaces because the overall detected magnetization is equal to
zero. Here, we will show how SP-STM can be used to overcome this difculty. The
rst example is the topological anti-ferromagnetism of Cr(001); a second example,
namely the atomic resolution of magnetic behavior, will be demonstrated using the
antiferromagnetic Mn monolayer on W(110) and Mn3N2 lms on MgO(001).
The antiferromagnetic nature was additionally reported for the rst monolayer of
Fe on W(001) [77]. The antiferromagnetic coupling between a whole atomic layer and
a ferromagnetic substrate was investigated for Mn on Fe(001) [25, 26, 78]; a Co layer
on a Cu-capped Co substrate exhibits a ferromagnetic or antiferromagnetic coupling,
depending on the interlayer thickness [79]. Magnetite Fe3O4 represents a ferrimagnet
with a high spin polarization near the Fermi level. SP-STM was used to investigate the
corresponding (001) [39, 40, 8083] and (111) surfaces [40, 84].
1.5.1
Topological Anti-Ferromagnetism of Cr(001)
The Cr(001) surface for which the topological step structure is directly linked to the
magnetic structure represents a topological anti-ferromagnet; that is, each terrace
exhibits a ferromagnetic alignment of the magnetic moments, although between two
adjacent terraces the magnetization possesses an antiparallel orientation that was
predicted, on a theoretical basis, by Bl
ugel et al. [85].
By using the scanning possibilities, the antiferromagnetic coupling between
neighbored terraces of a Cr(001) surface can be imaged directly [70, 8694]. The
topography (see Figure 1.12a) presents a regular step structure with terrace widths of
about 100 nm [88]. The line section in the bottom of Figure 1.12a shows that all step
edges in the eld of view are of single atomic height that is, 1.4 . This topography
should lead to a surface magnetization that periodically alternates between adjacent
terraces and, indeed, this was observed experimentally (see Figure 1.12b). The line
section of the differential conductance drawn along the same path as in Figure 1.12a
indicates two discrete levels with sharp transitions at the positions of the step edges.
The typical domain wall width, as measured on a stepped Cr(001) surface, amounts
to 120170 nm [86]. In analogy to ferromagnetic domain walls (these are discussed in
detail in Chapter 9), this value is determined by intrinsic material parameters that
is, the strength of the exchange coupling and the magnetocrystalline anisotropy.
Clearly, the domain wall width cannot remain unchanged very close to a screw
Figure 1.12 (a) Topography and (b) spin resolved map of the dI/
dU signal of a clean and defect free Cr(0 0 1) surface as measured
with a ferromagnetic Fe-coated tip. The bottom panels show
averaged sections drawn along the line. Adjacent terraces are
separated by steps of monatomic height. (Reprinted with
permission from [88]; copyright (2003) by the American Physical
Society.).
j19
20
dislocation where the circumference becomes comparable with or smaller than the
intrinsic domain wall width.
The dependence of domain wall width on the distance from the screw dislocation
of the Cr(001) surface is shown in Figure 1.13a [88]. Here, approximately 100 nm
from the next step edge, a single screw dislocation can be recognized in the upper
left corner of the image. The magnetic dI/dU map of Figure 1.13b reveals that this
screw dislocation is the starting point of a domain wall which propagates towards
the upper side of the image. Starting at the tail of the arrow (zero lateral displacement), eight circular line sections are drawn counterclockwise around the screw
dislocation at different radii ravg, from 75 nm down to 7.5 nm; these data are plotted
in Figure 1.13c. The domain walls were tted using the model provided in Chapter
9. The results are shown as gray lines in Figure 1.13c; except for the smallest average
Figure 1.14 STM images of (a) the topography and (b) dI/dU
magnetic signal obtained simultaneously from the same area of a
9 nm-thick Cr(001) film. (Reprinted with permission from
Ref. [93]; copyright (2007), Elsevier.)
radius (ravg 7.5 nm), an excellent agreement with the experimental data was found.
At an average radius ravg 75 nm, the domain wall width amounted to 125 nm, being
in close agreement with the intrinsic domain wall width of Cr(001) as determined far
away from screw dislocations. This was not surprising, as the circumference
amounted to about 500 nm much larger than the intrinsic domain wall width.
However, as soon as ravg was reduced below 60 nm a signicant reduction in domain
wall width was observed, although the circumference still exceeded the intrinsic
domain wall width. The results showed clearly that the domain wall width was always
considerably narrower than the circumference of the cross-section.
We can now discuss a more complex structure, namely the inuence of the
distance and chirality between two adjacent spiral terraces on magnetic structures on
Cr(001) lms [90, 93]. Figure 1.14a shows a topographic STM image of a 9 nm-thick
Cr(001) lm [93] where the feature of the surface morphology is that the Cr layers
form high-density, self-organized spiral terraces. Each terrace is displaced by a
monatomic step height, and a screw dislocation is clearly visible in the center of
each spiral pattern. The typical diameter of these spiral terraces is 50 nm. A complex
spin frustration and characteristic magnetic ordering is present, being restricted by
the topological asymmetry of the spiral terraces. Figure 1.14b shows the dI/dU
magnetic image obtained simultaneously in the same area of Figure 1.14a, exhibiting
a magnetic contrast. A comparison of the two images of Figure 1.14a and b reveals
that most parts of the observed magnetic contrasts are consistent with a topological
antiferromagnetic structure. The maximum magnetic contrast corresponds to the
topological antiferromagnetic order, and a deviation from the maximum magnetic
contrast can be recognized as the spin frustration which appear in the region near the
screw dislocations and between two spirals.
The magnetic structure can be deduced from the observed dI/dU magnetic signal
intensity by assuming the orientation of the tip magnetization parallel to the bcc [100]
direction. For example, the derived magnetic structures of two adjacent spirals (the
regions A and B indicated in Figure 1.14b) are shown in Figure 1.15 by arrows.
Although the two adjacent spirals have the same chirality, the sign is opposite
j21
22
between regions A and B; the distance between the two screw dislocations ds is 20 and
32 nm for the regions A and B, respectively. Although the magnetization rotates
gradually around the center of the spiral in the case of ds 75 nm [90], this does not
occur for ds 20 nm and 32 nm, which suggests that the spin-frustrated region
decreases with decreasing ds. There seems to be little difference in the sign of chirality
of spirals.
In order to understand its origin, we can calculate the magnetic structure of these
spiral terraces [93]; the result is shown in the two lower panels in Figure 1.15. The
white (black) contrast represents the magnetization to be parallel (antiparallel) to the
[100] direction. The topological antiferromagnetic order appears in a series of
adjacent terraces, as well as in the most part of spiral terraces; the frustrated regions
(the gray regions in the simulated gures) are evident between the center of adjacent
spirals. It should be noted that the observed and calculated magnetic structures are
clearly asymmetric with respect to the straight line connecting two screw dislocations, in spite of the different ds-values. The simulated spin alignments are in good
qualitative agreement with the observed results.
1.5.2
Magnetic Spin Structure of Mn with Atomic Resolution
The deposition of Mn on W(110) in the submonolayer regime results in a pseudomorphic growth; that is, Mn mimics the bcc symmetry as well as the lattice constant
of the underlying substrate [95]. By using a clean W tip, atomic resolution could be
achieved on the Mn islands, as demonstrated by Heinze et al. [96] (see Figure 1.16a).
Additional information is provided in Refs [70, 97]. The diamond-shaped unit cell of
the (11)-grown Mn ML is clearly visible. The line section drawn along the densepacked 1 1 1 direction exhibits a periodicity of 0.27 nm, which almost perfectly ts
the expected nearest-neighbor distance of 0.274 nm. The measured corrugation
amplitude amounts to 15 pm. A calculated STM image for a conventional tip without
spin polarization is given for comparison (see inset of Figure 1.16a). Clearly, the
qualitative agreement between theory and experiment is excellent.
In a second set of experiments [96], different ferromagnetic tips were used. Since it
is known from rst-principles computations that the easy magnetization axis of the
Mn ML on W(110) is in-plane [96], the experiment required a magnetic tip with a
magnetization axis in the plane of the surface in order to maximize the effects. This
condition is fullled by Fe-coated probe tips [27]. Figure 1.16b shows an STM image
taken with such a tip, where the periodic parallel stripes along the [001] direction of
the surface can be recognized. The periodicity along the 1 1 0 direction amounts to
4.5 , which corresponds to the size of the magnetic c(2 2) unit cell. The inset in
Figure 1.16b shows the calculated STM image for the magnetic ground state, that is,
the c(2 2)-antiferromagnetic conguration. Thus, theory and experiment give a
consistent picture. Even the predicted faint constrictions of the stripes along the [001]
direction related to the pair of second-smallest reciprocal lattice vectors are visible in
the measurement. Again, experimental and theoretical data can be compared more
quantitatively by drawing line sections along the dense-packed 1 1 1 direction (see
Figure 1.16b). The result, which is plotted in Figure 1.16c, reveals that the periodicity,
when measured with a Fe-coated probe, is twice the nearest-neighbor distance (i.e.
0.548 nm).
j23
24
The pronounced dependence of the effect on the magnetization direction of the tip
can be exploited to gain further information on the magnetization direction of the
sample. This is done by using a tip that exhibits an easy magnetization axis that is
almost perpendicular to the one of the sample surface. This condition is fullled by a
W tip coated with about 7 ML Gd [29]. The gray line in Figure 1.16c represents a
typical line section as measured with such a Gd-coated probe tip. Indeed, the
corrugation amplitude was always much smaller than that for Fe-coated tips and
never exceeded 1 pm, thus supporting the theoretical results that the easy axis of the
Mn atoms is in-plane.
In the following section it will be shown, by reference to the studies of Yang
et al. [98] and Smith et al. [99], that both magnetic and nonmagnetic atomic-scale
information can be obtained simultaneously in the constant current mode for
another Mn-based system which consists of Mn3N2 lms grown on MgO(001) with
the c axis parallel to the growth surface, which is (010). The surface geometrical unit
cell, containing six Mn atoms and four N atoms (3 : 2 ratio), can be denoted as c(1 1),
whereas the surface magnetic unit cell is just (1 1).
The bulk structure of Mn3N2 exhibits a face-centered tetragonal (fct) rock salt-type
structure. The bulk magnetic moments of the Mn atoms are ferromagnetic within
(001) planes, lie along the [100] direction, and are layerwise antiferromagnetic along
[001]. Besides the magnetic superstructure, every third (001) layer along the c
direction has all N sites vacant, which results in a bulk unit cell exhibiting c 12.13
(six atomic layers).
Figure 1.17a presents a SP-STM image [98] of the surface acquired using a Mncoated W tip thus being sensitive to the in-plane direction. Although the row structure
with period c/2 is observed, a modulation with period c of the height of the rows is
additionally obvious. The modulation shown in Figure 1.17b is evident for both
domains D1 and D2 by the area-averaged line proles taken from inside the boxed
regions on either side of the domain boundary. For the domain D1 (red line), the
modulation amplitude is about a factor of 2 larger than for the domain D2 (blue line).
As the height modulation is proportional to mtms cos q, with mt and ms being the
!
!
of tip (m
t ) and the sample (m s ) for the two
different domains and the angles between them.
Each ball represents a magnetic atom.
(Reprinted with permission from Ref. [98];
copyright (2002), American Physical Society.)
moment of the tip and sample, respectively, and q the angle between them (cf.
Equation 1.8), it is simple to show that q arctan (Dz2/Dz1), where z1 and z2 are the
height modulation in domains D1 and D2, respectively. In the case shown here, with
Dz1 0.04 and Dz2 0.02 , q amounts to 27 .
A high peak (H) on one side of the domain boundary converts to a low peak (L) on
the other side. This inversion is simulated in Figure 1.17c by a simple antiferromagnetic model conguration of spin moments and a tip spin at the angle q 27 . The
gray scale for each magnetic atom is proportional to mtms cos q (white: q 0; black:
q p). Clearly, the inversion occurs when the difference f q p/2, where q and f
are the two different angles between tip and sample moments in domains D1 and D2,
respectively.
The data can now be used to separate the magnetic and nonmagnetic components.
Beginning with the SP-STM image shown in Figure 1.18a [98], the average height
prole z(x) where x is along [001] (Figure 1.18b, dark blue line) and also z(x c/2)
(Figure 1.18b, light blue line) are plotted. Clearly, by taking the difference and sum of
these two functions, the magnetic component with periodicity c and the nonmagnetic
component with period c/2 can be extracted: mtms cos [q(x)] [z(x) z(x c/2)]/2.
This is further justied if it is assumed that the bulk magnetic symmetry is
maintained at the surface. When using this procedure, the resulting magnetic
prole for the data of Figure 1.18 has a period of c and a trapezoidal wave shape,
as shown in Figure 1.18c (violet line). The nonmagnetic prole is also shown in
j25
26
Figure 1.18c (red line) exhibiting a period of c/2 and a sinusoidal shape, the same as
for the average line prole acquired with a nonmagnetic tip. The magnetic component amplitude is about 20% of the nonmagnetic component amplitude.
1.6
Magnetic Properties of Nanoscaled Wires
Figure 1.20 (a) STM topograph and (b) magnetic dI/dU image of
Fe nanowires on W(110). Both images were measured
simultaneously. The sample exhibits a demagnetized
antiferromagnetic ground state which is energetically favorable
due to flux closure between adjacent perpendicularly magnetized
Fe nanowires [106]. (Reprinted with permission from Ref. [104];
Elsevier.)
magnetic stray eld of the perpendicularly magnetized Fe DL. At domain walls the
magnetization vector may locally be oriented along the hard magnetic axis that is,
in-plane.
Tunneling spectroscopy was used to image the corresponding magnetic domain
structure. Since it is known from full dI/dU spectroscopy curves how the contrast
must be interpreted (see Section 1.3.2), it is no longer necessary to measure the entire
spectra at every pixel of the image as this is very time-consuming (about 1020 h per
image for the investigation discussed here [104]). Instead, the dI/dU signal at a xed
sample bias already gives a good contrast. Figure 1.20 shows the simultaneously
recorded topography (panel a) and the dI/dU signal (panel b) of 1.5 ML Fe/W(110).
The measurement time for this image was about 30 min. Due to its different
electronic properties, the Fe ML appears dark, but this is not related to the magnetic
properties. Instead, the ML is known to exhibit an in-plane magnetization [107] which
cannot be detected using Gd-tips which are sensitive only to out-of-plane magnetization [29]. Clearly, the magnetic domain structure is dominated by DL nanowires
which are magnetized alternately up and down, although exceptions from this model
can easily be recognized. Several domain walls within single Fe nanowires are visible;
some of these are marked with arrows in Figure 1.20b. There are also numerous
adjacent nanowires which couple ferromagnetically rather than antiferromagnetically. It is likely that these DL nanowires approach very close to each other or may
even touch so that the exchange coupling dominates.
Imaging by SP-STM can also be used to deduce macroscopic magnetic properties,
a situation demonstrated by Pietzsch et al. [104, 108] for a system of Fe nanowires as
just discussed above. These authors showed that spin-resolved dI/dU maps in a
varying external eld could be used to obtain the magnetic hysteresis curve of the
j27
28
surface area under investigation; that is, SP-STM enables the observation of magnetic
hysteresis down to the nanometer scale.
Replacing W(110) with Mo(110) provides the unique possibility of observing the
modication of magnetic properties of the Fe nanostructures, but leaving the
structure and morphology almost unaffected [33, 109]. The magnetic easy axis is
directed along the [001] direction for Fe/Mo(110) [110], while the easy axis is 1 1 0 for
Fe/W(110) lms [111]. The pseudomorphic ML (ps-ML) Fe/Mo(110) nanostructures
are perpendicularly magnetized at low temperatures [112], whereas the ps-ML Fe/W
(110) is magnetized in-plane along the 1 1 0 direction [113].
Figure 1.21a shows the topography and Figure 1.21b the simultaneously recorded
dI/dU map of 0.5 ps-ML Fe deposited onto the Mo(110) single crystal at 700 K [109].
Monatomic Mo terraces decorated with the regular narrow Fe nanostripes grown by
step-ow growth at the step edges are visible. The location of the Fe atoms on the Mo
(110) surface is better visible on the dI/dU map (see Figure 1.21b) due to the element
specic contrast resulting from the differences of the spin-averaged dI/dU signal
which are connected with the local electronic surface properties that are different for
Fe and Mo [112]. Uncovered Mo surfaces are indicated in Figure 1.21b by white
arrows. The Fe nanowires show two different colors, representing two different
values of the local dI/dU signal for equivalent surface regions (ML Fe/Mo(110)) for
which the spin-averaged conductance signals should be the same [24]. All STM
images and conductance maps shown in Figure 1.21 were measured using W tips
covered by 10 ML Au and subsequently by 4 ML Co. It is known [114] that 4 ML Co
prepared on W(110)/Au reveal an out-of-plane magnetic easy axis. Therefore, it may
be expected that the magnetization of the tip would show perpendicular to the front
plane of the tip that is, along the tip axis leading to an out-of-plane magnetic
sensitivity of the tip. The contrast observed for the Fe nanostructures results from the
perpendicularly magnetized Fe nanostripes, in agreement with Ref. [112].
The perpendicularly magnetized ML Fe nanostripes shown in Figure 1.21b are not
antiferromagnetically ordered; that is, only two of the stripes are magnetized up,
whereas the orientation of the magnetization for the remaining stripes shows in the
opposite direction (down). This means that the dipolar coupling between adjacent
ML Fe nanowires is weak. The strength of the dipolar coupling between adjacent
stripes increases with the stripe width and decreases with the distance between
adjacent stripes [113]. The distance between adjacent ML Fe nanowires can be
diminished down to a minimum by an increase of Fe coverage up to 1 ML. The
topography of the 1 ML Fe deposited onto the vicinal surface of the Mo(110) crystal is
presented in Figure 1.21c. Narrow ML Fe nanowires obtained on the vicinal surface
are antiferromagnetically ordered, as demonstrated on the conductivity map (see
Figure 1.21d).
1.7
Nanoscale Elements with Magnetic Vortex Structures
j29
30
varying between 3.5 and 8.5 nm. The lateral dimensions of the islands, irrespective of
their height, are almost equal. The islands shown in Figure 1.22a exhibit an average
height of approximately 3.5 nm (see line section). In the right-hand panel of
Figure 1.22a the different magnetization states of islands can be distinguished by
means of different dI/dU intensities. This variation results from spin-polarized
tunneling between the magnetic STM tip (due to a coating with more than 100 ML Cr
the tip is sensitive to the in-plane direction [28]) and the magnetic sample, and
therefore represents different relative in-plane orientations of the magnetization in
tip and sample. As no signicant variation was found on top of the atomically at
island surface, it would appear that these Fe islands are single domain, as shown
schematically in the right-hand panel of Figure 1.22a. Evidently, there exists a close
correlation between the magnetization direction of individual Fe islands and the
surrounding Fe ML; dark (bright) Fe islands are always surrounded by a dark (bright)
ML. With increasing thickness the magnetic pattern of the Fe islands becomes more
and more complex such that, at h 54.5 nm (see Figure 1.22b) a two-domain state is
present. The island in Figure 1.22c exhibits a height h of 57.5 nm, while the
corresponding spin-resolved dI/dU map shows the typical pattern of a single vortex
state. A diamond state is found on the even higher island shown in Figure 1.22d
(h 58.5 nm).
Thus, the magnetic ground state is expected to be a vortex, as can be understood by
the following consideration. If the dimensions of the particles are too small they do
not form a single domain state, as this would require a relatively high stray eld (or
dipolar) energy. On the other hand, if the dimensions were too large, such domains
would be formed such as those found in macroscopic pieces of magnetic material,
because the additional cost of domain wall energy cannot be compensated by the
reduction in stray eld energy. By exhibiting a vortex conguration, the magnetization curls continuously around the particle center, drastically reducing the stray eld
energy and avoiding domain wall energy. Yet, the question arises as to the diameter of
this core. An earlier investigation conducted by Shinjo et al. [132], using magnetic
force microscopy, suggested an upper limit of about 50 nm caused by the intrinsic
lateral resolution that was due to detection of the stray eld. In the following section, it
will be seen that an enhanced lateral resolution can be obtained using the technique
of SP-STM, as shown by Wachowiak et al. [28, 102].
In order to gain a detailed insight into the magnetic behavior of the vortex core, a
zoom into the central region was carried out where the rotation of the magnetization
into the surface normal is expected. Maps of the dI/dU signal measured with
different Cr-coated tips that are sensitive to the in-plane and out-of-plane component
of the local sample magnetization are shown in Figure 1.23a and b, respectively [28].
The dI/dU signal as measured along a circular path around the vortex core (the circle
in Figure 1.23a) exhibits a cosine-like modulation, indicating that the in-plane
component of the local sample magnetization curls continuously around the vortex
core. Figure 1.23b, which was measured with an out-of-plane sensitive tip on an
identically prepared sample, exhibits a small bright area approximately in the center
of the island. Therefore, the dI/dU map of Figure 1.23b conrms that the local
magnetization in the vortex core is tilted normal to the surface (Figure 1.23c). The line
section across the vortex core enables the width to be determined at about 9 nm,
which is not restricted due to lateral resolution.
1.8
Individual Atoms on Magnetic Surfaces
j31
Figure 1.23 Magnetic dI/dU maps as measured with an (a) in-plane and an (b) out-of-plane sensitive Cr tip. The curling in-plane
magnetization around the vortex core is recognizable in (a) and the perpendicular magnetization of the vortex core is visible as a bright
area in (b); (c) Schematic arrangement of a magnetic vortex core. Far from the vortex core the magnetization curls continuously around
the center with the orientation in the surface plane. In the center of the core the magnetization is perpendicular to the plane
(highlighted). (Reprinted with permission from Ref. [28]; AAAS.)
32
towards the creation of devices, the functionality of which can be engineered at the
level of individual atomic spins.
The direct observation of a spin polarization state of isolated adatoms remains
challenging because isolated atoms have a low magnetic anisotropy energy that
causes their spin to uctuate over time due to environmental interactions. In the
following section, measurements made by Yayon et al. [133] concerning the spin
polarization state of individual Fe and Cr adatoms on a metal surface, are described. In
order to x the adatom spin in time, the adatoms were deposited onto ferromagnetic
Co nanoislands, thereby coupling the adatom spin to the island magnetization
through the direct exchange interaction.
Cobalt islands were chosen as a calibrated substrate where different magnetization
states (up " and down # with respect to the surface plane) are easily accessed [122].
The left-hand part of Figure 1.24 shows a representative topograph of Fe adatoms
adsorbed onto triangular Co islands on the Cu(111) surface. The spatial oscillations
seen on the Cu(111) surface are due to interference of surface state electrons scattered
from the adatoms and Co islands [134].
In the right-hand part of Figure 1.24, panel (a) shows a color-scaled spin-polarized
dI/dU map, together with topographic contour lines (measured simultaneously) for
Fe and Cr atoms codeposited on two Co islands with opposite magnetization. The Fe
and Cr atoms can easily be distinguished by their topographic signatures (Cr atoms
j33
34
protrude 0.07 nm from the island surface, while Fe atoms protrude 0.04 nm). Spincontrast between adatoms sitting on the two islands is seen as line cuts through
Fe and Cr atoms (see Figure 1.24be). Fe atoms sitting on the # island exhibit a larger
dI/dU signal than those on the " island, while Cr atoms on the # island show a smaller
dI/dU signal than those on the " island. This conrms the parallel nature of the Fe/Co
island spin coupling and the antiparallel nature of the Cr/Co island spin coupling
over this energy range. SP-STS thus clearly reveals single adatom spin contrast: Each
type of adatom yields a distinct spectrum, and over the energy range of the Co island
surface-state Fe and Cr adatoms show opposite spin polarization. However, this
measurement does not unambiguously determine the direction of the total spin of
the adatom because the total spin is an integral over all lled states, whereas the
spectra shown here were recorded over a nite energy range.
For a better understanding of the magnetic coupling between adatoms and islands,
a density functional theory (DFT) calculation was also carried out [133]. The
adsorption energies of Fe and Cr atoms on a ferromagnetic 2 ML lm of Co on
Cu(1 1 1) were calculated with the adatom moment xed parallel and antiparallel to
the magnetization of the Co lm. The resulting values (see Figure 1.25) showed that
Fe adatoms preferred a ferromagnetic alignment to the Co lm, while Cr adatoms
preferred an antiferromagnetic alignment. Comparison with the spin-polarized
measurements implied that the Fe and Cr adatoms exhibit a negative spin polarization over the energy range of the Co island surface state.
As discussed in Section 1.5.1, the Cr(001) surface exhibits a topological antiferromagnetic order. By increasing the number of adatoms, however, a small proportion of
the Fe atoms on this surface will also exhibit an antiferromagnetic coupling to the
underlying Cr(001) terraces [135, 136].
It is known from spin-polarized photoemission experiments that even nonmagnetic atoms such as oxygen (see Ref. [137] for O/Fe(110) and Ref. [138] for O/Co
(0001)), sulfur [139] and iodine [140] become polarized upon chemisorption onto
ferromagnetic surfaces. For each of these systems, SP-STM allows a deeper insight
on the basis of its atomic resolution. For example, it was found that individual oxygen
atoms on an Fe DL would induce highly anisotropic scattering states which were of
minority spin character only [141]. This spin-dependent electron scattering at the
single impurity level opens the possibility of understanding the origin of magnetoresistance phenomena on the atomic scale.
In the case of Fe islands, it has been reported that magnetic domains can be
observed even after the deposition of a sulfur layer [142], and can act as a passivation
species. These ndings can be understood on the basis of the above discussion, also
taking into account the fact that spin-polarized electrons from the interface with
binding energies near the Fermi level are not fully damped but rather exhibit an
attenuation length of at least several monolayers. Additionally, this mean free path is
spin-dependent [143], such that an appropriate adsorbate layer may allow to extend
the SP-STM to operate even under ambient conditions.
1.9
Domain Walls
The motion of domain walls is often hindered by lattice defects, leading to Barkhausen jumps in magnetic hysteresis curves. By using a high lateral resolution, the
effective pinning of domain walls by screw-and-edge dislocations was rst presented
by Krause et al. [144] for Dy lms on W(110).
Here, we will describe the details of two further aspects concerning domain walls
which require an effective lateral resolution of SP-STM. First, we report on the
behavior of domain walls in external magnetic elds, as investigated by Kubetzka
et al. [103]; second, we will outline details of the widths of domain walls in nanoscale
systems, referring to the studies conducted by Pratzer et al. [107].
The formation and stability of 360 domain walls plays a crucial role in the
remagnetization processes of thin ferromagnetic lms, with possible implications
for the performance and development of magnetoresistive and magnetic random
access memory (MRAM) devices. These are formed in external elds applied
along the easy direction of the magnetic material when pairs of 180 walls with
the same sense of rotation are forced together. Their stability against remagnetization into the uniform state is a manifestation of a hard axis anisotropy
perpendicular to the rotational plane of the wall. This anisotropy may be of
crystalline origin or in lms with an in-plane magnetic easy direction due to
the shape anisotropy.
j35
36
j37
38
Figure 1.27 (a) Topographic and (b) spinresolved dI/dU image showing the in-plane
magnetic domain structure of 1.25 ML Fe/W
(110). Several ML and DL domain walls can be
recognized in the higher magnified inset; (c) Line
sections showing domain wall profiles of the ML
inset of Figure 1.27b presents this location at higher magnication. Averaged line
sections drawn along the white lines across domain walls in the ML and the DL are
plotted in Figure 1.27c lower and upper, respectively, where clearly the ML domain
wall is much narrower than the DL wall. The inset of Figure 1.27c shows the data in
the vicinity of the ML domain wall in greater detail, and reveals a domain wall width
w < 1 nm. In order to allow a more quantitative discussion the measured data have
been tted with a theoretical tanh function of a 180 wall prole [145]. This can be
extended to an arbitrary angle between the magnetization axis of tip and sample
f [107, 146] by
xx 0
f
1:14
yx y0 ysp cos arccos tanh
w=2
where y(x) is the dI/dU signal measured at position x, x0 is the position of the domain
wall, w is the wall width, and y0 and ysp are the spin-averaged and spin-polarized dI/dU
signals, respectively. Due to the Fe-coated tip which exhibits in-plane sensitivity, it is
known that fDL p/2 and fML 0. The best t to the wall prole of the DL is achieved
with wDL 3.8 0.2 nm. The prole of the ML domain wall is much narrower. If the
t procedure is performed over the full length of the line section wML 0.50 0.26
nm, whereas wML 0.66 0.18 nm is found if the t is applied to the data in the inset
of Figure 1.27c; this conrms the result of the analysis of the magnetization curves
that is, an almost atomically sharp domain wall.
A domain wall width of only six to eight atomic rows was also observed for an
antiferromagnetic Fe monolayer on W(001) [147]. Such a narrow domain wall width
can, in theory, be understood to arise from band structure effects, also taking into
account intra-atomic noncollinear magnetism [148].
With regards to noncollinear effects it is important to distinguish between interatomic and intra-atomic magnetism. The rst type is well known, and has been
observed experimentally for small magnetic clusters [149] and in magnetic
layers [150]; it has also been described on a theoretical basis [56, 151157]. Interatomic magnetism can be understood within the Heisenberg model, taking into
account atomic magnetic moments which are nonparallel for different atoms. Intraatomic noncollinear effects arise from the tunneling current which ows through
orbitals of the same atom, whilst that directly at the tip apex possesses a spin density
orientation that is noncollinear [55].
1.10
Chiral Magnetic Order
Due to the inversion symmetry of bulk crystals, homochiral spin structures are
unable to exist. However, as low-dimensional systems lack structural inversion
symmetry, these single-handed spin arrangements may occur due to the DzyaloshinskiiMoriya interaction [158, 159] arising from the spin-orbit scattering of
electrons in an inversion asymmetric crystal eld. This observation is now discussed
with reference to the studies of Bode et al. [160], which were carried out in the same
system Mn/W(110) for which atomic resolution of the magnetic properties had been
demonstrated [96] (see Section 1.5.2).
In metallic itinerant magnets, spin-polarized electrons of the valence band hop
across the lattice and exert the Heisenberg exchange interaction between magnetic
!
spin moments S located on atomic sites i and j:
E exch
J ij S i S j
1:15
i;j
Dij S i S j
1:16
i;j
j39
40
(C < 0). In fact, J, D and also the anisotropy constants K, create a parameter space
containing magnetic structures of unprecedented complexity [161], including 2-D
and 3-D cycloidal, helicoidal or toroidal spin structures, or even vortices.
Figure 1.28a shows the topography of 0.77 atomic layers of Mn grown on a W(110)
substrate [160]. The magnetic structure can be directly imaged with SP-STM using
magnetically coated W tips. Figure 1.28b shows a high spatial resolution constantcurrent image measured on the atomically at Mn layer using a Cr-coated probe tip
which is sensitive to the in-plane magnetization [28]. The SP-STM data reveal
periodic stripes running along the [001] direction, with an inter-stripe distance
matching the surface lattice constant along the 1 1 0 direction (this was discussed
earlier in Section 1.5.2). The additional important observation is, however, that the
line section in the lower panel of Figure 1.28b, representing the magnetic amplitude,
is not constant but is rather modulated, with a period of about 6 nm. Further, the
magnetic corrugation is not simply a symmetric modulation superimposed on a
j41
42
magnetic contrast at this lateral position in zero eld is observed, indicating large inplane components of the sample magnetization here. This is also corroborated by the
line section, which in agreement with the in-plane-sensitive measurements of
Figure 1.28b shows a high magnetic corrugation at the maximum of the spinaveraged long-wave modulation. With an increasing external eld the position of
high magnetic corrugation shifts to the left (see Figure 1.29b) until a node reaches the
adsorbate at 2 T (Figure 1.29c). The line sections reveal that the magnetic eld shifts
the position of high magnetic corrugation, but leaves the long-wave spin-averaged
modulation unaffected. At 2 T that is, with an almost perfectly out-of-plane
magnetized tip a maximum magnetic contrast is achieved and the spin-averaged
signal exhibits a minimum (see line section of Figure 1.29c). Although this observation rules out a SDW, it provides clear proof of a spin spiral with magnetic moments
rotating from in-plane (imaged in Figure 1.29a) to out-of-plane (imaged in
Figure 1.29c).
The islands exhibit a spin spiral of only one chirality, as would be expected for a
DzyaloshinskiiMoriya interaction-driven magnetic conguration. The azimuthal
orientation of the tip magnetization, however, cannot be reliably controlled, and
consequently it is not possible to test experimentally whether the observed spin spiral
is helical or cycloidal.
References
1 Binnig, G. and Rohrer, H. (1982) Helvetica
Physica Acta, 55, 726.
2 Binnig, G. and Rohrer, H. (1987) Reviews
of Modern Physics, 59, 615.
3 Pierce, D.T. (1988) Physica Scripta, 38,
291.
4 Minakov, A.A. (1990) Surface Science, 236,
L377.
5 Allenspach, R. and Bischof, A. (1988)
Applied Physics Letters, 54, 587.
6 Wiesendanger, R., G
untherodt, H.-J.,
G
untherodt, G., Gambino, R.J. and Ruf,
R. (1990) Physical Review Letters, 65,
247.
7 Getzlaff, M. (2007) Fundamentals of
Magnetism, Springer, Berlin.
8 Julliere, M. (1975) Physics Letters A, 54,
225.
9 Miyazaki, T. and Tezuka, N. (1995) Journal
of Magnetism and Magnetic Materials, 139,
L231.
10 Ding, H.F., Wulfhekel, W., Henk, J.,
Bruno, P. and Kirschner, J. (2003) Physical
Review Letters, 90, 116603.
11 Wiesendanger, R., B
urgler, D., Tarrach,
G., Schaub, T., Hartmann, U.,
G
untherodt, H.-J., Shvets, I.V. and Coey,
J.M.D. (1991) Applied Physics A: Materials
Science and Processing, 53, 349.
12 Wiesendanger, R., B
urgler, D., Tarrach,
G., Wadas, A., Brodbeck, D., G
untherodt,
H.-J., G
untherodt, G., Gambino, R.J. and
Ruf, R. (1991) Journal of Vacuum Science
and Technology B, 9, 519.
13 Wiesendanger, R., Bode, M., Kleiber, M.,
Lohndorf, M., Pascal, R., Wadas, A. and
Weiss, D. (1997) Journal of Vacuum
Science and Technology B, 15, 1330.
14 Rastei, M.V. and Bucher, J.P. (2006)
Journal of Physics: Condensed Matter, 18,
L619.
15 Wadas, A. and Hug, H.J. (1992) Journal of
Applied Physics, 72, 203.
16 Kubetzka, A., Bode, M. and
Wiesendanger, R. (2007) Applied Physics
Letters, 91, 012508.
17 Wulfhekel, W. and Kirschner, J. (1999)
Applied Physics Letters, 75, 1944.
References
18 Wulfhekel, W., Ding, H.F. and
Kirschner, J. (2000) Journal of Applied
Physics, 87, 6475.
19 Wulfhekel, W., Ding, H.F., Lutzke, W.,
Steierl, G., Vazquez, M., Marin, P.,
Hernando, A. and Kirschner, J. (2001)
Applied Physics A: Materials Science and
Processing, 72, 463.
20 Ding, H.F., Wulfhekel, W., Chen, C.,
Barthel, J. and Kirschner, J. (2001)
Materials Science and Engineering B, 84, 96.
21 Ding, H.F., Wulfhekel, W. and
Kirschner, J. (2002) Europhysics Letters, 57,
100.
22 Wulfhekel, W., Hertel, R., Ding, H.F.,
Steierl, G. and Kirschner, J. (2002) Journal
of Magnetism and Magnetic Materials, 249,
368.
23 Ding, H.F., Wulfhekel, W., Schlickum, U.
and Kirschner, J. (2003) Europhysics
Letters, 63, 419.
24 Bode, M. (2003) Reports on Progress in
Physics, 66, 523.
25 Schlickum, U., Wulfhekel, W. and
Kirschner, J. (2003) Applied Physics Letters,
83, 2016.
26 Schlickum, U., JankeGilman, N.,
Wulfhekel, W. and Kirschner, J. (2004)
Physical Review Letters, 92, 107203.
27 Bode, M., Getzlaff, M. and
Wiesendanger, R. (1998) Physical Review
Letters, 81, 4256.
28 Wachowiak, A., Wiebe, J., Bode, M.,
Pietzsch, O., Morgenstern, M. and
Wiesendanger, R. (2002) Science, 298,
577.
29 Pietzsch, O., Kubetzka, A., Bode, M. and
Wiesendanger, R. (2000) Physical Review
Letters, 84, 5212.
30 Kubetzka, A., Bode, M., Pietzsch, O. and
Wiesendanger, R. (2002) Physical Review
Letters, 88, 057201.
31 Allenspach, R., Stampanoni, M. and
Bischof, A. (1990) Physical Review Letters,
65, 3344.
32 P
utter, S., Ding, H.F., Millev, Y.T., Oepen,
H.P. and Kirschner, J. (2001) Physical
Review B - Condensed Matter, 64,
092409.
j43
44
References
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
j45
46
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
References
136 Bode, M., Ravlic, R., Kleiber, M. and
Wiesendanger, R. (2005) Applied Physics
A: Materials Science and Processing, 80,
907.
137 Getzlaff, M., Bansmann, J. and
Sch
onhense, G. (1999) Journal of
Magnetism and Magnetic Materials, 192,
458.
138 Getzlaff, M., Bansmann, J. and
Sch
onhense, G. (1996) Journal of Electron
Spectroscopy and Related Phenomena, 77,
197.
139 Getzlaff, M., Westphal, C., Bansmann, J.
and Sch
onhense, G. (2000) Journal of
Electron Spectroscopy and Related
Phenomena, 107, 293.
140 Getzlaff, M., Bansmann, J. and
Sch
onhense, G. (1993) Physics Letters A,
182, 153.
141 von Bergmann, K., Bode, M., Kubetzka,
A., Heide, M., Bl
ugel, S. and
Wiesendanger, R. (2004) Physical Review
Letters, 92, 046801.
142 Berbil-Bautista, L., Krause, S., Hanke, T.,
Bode, M. and Wiesendanger, R. (2006)
Surface Science, 600, L20.
143 Getzlaff, M., Bansmann, J. and
Sch
onhense, G. (1993) Solid State
Communications, 87, 467.
144 Krause, S., Berbil-Bautista, L., Hanke, T.,
Vonau, F., Bode, M. and Wiesendanger, R.
(2006) Europhysics Letters, 76, 637.
145 Hubert, A. and Schafer, R. (1998)
Magnetic Domains, Springer, Berlin.
146 Kubetzka, A., Pietzsch, O., Bode, M.,
Ravlic, R. and Wiesendanger, R. (2003)
Acta Physica Polonica B, 104,
259.
147 Bode, M., Vedmedenko, E.Y., von
Bergmann, K., Kubetzka, A., Ferriani, P.,
Heinze, S. and Wiesendanger, R. (2006)
Nature Materials, 5, 477.
j47
j49
2
Nanoscale Imaging and Force Analysis with
Atomic Force Microscopy
Hendrik Holscher, Andre Schirmeisen, and Harald Fuchs
2.1
Principles of Atomic Force Microscopy
2.1.1
Basic Concept
The direct measurement of the force interaction between distinct molecules has been
a challenge for scientists for many years. In fact, only very recently was a demonstration given that these forces can be determined for a single atomic bond, by using
the powerful technique of atomic force microscopy (AFM). But how is it possible to
measure interatomic forces, which may be as small as one billionth of one Newton?
The answer to this question is surprisingly simple: It is the same mechanical
principle used by a pair of kitchen scales, where a spring with a dened elasticity is
elongated or compressed due to the weight of the object to be measured. The
compression Dz of the spring (with spring constant cz) is a direct measure of the force
F exerted, which in the regime of elastic deformation obeys Hookes law:
F c z Dz:
2:1
The only difference from the kitchen scale is the sensitivity of the measurement. In
AFM, the spring is a bendable cantilever with a stiffness of 0.01 N m1 to 10 N m1.
As interatomic forces are in the range of some nN, the cantilever will be deected by
0.01 nm to 100 nm. Consequently, the precise detection of the cantilever bending is
the key feature of an atomic force microscope. If a sufciently sharp tip is directly
attached to the cantilever, it would be possible to measure the interacting forces
between the last atoms of the tip and the sample through the bending of the
cantilever.
In 1986, Binnig, Quate and Gerber presented exactly this concept for the rst
atomic force microscope [1]. These authors measured the deection of a cantilever
with sub-ngstrom precision by a scanning tunneling microscope [2] and used a gold
foil as the spring. The tip was a piece of diamond glued to this home-made cantilever
50
Figure 2.1 (a) The basic concept of the first atomic force
microscope built in 1986 by Binnig, Quate and Gerber. A sharp
diamond tip glued to a gold foil scanned the surface, while the
bending of the cantilever was detected with scanning tunneling
microscopy; (b) The ultimate goal was to measure the force
between the front atom of the tip and a specific sample atom.
(Reproduced from Ref. [1].)
(see Figure 2.1), and by using this set-up the group was able to image sample
topographies down to the nanometer scale.
2.1.2
Current Experimental Set-Ups
During the past few years the experimental set-up has been modied while AFM has
become a widespread research tool. Some 20 years after its invention, the commercial
atomic force microscope is available from a variety of manufacturers. Although most
of these instruments are designed for specic applications and environments, they
are typically based on the following types of sensors detection method and scanning
principles.
2.1.2.1 Sensors
Cantilevers are produced by standard microfabrication techniques, mostly from
silicon and silicon nitride as rectangular or V-shaped cantilevers. Spring constants
and resonance frequencies of cantilevers depend on the actual mode of operation. For
contact AFM measurements these are about 0.01 to 1 N m1 and 5100 kHz,
respectively. In a typical atomic force microscope, cantilever deections ranging
from 0.1 to a few micrometers are measured, which corresponds to a force
sensitivity ranging from 1013 N to 105 N.
Figure 2.2 shows two scanning electron microscopy (SEM) images of a typical
rectangular silicon cantilever. When using this imaging technique, the length (l),
width (w) and thickness (t) can be precisely measured. The spring constant cz of the
cantilever can then be determined from these values [3].
c z E Si
w t3
4 l
2:2
where ESi 1.69 1011 N m2 is the Youngss modulus. The typical dimensions of
silicon cantilevers are as follows: lengths of 100300 mm; widths of 1030 mm; and
thicknesses of 0.35 mm.
The torsion of the cantilever due to lateral forces between tip and surface depends
also on the height of the tip, h. The torsional spring constant can be calculated from [3]
c tor
G wt3
3 lh2
2:3
j51
52
The formulas presented above are only valid for rectangular cantilevers. Equations
for V-shaped cantilevers can be found in Refs. [6, 7].
2.1.2.2 Detection Methods
In addition to changes in the cantilevers, the detection methods used to measure the
minute bendings have also been improved. Today, commercial AFM instruments use
the so-called laser beam deection scheme shown in Figure 2.3. The bending and
torsion of cantilevers can be detected by a laser beam reected from their reverse
side [8, 9], while the reected laser spot is detected with a sectioned photo-diode. The
different parts are read out separately. For this, a four-quadrant diode is normally
used, in order to detect the normal as well as the torsional movements of the
cantilever. With the cantilever at equilibrium the spot is adjusted such that the upper
and lower sections show the same intensity. Then, if the cantilever bends up or down,
the spot will move and the difference signal between the upper and lower sections will
provide a measure of the bending.
The sensitivity can be improved by interferometer systems adapted by several
research groups (see Refs [1013]). It is also possible to use cantilevers with integrated
deection sensors based on piezoresistive lms [1416]. As no optical parts are
required in the experimental set-up of an atomic force microscope, when using these
cantilevers the design can be very compact [17]. An extensive comparison of the
different detection methods available can be found in Ref. [4].
2.1.2.3 Scanning and Feedback System
As the surface is scanned, the deection of the cantilever is kept constant by a
feedback system that controls the vertical movement of the scanner (shown schematically in Figure 2.3). The system functions as follows: (i) The current signal of the
photo-diode is compared with a preset value; (ii) the feedback system, which includes
a proportional, integral and differential (PID) controller, then varies the z-movement
of the scanner to minimize the difference. As a consequence, the tip-sample force is
kept practically constant for an optimal set-up of the PID parameters.
While the cantilever is moving relative to the sample in the xy-plane of the surface
by a piezoelectric scanner (see Figure 2.4), the current z-position of the scanner is
recorded as a function of the lateral xy-position with (ideally) sub-ngstr
om
precision. The obtained data represents a map of equal forces, which is then analyzed
and visualized by computer processing.
A principle of a simple laboratory class experiment the imaging of a test grid is
shown in Figure 2.5. The comparison between the topography and the error signal
shows that the PID controller needs some time at the step edges to correct the actual
deection error.
2.1.3
TipSample Forces in Atomic Force Microscopy
A large variety of sample properties related to tip-sample forces can be detected using the
atomic force microscope. The obtained contrast depends on the operational mode and
j53
54
the actual tip-sample interactions. Before discussing details of the operational modes of
AFM, however, we must rst specify the most important tipsample interactions.
Figure 2.6 shows the typical shape of the interaction force curve that the tip senses
during an approach towards the sample surface. Upon approach of the tip towards the
6
4
2
0
-2
-4
-1.5
-1.0
-0.5
0.0
z 0 0.5
1.0
1.5
2.0
sample, the negative attractive forces (which represent, for example, van der Waals or
electrostatic interaction forces) increase until a minimum is reached. This turnaround point is due to the onset of repulsive forces, caused by Pauli repulsion, which
will start to dominate upon further approach. Eventually, the tip is pushed into the
sample surface and elastic deformation will occur.
In general, the effective tipsample interaction force is a sum of different force
contributions, and these can be roughly divided into attractive and repulsive components. The most important forces are summarized as follows.
2.1.3.1 Van der Waals Forces
These forces are caused by uctuating induced electric dipoles in atoms and
molecules. The distance-dependence of this force for two distinct molecules follows
1/z7. For simplicity, solid bodies are often assumed to consist of many independent
noninteracting molecules, and the van der Waals forces of these bodies are obtained
by simple summation. For example, for a sphere over a at surface the van der Waals
force is given by
F vdW z
AH R
;
6z2
2:5
j55
56
empirical potentials are mostly used to allow an easy and fast calculation. One wellknown model is the LennardJones potential, which combines short-range repulsive
interactions with long-range attractive van der Waals interactions:
r 0 12 r 0 6
2:6
2
V LJ z E 0
z
z
where E0 is the bonding energy and r0 the equilibrium distance. In this case, the
repulsion is described by an inverse power law with n 12. The term with n 6
describes the attractive van der Waals potential between two atoms/molecules.
2.1.3.4 Elastic Forces
Elastic forces and deformations can occur if the tip is in contact with the sample. As
this deformation affects the effective contact area, knowledge about the elastic forces
and the corresponding deformation mechanics of the contact is an important issue in
AFM. The repulsive forces that occur during the elastic indentation of a sphere into a
at surface were analyzed as early as 1881 by H. Hertz (see Refs [20, 21]),
4 p
F Hertz z E Rz0 z3=2
3
for z z0 ;
2:7
E
Et
Es
2:8
depends on the Youngs moduli Et,s and the Poisson ratios mt,s of the tip and surface,
respectively. Here, R is the tip radius and z0 is the point of contact.
This model does not include adhesion forces, however, which must be considered
at the nanometer scale. Two extreme cases were analyzed by Johnson et al. [22] and
Derjaguin et al. [23]. The model of Johnson, Kendall and Roberts (the JKR model)
considers only the adhesion forces inside the contact area, whereas the model of
Derjaguin, Muller and Toporov (the DMT model) includes only the adhesion outside
the contact area. Various models analyzing the contact mechanics in the intermediate
regime were suggested by other authors (see Ref. [24] for a recent overview).
However, in many practical cases it is sufcient to assume that the geometric shape
of the tip and sample does not change until contact has been established at z z0, and
that afterwards, the tipsample forces are given by the DMT-M theory, denoting
Maugis approximation to the earlier DMT model [24]. In this approach, an offset
FvdW(z0) is added to the well-known Hertz model, which accounts for the adhesion
force between tip and sample surface. Therefore, the DMT-M model is often also
referred to as Hertz-plus-offset model [24]. The resulting overall force law is given by
8
AH R
>
>
for z z0 ;
< 2
6z
2:9
F DMT-M z 4 p
>
3=2 AH R
>
for z < z0 :
: E Rz0 z 2
3
6z0
Figure 2.6 shows the resulting tipsample force curve for the DMT-M model.
The following parameters were used, representing typical values for AFM measurements under ambient conditions: AH 0.2 aJ; R 10 nm; z0 0.3 nm; mt ms 0.3;
Et 130 GPa; and Es 1 GPa.
2.1.3.5 Frictional Forces
Frictional forces counteract the movement of the tip during the scan process, and
dissipate the kinetic energy of the moving tipsample contact into the surface or tip
material. This can be due to permanent changes in the surface itself, by scratching or
indenting, or also by the excitation of lattice vibration (i.e. phonons) in the material.
2.1.3.6 Chemical Binding Forces
Chemical binding forces arise from the overlap of molecular orbitals, due to specic
bonding states between the tip and the surface molecules. These forces are extremely
short-ranged, and can be exploited to achieve atomic resolution imaging of surfaces.
As these forces are also specic to the chemical identity of the molecules, it is
conceivable to identify the chemical character of the surface atoms with AFM scans.
2.1.3.7 Magnetic and Electrostatic Forces
These forces are of long-range character and might be either attractive or repulsive;
they are usually measured when the tip is not in contact with the surface (i.e.
noncontact mode). For magnetic forces, magnetic materials must be used for tip or
tip coating. Well-dened electrical potentials between tip and sample are necessary
for the measurement of electrostatic forces.
More detailed information on the intermolecular and surface forces relevant for
AFM measurements can be found in the monographs of Israelachvili [18] and
Sarid [25]. Details of the most important forces are summarized in Figure 2.7.
Although, in principle, every type of force can be measured using the atomic force
microscope, the actual sensitivity to a specic force depends on the mode of
operation. Hence, the most important modes are introduced in the next section.
j57
58
2.2
Modes of Operation
The contact mode, which historically is the oldest, is used frequently to obtain
nanometer-resolution images on a wide variety of surfaces. This technique also has
the advantage that not only the deection, but also the torsion of the cantilever, can be
measured. As shown by Mate et al. [26], the lateral force can be directly correlated to
the friction between tip and sample, thus extending AFM to friction force microscopy
(FFM).
Some typical applications of an atomic force microscope driven in contact-mode
are shown in Figure 2.8a and b. Here, the images represent a measurement of a L-a-
cz <
qF ts z
:
qz
2:10
repulsive
In this case the attractive forces can no longer be sustained by the cantilever and the
tip jumps towards the sample surface [29].
This effect has a strong inuence on static-mode AFM measurements, as exemplied by a typical force-versus-distance curve shown in Figure 2.9. Here, the force
snap-in
retraction
0
attractive
approach
snap-out
Fadh
piezo movement
Figure 2.9 A schematic of a typical force versus distance curve
obtained in static mode. The cantilever is approached towards the
sample surface. Due to strong attractive forces it jumps (snap-in)
towards the sample surface at a specific position. During
retraction, the tip is strongly attracted by the surface and the snapout point is considerably behind the snap-in point. This results
in an hysteresis between approach and retraction.
j59
60
acting on the tip recorded during an approach and retraction movement of the
cantilever is depicted. Upon approach of the cantilever towards the sample, the
attractive forces acting on the tip bend the cantilever towards the sample surface. At a
specic point close to the sample surface these forces can be no longer sustained by
the cantilever spring, and the tip jumps towards the sample surface. Now, the tip and
sample are in direct mechanical contact, and a further approach towards the sample
surface pushes the tip into the sample. As the spring constant of the cantilever usually
is much softer than the elasticity of the sample, the bending of the cantilever
increases almost linearly.
If the cantilever is now retracted from the surface, the tip stays in contact with
the sample because it is strongly attracted by the sample due to adhesive forces,
and the force Fadh is necessary to disconnect the tip from the surface. The snapout point is always at a larger distance from the surface than the snap-in, and
this results in an hysteresis between the approach and retraction of the cantilever.
This phenomenon of mechanical instability is often referred to as the jump-tocontact. Unfortunately, this sudden jump can lead to undesired changes of the tip
and/or sample.
2.2.2
Dynamic Modes
Despite the success of contact-mode AFM, the resolution was found to be limited in
many cases (in particular for soft samples) by lateral forces acting between tip and
sample. In order to avoid this effect, the cantilever can be oscillated in a vertical
direction near the sample surface. AFM imaging with vibrating cantilever is often
denoted as dynamic force microscopy (DFM).
The historically oldest scheme of cantilever excitation in DFM imaging is the
external driving of the cantilever at a xed excitation frequency exactly at or very close to
the cantilevers rst resonance [3032]. For this driving mechanism, different
detection schemes measuring either the change of the oscillation amplitude or the
phase shift were proposed. Over the years, the amplitude modulation (AM) or
tapping mode, where the oscillation amplitude is used as a measure of the
tipsample distance, has developed into the most widespread technique for imaging
under ambient conditions and liquids.
In a vacuum, any external oscillation of the cantilever is disadvantageous. Standard
AFM cantilevers constructed from silicon exhibit very high Q-values in vacuum,
which results in very long response times of the system. Consequently, in 1991
Albrecht et al. [33] introduced the frequency modulation (FM) mode, which works
well for high-Q systems and subsequently has developed into the dominant driving
scheme for high-resolution DFM experiments in ultra-high vacuum (UHV) [3437].
In contrast to the AM mode, this approach features a so-called self-driven oscillator [38, 39] which, when placed in a closed-loop set-up (active feedback), uses the
cantilever deection itself as the driving signal, thus ensuring that the cantilever
instantaneously adapts to changes in the resonance frequency. These two driving
mechanisms are discussed in more detail in the following section.
2.3
Amplitude Modulation (Tapping Mode)
2.3.1
Experimental Set-Up of AM-Atomic Force Microscopy
As an alternative to the contact mode, the cantilever can be excited to vibrate near its
resonant frequency close to the sample surface. Under the inuence of tipsample
forces the resonant frequency (and consequently also the amplitude and phase) of the
cantilever will change and serve as the measurement parameters. This is known as
the dynamic mode. If the tip is approached towards the surface, the oscillation
parameters of amplitude and phase are inuenced by the tipsurface interaction, and
can therefore be used as feedback channels. A certain set-point (e.g. the amplitude) is
given, whereby the feedback loop will adjust the tipsample distance so that the
amplitude remains constant. The controller parameter is recorded as a function of
the lateral position of the tip with respect to the sample, and the scanned image
essentially represents the surface topography.
The technical realization of dynamic-mode AFM is based on the same key
components as a static AFM set-up. A sketch of the experimental set-up of an atomic
force microscope driven in AM mode is shown in Figure 2.10.
electronics with
lockin amplifier
photo-diode
laser
phase
z
piezo
D+2A
aexc
function
generator
d
D
amplitude
amplitude
sample
PID
setpoint
xyzscanner
Figure 2.10 Set-up of a dynamic force
microscope operated in AM or tapping mode. A
laser beam is deflected by the reverse side of the
cantilever, with the deflection being detected by a
split photo-diode. The cantilever vibration is
caused by an external frequency generator
j61
62
The deection of the cantilever is typically measured with the laser beam deection
method, as indicated [8, 9], but other displacement sensors such as interferometric
sensors [12, 13, 30, 40] have also been applied. During operation in conventional
tapping mode, the cantilever is driven at a xed frequency with a constant excitation
amplitude from an external function generator, while the resulting oscillation
amplitude and/or the phase shift are detected by a lock-in amplier. The function
generator supplies not only the signal for the dither piezo; its signal serves
simultaneously as a reference for the lock-in amplier.
This set-up can be operated both in air and in liquids. A typical image obtained with
this experimental set-up in ambient conditions is shown in Figure 2.11. For a direct
comparison with the static mode, the sample is also DPPC-adsorbed onto a mica
substrate. In contact mode the frictional forces are measured simultaneously with the
topography, whereas in dynamic mode the phase between excitation and oscillation is
acquired as an additional channel. The phase image provides information about the
different material properties of DPPC and the mica substrate. It can be shown, that
the phase signal is closely related to the energy dissipated in the tipsample
contact [4143].
Due to its technical relevance the investigation of polymers has been the focal point
of many studies (see Ref. [44] for a recent review). High-resolution imaging has been
extensively performed in the area of materials science; for example, by using specic
tips with additionally grown sharp spikes, Klinov et al. [45] obtained true molecular
resolution on a polydiacetylene crystal.
Imaging in liquids opens up an avenue for the investigation of biological samples
in their natural environment. For example, Moller et al. [46] have obtained highresolution images of the topography of the hexagonally packed intermediate (HPI)
layer of Deinococcus radiodurans, using tapping-mode AFM. A typical example of the
imaging of DNA in liquid solution is shown in Figure 2.12.
2.3.1.1 Theory of AM-AFM
Based on the above description of the experimental set-up, it is possible to formulate
the basic equation of motion describing the cantilever dynamics of AM-AFM:
mzt
2pf 0 m
_ c z ztd ad c z cos2pf d t F ts zt; zt
_ :
zt
|{z}
|{z}
Q0
external driving force
tip-sample force
2:11
p
_ is the position of the tip at the time t; cz, m and f 0 c z =m=2p are
Here, zt
the spring constant, the effective mass, and the eigenfrequency of the cantilever,
respectively. As a small simplication, it is assumed that the quality factor Q0
combines the intrinsic damping of the cantilever and all inuences from surrounding media, such as air or liquid (if present) in a single value. The equilibrium position
of the tip is denoted as d. The rst term on the right-hand side of the equation
represents the external driving force of the cantilever by the frequency generator. It is
modulated with the constant excitation amplitude ad at a xed frequency fd. The
(nonlinear) tipsample interaction force Fts is introduced by the second term.
Before discussing the solutions of this equation, some words of caution should be
added with regards to the universality of the equation of motion and the various
solutions discussed below. Equation 2.11 disregards two effects, which might
become important under certain circumstances. First, we describe the cantilever
by a spring-mass-model and neglect in this way the higher modes of the cantilever.
j63
64
This is justied in most cases, as the rst eigenfrequency is by far the most dominant
in typical AM-AFM experiments (see Refs [41, 4750]). Second, we assume in our
model equation of motion that the dither piezo applies a sinusoidal force to the
spring, but do not consider that the movement of the dither piezo simultaneously
also changes the effective position of the tip at the cantilever end by aexc(t)
ad cos(2pfdt) [47, 51, 52]. This effect becomes important when ad is in the range of
the cantilever oscillation amplitude.
In a rst step, we assume that the cantilever vibrates far away from the sample
surface. Consequently, we can neglect tipsample forces (Fts 0), resulting in the
well-known equation of motion of a driven damped harmonic oscillator.
After some time the external driving amplitude forces the cantilever to oscillate
exactly at the driving frequency fd. Therefore, the steady-state solution is given by the
ansatz
zt 0 d A cos2pf d t f;
2:12
where f is the phase difference between the excitation and the oscillation of
the cantilever. With this, we obtain two functions for the amplitude and phase
curves:
ad
A r
;
2:13a
1 f d =f 0
:
Q 0 1f d2 =f 02
2:13b
f2
1 f d2
0
tanf
1 fd
Q0 f 0
The features of such an oscillator are well known from introductory physics
courses.
If the cantilever is brought closer towards the sample surface, the tip senses the
tipsample interaction force, Fts, which changes the oscillation behavior of the
cantilever. However, as the mathematical form of realistic tipsample forces is
highly nonlinear, this fact complicates the analytical solution of the equation of
motion Equation 2.11. For the analysis of DFM experiments we need to focus on
steady-state solutions of the equation of motion with sinusoidal cantilever oscillation. Therefore, it is advantageous to expand the tipsample force into a Fourier
series
1=f d
_
F ts zt; z_ tdt
F ts zt; zt f d
0
2f d
2f d
1=f d
0
1=f d
0
_
F ts zt; ztcos2pf
d t fdt cos2pf d t f
_
F ts zt; ztsin2pf
d t fdt sin2pf d t f
...;
where z(t) is given by Equation 2.12.
2:14
The rst term in the Fourier series reects the averaged tipsample force over one full
oscillation cycle, which shifts the equilibrium point of the oscillation by a small offset Dd
from d to d0. Actual values for Dd, however, are very small. For typical amplitudes used in
AM-AFM in air (some nm to some tens of nm), the averaged tipsample force is in the
range of some pN. The resultant offset Dd is less than 1 pm for typical sets of parameters.
As this is well beyond the resolution limit of an AM-AFM experiment in air, we neglect
this effect in the following and assume d d0 and D d A.
For further analysis, we now insert the rst harmonics of the Fourier series
Equation 2.14 into the equation of motion (Equation 2.11), thus obtaining two
coupled equations [53, 54]
f 20 f 2d
ad
I d; A cosf;
A
f 20
1 fd
ad
I d; A sinf;
Q0 f 0
A
2f 1=f d
_
I d; A d
F ts zt; ztcos2pf
d t fdt
cz A 0
dA
1
zd
F # F " q dz;
pc z A2 dA
2
A zd2
I d; A
2f d
cz A
1=f d
0
2:15b
2:16a
_
F ts zt; ztsin2pf
d t fdt
d A
1
F # F " dz
pc z A2 dA
1
DEd; A:
pc z A2
2:15a
2:16b
Both integrals are functions of the actual oscillation amplitude A and the
cantileversample distance d. Furthermore, they depend on the sum and the
difference of the tipsample forces during approach (E#) and retraction (F"), as
manifested by the labels and for easy distinction. The integral I is a
weighted average of the tipsample forces (F# F"). On the other hand, the integral
I is directly connected to DE, which reects the energy dissipated during an
individual oscillation cycle. Consequently, this integral vanishes for purely conservative tipsample forces, where F# and F" are identical. A more detailed discussion of
these integrals can be found in Refs. [55, 56].
By combining Equations 1.13b and 1.16b we obtain a direct correlation between
the phase and the energy dissipation.1)
1) The -sign on the right-hand side of the
equation is due to our denition of the phase f
in Equation 2.12.
j65
66
Q DE
A fd
sinf
0
:
A0 f 0 pc z A0 A
2:17
This relationship can be also obtained from the conservation of energy principle [4143].
Equation (2.15) can be used to calculate the resonance curves of a dynamic force
microscope, including tipsample forces. The results are
ad
A r
;
f2
1 f d2
0
tanf
1 fd
Q0 f 0
f
I d; A
I d; A
2
d
2
0
1 f I d; A
1 fd
Q0 f 0
I d; A
2:18a
2:18b
Equation 2.18a describes the shape of the resonance curve, but it is an implicit
function of the oscillation amplitude A, and cannot be plotted directly.
Figure 2.13 contrasts the solution of this equation (solid lines) with numerical
solution (symbols). As pointed out by various authors (see Refs [47, 5763]), the
amplitude versus frequency curves are multivalued within certain parameter ranges.
Moreover, as the gradient of the analytical curve increases to innity at specic
positions, some branches become unstable. The resulting instabilities are reected
by the jumps in the simulated curves (marked by arrows in Figure 2.13), where only
stable oscillation states are obtained. Obviously, they are different for increasing and
decreasing driving frequencies. This well-known effect is frequently observed in
nonlinear oscillators (see Refs [64, 65]).
Amplitude (nm)
11
10
tapping (Q = 300)
9
8
7
6
299.2
299.6
300.0
300.4
300.8
Amplitude (nm)
bistable regime
8
10
12
Distance d (nm)
Figure 2.14 Amplitude versus distance curve for
conventional (tapping mode) AM-AFM for
A0 10 nm, f0 300 kHz, and a tipsample
interaction force as given in Figure 2.6. The
dashed lines represent the analytical result, while
the symbols are obtained from a numerical
j67
68
Amplitude (nm)
10
8
6
4
2
0
6
tapping
regime
4
bistable
tapping
regime
2
0
-2
-4
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
the tipsample force. After jumping to the higher branch, however, the tip senses also
the repulsive part of the tipsample interaction.
In Figure 2.15 the oscillation amplitude is plotted as a function of the nearest
tipsample distance. In addition, the lower graph depicts the corresponding tip
sample force (cf. Figure 2.6). The origin of the nearest tipsample position D is dened
by this force curve. As both the amplitude curves and the tipsample force curve are
plotted as a function of the nearest tipsample position, it is possible to identify the
resulting maximum tipsample interaction force for a given oscillation amplitude.
A closer look at the A(D)-curves helps to identify the different interaction regimes
in AM-AFM. During the approach of the vibrating cantilever towards the sample
surface, this curve shows a discontinuity for the nearest tipsample position D (the
point of closest approach during an individual oscillation) between 0 and 1 nm.
This gap corresponds to the bistability and the resulting jumps in the amplitude
versus distance curve. When the jump from the attractive to the repulsive regime has
occurred, the amplitude decreases continuously, but the nearest tipsample position
does not reduce accordingly, remaining roughly between 0.8 nm and 1.5 nm. As a
result, larger A/A0 ratios do not necessarily translate into lower tipsample interactions a point which is important to bear in mind while adjusting imaging
parameters in tapping mode AM-AFM imaging. In contrast, once the repulsive
regime has been reached, the users ability to inuence the tipsample interaction
strength by modifying the set-point for A is limited, thus also limiting the possibilities
of improving the image quality.
For practical applications, it is reasonable to assume that the set-point of the
amplitude used for imaging has been set to a value between 90% ( 9 nm) and 10%
( 1 nm) of the free oscillation amplitude. With this condition, we can identify the
accessible imaging regimes indicated by the horizontal (dashed) lines and the
corresponding vertical (dotted) lines in Figure 2.15. In tapping mode, two imaging
regimes are realized: the tapping regime (left) and the bistable tapping regime
(middle). The rst can be accessed by any amplitude set-point between 9 nm and
1 nm, and results in a maximum tipsample forces well within the repulsive regime.
The second regime, belonging to the bistable imaging state, is only accessible during
approach; here, the corresponding amplitude set-point is between 9 nm and 8 nm.
Imaging in this regime is possible with the limitation that the oscillating cantilever
might jump into the repulsive regime [68, 70].
2.3.1.2 Reconstruction of the TipSample Interaction
Previously, we have outlined the inuence of the tipsample interaction on the
cantilever oscillation, calculated the maximum tipsample interaction forces based
on the assumption of a specic model force, and subsequently discussed possible
routes for image optimization. However, during AFM imaging, the tipsample
interaction is not known a priori. However, several groups [52, 7375] have suggested
solutions to this inversion problem. Here, we present an approach which is based on
the analysis of the amplitude and phase versus distance curves which can easily be
measured with most AM-AFM set-ups.
Let us start by applying the transformation D d A to the integral I in
Equation 2.16a, where D corresponds to the nearest tipsample distance, as dened
in Figure 2.10. Next we note that, due to the cantilever oscillation, the current method
intrinsically recovers the values of the force that the tip experiences at its lower
turning point, where F# necessarily equals F". We thus dene Fts (F# F")/2, and
Equation 2.16a subsequently reads as
pc z A2
D
2A
zDA
F ts q dz:
2
A zDA2
2:20
The amplitudes commonly used in AM-AFM are considerably larger than the
interaction range of the tipsample force. Consequently, tipsample forces in the
integration range between D A and D 2A are insignicant. For this so-called
large-amplitude approximation [76, 77], the last term can be expanded at z ! D to
q
p
zDA= A2 zDA2 A=2zD, resulting in
I
p
2
pc z A
D
2A
3=2
D
F ts
p
dz:
zD
2:21
j69
70
By introducing this equation into Equation 2.15a, we obtain the following integral
equation:
"
#
D
2A
c z A3=2 ad cosf f 20 f 2d
1
F ts
p
p dz:
2
2:22
A
p
2
zD
f0
D
|{z}
k
The left-hand side of this equation contains only experimentally accessible data,
and we denote this term as k. The benet of these transformations is that the integral
equation can be inverted [65, 76] and, as a nal result, we nd
q
F ts D
qD
D
2A
kz
p dz:
zD
2:23
It is now straightforward to recover the tipsample force using Equation 2.23 from
a spectroscopy experiment that is, an experiment where the amplitude and the
phase are continuously measured as a function of the actual tipsample distance
D d A at a xed location. With this input, one rst calculates k as a function of D.
In a second step, the tipsample force is computed solving the integral in Equation 2.23 numerically.
Additional information about the tipsample interaction can be obtained, remembering that the integral I is directly connected to the energy dissipation DE. By
simply combining Equations 1.15b and 1.16b, we get
1 fd
ad
DE
sinf pc z A2 :
2:24
Q0 f 0
A
The same result was found earlier by Cleveland et al. [41], using the conservation of
energy principle. However, in a further development of Clevelands investigations we
suggest plotting the energy dissipation as a function of the nearest tipsample
distance D d A in order to have the same scaling as for the tipsample force.
An application of the method to experimental data obtained on a silicon wafer is
shown in Figure 2.16, where only the data points before the jump were used to
reconstruct the tipsample force and energy dissipation. As a consequence, the
experimental force curve showed only the attractive part of the force between tip and
sample, with a minimum of 1.8 nN. This result was in agreement with previous
studies which stated that the tip sensed only attractive forces before the
jump [66, 69, 78].
2.3.2
Frequency-Modulation or Noncontact Mode in Vacuum
vacuum utilize the FM detection scheme introduced by Albrecht et al. [33]. In this
mode, the cantilever is self-oscillated, in contrast to the AM- or tapping-mode
discussed in Section 2.2.3. The FM technique enables the imaging of single point
defects on clean sample surfaces in vacuum, and its resolution is comparable with
that of the scanning tunneling microscope, while not restricted to conducting
surfaces. During the years after the invention of the FM technique the term
noncontact atomic force microscopy (NC-AFM) was established, because it is
commonly believed that a repulsive, destructive contact between the tip and sample
j71
72
electronics with
frequency detector
photo-diode
laser
amplitude
frequency
amplifier
0
frequency
phase
shifter
z
piezo
aexc
D+2A
d
D
sample
PID
setpoint
xyzscanner
The detector signal is amplified and phaseshifted before being used to drive the piezo. The
measured quantity is the frequency shift due to
tipsample interaction, which serves as the
control signal for the cantileversample distance.
The key feature of the described set-up is the positive feedback loop which
oscillates the cantilever always at its resonance frequency, f [39]. The reason for this
behavior is that the cantilever serves as the frequency-determining element. This is in
contrast to an external driving of the cantilever by a frequency generator, where the
driving frequency fd is not necessarily the resonant frequency of the cantilever.
If the cantilever oscillates near the sample surface, the tipsample interaction
alters its resonant frequency, which is then different from the eigenfrequency f0 of the
free cantilever. The actual value of the resonant frequency depends on the nearest
tipsample distance and the oscillation amplitude. The measured quantity is the
frequency shift Df, which is dened as the difference between both frequencies
(Df f f0). The detection method received its name from the frequency demodulator (FM-detector). The cantilever driving mechanism, however, is independent of this
part of the set-up. Other set-ups use a phase-locked loop (PLL) to detect the frequency
and to oscillate the cantilever exactly with the frequency measured by the PLL [80].
For imaging, the frequency shift Df is used to control the cantilever sample
distance. Thus, the frequency shift is constant and the acquired data represents
planes of constant Df, which can be related to the surface topography in many cases.
The recording of the frequency shift as a function of the tipsample distance, or
alternatively the oscillation amplitude can be used to determine the tipsample force
with high resolution (see Section 2.2.4.5).
2.3.2.2 Origin of the Frequency Shift
Before presenting experimental results obtained in vacuum, we will analyze the origin
of the frequency shift. A good insight into the cantilever dynamics is provided by
examiningthetip potential displayedin Figure2.18. If thecantilever is far away from the
sample surface, the tip moves in a symmetric, parabolic potential (dotted line), and its
oscillation is harmonic. In such a case, the tip motion is sinusoidal and the resonance
V(z)
cantilever potential
tipsample potential
effective potential
E
z min
D + 2A
j73
74
frequency is given by the eigenfrequency f0 of the cantilever. If, however, the cantilever
approaches the sample surface, the potential which determines the tip oscillation is
modied to an effective potential Veff (solid line) given by the sum of the parabolic
potential and the tipsample interaction potential Vts (dashed line). This effective
potential differs from the original parabolic potential and shows an asymmetric shape.
As a result of this modication of the tip potential the oscillation becomes
anharmonic, and the resonance frequency of the cantilever depends now on the
oscillation amplitude A. Since the effective potential experienced by the tip changes
also with the nearest distance D, the frequency shift is a functional of both parameters
()Df : Df(D, A)).
Figure 2.19 displays some experimental frequency shift versus distance curves
for different oscillation amplitudes. These experiments were carried out with
0.10
x 10
0.0
180
126
90
72
54
-0.4
-0.8
-10
0.05
1/2
0.4
-15
0.8
-5
z0 5
10
15
Distance D []
(a)
0.00
180
126
90
72
54
-0.05
-0.10
-10
(b)
-5
10
15
Distance D []
0
180
126
90
72
54
-4
Fad
(c)
-8
-10
-5
z0 5
10
15
Distance D []
m zt
2pf 0 m
_
_ c z ztd gcz ztt0 d F ts zt; zt
zt
|{z}
Q
driving
2:25
where z : z(t) represents the position of the tip at the time t; and cz, m and Q are the
spring constant, the effective mass and the quality factor of the cantilever, respectively. Fts (qVts)/(qz) is the tipsample interaction force. The last term on the left
describes the active feedback of the system by the amplication of the displacement
signal by the gain factor g measured at the retarded time t t0.
The frequency shift can be calculated from the above equation of motion with the
ansatz
zt d Acos2pf t
2:26
f 2 f 20
I
f 20
2:27
j75
76
g sin2pf t0
1 f
I
Qf0
2:28
where the two integrals I Equation 2.16a and IEquation 2.16b were dened in
accordance to Section 2.2.3.2. These two coupled equations can be solved numerically, if one is interested in the exact dependency of the tipsample interaction force
Fts and the time delay t0 on the oscillation frequency f and the gain factor g.
Fortunately, a detailed analysis shows that the results of a FM-AFM experiment are
mainly determined by the tipsample force, and only very slightly by the time delay, if
t0 is set to an optimal value before approaching the tip towards the sample surface.
These values of the time delay are specic resonance values corresponding to 90
(i.e. t0 1/4f0), and can be easily found by minimizing the gain factor as a function
of the time delay. Therefore, it can be assumed that cos(2pft0) 0 and sin(2pft0) 1
and the two coupled Equations 1.27 and 1.28 can be decoupled. As a result, an
equation for the frequency shift is obtained:
d A
f
1
zd
F # F " q dz
Df 0 I
2:29
2
pcz A2 dA
2
A zd2
and the energy dissipation
1 f
pc z A2 :
DE g
Qf0
2:30
As no assumptions were made about the specic force law of the tipsample
interaction Fts, these equations are valid for any type of interaction as long as the
resulting cantilever oscillations remain nearly sinusoidal.
As the amplitudes in FM-AFM are often considerably larger than the distance
range of the tipsample interaction, we can again make the large amplitude
approximation [76, 77] and introduce the approximation Equation 2.21 for the
integral I. This yields the formula
f0
1
Df p
2p cz A3=2
D
2A
F ts z
p dz
zD
2:31
c z A3=2
Df z:
f0
2:32
j77
78
p c z A3=2 q Df z
p dz;
F ts D 2
f 0 qD
zD
2:33
which allows a direct calculation of the tipsample interaction force from the frequency
shift versus distance curves.
An application of this formula to the experimental frequency shift curves already
presented in Section 2.2.4.2 is shown in Figure 2.19c. The obtained force curves are
almost identical, despite being obtained with different oscillation amplitudes. As the
tipsample interactions can be measured with high resolution, DFS opens a direct
way to compare experiments with theoretical models and predictions.
Giessibl [77] suggested a description of the force between the tip and the sample by
combining a long-range (van der Waals) and a short-range (LennardJones) term (see
Section 2.1.3). Here, the long-range part describes the van der Waals interaction of the
tip, modeled as a sphere with a specic radius, with the surface. The short-range
LennardJones term is a superposition of the attractive van der Waals interaction of
the last tip apex atom with the surface and the coulombic repulsion. For a tip with
radius R, this assumption results in the tipsample force:
AH R 12E 0 r 0 13 r 0 7
:
2:34
F nc z 2
6z
r0
z
z
As this approach does not explicitly consider elastic contact forces between tip and
sample, we call this the noncontact force law in the following sections.
A t of this equation to the experimental tipsample force curve is shown in
Figure 2.19c by a solid line; the obtained parameters are AHR 2.4 1027 Jm,
r0 3.4 , and E0 3 eV [98]. The regime on the right from the minimum ts well to
the experimental data, but the deep and wide minimum of the experimental curves
cannot be described accurately with the noncontact force. This is caused by the steep
increase in the LennardJones force in the repulsive regime ()Fts / 1/r13 for z < r0).
The elastic contact behavior can be described with the assumption of the abovedescribed DMT-M model (see Section 2.1.3), that the overall shape of tip and sample
changes only slightly until point contact is reached and that, after the formation of
this point contact, the tipsample forces are described by the Hertz theory. A t of the
Hertz model to the experimental data is shown in Figure 2.19 by a solid line. The
experimental force curves agree quite well with the contact force law for distances
D < z0. This shows that the overall behavior of the experimentally obtained force
curves can be described by a combination of long-range (van der Waals), short-range
(LennardJones), and contact (Hertz/DMT) forces.
As Equation 2.31 was derived under the assumption that the resonance amplitude
is considerably larger than the decay length of the tipsample interaction, the same
restriction applies for Equation 2.33. However, by using more advanced algorithms it
is also possible to determine forces from DFS experiments without the large
amplitude restriction. The numerical approach of Gotsmann et al. [94], as well as
the semi-empirical methods of D
urig [92], Giessibl [93] and Sader and Jarvis [97], are
applicable in all regimes.
The resolution of DFS can be driven down to the atomic scale. Lantz et al. [99]
measured frequency shift versus distance curves at different lattice sites of the Si
(1 1 1)-(7 7) surface, and in this way were able to distinguish differences in the
bonding forces between inequivalent adatoms of the 7 7 surface reconstruction of
silicon.
The concept of DFS can be also extended to three-dimensional (3-D)-force
spectroscopy by mapping the complete force eld above the sample surface [100].
A schematic of the measurement principle is shown in Figure 2.21a. Frequency shift
versus distance curves are recorded on a matrix of points perpendicular to the sample
j79
80
surface. By using Equation 2.33, the complete 3-D force eld between the tip and
sample can be recovered with atomic resolution. Figure 2.21b shows a cut through
the force eld as a two-dimensional (2-D) map.
The 3D-force technique has been applied also to a NaCl(1 0 0) surface, where not
only conservative but also the dissipative tipsample interaction could be measured
in full space [101]. Initially, the forces were measured in the attractive as well as
repulsive regime, allowing for the determination of the local minima in the
corresponding potential energy curves (Figure 2.22). This information is directly
related to the atomic energy barriers responsible for a multitude of dynamic
phenomena in surface science, such as diffusion, faceting and crystalline growth.
The direct comparison of conservative with the simultaneously acquired dissipative
processes furthermore allowed determining atomic-scale mechanical relaxation
processes.
If the NC-AFM is capable of measuring forces between single atoms with sub-nN
precision, why should it not be possible to also exert forces with this technique? In
fact, the new and exciting eld of nanomanipulation would be driven to a whole new
dimension, if dened forces could be reliably applied to single atoms or molecules. In
this respect, Loppacher et al. [102] were able to exert pressure on different parts of an
isolated CuTBBP molecule, which is known to possess four rotatable legs. Here, the
forcedistance curves were measured while one of the legs was pushed by the AFM
tip and turned by 90
, and hence were able to measure the energy which was
dissipated during the switching of this molecule between different conformational
states. The manipulation of single silicon atoms with NC-AFM was demonstrated by
Oyabu et al. [103], who removed single atoms from a Si(1 1 1)-7 7 surface with the
AFM tip and were able subsequently to re-deposit atoms from the tip onto the surface.
This approach was driven to its limits by Sugimoto et al., who manipulated single Sn-
2.4 Summary
2.4
Summary
In summary, we have presented an overview over the basic principles and modern
applications of AFM. This versatile technique can be categorized into two operational
modes, static and dynamic. The static mode allows the simultaneous measurement
of normal and lateral forces, thus yielding direct information about friction mechanisms of nanoscale contacts. The main advantage of the dynamic mode is the
possibility to control tipsample distances while avoiding the undesirable and
destructive jump-to-contact phenomenon. Two different excitation schemes for
dynamic force microscopy were introduced, where the amplitude-modulation or
tapping mode are in particular well-suited to high-resolution imaging under ambient
or liquid conditions. The ultimate true atomic resolution, however, is limited to
vacuum conditions using FM or noncontact techniques. Nonetheless, the impact of
AFM reaches far beyond the high-resolution imaging of surface topography: DFS
allows the quantication of tipsample forces, through the systematic acquisition of
parameters such as amplitude, phase and oscillation frequency as a function of the
relative tipsample distance. Based on this approach, not only the bonding force of
single interatomic chemical bonds can be measured, but also the full 3-D force eld
can be determined, at atomic resolution. Finally, the nding that atomic forces can
j81
82
not only be measured but also exerted with atomic precision will open up the new and
exciting eld of nanomanipulation.
Acknowledgments
The authors would like to thank all colleagues who contributed to these studies with
their images and experimental results, including Boris Anczykowski, Daniel Ebeling,
Jan-Erik Schmutz, Domenique Weiner (University of M
unster), Wolf Allers, Shenja
Langkat, Alexander Schwarz (University of Hamburg) and Udo D. Schwarz (Yale
University).
References
1 Binnig, G., Quate, C.F. and Gerber, Ch.
(1986) Atomic force microscopy. Physical
Review Letters, 56, 930933.
2 Binnig, G., Rohrer, H., Gerber, C. and
Weibel, E. (1982) Surface studies by
scanning tunneling microscopy. Physical
Review Letters, 49, 5761.
3 Meyer, E., Hug, H.-J. and Bennewitz, R.
(2004) Scanning Probe Microscopy The
Lab on a Tip, Springer-Verlag.
4 Bhushan, B. and Marti, O.(eds) (2005)
Scanning probe microscopy principle of
operation, instrumentation, and probes,
in Nanotribology and Nanomechanics An
Introduction, (ed. B. Bhushan) SpringerVerlag, Berlin Heidelberg, pp. 41115.
5 L
uthi, R., Meyer, E., Haefke, H., Howald,
L., Gutmannsbauer, W., Guggisberg, M.,
Bammerlin, M. and G
untherodt, H.-J.
(1995) Nanotribology: an UHV-SFM study
on thin lms of C60 and AgBr. Surface
Science, 338, 247260.
6 Neumeister, J.M. and Ducker, W.A. (1994)
Lateral, normal, and longitudinal spring
constants of atomic force microscopy
cantilevers. Review of Scientic
Instruments, 65, 25272531.
7 Sader, J.E. (1995) Parallel beam
approximation for V-shaped atomic force
microscope cantilevers. Review of
Scientic Instruments, 66, 45834587.
References
14 Linnemann, R., Gotszalk, T., Rangelow,
I.W., Dumania, P. and Oesterschulze, E.
(1996) Atomic force microscopy and
lateral force microscopy using
piezoresistive cantilevers. Journal of
Vacuum Science & Technology B, 14 (2),
856860.
15 Tortonese, M., Barrett, R.C. and Quate,
C.F. (1993) Atomic resolution with an
atomic force microscope using
piezoresistive detection. Applied Physics
Letters, 62, 834836.
16 Yuan, C.W., Batalla, E., Zacher, M., de
Lozanne, A.L., Kirk, M.D. and Tortonese,
M. (1994) Low temperature magnetic
force microscope, utilizing a
piezoresistive cantilever. Applied Physics
Letters, 65, 13081310.
17 Stahl, U., Yuan, C.W., de Lozanne, A.L.
and Tortonese, M. (1994) Atomic force
microscope using piezoresistive,
cantilevers and combined with a scanning
electron microscope. Applied Physics
Letters, 65, 28782880.
18 Israelachvili, J.N. (1992) Intermolecular
and Surface Forces, Academic Press,
London.
19 Stifter, Th., Marti, O. and Bhushan, B.
(2000) Theoretical investigation of the
distance dependence of capillary and van
der Waals forces in scanning force
microscopy. Physical Review B - Condensed
Matter, 62, 1366713673.
20 Johnson, K.L. (1985) Contact Mechanics,
Cambridge University Press, Cambridge,
UK.
21 Landau, L.D. and Lifschitz, E.M. (1991)
Lehrbuch der theoretischen Physik VII:
Elastizit
atstheorie, Akademie-Verlag,
Berlin.
22 Johnson, K.L., Kendall, K. and Roberts,
A.D. (1971) Surface energy and contact of
elastic solids. Proceedings of the Royal
Society of London Series A - Mathematical,
Physical and Engineering Sciences, 324,
301.
23 Derjaguin, B.V., Muller, V.M. and
Toporov, Y.P. (1975) Effect of contact
deformations on the adhesion of particles.
24
25
26
27
28
29
30
31
32
j83
84
33 Albrecht, T.R., Gr
utter, P., Horne, D. and
Rugar, D. (1991) Frequency modulation
detection using high-Q cantilevers for
enhanced force microscope sensitivity.
Journal of Applied Physics, 69,
668673.
34 Garcia, R. and Perez, R. (2002) Dynamic
atomic force microscopy methods.
Surface Science Reports, 47, 197301.
35 Giessibl, F.-J. (2003) Advances in atomic
force microscopy. Reviews of Modern
Physics, 75, 949983.
36 Holscher, H. and Schirmeisen, A. (2005)
Dynamic force microscopy and
spectroscopy, in Advances in Imaging and
Electron Physics, (ed. P.W. Hawkes),
Academic Press Ltd, London, pp. 41101.
37 Morita, S., Wiesendanger, R. and Meyer,
E.(eds) (2002) Noncontact Atomic Force
Microscopy, Springer-Verlag, Berlin.
38 Holscher, H., Gotsmann, B., Allers, W.,
Schwarz, U.D., Fuchs, H. and
Wiesendanger, R. (2001) Measurement of
conservative and dissipative tip-sample
interaction forces with a dynamic force
microscope using the frequency
modulation technique. Physical Review
B - Condensed Matter, 64, 075402.
39 Holscher, H., Gotsmann, B., Allers, W.,
Schwarz, U.D., Fuchs, H. and
Wiesendanger, R. (2002a) Comment on
damping mechanism in dynamic force
microscopy. Physical Review Letters, 88,
019601.
40 Schonenberger, C. and Alvarado, S.F.
(1989) A differential interferometer for
force microscopy. Review of Scientic
Instruments, 60, 31313134.
41 Cleveland, J.P., Anczykowski, B., Schmid,
A.E. and Elings, V.B. (1998) Energy
dissipation in tapping-mode atomic force
microscopy. Applied Physics Letters, 72,
2613.
42 Garcia, R., Gmez, C.J., Martinez, N.F.,
Patil, S., Dietz, C. and Magerle, R. (2006)
Identication of nanoscale dissipation
processes by dynamic atomic force
microscopy. Physical Review Letters, 97,
016103.
References
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
j85
86
References
91
92
93
94
95
96
97
98
99
100
101
102
103
104
j87
j89
3
Probing Hydrodynamic Fluctuations with a Brownian Particle
Sylvia Jeney, Branimir Lukic, Camilo Guzman, and Laszl Frr
3.1
Introduction
The observation of Brownian motion has been a subject of interest since the invention
of optical microscopy during the seventeenth century [1]. From then on, the understanding of its origin remained a subject of debate until 1905, when Einstein described
a convincing model which, assumed that the uctuations of a small-sized particle
oating in a uid were caused by momentum transfer from thermally excited uid
molecules. Einstein identied the mean square displacement of the particle as the
characteristic experimental observable of Brownian motion, and showed that it grows
linearly with time as hDx2(t)i 2Dt, thereby introducing the diffusion coefcient D [2].
In 1908, Langevin reformulated Newtons force balance equation by adding to the
instantaneous Stokes friction [3] a stochastic force term, representing the random
impacts of surrounding medium molecules on the Brownian particle [4]. At the same
time, Henri pointed out the limited nature of Stokes formula for the friction force,
when applied to neutrally buoyant, micron-sized particles [5],which is the casefor most
Brownian particles used in experiments. In such cases, correlations between friction
and velocity are non-instantaneous, and the positions of the particle are expected to be
correlated up to longer times. The origin of this effect comes from the non-negligible
inertia of the uid, and this must be accounted for in the description of Brownian
motion. The expression for the mean square displacement using the noninstantaneous friction force was introduced by Vladimirsky and Terletzky in 1945 [6], but
remained largely ignored, as their contribution was published in Russian. In 1967,
Alder and Wainwright discovered, in numerical simulations, that the particles velocity
correlations (another characteristic observable of Brownian motion) display a powerlaw decay [7] instead of an exponential relaxation, as expected for instantaneous
friction. These simulations led theoreticians during the 1970s to reconsider the
contribution of uid mechanics to Brownian motion [814], and to address its
relevance in experiments. The idea of using a particle subjected to Brownian motion
as a reporter of its local environment was settled. With this approach, any deviation
90
from the normal diffusive behavior of the particle could be interpreted as a response to
the material properties of its complex environment [15, 16]. To measure such behavior,
a high spatial resolution down to the nanometer scale is needed. Experiments using
dynamic light scattering in colloidal suspensions conrmed that the diffusion of
colloidal particles is inuenced by uid mechanics, and hence is time-dependent [17
21]. However, in order to achieve a sufciently high resolution, averaging over an
ensemble of different particles was necessary. Nowadays, tracking a single particle in a
uid on time scales sufciently short to detect hydrodynamic contributions can be
realized by using optical tracking interferometry (OTI). This allows a direct measurement of Brownian motion at the same resolution as techniques averaging over many
particles, and an individual particle comes to be the local Brownian probe. OTI utilizes
a weak optical trap [22] and interferometric particle position detection. The trapping
laser provides a light source for the position detection of the particle, and at the same
time ensures that the particle remains within the detector range.
In this chapter we provide a complete picture of the measurements of a Brownian
particle immersed in a viscous uid and held by an optical trap. First, relevant
theoretical insights are exposed and the different timescales of Brownian motion are
summarized. Next, the technical aspects of OTI are described, and methodologies on
data acquisition, analysis andinterpretation provided. Theinuences of experimentally
relevant parameters, such as the trapping force constant, the uid properties and the
Brownian particle itself, are presented. Finally, the overlap of the different measurable
time scales of Brownian motion is quantied, and the consequences are discussed.
3.2
Theoretical Model of Brownian Motion in an Optical Trap
There are principally two, counteracting forces that govern the motion of a Brownian
particle. First, the particle is driven through the thermal force Fth(t), that arises from
random uctuations of the uid molecules excited by the thermal energy kBT.
Second, Fth(t) is resisted by the friction force Ffr(t), which (over-)damps the motion
of the uctuating particle. Ffr(t) is the force exerted on the Brownian sphere by the
surrounding viscous uid, when the uid is perturbed through the particles
uctuations. Following from Newtons second law, the equation of motion can be
written as the generalized Langevin equation:
ms xt F th t F f r t F ex t;
3:1
where ms is the inertia of the Brownian sphere (s referring to the spheres parameters), and Fex(t) represents the sum of any external forces, such as gravity or the
force of the optical trap. For simplicity, we will discuss only the one-dimensional case
for the axis x, even though OTI measurements give access to all three directions x, y
and z.
3.2.1.1 The Random Thermal Force Fth(t)
The random force Fth(t) represents the collisions of the uid molecules on the
particle. Its contribution in a homogeneous and isotropic medium varies so rapidly
compared to the observable time scales, that Fth(t) should be, on average, zero;
hFth(t)i 0.
Furthermore, as a very large number of collisions occurs during two successive
measurements at times t and t0 , the correlation time of Fth(t) is much shorter than the
time interval between the two measurements [10]:
hF th tF th t0 i 2kB Tghxtxt0 i
where kB is the Boltzmann constant, g is the viscous drag of the uid on the particle
and z(t) is a white noise term with no nite correlation time: hx(t)x(t0 )i d(t t0 ).
Fth(t) obeys the uctuationdissipation theorem, and has the expression:
F th t
p
2kB Tg xt
3:2
In real systems the correlation time will not typically vanish instantaneously,
because of the nite-size and nite-scale interactions which also exist between the
uid molecules themselves. Viscous and thermal forces will then become spatially
and temporally correlated, through a time-dependent viscous drag g (this is discussed
next). It is worthy of note that Fth(t) can only be described in terms of its statistical
properties, and as its effect has already vanished on experimentally accessible time
scales, Fth(t) has never been measured [23].
3.2.1.2 The Friction Force Ffr(t)
An incompressible isotropic uid with a viscosity hf(t) and density rf (f refers to uid
parameters) generates a viscous drag g(t) on the thermally excited Brownian particle
as it moves through the uid, giving rise to the friction force Ffr(t). A correct
j91
92
expression of g(t) and the resulting Ffr(t) is given by solving the NavierStokes
equation, describing the hydrodynamic properties of the uid [24]. Here, the
molecular character of the viscous uid is neglected, and the uid is treated as a
continuum, which is valid when the Brownian sphere with radius as and density rs is
much larger than the uid molecules. Then, the average free path length of the
molecules which compose the uid is small compared to the dimension of, for
example, the sphere immersed in it. Furthermore, we will only consider the sphere
moving far away from any boundary, like an obstacle placed in its trajectory, which
would make the uid anisotropic. Two experimentally relevant solutions of the
NavierStokes equation can then be distinguished, the rst being a commonly used
approximation of the second:
(i) rs rf : If the sphere has a high inertia ms, and hence a density much higher
than the uids density rf, it will move steadily and at very slow speeds through
the uid. The uids response to the particles presence can then be considered as
instantaneous, and the solution of the NavierStokes equation is simply the
constant Stokes drag [3]:
g 6phf as
3:3
and the friction force Ffr follows Stokes law, which states that it is instantaneously
linear with the spheres velocity x_ :
F f r 6phf as x_
3:4
It must be noted here that hf, the dynamic viscosity of the uid, is considered as
time-independent, thus implying that correlations between successive collisions
of the uid molecules on the Brownian particle have already vanished. Motion can
hence be observed as a Markovian process that is, a random walk.
(ii) rs rf : As noted by Lorentz [25], Equation 3.1, which includes Stokes law
(Equation 3.4), is only consistent with hydrodynamics when ms mf
4=3pa3s rf . When the sphere has a density similar to its surrounding
medium which is usually the case for the neutrally buoyant particles used in
optical trapping the spheres motion will be determined not only by its own
inertia but also by the inertia mf of the surrounding uid. Then, Brownian
motion theory needs to include frequency-dependent effects, and the time
dependence of x_ should be accounted for when computing the drag g.
As the particle receives momentum from the uctuating uid molecules, it
displaces the uid in its immediate vicinity. Although the uid can still be considered
as a continuum, and even with the conditions of a low Reynolds number, the uids
ow eld will be perturbed. The non-negligible inertia of entrained uid
mf 4=3pa3s rf will act back on the sphere. As a consequence, correlations in the
uids uctuations will persist up to time scales observable by OTI and become
experimentally relevant. The Brownian sphere will move with a non-constant velocity
and perform a non-random walk, which will depend heavily on the nature of the
surrounding medium. This phenomenon is commonly called hydrodynamic memory. Such perturbations give rise to the StokesBoussinesq friction force that is
derived from the NavierStokes equation accounting for the inertia of the uid, and
given as [9]:
p t
2 3
_
pas rf x6a2s phf rf tt0 1=2 xt0 dt0
3:5
F f r 6phf as x
3
0
The rst term is the ordinary Stokes friction from Equation 3.4, while the second
term is connected to the mass mf of the incompressible uid displaced by the
Brownian particle. In principle, this term denes an effective mass M ms mf/2
that should replace ms on the left-hand side of the Langevin equation (Equation 3.1).
The third term is time-dependent, stating that the friction force at time t is
determined by the penetration depth of the viscous, unsteady ow around the sphere
at all preceding times. Equation 3.5 conrms that, for a uid with a density similar to
the density of the Brownian particle, the terms containing rf cannot be neglected, as
they are of the same order of magnitude as the inertial term.
Consequently, an instantaneous disturbance of the uid from the thermally excited
Brownian sphere will spread, and its initial momentum will be shared with all of the
molecules in a small volume around the sphere. The velocity eld of this moving
volume then grows by vorticity diffusion. In an incompressible liquid, this enforces a
back ow at short times, which creates a vortex ring in three dimensions [12]. The
diffusive spreading of this vortex carries the momentum into the uid on a time scale
tf a2s rf =hf the time needed for vorticity to diffuse over the distance of one
particle radius. Figure 3.1 shows a simplied scheme of the characteristic doublevortex structure of the velocity eld caused by the initial displacement of the
Brownian sphere in a simple liquid.
.
x(0)
j93
94
(i) Fex(t) 0: When no additional force acts on the sphere and its motion is
considered as free.
(ii) Fex(t) kx(t): Corresponding to the harmonic trapping potential with a force
constant k created by the optical trap, which retains the sphere within the
observation volume of the detector. The spheres motion is then qualied as
optically conned.
3.2.2
Solutions to the Different Langevin Equations for Cases Observable by OTI
From the solution of the Langevin equation for a Brownian sphere, the following
measurable quantities of physical interest are derived for further studies in experi_ x0i;
_
ments (see Sections 3.3 and 3.4) the velocity autocorrelation function (VAF) hxt
the mean square displacement (MSD) hDx2(t)i, which is related to the velocity
autocorrelation function through:
t
0
_ 0 x0idt
_
;
hDx 2 ti 2 tt0 hxt
3:6
3:7
The following listing of all three measurables derived from the four discussed
Langevin equations is meant to provide a summarizing overview of the theoretical
models of Brownian motion derived in the literature by various authors. Each model
can be picked accordingly to t the data acquired by OTI, as discussed in Section 3.4.
The most accurate expressions are also the most complex; however, a good understanding of the problem of Brownian motion in a viscous uid is already gained by
only considering the characteristic limiting behaviors in each situation.
3.2.2.1 Free Brownian Motion
Solving the Langevin Equation using Stokes Friction Solving the Langevin equation
using the Stokes friction of Equation 3.4 and Fex(t) 0 results in a VAF:
_ x0i
_
hxt
free
and in a MSD:
kB T t=ts
e
ms
3:8
h
i
ts
hDx 2 tifree 2Dt 1 et=ts 1
t
3:9
D
p2 f 2 gf =2pms 2 1
D
p2 f 2 f =fs 2 1
3:10
3:12
hj~
x f j2 ifree Df2s =p2 f 4
3:13
and:
At long times (t ! ), and respectively, low frequencies ( f ! 0), velocity correlations vanish exponentially with ts, the relaxation time of the particles initial
momentum. The particle has then lost all information about its initial velocity, and
diffuses randomly with
hDx 2 tifree 2Dt
3:14
and respectively
hj~
x f j2 ifree D=p2 f 2
3:15
Solving the Langevin Equation using the StokesBoussinesq Friction Force Solving the
Langevin equation using the StokesBoussinesq friction force of Equation 3.5 and
Fex(t) 0 results in more complex expressions [6, 11]:
h
p
p i
kB T
2
2
_
q a ea t erfc a t a ea t erfc a t
hx_ tx0i
f ree
2pa3s rf 58rs =rf
with
a
p
3 3 536ts =tf
p
2
t19ts =tf
3:16
now also including the hydrodynamic effect, decaying with a uid-dependent time
constant tf a2s rf =hf , which represents the time needed by the perturbed uid ow
eld to diffuse over the distance of one particlepradius.
For an incompressible
p uid,
_
the initial velocity is now given by x0
kB T=ms mf =2 kB T=M, and
j95
96
ts t
with X
;
tf tf
p 1 a2 t
p
3
1 a2 t
p 3 e erfc a t 3 e erfc a t :
a
t 536ts =tf a
3:17
Equation 3.17 depends on the particles inertia through ts and on the uids inertia
2rs
.
through tf. The two times are connected by the relationship ttfs 9r
f
The corresponding characteristic frequency ff 1/2ptf appears in the PSD as [27]:
hj~
x f j2 if ree
D
p2 f 2
q
1 f =2ff
h
i2 q 2
q
f =fs f =2ff f =9ff 1 f =2ff
3:18
The behaviors in the time and frequency limits of all three functions remain
similar to the previously discussed case, meaning that, for very short times, the
motion is ballistic and, for very long times, the motion is diffusive. However, the
transition between the two regimes is algebraic and delayed to signicantly longer
times compared to the case of simple Stokes friction. This translates into a slow
algebraic decay in the MSD (Equation 3.12) and results in a VAF which is governed by
a power-law rather than by an exponential tail [7]:
3=2
_ x0i
_
;
hxt
free / t=tf
for t tf
3:19
This power law is usually referred to in the literature as the long-time tail, and
arises from the uid vortices observed around the colloidal particle, as sketched in
Figure 3.1.
The log-log plot in Figure 3.2a compares the VAF given by Equation 3.8 (red line)
with that given by Equation 3.16 (blue line), both normalized by their respective initial
velocity, for a sphere with radius as 1 mm, density rs 1.51 kg dm3 immersed in
water with viscosity hf 103 Pas at T 22 C. It can be seen that the exponential
relaxation resulting from the Stokes friction changes to a power-law decay when the
uids inertia is accounted for. In the same way, the log-log representations in
Figures 3.2b and 3.2c show a comparison between Equations 3.9 and 3.17, as well as
between Equations 3.10 and 3.18. Differences in the MSD and PSD are less visible in
this representation, but the respective common limiting behaviors, translating to
characteristic slopes are indicated. In Figure 3.2c, the discrepancies visible at high
frequencies above 2 MHz (arrow) between both functions, arise from the differences
in the displaced masses; ms, and respectively M. The green bars on the abscissa
highlight the time, and respectively, frequency regions accessible by OTI, as introduced in Section 3.3.
(a)
(b)
(nm2)
10-1
-2
free
(t / f)-3/2
10
-3
10
10-4
x 2(t)
(t) (0)
free
2(0)
10 0
e-t / s
10
10
~t
-2
10
-4
10
~t 2
-6
10-5
10
-9
10
-7
-5
10
10
-3
10-9
10
Time (s)
10-7
10-5
10-3
Time (s)
x~ ( f )
free
(nm2 Hz-1)
(c)
10-2
10-5
~f -2
-8
10
10-11
10-14
~f -4
-17
10
10 3
10 5
10 7
Frequency (Hz)
Figure 3.2 Log-log plots of (a) the normalized
VAF given by Equation 3.8 (red line) and
Equation 3.16 (blue line); (b) the MSD given by
Equation 3.9 (red line) and Equation 3.17 (blue
line); and (c) the PSD given by Equation 3.10 (red
line) and Equation 3.18 (blue line) for a 2 mm-
kB T
V eV t V eV t
ms V V
p
1
V
1 1 4ts =tk
2ts
_
hx_ tx0i
with
3:20
The VAF now has a positive part which decays exponentially to zero as
_ x0i
_
hxt
/ et=ts for t ts, and a negative part which decays exponentially as:
_ x0i
_
hxt
/ et=tk
3:21
j97
98
for t ts, a new characteristic time constant tk 6phfas/k determined by the trap
stiffness k. The MSD follows as:
V
2kB T
V
V t
V t
1
e
e
hDx ti
k
V V
V V
2
3:22
D
p2 f 2 1 fk =f 2
3:23
showing that the motion is also inuenced by the trapping potential with the
characteristic frequency fk 1/2ptk k/12p2hfas, corresponding to the corner frequency of the Lorentzian function.
In the short time limits, the behavior remains the same as introduced above, but in
the long time limit it is now governed by the conning potential that is, the optical
trapping constant. Then, the velocity correlations still disappear, but Equation 3.22
approaches the time-independent limit:
hDx 2 tit ! 2kB T=k
3:24
3:25
The spheres motion is now governed by the potentials restoring force with
stiffness k.
Using the StokesBoussinesq Friction Force The most complete solution considered
in this work is when the StokesBoussinesq friction force of Equation 3.5 and
Fex(t) kx(t) are used to set up the Langevin equation. Then, the VAF is [14]:
_
hx_ tx0i
p
p
2
2
z31 ez1 t erfc z1 t
z32 ez2 t erfc z2 t
where the coefcients z1, z2, z3 and z4 are the four roots of the equation:
p
1
1
ts tf z4 tf z3 z2 0:
9
tk
2
ez3 t erf c z3 t
ez4 t erf c z4 t
2D
tf z z z z z z z z z z z z z z
3 3
1
3
2
3
4
4 4
1
4
2
4
3
ts
9
3:27
and the PSD is given by [27]:
2
3
q
1
f
=2f
f
D 6
7
hj~
x f j2 i
4h
i q 5
q
p2 f 2 f =f f =f f =2f f =9f 2 1 f =2f 2
k
s
f
f
f
3:28
which is by far the most convenient formula of all three quantities to use for tting
data acquired in OTI experiments. Fortunately, despite their complexity, the limiting
behaviors of all three expressions simplify in the same way as in the above-discussed
cases.
3.2.3
Time Scales of Brownian Motion
All of the above considerations show that many different time scales (as outlined in
Table 3.1) govern the physics of Brownian motion. To better appreciate these times,
we can consider a sphere with radius as 1 mm in water. The sphere is set into motion
by random collisions with uid molecules within tcol dmol vmol 0:1 ps, the
correlation time of Fth(t), estimated by the ratio of the mean solvent particle
separation dmol and uctuation speed vmol [30]. The momentum is then transferred
from the particle to the uid on very different time scales. If compressibility effects of
the uid are taken into account, one-third of the initial momentum is carried off
Table 3.1 Overview of the characteristic time scales of a Brownian particle in a harmonic potential.
Time constant
Origin
ts ms =g 2a2s rs =9hf
tf a2s rf =hf
tk 6phf as =k:
Typical values are calculated for a resin sphere (as 1 mm, rs 1.51 kg dm3) in water
(r f 1 kg dm3, h 103 Pas).
j99
100
rapidly by a spherical sound wave, the front of which leaves the sphere after a time
tsw as/c 0.7 ns, where c is the speed of sound in the uid [31]. Equation 3.5, which
was set for an incompressible uid, has simply to be corrected to include a rapid
change of the particles inertial mass ms to the combined mass M ms mf/2 [13].
Consequently, the velocity correlation function starts with the initial value given by
_ x0i
_
the equipartition theorem, hx0
kB T=ms and, after a short time, on the order
of tsw, decays from kBT/ms to kBT/M due to acoustic damping of the particles velocity.
When the sound wave has separated and the particle has relaxed with ts ms/
g 0.9 ms or ts f (ms mf/2)/g 0.7 ms, the vortex ring develops around the
particle. The region of vorticity (see Figure 3.1) grows diffusively, as the remaining
momentum is distributed over increasingly larger volumes, with the disturbance
taking a time of order tf a2s rf =hf to leave the particle. Finally, the optical trapping
force, Fex(t), sets in after a time tk 6phfas/k and slows down the sphere, conning its
motion around the potential minimum. The stronger the trap, the earlier optical
connement will reduce the particles free motion.
For the model of a sphere in a harmonic potential with a typical spring constant
k 10 mN m1, ts and tf are in the microsecond range, whereas tk is in the
millisecond range. The relaxation time of the optical trapping potential is thus well
separated from the others, and its separation can be adjusted by choosing suitable
experimental parameters, that is, as, rs, rf, hf and k. The diffusion constant D is on
the order of 1 mm2 s1, and hence the time for the sphere to diffuse over its radius is
on the order of 1 s. Correspondingly, within 1 ms, the sphere will have diffused about
1 nm, which is the time and distance range accessible by OTI. This will allow study of
the interplay between tf and tk, as shown in detail in Section 3.4.
3.3
Experimental Aspects of Optical Trapping Interferometry
The apparatus consists mainly of a custom-built inverted light microscope with a 3-D
sample positioning stage, an infrared laser for trapping, and a quadrant photodiode for
high-resolution 3-D and time-resolved position detection. The two main custom-made
circular base plates are made from the titanium alloy Ti-6AL-4V, on the basis of its high
tensile strength, light weight, low thermal conductivity, low thermal expansion coefcient and corrosion resistance compared to steel or aluminum. The commercially
available mechanical pieces are either made of aluminum or steel. In order to minimize
mechanical vibrations, the whole set-up is mounted on a table with tuned damping
(RS-4000, Newport, UK), supported by pneumatic isolators (I-2000, Newport). The
main features of the instrument are shown in Figure 3.3.
3.3.1.1 Optical Trapping Interferometry and Microscopy Light Path
The optical paths can be divided into an infrared (IR) trapping and detection light
path, and a visible illumination and imaging light path, as shown in Figure 3.4. The
trapping beam is emitted by a diode-pumped, ultra-low-noise Nd:YaG laser with a
wavelength of l 1064 nm (IRCL-500-1064-S; CrystaLaser, USA) and a maximal
light power of 500 mW in continuous-wave mode. The choice of the near-IR
wavelength satises the requirement of minimal water absorption to avoid heating
of the sample in the laser focus [29]. A high-intensity gradient for good trapping
efciency is achieved by over-illuminating the high numerical back-aperture of the
focusing lens (OBJ). Therefore, the effective laser beam diameter is expanded 20-fold
by a telecentric lens system (EXP; Beam Expander, Sill Optics, Germany). In order to
j101
102
minimize noise, the laser is operated at high power, and its intensity is adapted after
expansion by neutral density lters (NF1) with variable transmission coefcients
(T 0.25, 0.1, 0.01 or 0.001; OWIS, Germany). Increasing the transmission coefcient of NF1 will decrease the trapping stiffness, as this depends linearly on the laser
power [33]. Next, the IR-beam is reected by a dichroic mirror (DM1; AHF
analysentechnik AG, Germany) into the high numerical aperture (NA 1.2) of a
working distance guarantees a stable space-invariant trap through the entire sample
chamber. This is particularly essential when studies on Brownian particles far away
from any surface boundary are of interest (as will be the case for the experiments
discussed below).
The sample is mounted onto an xyz-piezo scanning table (PZT; P-561, Physikalishe Instrumente, Germany) for manipulation and positioning. The PZT with
controller (E-710 Digital PZT Controller; Physikalishe Instrumente, Germany) has
a travel range of 100 mm along all three dimensions, with a precision of 1 nm. The
laser light focused by the objective lens is collected with a condenser (CND; 63X,
Achroplan, NA 0.9, water-immersion; Zeiss, Germany), and projected by a second
dichroic mirror (DM2) and two lenses (L1 and L2, with focal lengths f1 30 mm and
f2 50 mm) with a magnication of f1/f2 1.67 onto an InGaAs quadrant photodiode
(QPD; G6849, Hamamatsu Photonics, Japan). The QPD is placed in the back focal
plane of the condenser and xed on an xy translation stage (OWIS, Germany) for
manual centering of the detector relative to the IR-beam. In order to avoid possible
saturation of the QPD, a second neutral density lter (NF2) can be optionally placed in
front of the QPD. For illumination in the visible range, a 50 W halogen lamp is
diffused (DIF) and projected by a mirror (M) through the condenser, objective and a
180 mm tube lens (TL) that creates an image of the object plane onto a charge-coupled
device camera (CCD; ORCA ER S5107, Hamamatsu Photonics, Japan). The image of
the object plane is digitized (Hamamatsu Digital Controller ORCA ER), and can be
further processed (HiPic, Hamamatsu Photonics, Japan).
3.3.1.2 Sample Preparation
The sample chamber consists of a custom-made ow cell. A coverslip (thickness
130 mm) is stuck to a standard microscope slide using two pieces of double-sided
tape, arranged in such a way as to form a 5 mm-wide and 70 mm-thick channel
with a volume of 20 ml. After loading with a suspension of microspheres, the owcell is mounted upside down on the 3-D piezo-stage. In the experiments presented in
Section 3.4, either polystyrene (rs 1.05 kg dm3; Bangs Laboratories, USA), melamine resin (rs 1.51 kg dm3; Sigma-Aldrich, USA) or silica spheres (rs 1.96 kg
dm3; Bangs Laboratories, USA) were used with radii (as) varying from 0.27 to 2 mm.
To guarantee the manipulation and analysis of exclusively one particle, a particle
concentration of 106 spheres per milliliter was used to maximize the average distance
between two neighboring particles and minimize their mutual inuence on their
motions [34].
3.3.2
Position Signal Detection and Acquisition
When following the 3-D Brownian motion of the trapped particle, the scattering of the
strongly focused trapping laser on the particle is measured by the QPD. The InGaAs
Quadrant Photodiode (G6849, Hamamatsu Photonics, Japan) is 2.0 mm in diameter
with a dead zone of 0.1 mm between the quadrants. The photosensitivity is 0.67 A
W1 at 1064 nm. The QPD signals are fed into a custom-built preamplier (Pre-AMP;
j103
104
DAQ
time
MSD
position
time
PSD
time
frequency
Figure 3.5 Positionsignalacquisitionanddataprocessing.Intensity
fluctuations are recorded on the quadrant photodiode (QPD) and
converted to volts (Pre-AMP). The signal is amplified (AMP) and
digitized using the acquisition card (DAQ). The VAF, MSD and PSD
are then calculated from the recorded position time trace.
ffner MSR-Technik, Germany) which provides two differential signals between the
segments and one signal that is proportional to the total light intensity (Figure 3.5).
Preamplication of the quadrant photodiode signals at 20 V mA1 with 0.67 A W1
photosensitivity leads to a voltage of 13.4 V mW1. Subsequently, differential ampliers (AMP; ffner MSR-Technik) adjust the preamplier signals for optimal digitalization by the data acquisition board (DAQ) with a dynamic range of 12 bits (NI-6110,
National Instruments, USA). Amplication of the QPD signal is chosen to span the
maximal dynamic range of the acquisition card. The amplier (ffner MSR-Technik),
with a maximal gain of 500, has a cut-off frequency around 1 MHz. The particles
position can be detected in all three dimensions. On the QPD, scattered and
unscattered light generate an interference pattern that corresponds to the probes
position. A displacement of the particle near the beam focus modulates the optical
power collected by the QPD. When the sphere moves perpendicularly to the optical
axis that is, along the x-direction the current signal Sx (Q1 Q2) (Q3 Q4)
measured between both top segments (Q1,Q2) and both bottom segments (Q3,Q4) of
the QPD changes correspondingly. The same holds for movements in the y-direction.
Displacements along the z-axis instead affect the sum-signal Sz Q1 Q2 Q3
Q4 of the QPD, and so z displacements can be determined by the changes in total
intensity. The full 3-D position information of the probe is thus encoded in the
interference pattern of the forward-scattered and transmitted laser light recorded by
the QPD. A detailed analysis of the detector response is given for uctuations
perpendicular to the optical axis in Ref. [35], and along the optical axis in Refs [36]
and [37]. For small displacements from the trap center, the differential signals from the
QPD are proportional to the lateral displacements of the particle in the optical trap, and
the sum signal to the axial displacement.
The scanning stage, CCD camera and data acquisition are controlled and coordinated by a custom-made program written in VEE (Agilent, USA). Data are saved in
binary format and analyzed with Igor 6.0 (WaveMetrics, USA).
The 3-D positiontime traces of the Brownian motion of the probe can be acquired
up to maximally N 107 points (this limitation is set by the working memory of the
VEE). Data analysis also becomes very slow when the data les exceed 3.107 points per
channel, and therefore the combination between data acquisition rate facq and total
recording time ttot, must be adjusted according to N facqttot. A schematic overview of
the signal acquisition and data processing schemes is shown in Figure 3.5.
3.3.3
Position Signal Processing
The three quantities of VAF, MSD and PSD, which were introduced in Section 3.2
can be calculated from the same experimental time trace x(t), which is recorded in
volts and converted to nanometers after tting to the suitable theory of Brownian
motion (as described in Section 3.4.1). An example of such a time trace, as well as the
resulting VAF, MSD and PSD is shown on the right-hand side of Figure 3.5 for a 2 mm
resin bead. For the sake of clarity, only one-dimensional time traces x(t) will be
presented through the remainder of this chapter, even though the developed data
analysis strategy also applies to the y and z directions.
_ Dxt=Dt of the studied sphere is derived from the steps
The velocity xt
Dx(t) x(t Dt) x(t) it performed, where Dt is the lag time related to the acquisition frequency as facq 1/Dt. For the total number of acquired points N, the total
recording time is expressed as ttot NDt.
_ x0i
_
The velocity autocorrelation function hxt
is then dened as:
_ x0i
_
hxt
1
Dt
hDxtDx0i
f 2acq X
N
Dxt iDtDxiDt
3:29
3:30
P
The discrete Fourier transform of x(t) is: x~f 1=f s i ei2ptf =N xt, where
t iDt and i 0,1, . . ., N/2 and the power spectral density follows as:
hj~
xf j2 i
j~
x f j2
ttot
3:31
3.3.4
Temporal and Spatial Resolution of the Instrument
Apart from the signal arising from the particles thermal uctuations, anything
that changes the intensity recorded by the quadrant photodiode will limit the
resolution of the system. The main noise sources are the mechanical instabilities
j105
106
of the microscope, and electronic noise combined with laser intensity uctuations
and pointing instabilities. Mechanical instabilities mainly cause low-frequency
noise (drift). Laser intensity uctuations may lead to temporal variations in the
spring constants of the optical trapping potential, and pointing instabilities to
unwanted drifting of the trapping focus in the specimen plane. To quantify the
contribution of laser noise to the measured signal x(t), it is decomposed in:
xt x s t x n t
where xs(t) is the particles thermal uctuations and xn(t) is the noise contribution
to the signal. A subtraction of its contribution can increase the quality of the
signal. With the assumption that position uctuations of the sphere and the noise
are uncorrelated that is, hxs(t)xn(t0 )i 0 the velocity autocorrelation function can
be written as:
_ x0i
_
hxt
hx_ s tx_ s 0i hx_ n tx_ n 0i
the MSD of the acquired signal as:
hDx 2 ti hDx 2s ti hDx 2n ti
and similarly,
x s f j2 j~
x n f j2 ;
j~
x f j2 j~
where x~s f and x~n f are the Fourier transforms of xs(t) and xn(t) respectively.
The calibrated (see Section 3.4.1) position uctuations as a function of time of
a trapped sphere (resin, radius as 1 mm in water, k 5 mN m1, facq 5 MHz,
ttot 2 s) are shown in Figure 3.6a (red line). The blue line represents the recordings
of the empty traps noise signal xn(t) after the sphere has been released. xn(t) can be
minimized by optimizing the illumination pattern of the incident laser spot on the
QPD. The comparison between xs(t) and xn(t) indicates that the laser noise contribution in this conguration is very small, and its inuence on the VAF, MSD and PSD
are shown in Figure 3.6bd, respectively. As expected, the velocity uctuations of the
laser spot without a scattering particle are uncorrelated and uctuate around zero
(Figure 3.6b, blue line). Correspondingly, hDx2(t)in is small and constant (Figure 3.6c,
blue line), whereas hDx2(t)i increases with time (Figure 3.6c, red line). The resolution
of the MSD of the spheres position uctuations can then be enhanced by subtracting
hDx2(t)in from hDx2(t)i. The spatial resolution of 8 achieved by the apparatus can
be read from the rst time points of the MSD (inset of Figure 3.6c, indicated by
brackets).
In the frequency domain (Figure 3.6d), we dene fN as the frequency at which
the power spectrum of the trapped bead (red line) drops to the noise level given by
the power spectrum of the empty trap (blue line). The amplier has a Butterworthtype low-pass lter with a cut-off frequency at 1 MHz, which is slightly above the
frequency fN 0.8 MHz. Therefore, in the following section we will analyze data
in the frequency range up to a maximum of 0.5 MHz, setting the time resolution
Figure 3.6 (a) Position signal of the sphere (as 2 mm) in the trap
(red) and of the empty trap (blue) acquired during ttot 2 s at
facq 5 MHz; (b) VAF, (c) MSD and (d) PSD calculated from the
position signal (the black circles represent the PSD blocked in
five bins per decade). The frequency fN, where the signal reaches
the noise floor, is indicated by an arrow.
of the OTI system at 2 ms. Additionally, in order to avoid aliasing artifacts [38]
from the data acquisition card, the acquired signal is oversampled by a factor of 2;
facq > 2fN.
When plotting the VAF and PSD on a log-log scale, further data processing can be
made to enhance noisy data. As both functions are distributed exponentially, the
number of points in a log-log plot will increase with, respectively, time or frequency.
Therefore, data are commonly averaged from a block of consecutive data points [39],
resulting in equidistantly distributed points on the logarithmic scale. In the example
discussed above, data were blocked in ve bins per decade (Figure 3.6d, black circles).
The data scattering around their average value gives the standard error, which
remains within the size of the black circles. Together with improving the visibility
of data, blocking allows tting the data by the least-squares method, which assumes
that the analyzed data points are statistically independent and conform to a Gaussian
distribution [27]. Whilst the second assumption is satised by the VAF and PSD (as
dened in Equations 3.29 and 3.31), the rst assumption is satised only after
blocking.
j107
108
3.4
High-Resolution Analysis of Brownian Motion
After having characterized the performance of the OTI set-up, we can now present
measurements on details in the Brownian motion of the model sphere. We demonstrate that the theory presented in Section 3.2 may be used for tting and calibrating
the data. The inuence of experimentally relevant parameters on Brownian motion
will be demonstrated, and the consequences of the presence of inertial effects in the
data discussed.
3.4.1
Calibration of the Instrument
Position signal calibration consists of determining the detector sensitivity b and the
spring constant k of the optical trapping potential. The term b has units of V nm1, as
the acquired position signal is recorded in volts, and the position of the particle is
expressed in nanometers. Both, experimental and theoretical investigations [40] have
shown that, close to the trap center, the optical trapping forces are well approximated
by three orthogonal forces derived from a harmonic trapping potential. For a given
wavelength of the laser beam, the spring constant along each direction depends
mainly on the particles size, the difference between refractive indices of particle and
uid, and the intensity of the trapping laser light. When the spheres radius is known,
the physics of Brownian motion in a harmonic potential (see Section 3.2.2.2) can be
exploited to calibrate the optical trap [38]. Either of three quantities presented in
Section 3.3.3 namely the VAF, MSD or PSD can be used to obtain b and k
simultaneously. An overview is provided in Figure 3.7 for measurements of a resin
sphere with radius as 0.5 mm, facq 0.5 MHz, ttot 20 s. For least-square tting, the
VAF (top graph) is normalized by its initial value, blocked in 10 bins per decade, and
represented in a linear-log plot, whereas the PSD (bottom graph) is blocked in 10 bins
per decade and plotted on a log-log scale. As can be seen by comparing Figures 3.2
and 3.6, the bandwidth of OTI allows the measurement of Brownian motion within a
time range, during which it is greatly inuenced by the hydrodynamic memory effect.
Hence, the calibration must be made by using a theory that accounts for the uids
inertia that is, using Equation 3.26 instead of Equation 3.8 for tting the VAF,
Equation 3.27 instead of Equation 3.9 for tting the MSD, and Equation 3.28 instead
of Equation 3.10 for tting the PSD. The black continuous line in each graph of
Figure 3.7 therefore corresponds to Equations 3.26, 3.27and 3.28, respectively, being
tted to the data (red circles) with the two tting parameters b and k. All three ts
generally provide an accuracy of better than 6%, depending on the acquisition
frequency and total acquisition time. The relative difference for all tted values of b
acquired by either of each equation is less than a small percentage [41]. The trapping
force constant k (see Table 3.2 and next section) is obtained from the long-time and
respectively low-frequency, limits of each function, according to Equations 3.21, 3.24
and 3.25, or from the corner frequency fk in Equation 3.23. The two main sources of
error are uncertainties in the determination of the bead size and of the temperature
2(0) ( x10-4)
(t) (0)
-1
(t/ f)-3/2
2
1
0
- t / k
-e
-2
-3
10-5
10-4
10-3
Time (s)
60
40
2kBT
k
30
x (t) (nm2)
50
20
10
0
0.0
0.2
0.4
0.6
1.0
0.8
Time (x10-3 s)
10-3
4kBT
10-4
k2
~ f)
x(
(nm2 Hz-1)
10-2
-5
10
10-6
10 1
10 2
10 3
10 4
10 5
Frequency (Hz)
Figure 3.7 Measured VAF (raw data as red line, blocked values as
red circles), MSD (circles, data points were removed for clarity)
and PSD (only blocked values are shown as circles). The respective
fits calculated from Equations 3.26, 3.27or 3.28 are plotted as
black lines.
inside the laser focus. The latter can lead, in particular, to unwanted uctuations in
the uids viscosity and density [42].
3.4.2
Influence of Different Parameters on Brownian Motion
In this section we describe the inuences of the trapping potential, the surrounding
uids properties and the particles properties on Brownian motion.
j109
110
k1
k2
k3
69 ms ! 136 mN m1
293 ms ! 33 mN m1
798 ms ! 12 mN m1
547 Hz ! 32 mN m1
197 Hz ! 12 mN m1
hj~
x f j2 if ! 0
tk
6phf as
k
fk 12p2kh
96kB Thf as
as
k2
k1
k2
k3
150
100
50
0
-50
-100
-150
10 -5
10 -4
10 -3
Time (s)
k1
k2
k3
500
MSD (nm2)
400
300
200
100
0
0.0
0.2
0.4
0.6
1.0
0.8
Time (x10-3 s)
0
k1
k2
k3
10
-1
10
-2
10
-3
10
-4
10
-5
10
-6
10
10 1
10 2
10 3
10 4
10 5
Frequency (Hz)
Figure 3.8 Measured VAF, MSD and PSD for three different
force constants: k1 140 mN m1 (red line), k2 32 mN m1
(blue line) and k2 32 mN m1 (green line).
force, the trapping stiffness is minimized and the OTI conguration is used solely as
a position detector for single particle tracking.
3.4.2.2 Changing the Fluid
In order to study the inuence of the uids inertia and detect the long-time tail in the
normalized VAF (see Figure 3.1 and Section 3.2.2.1), we track the Brownian motion
of a larger resin sphere (as 1.5 mm, rs 1.51 kg dm3, facq 0.5 MHz, ttot 20 s) in
three different solvents having different viscosities hf and, unavoidably, different
densities rf (Figure 3.9). The resin beads are suspended either in a more viscous
j111
112
(t) (0)
2 (0)(x10-6)
-t / s
Acetone
H2O
30% Glycerol
-6
10
4
-7
10
3
2
t -3/2
-8
10
10s
0
10-5
10-4
Time (s)
Figure 3.9 Log-linear representation of the
measured normalized VAF of a resin sphere
(as 1.5 mm, rs 1.51 kg dm3, facq 0.5 MHz,
ttot 20 s) with k 10 mN m1, for three
different fluids; 30% glycerol in H2O
(rf 1.18 kg dm3, h30%glycerol 2.11
x (t)
2 Dt
5
5
7 8 9
7 8 9
10 -4
10 -5
Time (s)
Figure 3.10 Comparison of the motion of particles with the same
radius but different densities. Log-log representation of the
measured normalized MSD for a silica sphere (as 1.97 mm,
rs 1.96 kg dm3, *), of a resin sphere (as 2 mm,
rs 1.51 kg dm3, ~) and a polystyrene sphere (as 1.94 mm,
rs 1.05 kg dm3, &). Fits correspond to the continuous line
with the respective color.
Having demonstrated agreement between Brownian motion theory and OTI data,
and studied the inuence of parameters which can be varied experimentally, we can
derive a rule of thumb to estimate the time range during which the particles motion
j113
114
can be considered as effectively free from the inuence of the trap. Furthermore, we
can determine for how long the inertia of a Newtonian uid will inuence the
Brownian probes motion and prevent it from performing a diffusive random walk
inside an optical trapping potential. This sets the bandwidth of OTI for highresolution single particle tracking to probe locally many different media.
3.4.3.1 Single Particle Tracking by OTI
We dene the time tfree starting from which the optically conned MSD given by
Equation 3.27 begins to deviate by at least 2% from the free MSD described by
Equation 3.17. We record the motion of different polystyrene spheres of sizes as 0.27,
0.33, 0.50, 0.74, 1.00 and 1.25 mm conned by potentials with a spring constant ranging
from 1 to 100 mN m1. For stronger traps (tk < 1 ms), the data are recorded at 500 kHz
during 20 s and calibrated in the range between 2 ms and 1 ms. For softer traps
(tk > 1 ms), the data are recorded at 200 kHz for 50 s and calibrated between 5 ms and
10 ms. In Figure 3.11, tfree is represented as a function of tk in a log-log scale, which
allows usto formulatean approximate empiricalrelation between bothtime scales [41]:
tfree tk =20
3:32
Hence, in the time range [2 ms, tfree], which is limited on one side by the noise oor
(see Figure 3.6) and on the other side by tk/20, OTI can track the motion of a single
free Brownian particle. Under these conditions, the sphere probes only its local
environment, for up to three decades in time.
3.4.3.2 Diffusion in OTI
The next issue to address is the question of whether or not the Brownian probe has
time to reach a purely diffusive behavior before it becomes conned by the potential
of the trap. This is equivalent to studying the inuence of inertial effects on the
particles motion in the optical trap at times shorter than tfree. Therefore, we
investigate the diffusion of a small sphere, which is expected to perturb the uid
less, and hence tf and the hydrodynamic memory effect are smaller. Furthermore,
we adjust the trapping potential to be the softest possible, as the time region where the
particles motion is free from the inuence of the trap lasts longer for high tk.
1
free (ms)
10
as = 0.27 m
as = 0.33 m
as = 0.50 m
as = 0.74 m
as = 1.00 m
as = 1.25 m
10
-1
10
10
-1
10
k (ms)
10
x (t)
2Dt
1.00
0.96
0.92
0.88
10 4
10 5
10 3
Time (s)
Figure 3.12 hDx2(t)i/2Dt for the sphere with as 0.27 mm and
k 1.5 mN m1, facq 0.2 MHz, ttot 50 s, fitted by Equation 3.34
(black line). The theory for the free particle is given by the blue line
corresponding to Equation 3.24. The arrow indicates the time
when hDx2(t)ifree/2Dt reaches diffusive motion within 2%.
We introduce the dimensionless representation hDx2(t)i/2Dt to distinguish between the free diffusive motion when hDx2(t)i/2Dt 1 (see Section 3.2.2.1) and the
motion inuenced by inertial effects when the particle is either free or optically
conned (see Section 3.2.2.1, parts (i) and (ii), respectively). In both cases motion is
nondiffusive as hDx2(t)i/2Dt < 1.
The measured hDx2(t)i/2Dt for a polystyrene sphere (as 0.27 mm, k 1.5 mN
1
m ) is shown in Figure 3.12 (red circles), tted to Equation 3.27 (black line) and
compared to hDx2(t)ifree/2Dt given by Equation 3.17 (blue line). Here, it can be seen
that hDx2(t)i/2Dt reaches a maximum of 0.96, but never 1.
For the free particle, hDx2(t)ifree/2Dt 1 would occur within 2% error after
approximately 200 ms (Figure 3.12, arrow). Thus, in order to observe free diffusive
Brownian motion, the optical trap would have to be so weak that tfree > 0.2 ms is
satised, or equivalently tk > 4 ms according to Equation 3.32. However, for all the
combinations of particle sizes and spring constants studied, we could never adjust
such a long relaxation time. In the particular case of the sphere with as 0.27 mm, a
spring constant k < 1 nN/m would be needed to observe free diffusive motion for at
least one decade in time. However, such a low spring constant does not allow us to
trap and observe the particle for a sufciently long period of time. Hence, in
experiments using optical traps, the motion of a particle is inuenced by either
memory effects and/or by the harmonic potential [41]. This is in contradiction with
assumptions commonly made in optical trapping experiments, where a normal
diffusive behavior of the trapped particle is assumed and inertial effects from the
uid are ignored [38, 43].
The time-dependent diffusion coefcient can be derived from the VAF or the MSD
as:
t
0
_ 0 x0idt
_
Dt hxt
0
1d
hDx 2 ti
2 dt
3:33
j115
116
and approaches the diffusion constant D in the innite time limit when Fex(t) 0:
kB T
0
_ 0 x0idt
_
D hxt
6phf as
3:34
pxdx ceEx=kB T
3:35
which describes the probability p(x)dx of nding the Brownian particle in the
potential E(x) (c normalizes the probability distribution).
From the calibrated time traces acquired with facq 0.5 MHz, the probability
distribution is represented in Figure 3.13 for two different resin spheres with
as 0.5 mm (red histogram) and as 2 mm (green histogram) trapped in a similar
potential with k 12.5 mN m1 but obviously different tk : tksmall 69 ms and
tkbig 798 ms. The position histograms, with a bin width of 1 nm, are compared
for both spheres, and contain either N 400 000 data points, corresponding to an
acquisition time ttot 0.08 s (top graph), N 400 000 points, corresponding to an
acquisition time ttot 0.8 s (middle graph) or N 4 000 000 points, corresponding to
an acquisition time ttot 8 s (bottom graph).
Even though in each histogram there are apparently enough data points to perform
statistical analysis, these points are clearly not statistically independent. Indeed, the
upper histogram features only 100 statistically independent points for the larger
sphere and 1000 for the smaller sphere. The smaller sphere will sample the
trapping potential well more rapidly than the larger, which is translated in the
differences between tksmall and tkbig ; therefore, the larger sphere will take 10-fold
longer to explore the potential and, consequently, for statistical analysis the trajectory
should be acquired over longer times. The temporal resolution of the Boltzmann
statistics method is determined by the time required to record uncorrelated data, and
is heavily dependent on the particles size and tk. The high acquisition rates used
throughout these studies are not needed in this case.
3.5
Summary and Outlook
In this chapter we have shown how OTI can be used to study Brownian motion down
to hydrodynamic time scales, where the response of the surrounding uid becomes
dominant. This is only possible due to the high bandwidth (up to 500 kHz) and
as = 0.5 m
k = 69 s
as = 2.0 m
k = 798 s
N = 40000
Counts x103
1.0
0.8
0.6
0.4
0.2
0
Counts x103
N= 400000
6
4
2
0
Counts x103
80
N = 4000000
60
40
20
-60
-40
-20
20
40
60
Displacement (nm)
Figure 3.13 Histogram of position fluctuations (bin width of
1 nm) acquired with facq 0.5 MHz for a sphere with as 0.5 mm
and as 2.0 mm. The number of data points N in the histograms
increases from top to bottom from 40 000 (ttot 0.08 s) to
4 000 000 (ttot 0.08 s). The trapping stiffness k 12.5 mN m1 is
similar for both bead sizes.
subnanometer spatial resolution of the position detection conguration. The precision achieved allows us to detect not only the effects of the uids inertia but also the
more subtle effects of the spheres inertia.
The details in the motion of the trapped model sphere provide insight into the
interplay between inertial effects and the optical trapping potential, as summarized
by Figure 3.14. This shows in particular, the overlap of tf, the characteristic time of the
j117
118
viscous uid, with tk, the relaxation time of the restoring force of the trapping
potential. The time tfree, which separates tf from tk, was determined empirically and
corresponds to tk/20. Below this time, OTI can be used solely as a position detector
for single particle tracking with unprecedented spatial and temporal resolution.
At these time scales, the sphere performs a non-random walk, dominated by the
nature of the surrounding medium [45]. The presented method is capable of
providing new insights into the behavior of media that are more complex than just
a simple viscous uid, thus extending the bandwidth of microrheology by two
decades in time [46]. For example, the high-frequency response of a viscoelastic
polymer solution should provide information on the nanomechanical properties of
the polymer molecules. In particular, highly dynamic polymers, such as those
encountered in a living cell, should transmit their mechanical and dynamic signatures to the Brownian particle. Furthermore, an obstacle in the particles trajectory
such as a surface, with for example, various chemical functionalities, or a cell
membrane, should inuence Brownian motion in a characteristic way. Such studies
for a variety of biopolymers and surfaces are currently in progress in our laboratory.
Acknowledgments
The authors are grateful to J. Lekki for help in data acquisition, to P. De Los Rios, H.
Flyvbjerg, T. Franosch for discussions, and to D. Alexander for reading the manuscript. B.L. and C.G. acknowledge the nancial support of the Swiss National Science
Foundation and its NCCR. S.J. acknowledges the support of the Gebert R
uf
Foundation. The authors also thank EPFL for funding the experimental equipment.
References
1 Haw, M.D. (2002) Journal of Physics Condensed Matter, 14, 7769.
2 Einstein, A. (1905) Annalen Der Physik, 17,
549.
3 Stokes, G.G. (1851) Transactions of
the Cambridge Philosophical Society, 9,
8.
4 Langevin, P. (1908) Comptes Rendus
Hebdomadaires des Seances de LAcademie
des Sciences, 146, 530.
5 Henri, V. (1908) Comptes Rendus
Hebdomadaires des Seances de L Academie
des Sciences, 146, 1024.
6 Vladimirsky, V. and Terletzky, Y. (1945)
Zhurnal Eksperimentalnoi i Teoreticheskoi
Fiziki (in Russian), 15, 259.
References
15 Frey, E. and Kroy, K. (2005) Annalen Der
Physik, 14, 20.
16 Gittes, F., Schnurr, B., Olmsted, P.D.,
MacKintosh, F.C. and Schmidt, C.F. (1997)
Physical Review Letters, 79, 3286.
17 Boon, J.P. and Bouiller, A. (1976) Physics
Letters A, 55, 391.
18 Paul, G.L. and Pusey, P.N. (1981) Journal of
Physics A - Mathematical and General, 14,
3301.
19 Ohbayashi, K., Kohno, T. and Utiyama, H.
(1983) Physical Review A, 27, 2632.
20 Weitz, D.A., Pine, D.J., Pusey, P.N. and
Tough, R.J.A. (1989) Physical Review Letters,
63, 1747.
21 Kao, M.H., Yodh, A.G. and Pine, D.J.
(1993) Physical Review Letters, 70, 242.
22 Ashkin, A., Dziedzic, J.M., Bjorkholm, J.E.
and Chu, S. (1986) Optics Letters, 11, 288.
23 Berg-Sorensen, K. and Flyvbjerg, H. (2005)
New Journal of Physics, 7, 38.
24 Landau, L.D. and Lifshitz, E.M. (1987)
Fluid Mechanics, Vol. 6, 2nd edn,
Butterworth-Heinemann, Oxford.
25 Lorentz, H.A. (1921) Lessen over Theoretische
Natuurkunde, Vol. V, E.J. Brill, Leiden.
26 Vanderhoef, M.A., Frenkel, D. and Ladd,
A.J.C. (1991) Physical Review Letters, 67,
3459.
27 Berg-Sorensen, K. and Flyvbjerg, H. (2004)
Review of Scientic Instruments, 75, 594.
28 Uhlenbeck, G.E. and Ornstein, L.S. (1930)
Physical Review, 36, 0823.
29 Svoboda, K. and Block, S.M. (1994) Annual
Review of Biophysics and Biomolecular
Structure, 23, 247.
30 Reif, F. (1985) Fundamentals of Statistical
and Thermal Physics, McGraw-Hill,
Singapore.
31 Henderson, S., Mitchell, S. and Bartlett, P.
(2002) Physical Review Letters, 88, 088302.
j119
j121
4
Nanoscale Thermal and Mechanical Interactions Studies using
Heatable Probes
Bernd Gotsmann, Mark A. Lantz, Armin Knoll, and Urs D
urig
4.1
Introduction
122
results from the broad eld of heated-probe SFM are addressed. Although the
applications of these techniques are rather diverse, two common elements are the
use of a sharp, temperature-sensitive tip and the use of SFM techniques to scan this
tip over the surface and simultaneously measure the surface topography. Many of
these applications also require a means to heat the tip, in turn to heat the sample, on a
highly local scale. In the following sections we rst describe the various types of probe
that have been developed for thermal scanning probe microscopy, and outline the
basics of probe-based imaging of thermal properties. Later, we analyze the various
heat-loss mechanisms that play a role in the interpretation of thermal SPM data.
Specic applications are discussed thereafter, including thermomechanical nanoindentation, data storage and nanopatterning and lithography.
4.2
Heated Probes
At the heart of all the techniques described in this chapter is a heatable probe with
an integrated means of sensing the temperature of the probe tip. As with all
scanning probe techniques, the resolution is limited at least in part by the
geometry of the tip apex and the area of contact between the tip and the surface.
The most widely used heated probe is a Wollaston wire probe. In this technique, a
thin, bent platinum/rhodium wire is used to produce heat and detect temperature.
For SPM-based applications, the wire is bent into the shape of a cantilever, as
illustrated in Figure 4.1. Often, also a mirror is glued onto the back of the
cantilever to improve optical detection of the cantilever bending. The temperature
of the wire can be determined by measuring its electrical resistance and using the
j123
124
Another approach, developed by IBM, uses an all-silicon microfabricated cantilever with an integrated heater and tip (see Figure 4.3) [13]. Here, the largest part of the
two-legged cantilever is made from highly doped silicon (1020 cm3 As), whereas the
part of the cantilever that supports the tip is lower-doped (5 1017 cm3) and serves
as both heater and sensor. With dimensions of 4 mm 6 mm and a thickness of
200 nm, the heater has a resistance of few kW. The known temperature dependence
of the resistivity of doped silicon can be exploited to sense the heater temperature.
This type of thermal probe has been used in the majority of the experiments
described in this chapter.
The temperature calibration of the heater can be carried out by measuring a
currentvoltage response curve (IV curve) of the cantilever (Figure 4.4). For this, a
resistor is typically placed in series with the cantilever. The measured voltage drop
Temperature (C)
0
400
600
800
Resistance (k)
Current (mA)
0.8
200
0.6
0.4
0.2
0.0
0
Voltage(V)
5
4
3
2
0.0
0.5
1.0
1.5
Power(mW)
2.0
2.5
across the resistor is used to determine the current owing through the cantilever
and, in combination with the measured voltage drop across the cantilever, can be used
to calculate the electrical power dissipated in the cantilever. Initially, as the current
owing through the heater is increased, the dissipated power results in an increase in
resistance due to increased scattering of the carriers. As the temperature rises, the
number of thermally generated carriers also increases, which tends to reduce the rate
at which the resistance increases with temperature. When the number of thermally
generated carriers equals the number of dopants, the resistance reaches its maximum value, and begins to decrease with further increases in power and temperature.
The temperature at which the maximum resistance occurs is a function of the doping
density and is known from the literature [15]. The power needed to reach the
maximum resistance, PRmax, is determined from the measured IV data. It is
assumed that all of the power dissipated in the cantilever contributes to heating of
the heater structure, and that the temperature change of the heater is a linear function
of the dissipated power. We can then rescale a plot of resistance versus power to a
plot of resistance versus temperature using two known values, namely, the resistance
at room temperature measured at low very low power, and the temperature at
which the maximum resistance occurs. For the doping values used here, the
maximum resistance occurs at 550 C, and the heater temperature can thus be
calculated using
T heater RT P550 CRT=PRmax RT Rth P;
where RT is room temperature.
The implicit assumption here is that the thermal resistance of the system, Rth, is
independent of the heater temperature, Theater. We nd that Rth is typically on the
order of 105 K W1 under ambient conditions. When checking all of these assumptions by measuring the temperature of the heater by independent means, we found
that the resulting systematic error is far below the measurement errors, fabrication
tolerances and scatter. Using this calibration method, we estimate an absolute error
of about 30% for the temperature difference DT Theater RT. Relative measurements and temperature changes, however, can be conducted with a temperature
resolution of 0.1 K.
The time scale that these heaters are able to probe is related to the thermal
equilibration time of the cantilevers. Although the dynamics is not a simple RC-type
exponential [14, 16], it can be approximated by a simple exponential and a single time
constant. For the cantilevers used in most of the examples discussed below, the time
constant is on the order of 7 to 10 ms; this is then the time constant that limits fast
thermal sensing. For a rapid application of heat, however, the heater can be operated
in nonequilibrium on timescales down to 1 ms and lower [14]. The transient
temperature can be sensed reliably at any time scale by means of the electrical
resistance. By design, the time constant can be decreased to values below 1 ms, but
there are trade-offs between power consumption, sensitivity, time constant, ease of
fabrication and the mechanical stability of the cantilever [16, 17]. Therefore, the
optimum design depends critically on the application envisaged. For a detailed study
of cantilever designs and time constants, the reader is referred to Ref. [14].
j125
126
4.3
Scanning Thermal Microscopy (SThM)
cannot easily be insulated, and therefore a heater typically has several heat-loss paths.
The main heat-loss paths, which do not vary with the local sample conductivity, are
conduction through the cantilever legs, nonlocal air conduction between heater and
sample surface, and radiation cooling. Another potential heat-loss path is through a
water meniscus that can form at the point of contact between tip and surface. This will
effectively increase the heat-conduction cross-section, leading to a reduced lateral
resolution. After performing experiments under ambient conditions, Shi et al. [10]
have concluded that heat transfer between heater and sample is dominated by
conduction through a water meniscus. The heat transfer paths are discussed in more
detail in Section 4.4.
The method described above can be rened by modulating the heater drive voltage,
which results in a modulation of the heat ow to the sample. The resulting ac
component of the heater temperature can be measured using a lock-in amplier and
used to produce an ac thermal image. This ac heat-loss signal changes depending on
how the modulation period compares with the diffusion time of heat in the tipsurface contact region that is, lower-frequency signals diffuse further than do highfrequency signals. Thus, by varying the modulation frequency of the heater, the
probing depth can be controlled. This can be seen in Figure 4.6, which shows two ac
thermal images taken at 1 and 30 kHz. The sample consists of islands of highthermal-conductivity material surrounded by low-thermal-conductivity material,
both covered by a polymer layer. In the image taken at 1 kHz, the ac signal probes
below the polymer layer and strong material contrast is observed, whereas at 10 kHz
both probing depth and contrast are signicantly reduced.
The limits of the SThM technology are not easily dened. There is a trade-off
between time and temperature resolution on the one hand, and spatial resolution on
the other hand. For example, increasing the contact area between the tip and sample
increases the thermal conductivity, resulting in larger signals and therefore improved
sensitivity but at the expense of reduced lateral resolution. As will be shown below
(Section 4.4), the high thermal impedance of the tip and the tipsurface interface
make working on samples with high thermal conductivity challenging. Ideally, the
j127
128
thermal impedance of the tip and tipsurface interface should be comparable to, or
smaller than, those of the sample. The tip and tipsample interface impedances
increase as the tip is made sharper and the contact area reduced, making high-spatialresolution experiments challenging. Nevertheless, a spatial resolution in the range of
some tens of nanometers is feasible on some samples. For example, Shi et al. have
demonstrated a resolution better than 100 nm on metallic wires [10] and better than
50 nm when imaging a carbon nanotube [12]. A challenge for the quantitative analysis
of SThM images is the unknown interaction volume under the probing tip [19].
Nevertheless, the method is very successful in the study of polymers and biological
samples. For a recent review, see Ref. [19].
An example of high-lateral-resolution SThM is given in Figure 4.7. Here, the very
small contrast between two materials of similar thermal conductivity (silicon oxide
and hafnium oxide) is observed. The sample consisted of 2 nm-thick islands of SiO2
surrounded by a 3 nm-thick lm of HfO2 on a single-crystal silicon substrate. Note
that both materials have a considerably higher thermal conductivity than polymers.
The measurements were performed in a high-vacuum environment using silicon
probes with integrated silicon heaters (as described in Section 4.2). A lateral
resolution of 25 nm was achieved, and the previously unknown thermal conductivity of the 3 nm-thick HfO2 lm was determined. This example nicely demonstrates
the potential of using SThM for quantitative measurements, even at high spatial
resolution.
4.4
Heat-Transfer Mechanisms
Most of the applications described in this chapter use a sharp, heated probe to deliver
heat to a surface on a highly localized scale. However, this process is often rather
inefcient because of the high thermal resistance of nanometer-sharp tips and
nanometer-sized contact areas, in combination with the other parasitic heat-loss
paths present in the system.
In this section, we analyze the various heat-loss paths and mechanisms that can
play a role in heated-probe experiments. The analysis in this section is taken from
some unpublished results of U. D
urig and B. Gotsmann. The majority of heat-loss
paths are undesirable in the sense that they do not contribute to the image contrast in
SThM and reduce the efciency of delivering heat to the sample in other applications.
The various heat-loss mechanisms that can contribute to heat loss from a heated tip
are illustrated in Figure 4.8 and described in more detail in the corresponding
Sections 8.4.18.4.6. Which of these mechanisms contribute in a given experiment
and the relative magnitudes of their contributions depend on the details of the
cantilever design and the experimental conditions. The potentially undesirable heatloss mechanisms include conductive heat loss through air and the cantilever
(Section 4.4.1) and also thermal radiation (Section 4.4.2). Heat conduction through
the tip is desirable, but the thermal resistance of the tip (Section 4.4.4) and
conduction through a water meniscus (Section 4.4.3) that can form between the
tip and sample limit the sensitivity and resolution. The thermal resistance of
the tipsurface interface (Section 4.4.6) and the thermal spreading resistance in the
sample (Section 4.4.5) are material-specic and determine the image contrast in
SThM. The relative magnitudes of these resistances also play an important role in
determining the efciency of heat delivery to the sample. As a quantitative example,
we calculate the thermal impedances of the various heat-loss paths for the microfabricated silicon cantilever with integrated silicon heater described at the end of
Section 4.2. Finally, in Section 4.4.7, we describe a set of experiments designed to
quantify the various thermal resistances.
4.4.1
Heat Transport Through the Cantilever Legs and Air
j129
130
T heater
conduction in cantilever
air conduction
and radiation
tip
interface
room temperature
sample
room temperature
Figure 4.8 (a) Heat paths relevant for experiment using heated
probes (numbers refer to text sections in which they are
described); (b) Schematic representation as thermal resistances.
Note that the distinction between tip, interface and spreading
resistances is not possible in every case.
mean free paths of air molecules (60 nm). For the cantilever design shown in
Figure 4.3, the tip height is 500 nm, resulting in a strong coupling to the substrate.
For cantilevers that are long relative to the dimensions of the heater and to the
cantileversurface distance, most of the heat is conducted through the air and into the
substrate. For the cantilever in Figure 4.3, the conductivity through cantilever allows
the heat to spread along the cantilever by a distance on the order of a few tens of
micrometers. Heat conduction along the cantilever and into the air is analogous to a
Heat loss due to thermal black-body radiation, which involves the propagation of
electromagneticwavesfrom a hot object, isdescribed bytheStefanBoltzmann equation
S
p2 k4B 4
4
3 2 T 1 T 2 :
60h c
Here, the cooling power per area, S, is expressed in terms of the Boltzmann
constant, kB, the Planck constant, h, the speed of light, c, the temperature of the
heated body, T1, and the temperature of the environment, T2. For a heater temperature 100 K above the environment temperature, this corresponds to a thermal
resistance of 6 108 K W1 for effective heater dimensions of (9 mm)2. Compared
to the thermal resistance of the cantilever and air heat-loss paths of about 110
105 K W1, the contribution due to black-body radiation is negligible. In ambient
conditions, the overall thermal resistance is dominated by air conduction and in
vacuum by conduction through the legs to the support structure.
In heated-probe SPM experiments, the distance between the heater and sample is
often less than 1 mm. In this case, the StefanBoltzmann equation is only an
approximation, and near-eld effects must be taken into account. Such effects have
a long history of theoretical analysis (see for example Refs [2428]). Experimentally,
however, the effect appears difcult to pin down, and very few reports have been
made [2932]. It has been predicted on a theoretical basis that, compared with
StefanBoltzmanns law, heat transport by evanescent thermal radiation will depend
heavily on the materials involved, with a strong distance dependence (1/d2 for most
cases) and a weakened temperature dependence (T2 for most cases) [25]. The effect is
also heavily dependent on the dielectric constants of the heater and the sample
material. According to theory, we can expect that the near-eld effect for a polymer
surface is very small. However, for a silicon surface it can be signicantly higher,
depending on the doping [2628], but even in this case the effect is expected to be
negligible when compared to the total thermal resistance of the cantilever.
j131
132
Under ambient conditions, the distant-dependent cooling of the heater/cantilever is dominated by air cooling, and therefore it is not possible to observe neareld cooling effects. Under vacuum conditions, the contribution to cooling due to
thermal radiation should become measurable not so much because of the
increased overall thermal resistance without air conduction but rather because we
can use the distance dependence to demonstrate the existence of near-eld
cooling. In air, the distance dependence is dominated by air conduction, whereas
in a vacuum the air conduction is of course eliminated and conduction through
the cantilever legs does not depend on the heatersample distance. Thus, on
approaching a surface in vacuum, any variation in the thermal resistance that is
observed prior to tipsurface contact can likely be attributed to near-eld radiation
effects.
4.4.3
Thermal Resistance of a Water Meniscus
Under ambient conditions, humidity in the air usually results in the formation of a
water meniscus around the tipsample contact. The size and thermal conductance of
the meniscus are a function of both humidity and sample material, and therefore are
difcult to control. Thermal conduction through such a water meniscus effectively
increases the tipsample contact area and thereby reduces lateral resolution. On the
other hand, the meniscus improves thermal contact, especially on rough surfaces,
and may even be necessary to make nanoscale measurements possible in the rst
place. In a groundbreaking report, Shi and Majumdar concluded that in experiments
using a 100 nm-diameter metal tip on a metal surface, the inuence of the water
meniscus is of the same order of magnitude as the conduction through the
solidsolid tipsample contact [10]. Moreover, they also concluded that for relatively
blunt tips under ambient conditions, thermal conduction through the air gap
between the tip sidewalls [33] predominates, whereas for sharper tips, solidsolid
and watermeniscus conduction dominate [10]. The effects of conduction through a
water meniscus can of course be avoided by operating the heated tip in a lowhumidity or vacuum environment.
4.4.4
Heat Transfer Through a Silicon Tip
The thermal resistance of the tip stems from the conductance of phonons in the
silicon tip and from the layer of native oxide covering it. In the tip, the resistivity is
larger than that of bulk silicon because of enhanced phonon scattering at boundary
surfaces [3]. The thermal resistance of the silicon tip can be estimated using
predictions for the thermal resistivity of silicon nanowires as a function of diameter [34]. Integrating the expression for the varying diameter of a cone-shaped tip with
a typical opening angle of 50 down to the apex with a radius of 510 nm yields a
thermal resistance on the order of 106 to 107 K W1 because of phonon scattering.
This is in agreement with nite-element calculations [35].
To develop a hands-on feeling for this rather complex subject, we rst derive a
simple model for heat conduction in conical structures, in particular with regard to
ballistic phonon transport. Let us initially consider a cylindrical rod with a crosssection A d2p/4 (d rod diameter). Let us further assume that a constant current
Ith of thermal energy ows through the rod, which is driven by a temperature
difference along the rod axis (x-axis). For a temperature gradient DT/Dx, the heat
ow is)
Ith
kA
DT;
Dx
4:1
where k denotes the thermal conductance of the rod and DT is the temperature
difference across a cylindrical slice of thickness Dx. In using Equation 4.1, we assume
that the phonon mean free path is large compared with the diameter of the rod. For
very thin rods, say d < 100 nm for crystalline Si, this assumption is no longer valid and
one must account for phonon scattering at the boundaries by renormalizing the
thermal conductance according to the so-called Mathiessens rule [36, 37]:
k0 k
1
d
k ;
l
l
1 d
d ? l;
4:2
j133
134
T 1 T 2 I th
1
ptan Q2
x2
dx
:
x2
4:3
x1
The cylindrical rod approximation has also been applied for d ( l by simply
replacing the constant thermal conductivity k by the renormalized value k 0 [36].
However, it is not obvious that this approach is applicable because the cone crosssection changes signicantly over a distance l along the tip axis. Therefore, let us
examine the problem in more detail.
Consider a point on the cone axis at a distance x0 from the apex (see Figure 4.9). We
assume that d < l. Let S be the spherical cap dened by the intersection of the conical
tip with a sphere of radius l centered at x0. A fraction of the phonons emanating from
S arrive at x0 in a direct path without interference from the tip surface. These
unperturbed phonons impinge from a solid angle
tan
u dx 0 l dx 0
Q
tan :
2
2l
2l
2
4:4
The fraction of the total heat transport seen at x0 due to these direct phonons is
hd
I dth u
I dth p
u
Tx 0 x 0 l
cosu=2sinu=2 du=2
2p
dT=dx l
0
4:5
2p cosu=2sinu2 du=2
0
Tx 0 Tx 0 l 2
dx 0 2
sin u=2
sin u=2:
dT=dx l
l
The factor (T(x0) T(x0 l))/(ldT/dx) accounts for the reduced thermal energy
carried by the impinging phonons with respect to the value calculated from the local
thermal gradient at x0. As the temperature difference T(x0) T(x0 l) T(x0) /1/d2
(see below), the factor is equal to d/l. Similarly, for the fraction of heat transported by
the phonons scattered off the wall, one can write
hd
I wth u
I th p
p
DT w u0
cosu0 =2=2sinu0 =2 du0 =2
2p
DTu0
0
dx 0 2
cos u=2:
l
4:6
As for the direct phonons, the factor DT w/DTdenotes the fraction of heat carried by
a phonon scattered from the wall with respect to a thermal equilibrium phonon. A
calculation (A. D
urig, unpublished results) yields
8
dx0 dx 0
>
>
<1
<
l
l
w 0
0
;
4:7
DT u DTu
>
dx0
>
:
1
1
l
where the equality holds for u 0 and deviations for u > 0 have been neglected.
Hence, we obtain as nal result
k 0 x 0 khd hw k
dx 0
;
l
4:8
1 4l
k p
l
d0 << l
1
dx
d3
1 2l
1
k ptanQ=2 d20
3 1
Rs ;
8 tanQ=2
4:9
1 4l 1
k 3p d0 =22
4:10
where
Rs
j135
136
R(Y)/R(90 )
Y
1
90
1.73
60
2.41
45
3.73
30
7.60
15
R (90 ) (K W1)
d0 (nm)
3.86 108
1
9.65 107
2
1.54 107
5
3.86 106
10
9.65 105
20
1
:
2k s d0
4:12
The net heat current ow in the tip is I tth I th I rth , whereI rth hI th denotes the
reected current in the tip. As argued above, the tip temperature at the apex is not
altered. Nevertheless, we introduce an effective temperature at a virtual tip interface
T 0 T hT 1 T, where T1 is the tip temperature far from the interface (see
Figure 4.10b). With this denition, the heat balance can be satised with regard to
the net current, T 0 T 1 I tth R, where the heat conduction in the tip is represented by
the single resistive element, R. The virtual tip interface is connected to the substrate
surface by means of a resistive element that must satisfy the equation T 0 T s I tth Rb .
Hence, one obtains
Rb
h
R Rsp :
1h
4:13
4:14
where 0
a
1 is a type of accommodation factor for the back-scattering of the
phonons into the tip and (1 cos u/2 accounts for the fraction of the solid angle
covered by the back-scattered phonons that escape form the tip apex and
thermalize in the heat bath. The particular choice for u is heuristically motivated
by the observation that the temperature gradient is roughly one order of magnitude lower at a distance above the apex that corresponds to a cone diameter of
twice the aperture diameter.
The substrate temperature is one of the key parameters for studying thermomechanical material properties. It can be written as
Ts
Rsp
1h
T 1 T 0 T 0
T 1 T 0 T 0 ;
1 R=Rsp
R Rb Rsp
4:15
Note that already for rather weak back scattering the interface resistance accounts
for a signicant fraction of the total resistance, that is, a 0.3 yields Rb l0.5
(R Rsp) (see Equations 4.13 and 4.14). Figure 4.11 shows the substrate temperature as a function of tip cone angle for a 0.5 and 0.75 and for various ratios of
R Rsp (k sl)/(kd0) < 1 corresponding to cases in which the spreading resistance
is dominating the tip resistance. Under such conditions, the substrate temperature
is rather insensitive to the values substituted for the accommodation coefcient a
and the cut-off angle u.
As observed experimentally [8, 14, 35, 39, 40] and also described in Section 4.4.8,
the simple model predicts that the temperature rise at the substrate interface is on
the order of 0.4 to 0.7 times the total temperature difference between tip and
substrate for parameters that correspond to typical experimental conditions. It is
clear that the model cannot capture the complexities of phonon scattering in a
predictive manner. Instead, a phenomenological parameter a must be introduced to
match experimental observations with model predictions. However, the model
provides a means for assessing the scaling properties and provides qualitative
j137
138
0.8
0.7
( T s T 0)/ ( T1 T 0)
0.6
0.5
0.4
0.3
0.2
0.1
0
10
20
30
40
50
60
70
80
90
Cone angle ( )
Figure 4.11 Substrate temperature at the tip apex: a 0.75
(solid lines) and 0.5 (dashed lines) and R/Rsp (ks l)/(k d0)
0.025 (blue), 0.05 (green), 0.1 (red) and 0.25 (cyan).
The spreading resistance in the sample is probably the best understood of all the
thermal resistances involved. Commonly, it is well approximated by Equation 4.12,
which says that the resistance scales inversely with the contact diameter d0. The
scaling is borne out by the fundamental heat conduction Equation 4.1 by observing
that the mean gradient DT/Dx scales as 1/d0 for diffusive transport in a half-space.
For a thin lm on a substrate, one can account for the effect of the substrate by
using an approximate solution proposed by Yovanovich et al. [41]:
1
1
2
:
4:16
log
Rsp
2k s d0 2pks t
1 k s =ksub
Here, k s and t are the thermal conductance and the thickness of the lm,
respectively, and k sub denotes the thermal conductivity of the substrate on which
the lm has been deposited. In all experiments discussed below, the lm thickness is
at least one order of magnitude larger than the contact diameter, and we can disregard
the nite-size correction term.
For polymer lms, a value of ks 0.20.3 W Km1 is typical. The thermal
conductance can increase by up to a factor of 2 under a pressure of 1 GPa. As the
stress under the tip varies during an experiment and is transient within the
tipsurface interaction volume, we must resort to estimating an effective pressureincreased thermal conductivity [42]. For the experiment described below, we use a value
of kpol of 0.30.6 W Km1. For a contact diameter of d0 10 nm, we obtain an
estimated Rsp of approximately (0.8 1.6) 108 K W1 for polymers, 106107 K W1
for oxides, 3 105 K W1 for silicon, and down to 104 K W1 for metals.
4.4.6
Interface Thermal Resistance
As discussed in Section 4.4.4, the thermal resistance of the interface Rint dees
accurate prediction. The situation is further complicated because the interface
resistance usually depends heavily on the quality of the interface and the contact
pressure. For most of the cases described in this chapter, we can assume a singleasperity contact characterized by a contact diameter d0. Contact mechanics models
can be invoked to estimate d0. For a tip with a radius of 10 nm and applied loads of a
few tens of nanoNewtons, d0 is on the order of a few nanometers.
On the other hand, the values of the interface resistance reported in the literature
were measured on macroscopically large areas (rather than a tipsurface contact).
Typical values for siliconpolymer interfaces are in the range of 108 to 107 Km2
W1 [43, 44]. For a siliconsilicon interface, the corresponding value obtained for
phonon scattering is 2.1 109 Km2 W1. The subject of interface scattering has
been extensively reviewed by Swartz and Pohl, who discuss the thermal interface (or
boundary) resistance between various materials as well as related models [45].
Returning to the nanoscale tip contact, it is not immediately clear how to relate the
macroscopic data to the nanoscale interface resistance. One simple approach adopted
by King [35] is to treat the interface as a scattering site for phonons in silicon and to
assume that the total interface resistance is inversely proportional to the contact area,
which yields Rint 2.1 109 Km2 W1 4/(p d20 ) 2.7 107 K W1 for d0
10 nm. Alternatively, if we substitute the measured value for the polymersilicon
interface resistance, we obtain Rint 108 to 107 Km2 W1 4/(p d20 ) 1.3 108
to 109 K W1 for d0 10 nm.
Alternatively, it is shown in Section 4.4.4 that the resistance due to boundary
scattering at the tip apex is proportional to the sum of the thermal resistances
associated with the conduction paths through substrate and tip. Therefore, this has
two components: one scaling as 1/d0 and corresponding to the spreading resistance
in the substrate; and the other scaling as 1=d20 and corresponding to the tip resistance.
The question then arises how this is to be reconciled with the 1=d20 scaling suggested
by extrapolating from the macroscopic scale.
The interface resistance is a somewhat articial construct, which bridges the gap in
the conduction path where the temperature of the phonon gas cannot be dened
unequivocally. The temperature is well dened only if the phonons thermalize by
means of mutual scattering. Therefore, the gap typically extends over a distance on the
order of the phonon mean free path on either side of the interface. For consistency with
the concept of a thermal resistance, the interface resistance is dened as the tempera-
j139
140
ture difference divided by the net heat ux across the gap, as measured by an imaginary
observer with an apparatus that is in thermal equilibrium with the phonon gas. Note,
however, that unlike a regular thermal resistor, the interface resistance cannot be
broken up into a string of series resistors to calculate the temperature at any point along
the gap. In fact, such temperatures have no meaning and merely serve as a mathematical concept. The interface temperature of the tip, T0 (which was introduced in
Figure 4.10b), is an example of such a ctitious temperature. Moreover, the ballistic
tip resistance, R (see Equation 4.9 in Section 4.4.4) constitutes part of the overall
interface resistance. Therefore, we must write the following for the interface resistance:
Rint Rb R;
4:17
which spans the entire ballistic propagation path of the phonons through the conical
tip, including boundary scattering at the tipsubstrate interface up to their thermalization in the substrate. It is also clear from the above discussion that we cannot simply
extrapolate from macrosopic results to nanoscale thermal contacts without accurately
accounting for the conduction path. What one can do, however, is to extract a mean
backscattering probability from macroscopic experiments. Using the same type of
reasoning as in Section 4.4.4, one obtains the following for the interface resistance for
a unit area of a planar contact:
r int
h l
:
1h k
4:18
With h a, and assuming l 1 nm and k 0.3 W Km1 for the mean free path and
the thermal conductance of polymers, respectively, one must substitute a 0.75 to 0.97
in order to obtain the experimentally observed values of rint 108 to 107 Km2
W1 [43, 44]. The upper bound for the measured interface resistance yields a somewhat
unrealistically high value of 0.97 for the backscattering probability. However, it must be
borne in mind that it is difcult to obtain good contact uniformity in a large-scale
experiment, and therefore the experimental values must be seen as upper bounds.
4.4.7
Combined Heat Transport Through Tip, Interface and Sample
We dene the heating efciency c (similar to but a simplication of Equation 4.15 and
Figure 4.11) as the increase in the sample surface temperature divided by the total
temperature difference between heater and substrate:
c
Rsp
:
Rtip Rint Rsp
4:19
Here, Rtip denotes the nonballistic, diffusive component of the tip resistance (the
ballistic part is captured in Rint, as explained in Section 4.4.6). This denition is useful
for understanding sensitivity issues when using heated probes. The heating efciency c is a strong function of tip size that is, of the lateral resolution. Small values
of c imply that also the measured signal will be small, indicating that achieving high
lateral resolution becomes increasingly difcult.
As outlined in Section 4.4.4 and inferred experimentally [8, 14, 35, 39, 40], typical
values for c range from 0.3 to 0.7 for polymer samples. In the case of better thermal
conductors, c can be much lower; for example, on metals we estimate c 103104,
for semiconductors c 102103, and for oxides c 101. This points to the
challenges expected when extending the SThM method to both the nanoscale
(e.g. a 5 nm in the above calculations) and to sample materials having a higher
thermal conductivity than the commonly used polymers or oxides.
We note that, although the heating efciency reects the temperatures, it does not
reect how efcient a probe is in terms of heating power. For the cantilever type shown
in Figure 4.3 operating in air, the ratio of power going through the tip to that lost to
other heat paths is approximately 105/108 (K/W) 0.001. Thus, from a power
consumption point of view, the delivery of heat to the sample through the tip is
very inefcient. Improving the efciency requires either a reduction in the interface
and the tip thermal resistances or an increase in the air/cantilever thermal resistance.
The tip and interface resistances can of course be decreased by using a blunter tip, but
at the expense of lateral resolution. Increasing the cantilever thermal resistance
requires a decrease in the heater and lead cross-sections and/or an increase in the
cantilever length. Among the design issues that restrict the freedom to reduce the
heater/lead cross-sections are the mechanical stability, cantilever stiffness, mechanical response time, power consumption of the heater, electrical resistance and noise,
thermal response time and fabrication tolerances.
4.4.8
Heat-Transport Experiments Through a TipSurface Point Contact
Bringing the tip into and out of contact with the sample opens and closes heat
channels.
By varying the contact area, a, only some of the contributions will be affected
(interface and spreading part).
The total thermal resistance of the heater, Rth, is given by
Rth T heater RT=P;
4:20
where Theater is the temperature of the heater, RT is the room temperature, and P the
heating power. The thermal resistance due to conduction through the cantilever legs
j141
142
and radiation can be determined from the data obtained before the tip contacts the
sample. The tipsurface thermal conductance can then be determined by subtracting
this value from the data measured with the tip in contact with the sample. Figure 4.12
shows an example of such an experiment performed using a thermal probe similar to
that shown in Figure 4.3 and a sample consisting of 80 nm of SU8 (an epoxy-based
photoresist) on a silicon substrate. Out of contact, the displacement translates into a
distance change between heater and sample. In contact, the displacement translates
into a load force as the tip is pressed against the polymer. In this experiment, the
average Theater was approximately 315 C, and the change in Theater resulting
from contact was about 1.5 K. From the difference between the thermal resistance
out-of-contact (1.077 MK W1) and the thermal resistance in-contact (1.0711.073
MK W1), the thermal resistance due to heat transport through the tipsurface
contact was calculated to be 2 3 108 MK W1 (see Figure 4.14).
The tipsample resistance is given by
Rts Rtip Rint Rsp ;
4:21
as illustrated in Figure 4.13, where Rtip is the diffusive thermal resistance of tip, Rint is
the tipsample interface resistance, and Rsp is the spreading resistance in the
polymer.
To experimentally quantify the different contributions to Rts, we can vary the
individual contributions by varying the sample material and the applied force. As the
contributions depend on the contact radius, 0.5 d0, it is useful to vary this parameter.
To study the contact area dependence of the overall thermal resistance of the
tippolymer contact, we vary the force during an approach experiment. The contact
area is calculated using the JKR model [46]. For this purpose, the applied force, the
pull-off force and the tip radius need to be known. The applied force and pull-off force
are determined from the cantilever spring constant and the known motion of the tip
holder relative to the sample surface. The tip radius is measured ex situ by means of
scanning electron microscopy. The data shown in Figure 4.14 were obtained for a tip
with Rtip 13.5 nm corresponding to a variation of the contact diameter from 7 nm
just before the contact breaks at a pull-off force of 15 nN to 13 nm at the maximum
load of 35 nN.
Clearly, the thermal resistance depends on the tip force and hence on the contact
diameter. Motivated by the above discussion, we propose the following ansatz for the
thermal resistance:
Rts A0 A1 =d0 A2 =d20 :
4:22
j143
144
4.5
Thermomechanical Nanoindentation
j145
146
35
30
25
20
15
10
5
0
300
350
400
450
500
550
600
no heat applied, F0. The quantities T0 and F0 are a function of both indentation time
and tip geometry.
T0 is the writing temperature needed in the limit F ! 0, which is also the limit of
zero stress. In this limit the Tg is dened and therefore, for a given tip radius and
indentation time, T0 is a measure of the Tg.
For a given heater temperature, the temperature reached in the polymer underneath
the tip depends on the geometry of the tip (see Section 4.4). Whereas, the opening
angle of the tip cone has relatively little effect, the contact area and therefore the heat
transport from the tip to the polymer differ for blunt and for sharp tips.
To reach more quantitative statements about the Tg of a polymer, we must
normalize the effects of the tip geometry. There are two possible solutions to this:
.
First, to obtain comparable results, one can use the same tip for all samples being
studied. Assuming that the tip shape stays constant over the range of experiments,
this procedure yields comparable results that can be correlated to traditionally
measured Tg values of polymers.
The second approach is to pick one of the polymers as a reference sample and to
normalize the results on the other polymers with respect to this reference polymer.
Clearly, the rst approach is prone to difculties relating to the necessary constant
geometry of the tip, is restricted to the use of a single tip, and therefore cannot be
applied as a general method. In the second method, the Tg values measured are rescaled
with respect to a reference tip on the reference sample, which cancels to rst order the
relative difference in heat-transfer properties resulting from the use of different tips.
By using these two approaches, a correlation to Tg measured by conventional
means can be made [6, 14], as shown in Figure 4.17. All experimental data were either
1000
900
800
Polyimide
Tw=56 +2.4* Tg
700
600
Ultem
500
400
300
200
SU8
PMMA
PAEK
Polysulfone
Polystyrene(PS)
Poly- -Me-styrene
100
50 100 150 200 250 300 350 400
Tg (C)
Figure 4.17 Indentation-writing temperature T0 as a function of
the conventionally determined Tg for various polymers. The
indentation-writing temperature has been normalized to a
reference tip, as described in the text. All data points are within
10% of the linear fit to the data.
j147
148
1100
1000
BCB content
0%
2.6%
4.7%
5.4%
10%
25%
30%
T threshold (C)
900
800
700
600
500
400
300
200
0
50
100
150
200
250
Load (nN)
Figure 4.18 Indentation-writing threshold plots
determined by writing indentation arrays, as
shown in Figure 4.15. Each datum point refers to
the temperature needed to write an indentation
of 1 nm depth at a given load. Here, thins films of
polystyrene that have different crosslink
Hardness (nN)
250
direct
by extrapolation
200
150
100
50
0
10
15
20
25
30
j149
150
chains: the indentation is frozen in and the loaded rubber springs are kinetically
hindered from relaxing.
This picture of rubbery indentation is more compliant with our nanoscopic length
scales because no macroscopic changes in the material are involved. In addition, as
has been argued [14, 61], this picture of rubbery indentation also captures some
apparent physics better than the yield picture. For example, it works much better at
ultrafast indentation times, which are possible experimentally. Moreover, even
polymers with extremely high degrees of crosslinking can undergo a rubbery
deformation if the deformation is small. On the other hand, rubbery indentation
implies temperatures above Tg, in clear contradiction to ndings of a linear threshold
curve all the way down to F0. In the threshold curve no apparent transition through
the glass transition exists.
In nanoindentation experiments, Tg is not easy to quantify. It can be expected,
however, that Tg increases for the high indentation rates typical of these experiments
and decreases for the high stresses. Even for the lowest stresses that can be applied in
the experiments, signicant shear stresses of 100 MPa will have to be considered.
Although at compressive stresses, Tg always increases in macroscopic experiments, it
is expected that under shear or tensile stress the underlying alpha-transition is eased.
We note that a theory on yielding by Robertson [62] predicts WLF (Williams
LandelFerry) kinetics below Tg under shear stress, but this theory was found to
be useful only near Tg [63]. All in all, Tg is difcult to predict for such experiments.
As mentioned above, an important aspect of both models is the difference in the
indentation dynamics because in the yield picture monomer friction is predominant,
whereas in the rubbery picture the chain/network topology of the polymer is the
limiting factor.
In the rubbery picture, polymer backbone dynamics above Tg generally follow
the so-called timetemperature superposition [1, 64] and their kinetics are well
described by the WLF equation:
T
Here, t, Tand kB are the indentation time, indentation temperature of the polymer,
and the Boltzmann constant, respectively. The WLF parameters tref, Tinf, Tref and c1
are the t parameters characteristic for individual polymers. Note that usually these
parameters are found to be independent of the actual quantity measured, be it shear
modulus, viscosity or heat capacity. We therefore expect rubbery indentation to be
essentially controlled by backbone kinetics following WLF.
In the yielding model, again the indentation kinetics is controlled by the dynamics
of the backbone and is essentially of Arrhenius-type with a single activation energy
Ea [65]:
1 1
Ea
exp
:
t t0
kB T
Thus, to distinguish between rubbery indentation and yielding, the indentation
kinetics should be investigated.
j151
152
500
data PMMA
data SU8
400
300
200
100
1E-6 1E-5 1E-4 1E-3 0.01 0.1
10
First experiments [66] revealed that the indentation kinetics measured using
PMMA and SU8 samples between 1 ms and 1 s cannot be tted by a single activation
energy (i.e. Arrhenius kinetics). An overall t using WLF was satisfactory. An
example of such an experiment is shown in Figure 4.20. Writing-threshold curves,
dened as the minimum heater temperature needed to achieve an indentation depth
of 1 nm at a given indentation time and at constant load force, were measured. Two
prototype polymers are used: one is a thin lm of PMMA; and the other a highly
crosslinked thermoset, the epoxy SU8. A good t with WLF can be obtained for
PMMA. The temperatures needed are clearly above the Tg of about 120 C, in
agreement with our rubbery-indentation picture.
In a highly crosslinked system such as SU8, viscous ow can no longer account for
the indentation. In this case, the load force was relatively high (200 nN), so that
indentation times down to 1 ms were feasible with limited heater temperatures. Here
also, a reasonable WLF t was obtained despite the fact that, for long indentation
times, the writing-threshold temperatures were considerably lower than the Tg of
about 200 C. Note, however, that for these data points the quality of the data does not
allow the exclusion of a transition to Arrhenius-type behavior at long indentation
times.
More detailed experiments using a crosslinked polystyrene sample were performed to investigate the topic further. These data are shown in Figure 4.21, in the
form of an Arrhenius plot. Although the curve can be tted linearly at long times,
there is a clear deviation from the linear t above a temperature of 180 C that
coincides with the Tg of the material measured using DSC. Above Tg, the data is welltted using WLF and a Vogel temperature T of Tg 50 C. Below Tg, however, the
simple linear Arrhenius model is the best t. Despite the many uncertainties, the
activation energy can be quantied and is found to be comparable with the activation
energy of macroscopic polystyrene samples (12 eV).
It is concluded that WLF kinetics predominates at higher temperatures, and that a
smooth transition to Arrhenius kinetics can be observed at lower temperatures/
longer times. As expected, the transition occurs close to the Tg of the polymer.
These two physical pictures only manifest themselves in the different indentation
dynamics. Only in the rubbery model above Tg does the chain-like nature of the
polymers become apparent. On the other hand, it becomes clear that both physical
pictures are useful to understand the experiments, and a distinction between them
may appear articial. The reason for this difference to the macroscopic polymer world
is twofold: (i) because of the nanometer scale of the experiment and the crosslinked
nature of the materials, macroscopic phenomena such as shear bands or crazes are
absent; and (ii) the variations of shear stress and indentation rates are rather extreme.
This forces a transition between the two conventionally fully separate regimes, which
usually are switched only by the temperature and Tg.
In summary, a more unied picture of the indentation process emerges. For better
clarity, we would like to propose a schematic, qualitative picture (see Figure 4.22). In
this schematic, we capture the mechanics of the material using springs and dashpots.
Springs ks and kp are connected in series and, correspondingly, in parallel to the
dashpot. The elastic part of the medium is symbolized by ks, whereas kp is the elastic
part linked to conformational changes of the polymer network. The dashpot, g, is
linked to the glass-to-rubber transition in the polymer. Below Tg, the dashpot is locked
and can only be deformed by high stress; above Tg, it is open, representing the lowfriction sliding of the polymer chains.
j153
154
Figure 4.22a describes (from left to right) the events during an indentation using a
hot tip. If the hot tip is in contact with the polymer sample, the polymer is above Tg,
which means that the dashpot is open and essentially free to move. Upon application
of external stress, both springs will be deformed according to their strength, as shown
by the dashed line in the stressstrain diagram in Figure 4.22b. Let us assume that the
total deformation is dt. Upon cooling, we lock the dashpot in the deformed state, and
by releasing the external stress, sF,h, spring ks relaxes to its uncompressed length.
This state is show in the center of Figure 4.22a. We observe a partial loss of the
indentation depth (to dc for the cold case) as we retract the tip which results from
elastic recovery in the material.
However, the kp g system is still deformed, and a residual stress given by the
deformation of the polymer network spring kp is locked in the deformation. In fact,
this residual stress can be used to erase the indention (as will be shown in Section 4.6),
and we also found stress-dependent relaxation in retention studies of the indentations. This deformation above Tg implies that the dynamics follows WLF kinetics
because the backbone relaxation dynamics also follows this law. And indeed, we did
observe this behavior for hot tips, as shown above.
If we deform the polymer below Tg, the situation is rather different, however.
Under a cold tip, the dashpot is initially in the locked state. If we apply stress, the
entire deformation at low stress values will rst be absorbed only by the elastic spring
ks. The dashpot only opens once a critical stress, the yield stress sy, has been attained,
as indicated by the solid line in Figure 4.22b. At the yield stress, the backbone motion
is forced by the external stress and we have reached the writing threshold. As the
dashpot opens, kp is deformed accordingly, producing internal stress and, similarly to
the hot case, we obtain elastic stress relaxation upon removal of the external force.
The state of the polymer after indentation is therefore remarkably similar in the
two physical pictures discussed. Stress is stored in the deformed polymer network
and is frozen in by the glassy state of the cold polymer. Only the amounts of elastic
recovery (to dc and dh) and of the internally stored stress (si,c and si,h) differ slightly.
Experimentally, there is evidence of a higher remaining stress in cold indentations
than in hot written ones because, at elevated temperature, the former relax faster (A.
Knoll, unpublished results). The other signicant manifestation of the two mechan-
isms is in the indentation dynamics, where we see a transient crossover from yielding
to rubbery deformation with increasing temperature.
It is concluded that nanoindentation is a universal technique to study the
deformation physics of polymers at the nanometer scale. By varying load, force,
heat and temperature, important material properties such as glass temperature,
hardness, shift factors and yield-activation energies can be extracted.
4.6
Application in Data Storage: The Millipede Project
The capability of scanning probe techniques to modify and image a surface on the
nanometer scale makes these techniques obvious candidates for data-storage applications. In fact, since the invention of the STM, many demonstrations of bit
formation and imaging have been reported in the literature using almost every
SPM technique and many different storage media and write mechanisms. Perhaps
the most impressive of these demonstrations at least from a density point of view
is the manipulation of individual atoms on a surface [67]. Although the storage
densities that could be achieved with such techniques are very impressive, the
construction of an actual storage system based on one of these ideas requires that
numerous issues be addressed, including automated bit detection, system data rate,
error correction, bit retention, power consumption, eraseability/cyclability, servo/
tracking, reliability and cost. Many of these requirements are actually in competition
with each other. For example, the highest storage densities demonstrated so far that
is, atomic scale were achieved with very slow read-back speeds and had rather
complex system requirements, such as ultra-high-vacuum conditions and low
temperatures.
One scanning probe storage technology that achieves a balance between the many
competing system requirements is the thermomechanical approach developed by
IBM and referred to internally as the millipede project [6, 7, 68]. In order to achieve a
data rate comparable to those of conventional storage technologies, IBM has used
microelectromechanical systems (MEMS) technology to fabricate large arrays of
cantilevers that can be operated in parallel, with each cantilever writing and reading
data in its own small storage eld. The internal name of the project millipede
refers to the approximate 1000 cantilevers that were used in one of the rst prototype
systems.
4.6.1
Writing
j155
156
The data are read back by measuring the topography of the polymer surface using
same tip that wrote the data. In the IBM approach, this is done using a read-back
mechanism based on heat-transport sensing. For this purpose, a second heater has
been integrated into the cantilever structure. This second heater is remote from the
tip, and can be heated without causing much of an impact on the tip temperature.
When operated in ambient air conditions, the thermal resistance of the read heater
exhibits a strong dependence on the distance between heater and medium surface, as
discussed in Section 4.4.1. This thermal resistance dependence results in turn in a
heater temperature dependence and hence also an electrical resistance dependence
on the distance between heater and medium surface. (The electrical resistance
change with temperature is an intrinsic property of silicon, as discussed in Section 4.2.) This situation can be exploited to sense the topography by applying a
constant voltage to the heater and monitoring the changes in the electrical resistance
that result as the tip is being scanned over the surface. For example, when the tip
moves into an indentation (a 1), the distance between cantilever and surface is
reduced and the heat-transfer rate increased. This leads to a resistance change of the
1 kW heater of DR/R 104 per nanometer (Figure 4.24).
4.6.3
Erasing
in the close vicinity of an indentation that is, in the region of the rim around the
indentation then the increase in temperature in the indentation will result in a
decrease in the viscosity of the polymer, which allows the elastic stress in the
indentation to relax, effectively erasing the indentation. This process usually results
in the creation of a new indentation, which can then be erased by repeating the
procedure. Thus, a previously written data track can be erased by overwriting the data
track with a series of closely spaced indentations. With this procedure, each new
indentation erases the preceding one such that, at the end of the data track, all
indentations will have been erased except for the last one. A demonstration of this
principle is shown in Figure 4.25.
j157
158
4.6.4
Medium Endurance
Medium endurance is a critical issue. The rst challenge for polymer media is to be
robust against repeated scanning with a sharp tip. In general, polymers tend to
quickly roughen (and form ripples) when they are scanned repeatedly with a sharp
tip, even at low load forces (see Section 4.6.5). Of the many solutions proposed to
overcome medium wear, only a few can be readily applied to the nanoscale. On the
nanometer scale, the homogeneity of the medium is crucial for nanoscale datastorage applications, and thus phase separation, ller particles or similar ideas cannot
be used.
One elegant way to solve the issue of medium wear is to introduce a high degree of
crosslinking into the polymer. This not only solves the roughening issue during
sliding (reading) [56] but also facilitates erasing, because it provides the medium with
a means of storing elastic energy and results in a type of shape memory. To date,
more than 104 write/erase cycles have been demonstrated using highly crosslinked
polymer media (H. Podzidis et al., unpublished results).
The dramatic improvement in wear endurance that occurs with increasing
crosslinking is demonstrated in Figure 4.26, where the wear rate is plotted as a
function of crosslink density for a set of polystyrene samples that were repeatedly
scanned with a sharp tip. At a critical value of crosslinking, the mobility of the
polymer is signicantly reduced. This occurs when there is a sufcient number of
crosslinks so that each region of cooperative polymer motion (typically 13 nm in
size) is affected.
4.6.5
Bit Retention
Bit retention and the long-term stability of written data are also governed by the
polymer mobility below the Tg value. This mobility is fundamentally provided by the
activation energy of a backbone motion that is, the so-called alpha-relaxation.
Depending on the polymer, this can be as be as much as several electron volts, and can
thus be sufciently high for typical lifetime requirements. Lifetimes of 10 or more
years at operating temperatures of up to 80 C have been extrapolated from
experimental data.
4.6.6
Tip Endurance
Tip endurance may limit the feasibility of several SPM-based data-storage schemes
that involve mechanical contact between probe and surface. The endurance requirements of a tip will, of course vary, depending on the application and the system
architecture. In general, however, a single tip will have to scan distances ranging from
104 to 108 m during the lifetime of the device, without losing its ability to read and
write data. In thermomechanical data storage, the polymer medium is relatively soft
compared to the hard silicon tips used for reading and writing. However, even for this
combination, tip wear still is an important issue, and other even harder tip
materials are also currently being investigated. Lubrication has proved to be key in
improving the endurance of hard-disk drives, and may also prove to be useful for
probe-based storage.
The density limits of thermomechanical data storage are predicted to be well above
the 1 Tb in2 mark. Ultimately, the density limits will be determined by the mobility
of the polymer that corresponds to nite regions in which cooperative motions of
polymer chains or chain segments occur. These regions range in size from 1 to 3 nm.
As a small number of such regions must occur in each indented zone, a limit will
appear somewhere at or below an indentation spacing of 10 nm.
4.6.7
Data Rate
The data rate is commonly one of the weaker aspects of SPM-related data-storage
schemes. In the thermomechanical approach, two factors contribute to data-rate
limitations:
.
.
The cantilevered tip must be able to follow the topography mechanically; this
translates into the requirement of a high mechanical resonant frequency.
Temperature-based displacement sensor must be able to respond to these topography-induced height changes, ideally with a low power consumption.
The situation is further complicated by the requirement for low applied forces
during the read operation in order to minimize tip and polymer wear. This low-force
j159
160
requirement in turn entails the need for a small spring constant, which tends to
reduce the resonance frequency. Finally, during the write operation, the cantilever
must also be able to apply and withstand forces on the order of hundreds of
nanoNewtons. Thus, in order to achieve a competitive thermomechanical storage
technology, all of these competing requirements must be carefully balanced and the
cantilever design highly optimized. However, even with optimization, a data rate per
cantilever/tip well above 1 MHz appears speculative. Consequently, a high degree of
parallelization of 102 to 104 tips operating in parallel is required to achieve a sufcient
user data rate, and this is feasible only if the fabrication employs VLSI silicon
technology. To date, the fabrication of prototype cantilever arrays with thousands of
tips has been demonstrated, as illustrated in Figure 4.27. Moreover, parallel read/
write operation at high densities using a small subset of cantilevers has been
achieved [70]. Currently, three electrical connections to the array chip are required
for each cantilever that is to be operated, and thus the number of cantilevers that can
be operated in parallel is limited by the area available for bonding wires. The
demonstration of higher degrees of parallelization will require the integration of
some of the system electronics behind the cantilevers, and this is an area of current
research. The other basic components required to make an actual prototype storage
system based on this technology, including a MEMS scanner, a position-sensing and
servo-control scheme, a bit-detection scheme and error-correction codes as well as a
system controller, have also been developed. All of this makes the route to highly
parallel SPM-based storage appear feasible.
4.7
Nanotribology and Nanolithography Applications
j161
162
(i) At low temperatures, a ripple pattern is generated. The activated kinetics of ripple
formation of PS has been studied using both variable sample temperatures [73]
and heated tips [39]. Activation energies of the ripple process that exhibit a
similarity with the yield process discussed above have been established.
(ii) A second regime is the glass-transition region that is, at tip temperatures that
lead to a heating of the polymer in the interaction region under the tip up to Tg.
There rather drastic effects become apparent in a dramatic increase of the ripple
amplitude.
(iii) In the third regime, above Tg, the material becomes so ductile that it is swept to
the sides of the heated scan.
A very different trend is observed when the same experiment is performed with a
thermoset material in which material transfer is more suppressed, for example in a
highly crosslinked epoxy, such as a 100 nm-thick lm of SU8 (see Figure 4.28b).
Because of the high crosslink density of SU8, no ripple pattern can be formed.
Overall, the surface remains unchanged by the wear test, and only a marginal
depression is observed at the largest temperatures. If any debris is formed it is
apparently so volatile that it cannot be traced on the surface with the probing tip after
the wear procedure.
The third example (see Figure 4.28c) demonstrates yet another characteristic
degradation mode, where a chemical reaction is induced by the heated tip. The
In microelectronics, the time and expense required to produce a mask set is an issue in
the prototyping of integrated circuitry [74]. For this reason, electron-beam lithography (EBL), which is an ML2 technique, is often used to prototype individual devices.
The main drawback of ML2 efforts with respect to conventional lithography is the
comparatively low exposure throughput. To remedy this, efforts are underway to
develop EBL systems to operate a multitude of beams in parallel so as to reduce the
overall exposure time [74]. However, further development and research is needed to
control the crosstalk due to the high-voltage control signals and source brightness.
The need for UHV conditions is also seen as an obstacle.
Currently, numerous SPL-based systems are under development or have at least
been proposed and, in comparison to EBL tools, these will be more compact and
simpler systems. The fabrication of large arrays of probes for massively parallel
operation has been realized. One of the most powerful demonstrations of researchscale SPL used electrons extracted from a conducting tip to expose a resist [75].
Structures with resolutions of 30 nm have been transferred, and parallel operation
has been demonstrated. The main challenges to this particular technique are the
need for a conducting substrate, the high voltage necessary to extract electrons, the
reliability of the electron-extraction process, and the lack of a simple overlay strategy.
Heated probes can also be used for SPL. For example, local heating has been used to
induce the crosslink reaction of a conventional photoresist locally [76], after which the
pattern is developed in the same way as in conventional lithography. Based on the
example given in Figure 4.28c, an alternative SPL method can be applied that directly
removes material from the exposed region. This approach offers specic advantages:
.
It combines exposure and development in a single step, which not only makes the
method simpler but also enables direct inspection of the exposure and direct repair
j163
164
in a separate repair step. A prerequisite here is that the same probe can be used for
imaging, which is possible because the exposure creates a surface topography that
can easily be measured. Thereby, the difcult issue of reliability of SPL is directly
tackled.
.
It also facilitates the use of a simple overlay strategy based on exposure/development of alignment marks. These marks can be imaged and used to align the new
layer to be exposed. Note that the same strategy could be used for stitching error
correction or self-calibration of probe positioning.
j165
166
larly appealing for biological applications. Although various methods exist to deposit
material from liquid or gas phases using a scanned probe tip (such as local oxidation
of semiconductor surfaces [79]), the direct deposition of liquid droplets or lines using
dip-pen lithography (DPN) has recently attracted considerable attention. In this
method, an AFM tip that has been covered with a liquid to be deposited is brought
into contact with a surface at locations where the liquid is to be deposited. In
numerous experiments performed over the past few years, a range of materials have
been successfully deposited, notably those used for biopatterning applications. A
major challenge of DPN is the controlled switching on and off of the deposition
process. Since in many applications, arrays of probes must be run in parallel in order
to achieve sufcient throughput, individual probes cannot easily be brought into and
out of contact independently; otherwise, all probes would write the same pattern. One
strategy to circumvent this problem relies on heated probes, where the temperature
of the tip the sidewalls of which are the ink reservoir can be used to turn the
deposition on and off. This has been demonstrated by Sheehan et al. [80] (see
Figure 4.30), where lines of ink (octadecylphosphonic acid) were written at linewidths
down to 100 nm in a controlled manner.
Acknowledgments
The authors would like to thank C. Bolliger for carefully proof reading the manuscript
of the chapter. They are also grateful to the Millipede teams at the IBM Research
Laboratories in Zurich and Almaden for their continued collaboration and helpful
scientic discussions. Previously unpublished results were obtained in collaboration
with J. Frommer, C. J. Hawker, J. Hedrick, M. Hinz and R. Pratt.
References
1 Ferry, J.D. (1980) Viscoelastic Properties of
Polymers, 3rd edn, John Wiley & Sons,
New York.
2 Cahill, D.G., Ford, W.K., Goodson, K.E.,
Mahan, G.D., Majumdar, A., Maris, H.J.,
Merlin, R. and Phillpot, S.R. (2003) Journal
of Applied Physics, 93, 793.
3 Chen, G. (2000) International Journal of
Thermal Sciences, 39, 471.
4 Chen, G., Borca-Tasciuc, D. and Yang,
R.G. (2004) in Encyclopedia of Nanoscience
and Nanotechnology, Vol. 7 (ed. H.S.
Nalwa), American Scientic Publishers,
p. 429.
5 Balandin, A.A. (2005) Journal of
Nanoscience and Nanotechnology, 5, 1015.
References
12
13
14
15
16
17
18
19
20
21
22
23
24
j167
168
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
References
72 King, W.P., Saxena, S., Nelson, B.A.,
Weeks, B.L. and Pitchimani, R. (2006)
Nano Letters, 6, 2145.
73 (a)Schmidt, H.R., Haugstad, G. and
Gladfelter, W.L. (2003) Langmuir, 19,
10390; (b) Wang, X.P., Loy, M.M.T. and
Xiao, X. (2002) Nanotechnology, 13, 478.
74 Groves, T.R., Pickard, D., Rafferty, B.,
Crosland, N., Adam, D. and Schubert, G.
(2002) Microelectronic Engineering, 6162,
285.
75 Wilder, K., Quate, C.F., Singh, B. and
Kyser, D.F. (1998) Journal of Vacuum
Science and Technology B, 16, 3864.
j169
j171
5
Materials Integration by Dip-Pen Nanolithography
Steven Lenhert, Harald Fuchs, and Chad A. Mirkin
5.1
Introduction
The concept of using a tip coated with an ink that is, a pen to write on a surface has
been used throughout history and is widely used today for recording or communicating information by hand. Although the most ancient written texts appear to have
been carved in surfaces using sharp tools such as a knife or chisel, there are several
reasons why the pen has eventually become the hand-writing tool of choice. First, the
constructive nature of the pen typically enables a higher contrast than carving,
making it possible to distinguish the writing from the surface background without
further processing steps. Second, pen writing is relatively independent of the contact
force in comparison with carving. And nally, if desired, a variety of different inks can
be readily integrated on the same surface.
The same conceptual advantages that make the pen a useful tool on the macroscale
also translate to the nanoscale when the tip of an atomic force microscope is used as
an ultra-sharp pen to transfer material to a surface with nanometer scale resolution, a
method known as dip-pen nanolithography (DPN) [1]. By using this technique, highresolution chemical patterns can be constructed on surfaces in a single deposition
step. Because the ink-transfer is independent of the contact force between the atomic
force microscope tip and the substrate in almost all known cases, it is possible to carry
out DPN reproducibly and in parallel, without the requirement for feedback from
individual tips. By coating different tips with different inks, it then becomes possible
to integrate a wide variety of molecules on a surface. As with other scanning probe
lithography (SPL) methods (e.g. mechanical modications, oxidation, local thermal
treatments), DPN offers ultra-high lateral resolution, well below 20 nm. As a direct
write lithographic method, DPN enables arbitrary patterns to be drawn without the
need for a mask, with capabilities comparable to those of electron-beam lithography
(EBL). Additionally, it is a tool that is ideally situated to rapidly produce laboratory
prototypes and structures that are incompatible with the harsh conditions associated
with conventional microfabrication techniques (soft biological structures in
particular).
Nanotechnology. Volume 6: Nanoprobes. Edited by Harald Fuchs
Copyright 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
ISBN: 978-3-527-31733-2
172
5.2
Ink Transport
DPN is made possible by the transport of a material (ink) from the tip of an atomic
force microscope to a surface at point where the tip contacts the surface (Figure 5.1a).
As in the case of a macroscopic pen, the ink must ow from the tip of the pen to the
paper, and this transport process is typically driven by an interaction between the ink
and the substrate. However, quantitative differences appear when this technique is
carried out at the nanoscale tip of an atomic force microscope. Most striking is that
DPN is able to produce patterns consisting of a single molecular layer. The most
The ability to obtain quantitative data on transport rates of inks, as well as the
morphological information obtained by in situ AFM imaging, opens the possibility of
testing theoretical models for the nanoscale ink transport in DPN. In addition to the
perfectly round and sharp spots reproducibly achieved when patterning thiols on gold
under optimal conditions, occasionally it can be observed that some spots appear
more diffuse, are surrounded by a halo of lower lateral force microscopy (LFM)
contrast or consist only of a ring. Furthermore, in some cases an anomalous
diffusion is observed where, instead of circles, fractal-like branches appear.
Figure 5.3 shows schematics of four models that have been developed to explain
and understand the different spreading phenomena observed in DPN experiments.
The rst three models (see Figure 5.3ac) focus on the deposition of thiol SAMs on
gold, where a strong chemical binding of the ink molecule to the substrate is expected
and anomalous diffusion is not observed. The fourth model explains anomalous
diffusion in terms of strong intermolecular interactions within the ink.
j173
174
The rst two models (Figure 5.3a and b) are based on diffusion theory, and are
similar in that they both assume the tip to be an innite point source, and use
diffusion theory to describe the spreading of a thiol monolayer on a gold surface. The
rst model (Figure 5.3a) assumes a constant ux of molecules owing from the tip,
with a concentration of zero outside a spread island. That idea is consistent with the
experimental observations that the area tends to increase linearly with contact time,
and that the monolayer islands have sharp edges in the majority of AFM images. The
second model (Figure 5.3b) assumes a constant concentration at the tip and an area of
a lower density of thiol molecules on the surface diffusing from the tightly packed
monolayer. This second model allows for variation in the ux from the source,
explains the occasional presence of halos and provides a slightly better t to the
experimental data (solid line in Figure 5.2b). The constant-concentration model also
allows the derivation of an absolute diffusion coefcient using three physically
relevant t parameters (tip contact area, ratio of the concentration on the tip to that of
a tightly packed monolayer and the diffusion coefcient).
In addition to the general idea of modeling the tip as a point source from which ink
molecules diffuse, the effect that a microscopic condensed water meniscus forming
at the tipsubstrate contact point in the presence of humidity has been considered in
order to further unravel the mechanisms by which the ink molecules make their way
to the surface [10]. In particular, amphiphilic molecules such as MHA, which are only
very slightly soluble in water, might be expected to show an afnity for the airwater
interface. The meniscus interface transport model (Figure 5.3c) was therefore
developed and can be used to explain the formation of hollow ring patterns. This
model can also predict the transport behavior of a variety of amphiphilic molecules,
including some that physisorb on the substrate.
j175
176
Molecular dynamics simulations have suggested that SAM growth on the nanoscale involves a molecular basis that cannot be adequately described by analytical
diffusion models [11]. For instance, it was shown that, even in the case of strong
binding (e.g. alkanethiols on gold), the monolayer will grow as molecules from the tip
displace molecules already bound to the surface, in a mechanism more akin to
spreading than to diffusion. Computer simulations have also been able to recreate
anomalous diffusion [12] in silico by considering the collective behavior of the
molecules in a SAM (Figure 5.3d) [13]. There is evidence supporting aspects of each
of these models, and there is no consensus as to which is the best. Further innovations
from theoreticians, as well as carefully planned experiments designed to test the
different models, are necessary to determine the true situation at the DPN tip.
5.2.2
Experimental Parameters Affecting Ink Transport
Several experimental parameters have been observed to inuence the ink transport in
DPN, including: driving forces (chemical interactions and external elds), ink
composition, surface properties (chemistry and roughness), humidity, temperature
and tip geometry. It is necessary to understand and control these parameters in order
to optimize DPN processes for a particular application. This is especially important if
different materials and nanostructures of the desired materials are to be integrated
on the same substrate.
5.2.2.1 Driving Forces
In order for the ink to move from the tip to the substrate, a driving force is required,
otherwise the ink will simply remain on the tip. Internal driving forces (i.e. in the
absence of external elds) are typically based either on a chemical reaction of the ink
with the substrate (chemisorption) resulting in a SAM, or on physical adhesion of the
ink to the substrate (physisorption). Another approach is to apply an external eld, for
instance by heating the tip (thermal DPN) or applying a voltage (electrochemical
DPN). It is worth noting that all DPN fabrication process require some sort of
interaction between the ink and the substrate, even when the driving force is provided
externally.
5.2.2.2 Covalent Reaction with the Substrate
The covalent reaction of the ink molecules with the substrate to form a SAM is the
most straightforward and widely used driving force in DPN. The most reproduced
and well-studied system is the formation of thiol SAMs, as described in Section 5.2.1.
However, covalent self-assembly has also been used as a driving force for other ink
molecules. As an example, considerable effort has been made to develop reproducible
methods for DPN on semiconducting or insulating surfaces such as silicon and glass,
as thiols are limited to self-assembly on metallic surfaces.
Functional silane molecules are widely used for the fabrication of SAMs on
semiconductor surfaces, and are the therefore the rst choice. However, a drawback
of patterning silanes by DPN in air is that they polymerize in the presence of water,
and tend to be liquid in their monomeric state (thiol inks in contrast are typically solid
at room temperature). Despite these challenges, it has been shown that through
careful optimization of the experimental conditions for example, selection of the
molecule, control of humidity and functionalizing the AFM tip it is possible to
pattern SAMs of functional silanes [1618]. Another approach is to choose a
functional group for self-assembly that is not as sensitive to water; for example,
silazanes have been shown to be suitable inks on semiconductor surfaces by covalent
reaction with OH groups on the surface [19].
Perhaps the most generally applicable strategy for depositing and integrating
arbitrary molecules on an arbitrary surface by DPN is to prefunctionalize the desired
surface with a bulk SAM, which can then be covalently linked to the ink molecule of
choice. This approach has been particularly useful for the patterning of biofunctional
molecules (this will be discussed in more detail in Section 5.6). Briey, it has been
used successfully for direct DPN of synthetic macromolecules [20], peptides [21] and
DNA [22].
5.2.2.3 Noncovalent Driving Forces
While chemisorption of the DPN ink results in highly stable patterns, covalent
reactions tend to be highly specic and therefore in some cases it is desirable to be
able to deposit an ink noncovalently. Examples include patterning on inert substrates,
the integration of different materials on the same substrate, and/or the fabrication of
multilayer structures. Such noncovalent deposition is, for example, the method used
for patterning with macroscopic pens.
Numerous examples of noncovalent patterning have been reported in the literature. For example, the rst instance of controlled deposition of organic materials
from an AFM tip was the deposition of thiols on mica [23]. Electrostatic interactions
have been used as a driving force to pattern charged conducting polymers [24], as well
as polyelectrolytes which could be used as templates for layer-by-layer assembly [25]
on silicon substrates. Inorganic nanostructures were fabricated by depositing
inorganic precursors dispersed in a copolymer surfactant or dissolved in an ethylene
glycol solvent which wets the substrate [26, 27]. Luminescent polymer nanowires
were patterned on glass using only adhesion as a driving force [6, 28]. Nanoparticles
(Fe3O2 and gold) have been picked up by an AFM tip and deposited noncovalently in
controlled fashion onto mica surfaces in air by DPN [29, 30]. Semiconductor
precursors, which are expected to react with each other and precipitate CdS only
in the water meniscus, were used as inks to fabricate semiconductor nanostructures
on mica [31]. Another approach is to mix a functional molecule with a wellcharacterized ink, as has been demonstrated by the DPN patterning of binary ink
mixtures [32]. Finally, surfactants can be added to the ink in order to tune the
wettability of the ink on the substrate, providing another parameter that can be used
to control ink transport [33].
5.2.2.4 Tip Geometry and Substrate Roughness
In addition to chemical interactions between the ink and substrate, it is also apparent
that the topography of the substrate and geometry of the tip play a role in ink
j177
178
transport. The effect that these parameters have on the minimum feature size was
systematically investigated in the case of alklythiol patterning on gold surfaces, where
the smallest line widths (14 nm) could be achieved with the sharpest tips, and on the
smoothest gold available [8]. In another study on the effect of tip-geometry, the AFM
tips were deliberately made blunt using laser ablation [34]. It was then found that not
only did the minimum feature size depend on the tip radius, but also the rate of ink
transport an idea consistent with several of the ink transport models described
above.
5.2.2.5 Humidity and Meniscus Formation
A signicant amount of evidence is available which suggests that the transport of ink
molecules with polar groups (such as the thiol MHA) are heavily dependent on
humidity, with higher humidity showing higher diffusion constants. Although there
seems to be only a slight (if any) humidity dependence for the nonpolar molecule
ODT [9, 10], the effect cannot be ignored in the patterning of just about all other
molecules. Humidity is therefore an important parameter that must be controlled in
order to optimize DPN conditions. Ideally, this is achieved by encasing the DPN
apparatus in an environmental chamber, or locally by placing a water-containing
capillary tube near the atomic force microscope tip. Although the possibility that the
humidity might affect the ink properties or substrate reactivity has not been excluded,
it is generally thought that the mechanism for humidity dependence depends on the
presence of a meniscus that condenses at the tip of an AFM when it contacts a surface
in the presence of humidity [35].
Theoretical studies of meniscus formation at an atomic force microscope tip based
on Monte Carlo simulations have predicted that the meniscus should depend not
only on humidity, but also on the tip geometry and surface chemistry [36]. Striking
images conrming the presence of such meniscus formation (and indeed showing
that the meniscus can grow larger than expected) have been made possible by using
environmental scanning electron microscopy (ESEM), as shown in Figure 5.4 [37].
Interestingly, studies of the kinetics of meniscus formation between an atomic force
microscope tip and gold surfaces showed similar trends as the early patterning rates
of thiols on gold. It has therefore been hypothesized that growth of the water
meniscus may in some cases be the rate-limiting step in DPN ink transport [15, 38].
It should be noted that, although the humidity clearly inuences ink transport in
DPN, it does not appear to be a prerequisite, as patterning has been achieved at 0%
humidity and even in ultra-high vacuum [9, 10].
5.2.2.6 External Driving Forces
Another approach to controlling the transport of materials from an atomic force
microscope tip to the surface is to apply an external driving force between the tip and
the substrate. Although this poses an engineering challenge in fabricating externally
addressable tips, the ability to switch a particular pen on and off greatly increases the
versatility, especially when one considers parallel arrays of tips where each may be
addressable for large scale integration. Two examples are the use of heatable
cantilevers (thermal DPN) and the application of a voltage between the tip and
sample (electrochemical DPN).
5.2.2.7 Thermal DPN
The idea of thermal DPN is similar to that of a soldering iron; that is, the material on
the tip of the microscope should be heated above its melting temperature in order to
facilitate transport to the surface. The concept is shown schematically in Figure 5.5.
Although heating the ink can be a disadvantage for biological inks, which may be
sensitive to high temperature and dehydration, it provides a useful means of
patterning other materials. For example, octadecylphosphonic acid (OPA) was the
rst compound to be patterned on silicon by using thermal DPN [39]. It was observed
that OPA only began to write when the tip was heated above a critical temperature.
j179
180
The method has since been applied to the deposition of conducting polymers [40].
Although it was initially suggested that thermal DPN is necessary for organic
compounds with high melting points, it has since been shown that such compounds
can also be patterned at room temperature (well below their melting temperatures) by
humidity-controlled DPN. For instance, OPA as well as other compounds with
melting points up to 230 C have been patterned at room temperature under the
appropriate humidity [41]. The mechanism of transport in thermal DPN of organic
inks therefore remains unclear, although it appears to be possible to control the ink
transport by controlling the tip temperature. Most striking is that indium metal
nanostructures could be directly written by thermal DPN [42]. Furthermore, by lling
a single carbon nanotube with molten copper and then dispensing it under
observation with a transmission electron microscope, it has been suggested that
using such a carbon nanotube-based spotwelder in thermal DPN might enable the
ultra-high resolution of molten metals by thermal DPN [43]. Such direct, nanoscale
writing of water-insoluble metals has not been shown to be possible below the
melting point of the metal.
5.2.2.8 Electrochemical DPN
Another approach to controlling the transport of ink from the atomic force microscope tip is to use the water meniscus as a nanoscale electrochemical cell, where
metal salts can be dissolved, reduced and precipitated to form metal nanostructures
on the surface; this method, which is referred to as electrochemical DPN (E-DPN), is
illustrated schematically in Figure 5.6 [44]. This method was further applied to the
controllable transport of his-tagged proteins, which have an afnity to certain metal
ions such as Ni2 . By carrying out E-DPN on nickel-coated surfaces, the surface
could be locally ionized, thereby allowing proteins on the microscope tip to transport
the surface and bind. The same approach of combining local surface oxidation with
material transport from the atomic force microscope tip was used to locally oxidize a
pre-existing SAM and to simultaneously deposit organic inks to those same areas [45].
5.3
Parallel DPN
5.3.1
Passive Arrays
The constructive and chemically driven nature of DPN makes it uniquely amenable to
being carried out in parallel, using arrays of tips, and without the need for accessing
each tip electronically for force feedback. A rough alignment of the tip-array with the
surface is sufcient, since if the tips are touching the surface then the ink will be
transported at a constant rate which is determined primarily by the inksubstrate
combination. The rst demonstration that DPN could be readily carried out in
parallel, employed micromachining processes to fabricate one-dimensional arrays
with 32 silicon nitride tips or eight boron-doped silicon tips, the latter having sharper
tips at the expense of pen densities [46]. The number of tips in a single linear array was
then scaled up to the centimeter scale, using arrays of up to 250 tips, all of which wrote
simultaneously [47]. Parallel DPN was then scaled up again to a two-dimensional
arrays of tips that covered a square centimeter and consisted of 55 000 probes writing
simultaneously; an example is shown in Figure 5.7 [48]. The parallel and constructive
capabilities of DPN are what give it the potential to integrate materials on unprecedented scales.
5.3.2
Active Arrays
One factor which limits the complexity of patterns that can be generated by
parallel DPN as described above, is that every tip necessarily writes the same
j181
182
pattern, provided that it is coated with ink. That is, it would be impossible to get
each tip in an array to write a different pattern using passive tips. In addition to the
possibility of controlling the driving force by external elds (e.g. by thermal DPN
or E-DPN, as described earlier), another innovative step in the development of
DPN probes for parallel materials integration was taken in which the cantilevers
could be externally actuated. The rst approach of this type used thermal bimorph
cantilevers, where heating one side of the cantilever caused it to bend down so that
the tip contacted the substrate and the ink could ow [50]. Again, by optimizing
the tip fabrication the resolution of patterns generated by active pen arrays could
be brought down below 100 nm [51]. Another approach would be to use electrostatically actuated cantilevers, where the cantilever bends towards the surface
based on an applied electric eld in order to avoid the possibility of unwanted
heating of the tip or thermal crosstalk between neighboring cantilevers [51].
Actuated probes not only increase the available complexity of patterns that can be
generated by a single ink, but also open the door to the integration of different
materials from different tips in an array on an area smaller than the dimensions of
the tip array itself. This can be done by writing with one tip, and then moving the
array such that a neighboring tip writes on or near the same area that has already
been patterned.
5.4
Tip Coating
5.4.1
Methods for Inking Multiple Tips with the Same Ink
An essential part of any DPN process is to bring the ink onto the atomic force
microscope tips. This is typically achieved either by thermally evaporating the ink
onto the tips, or by immersing the tips in the ink in a type of dip-coating process.
Thermal evaporation is rather straightforward and typically results in a homogeneous ink coating, for instance of ODT. However, the majority of ink molecules
and especially the more polar ones do not seem suited to evaporation, and tend
to function better when the tip is coated from solution. For solution coating, the
entire cantilever chip can be dipped by hand into an ink solution and, upon
removing the tip and allowing the solvent to dry (e.g. under a stream of inert gas),
a typically homogeneous coating results. However, the distribution of the ink on
the tip when coated in this way will inevitably depend on exactly how the ink wets
(or de-wets) the tip, and also on how the ink solutes concentrate on the tip as it
is dried. Functionalization of the atomic force microscope tip before coating is
therefore sometimes benecial. For instance, in order to reproducibly pattern
proteins it was found useful to precoat the tips with a SAM of a thiolated
polyethylene glycol [11-mercapto-undecylpenta(ethylene glycol)disulde(PEG)],
which makes the tip hydrophilic but prevents the denaturing of adsorbed
proteins [52].
5.4.2
Ink Wells
Inking the tips in the ways described above is well suited for single tips, or in the
case that the requirement is to coat all tips with the same ink. However, in order to
take advantage of the potential for DPN to integrate different materials on the same
substrate in parallel, it is necessary to deliver different inks selectively to different
tips in an array. For this purpose, microuidic ink-delivery systems have been
developed [53, 54]. An example of tips being dipped in wells that coat only every
second tip in an array is shown in Figure 5.8a. An array where only every second tip
is coated is useful for many experiments that require uncoated tips as negative
controls, or in cases where there is a need to have the patterns spaced further apart
than the spacing of the tips in the array. Today, ink wells are available commercially
(from the company NanoInk) that allow the integration of up to 24 different inks on
a one-dimensional array. Figure 5.8b shows an example of two different uorescently labeled phospholipids integrated on a single cantilever array [49]. In similar
fashion, a chip has also been developed that allows local vapor coating onto tip
arrays [55].
j183
184
5.4.3
Fountain Pens
One particularly elegant approach to delivering ink to the atomic force microscope tip
is through the integration of microuidic channels directly onto the tip itself, in order
to generate a nanofountain probe (NFP), such as that shown in Figure 5.9 [56]. As
standard microfabrication techniques were used, it has been possible to generate
parallel arrays of NFPs integrated on a single chip, with different ink reservoirs
leading to different tips in the same array, thus enabling the parallel integration of
different inks [57]. As in the case of macroscopic pens, NFPs can be expected to be
particularly useful for the patterning of inks where the solvent must remain in the ink
until after patterning, as tips coated by dipping in solution are subject to drying.
5.4.4
Nanopipettes
Although micropipettes and nanopipettes differ technically from DPN (in that they
do not necessarily utilize an atomic force microscope tip), they are conceptually
similar to DPN and NFPs in several ways, and are therefore worthy of brief mention at
this point. Cantilevered micropipettes similar to those used for scanning near-eld
optical microscopy (SNOM) have been used for the local delivery of an etchant to a
chrome lm, with a resolution of 1 mm [58]; indeed, subsequent studies led to the
5.5 Characterization
5.5
Characterization
In addition to tip inking and writing, a third indispensable part of the DPN process is
the characterization and quality control of the resultant patterns. A convenient
capability of DPN, which is also shared by most scanning probe-based lithography
processes, is that the same tip can be used for both patterning and imaging, in the
case of DPN by AFM. In particular, lateral force imaging is typically used for the
characterization of chemical contrast in covalently bound inks such as thiols on gold
(as described earlier and shown in Figure 5.2). There are two practical issues to be
aware of when characterizing DPN patterns generated by the same tip that has been
used for imaging:
.
An inked tip will typically continue to write during imaging, and therefore high
scan speeds must be used to minimize this effect.
The vast majority of DPN is carried out in contact mode, using cantilevers with a
too-low spring constant for use in intermittent contact or tapping mode imaging
in air. This is especially the case for parallel DPN, where it is impractical to have a
separate tapping feedback mechanism for each tip. As the contact mode typically
provides inaccurate heights in air, it is often necessary to change the tip and realign
it to nd the patterned area in order to obtain quantitative height information.
Although these two issues represent disadvantages in a high-throughput lithography process, they can actually serve as signicant advantages when characterizing the
ink transport. For instance, in an early study using DPN, monolayer growth could be
observed in situ by scanning the same area repeatedly at high resolution; this allowed
observation of the monolayer growth dynamics as alkanethiols were transferred from
the tip to a gold substrate [63]. It is also possible to carry out DPN in tapping mode by
using a single tip for the simultaneous deposition and imaging of soft materials, as
well as obtaining accurate height information [64]. When this method was applied to
the deposition of poly-D,L-lysine hydrobromide onto mica surfaces, it was possible to
observe the nucleation and growth dynamics of polymer crystals at a submicron scale
that is inaccessible to other methods (Figure 5.10) [65].
In the DPN patterning of a new ink or substrate, it is essential to determine that the
patterns generated are indeed composed of the intended ink. If a particular
j185
186
suggests that the practical resolution limits of DPN may not be in the patterning, but
rather in the ability to detect patterns with dimensions smaller than the AFM tip used
to fabricate them [6]. Finally, it is often useful to characterize the AFM tips as well as
the presence and distribution of the ink on the tips, which can typically be achieved
using SEM or optical microscopy.
5.6
Applications Based on Materials Integration by DPN
Functional chemical patterns fabricated by DPN have been used for a wide variety of
scientic applications, and it can be expected that industrial applications will follow.
Even in the many published applications, including for example etch resists [73] or
templates for selective deposition [74], when only a single ink molecule is patterned
onto a single surface, DPN has several advantages over conventional direct-write
lithographic techniques such as EBL. While the latter is able to provide competitive
lateral resolution, it is severely limited in its throughput as well as its cost. Furthermore, in contrast to DPN, EBL involves removing material from the substrate, which
requires an extra development step. One advantage of placing resists and template
molecules directly onto the surface is that the remainder of the surface is left free of
contaminants. That being said, the ability to generate multicomponent nanostructures opens entirely new possibilities that are inaccessible by any other method.
Some of the more striking examples will be briey described in the following
sections, in order to provide an overview of the types of unique application made
possible by DPN. Whilst the examples are categorized based on selective adsorption,
combinatorial chemistry and biological arrays, these categories are by no means
complete and a signicant amount of overlap is apparent between them.
5.6.1
Selective Deposition
Although the selective deposition of materials onto patterned surfaces is not limited
to DPN patterns, the rapid prototyping capabilities and ability for DPN to generate
multicomponent nanostructures on a small scale adds a new dimension to the eld. A
few of the strategies that can be used to immobilize different particles from solution
by selective adsorption or templating are shown in Figure 5.11. In addition to the
adsorption strategies shown, covalent binding, nonpolar adhesion forces and entropic effects can also be used to direct binding towards desired areas of the substrate.
The surface passivation of the background is often a crucial step in fabricating
templates for selective adsorption. Most often, successful selective adsorption will
involve combinations of more than one of these strategies.
As a rst example of electrostatic templating on DPN patterns, positively charged
colloidal particles were immobilized electrostatically onto negatively charged MHA
patterns. By fabricating a variety of MHA dot array patterns with different dot sizes
and spacings, it was possible to screen the pattern dimensions for those capable of
j187
188
organizing the particles such that each dot had exactly one particle bound, in a
combinatorial fashion [74]. Such an approach was later applied to the fabrication of
arrays of individual bacterial cells [75]. In another approach, the positive and
negatively charged polyelectrolytes poly(diallyldimethylammonium) chloride
(PDDA) and poly(styrenesulfonate) (PSS) have been directly patterned on silicon
surfaces by DPN. Upon the addition of a complementary polyelectrolyte, selective
adsorption was observed which suggested a compatibility of the method with layerby-layer assembly [25]. Furthermore, such electrostatic templates have been used in
combination with molecular combing to organize aligned DNA strands, a biological
molecule which also falls into the polyelectrolyte category and can readily be adsorbed
electrostatically [76].
DNA-directed self-assembly on DPN patterns has been used to organize two
different-sized nanoparticles into nanoarrays, with the spacing between differentsized particles in an array being as low as 500 nm [74]. This method of self-sorting
was improved by using noncomplementary DNA as a passivation layer, enabling
larger particles to be immobilized [87]. Protein nanoarrays selectively bound to MHA
patterns [77], or patterned by direct write methods [52], have been used to subsequently immobilize other protein molecules by molecular recognition. Covalent
linking of the biofunctional group biotin was carried out selectively on DPN patterns,
enabling the binding of streptavidin protein by molecular recognition and subsequent binding of biotinylated materials [72]. Similarly, the covalent coupling of
proteins to DPN templates was carried out using succinimide chemistry [78]. Finally,
nonpolar carbon nanotubes were selectively adsorbed to DPN templates based on
differences in surface energy between the patterned and passivated regions [79].
Although not often emphasized in the literature, passivation is a crucial step in
selective adsorption, such that the materials to be integrated on the surface do not
simply bind everywhere. In a DPN experiment, the background is typically blocked by
a low-surface energy (methyl-terminated) compound such as ODT [74] or, in the case
The idea of placing different chemical compounds on the same surface, for exposure
to identical solution conditions, lends itself well to combinatorial chemistry. In
addition to the possibility of ultra-high-density chemical arrays, the nanoscale
resolution of DPN also enables studies of collective molecular interactions, as well
as how the properties of nanoscale aggregates might differ from bulk behavior. For
instance, the ability for nanopatterned SAMs to function as resists against the
chemical etching of metallic lms has been investigated combinatorially as a
function of pattern dimension in order to minimize feature sizes [73]. Solid-state
nanostructures with features as low as 15 nm have been fabricated by the direct
deposition of etchant [67], while nanostructures of various metals such as gold, silver
and palladium have been generated with 35 nm dot diameters and 53 nm line
widths [80, 81].
As a rst demonstration of the ability to screen the chemical behavior of different
compounds on the same surface, four different thiol molecules were patterned
within an area of 5 mm2 on a single gold surface to form combinatorial libraries,
as shown in Figure 5.12. The libraries were then used to study molecular
j189
190
displacement from the surface by repeatedly scanning the same area with an inkcoated tip and observing the order in which the spots disappeared as they were
exchanged by new molecules from the tip [82]. In another creative approach to
combinatorial chemistry, DPN was used to pattern molecules on cantilevers which
could, in principle, be used as force sensors for determining interactions between
molecular libraries [83].
Another unique application for DPN is in the study of microscale and nanoscale
phase separation. For instance, phase separation and pattern formation in conjugated
polymers spin-coated onto combinatorially nanostructured DPN templates was
studied as a function of polymer concentration and MHA dot diameter [84]. The
information resulting from those screenings of pattern dimensions enables one to
control pattern formation in thin organic lms, with potential applications in the
fabrication of organic electronic and optical devices. Furthermore, by coating the tip
with a mixture of two (or more) inks it becomes possible to observe if, and how, the
molecules de-mix during the DPN writing process by using in situ AFM measurements [85]. The use of such separated phases for selective adsorption provided the
ability to reduce line widths down to 10 nm.
5.6.3
Biological Arrays
attachment of oligonucleotide-labeled
nanoparticles of two different diameters (5 nm
and 13 nm). (Reproduced from Ref. [22] with
permission from the American Association for
the Advancement of Science.)
similarity between different DNA strands of the same sequence renders them
promising for parallel arraying, as DNA strands of different sequence can be
patterned under the same environmental conditions.
A variety of different proteins have also been patterned by DPN, both by selective
adsorption and by direct-write processes. In the majority of cases, selective
adsorption using various coupling strategies appears to be most successful in
fabricating functional protein nanoarrays to date, as selective adsorption can be
carried out without dehydration of fragile proteins [70, 72, 77, 78, 88, 89]. For
example, an antibody nanoarray-based detection assay for HIV in patients plasma
exceeded the limit of detection of conventional enzyme-linked immunosorbent
assay (ELISA)-based immunoassays (5 pg ml1 plasma) by more than 1000fold [90]. Direct-write approaches hold the promise of larger-scale integration,
and have also proven their ability to successfully generate functional protein
nanoarrays. For example, collagen was the rst protein to be directly deposited
onto gold by rst reducing disulde bridges in the biopolymer, and then allowing
the collagen brils to reassemble on the gold surfaces. Functional his-tagged
proteins have been deposited directly onto nickel oxide surfaces via metal afnity
[91]. Unmodied antibodies were directly patterned onto gold surfaces by nonspecic adsorption [52].
j191
192
References
5.7
Conclusions
In conclusion, DPN provides the ability to simultaneously integrate and nanostructure a diverse range of materials on a variety of surfaces, at unprecedented levels of
both spatial resolution and complexity. In principle, several thousand different
materials could be integrated in thousands of different combinations with nanoscale
resolution over square-centimeter (or larger) surface areas. Although much remains
to be done before this dream is achieved, the barriers appear to be surmountable.
Clearly, a basic understanding of the mechanisms behind the nanoscale transport of
ink will be necessary in order to reproducibly carry out and extend the capabilities of
DPN. In addition, innovative probe designs and inking strategies will be essential in
order to expand the capability of DPN for large-scale materials integration.
References
1 Piner, R.D., Zhu, J., Xu, F., Hong, S.H. and
Mirkin, C.A. (1999) Science, 283, 661.
2 Ginger, D.S., Zhang, H. and Mirkin, C.A.
(2004) Angewandte Chemie - International
Edition, 43, 30.
3 Salaita, K., Wang, Y.H. and Mirkin, C.A.
(2007) Nature Nanotechnology, 2, 145.
4 Haaheim, J. and Nafday, O.A. (2008)
Scanning, 30, 137.
j193
194
References
44 Li, Y., Maynor, B.W. and Liu, J. (2001)
Journal of the American Chemical Society,
123, 2105.
45 Cai, Y.G. and Ocko, B.M. (2005) Journal
of the American Chemical Society, 127,
16287.
46 Zhang, M., Bullen, D., Chung, S.W., Hong,
S., Ryu, K.S., Fan, Z.F., Mirkin, C.A. and
Liu, C. (2002) Nanotechnology, 13, 212.
47 Salaita, K., Lee, S.W., Wang, X.F., Huang,
L., Dellinger, T.M., Liu, C. and Mirkin, C.A.
(2005) Small, 1, 940.
48 Salaita, K., Wang, Y.H., Fragala, J., Vega,
R.A., Liu, C. and Mirkin, C.A. (2006)
Angewandte Chemie - International Edition,
45, 7220.
49 Lenhert, S., Sun, P., Wang, Y.H., Fuchs, H.
and Mirkin, C.A. (2007) Small, 3, 71.
50 Bullen, D., Chung, S.W., Wang, X.F., Zou,
J., Mirkin, C.A. and Liu, C. (2004) Applied
Physics Letters, 84, 789.
51 Bullen, D. and Liu, C. (2006) Sensors and
Actuators A - Physical, 125, 504.
52 Lee, K.B., Lim, J.H. and Mirkin, C.A.
(2003) Journal of the American Chemical
Society, 125, 5588.
53 Ryu, K.S., Wang, X.F., Shaikh, K., Bullen,
D., Goluch, E., Zou, J., Liu, C. and
Mirkin, C.A. (2004) Applied Physics
Letters, 85, 136.
54 Banerjee, D., Amro, N.A., Disawal, S.
and Fragala, J. (2005) Journal of Microlithography, Microfabrication and
Microsystems, 4, 230.
55 Li, S.F., Shaikh, K.A., Szegedi, S., Goluch,
E. and Liu, C. (2006) Applied Physics Letters,
89, 173125.
56 Kim, K.H., Moldovan, N. and Espinosa,
H.D. (2005) Small, 1, 632.
57 Moldovan, N., Kim, K.H. and Espinosa,
H.D. (2006) Journal of Micromechanics and
Microengineering, 16, 1935.
58 Lewis, A., Kheifetz, Y., Shambrodt, E.,
Radko, A., Khatchatryan, E. and Sukenik,
C. (1999) Applied Physics Letters, 75, 2689.
59 Taha, H., Marks, R.S., Gheber, L.A.,
Rousso, I., Newman, J., Sukenik, C. and
Lewis, A. (2003) Applied Physics Letters, 83,
1041.
j195
196
j197
6
Scanning Ion Conductance Microscopy of Cellular and Artificial
Membranes
Matthias Bocker, Harald Fuchs, and Tilman E. Schaffer
6.1
Introduction
198
6.2 Methods
6.2
Methods
6.2.1
The Basic Set-Up
In a basic SICM set-up a nanopipette with a small tip opening diameter is positioned
close to a sample surface (Figure 6.1a). The nanopipette is lled with an electrolyte
and the sample is placed in an electrolyte-lled dish. Typically, the electrolytes in the
nanopipette and in the dish are identical, so that no osmotic ow in or out of the
pipette occurs. For the current measurement a voltage is applied between two silver/
silver chloride (Ag/AgCl) electrodes. Ag/AgCl electrodes have electrochemical
properties that make them well suited to applications in SICM. For example, they
have a very small equilibrium constant at room temperature so that only a small
amount of Ag -ions exists in the electrolyte. Additionally, they are easily fabricated,
for example by the electrolytic deposition of silver chloride on a silver wire. One of the
electrodes is placed inside the pipette (the pipette electrode), while the other electrode
is place inside the electrolyte in the dish (the bath electrode). The ion current from the
pipette electrode through the pipette, and through it