François Baneyx: Recombinant Protein Expression in Escherichia Coli
Plasmid copy number and maintenance A radically different solution to the problem of plasmid
To achieve high gene dosage, heterologous cDNAs are instability is the direct insertion of heterologous genes
typically cloned into plasmids that replicate in a relaxed within the chromosome of E. coli. Although simple delivery
fashion and are present at 15–60 (e.g. pMB1/ColE1- vehicles (e.g. bacteriophage λ) are available for this pur-
derived plasmids) or a few hundred copies per cell pose, little emphasis has been placed on this strategy owing
(e.g. the pUC series of pMB1 derivatives). When to the perceived notion that gene dosage will necessarily be
Table 1 Promoters
For many years the E. coli lactose utilization (lac) operon
Programmed cell death in E. coli: selected approaches to
has served as one of the paradigms of prokaryotic regula-
enhance plasmid stability.
tion. It is therefore not surprising that many of the
Genetic tool Principle of action promoters used to drive the transcription of heterologous
genes have been constructed from lac-derived regulatory
hoh/sok (parB) Hok is a 52 amino acid-long membrane-
elements. Although the lac promoter and its close relative,
locus of plasmid R1 damaging protein encoded on a very
stable but translationally inactive transcript. lacUV5 (which is theoretically not subject to cAMP-depen-
Sok is a highly unstable antisense RNA that dent regulation, but see [7•]), are rather weak and rarely
binds to the hok mRNA leader region. used for the high-level production of recombinant
Rapid decay of the Sok pool in plasmid-free polypeptides, they are extremely valuable tools to achieve
cells leads to the processing of the 3′ end
of hok to yield an active transcript. graded expression of helper or toxic proteins provided that
Related system: pndAB of plasmid R483. lacY mutant hosts are used and that induction is performed
with the non-hydrolyzable lactose analog isopropyl-β-D-1-
ccdAB locus CcdB is a proteolytically stable 11 kDa protein
of plasmid F that inhibits DNA gyrase. CcdA is a 9 kDa protein thiogalactopyranoside (IPTG) (see [8•,9] for a discussion).
that binds to CcdB and blocks its action. The synthetic tac and trc promoters, which consist of the
Because the half-life of CcdA is much shorter –35 region of the trp promoter and the –10 region of the lac
than that of CcdB, plasmid-free segregants are
promoter, only differ by 1 bp in the length of the spacer
killed upon degradation of the ‘antidote’.
Related systems: parDE of plasmids RP4/RK2, domain separating the two hexamers. Both promoters are
phd/doc of plasmid P1, parD/pem of plasmids quite strong and routinely allow the accumulation of
R1/R100. polypeptides to about 15–30% of the total cell protein.
Complementation An essential chromosomal gene is deleted or Although it is often argued that the cost of IPTG limits the
mutated and an intact copy or a suppressor is usefulness of these promoters, this is rarely a problem for
supplied in trans on a plasmid. Plasmid loss leads high-added-value products. Furthermore, as little as
to cell death under non-permissive growth
conditions. Examples of chromosomal alterations
50–100 µM IPTG is usually sufficient to achieve full
include deletions of genes necessary for the induction. The more serious issue of IPTG toxicity can be
synthesis of essential amino acids, and circumvented by utilizing lactose as an inducer or by mak-
thermosensitive and nonsense mutations in ing use of thermosensitive variants of the LacI repressor
essential chromosomal genes.
protein that allow thermal induction of recombinant pro-
tein synthesis.
low. However, once chromosomal insertion of a single DNA The leakiness of lac-derived promoters may be a concern
fragment containing a drug-resistance marker and flanked for the production of membrane proteins or other gene
by two short direct repeats has been achieved, the entire products that are toxic to the cell. For medium copy num-
fragment can be amplified to 15–40 copies through recA- ber plasmids (e.g. pBR322), repression can be efficiently
mediated duplications by increasing the antibiotic achieved by using host strains carrying the lacIQ allele.
concentration [4••]. Although such amplified structures This single nucleotide mutation in the –35 hexamer of the
have been reported to be unstable in the absence of selec- chromosomal lacI promoter leads to an increase in the
tive pressure [5], Olson et al. [4••] recently reported that number of LacI repressor molecules from 10–20 to over
tandem repeats of an IGF-I fusion located at the attλ site of 100 per cell. For higher copy number plasmids (e.g. pUC
the E. coli chromosome remained stable in high-density fer- derivatives or pMB1 derivatives containing a rom/rop
mentations conducted without antibiotic. An elegant but mutation), the lacI or lacIQ genes are typically cloned onto
more time-consuming approach for the insertion of multi- the expression plasmid or provided in trans on a compati-
ple DNA fragments at different locations of the ble plasmid. It was recently shown, however, that a 15 bp
chromosome was developed by Peredelchuk and Bennett deletion in the lacI promoter that fortuitously replaces the
[6]. This scheme uses elements of the Tn1545 site-specific native –35 hexamer by the consensus sequence for σ70-
recombination module to randomly integrate a target gene dependent promoters increases the strength of the lacI
and a drug-resistance marker into the chromosome of a host promoter 170-fold [10•]. Strains bearing the resulting
strain provided with transposon integrase, thereby yielding lacIQ1 allele efficiently repress lacI-regulated genes on high
a collection of clones containing single insertions at differ- copy number plasmids and full activation of plasmid-borne
ent locations of the chromosome. Inserts from the resulting tac promoters can be achieved with as little at 3–10 µM
population can be accumulated within a single strain by IPTG [10•].
successive cycles of DNA transfer through bacteriophage
P1-mediated transduction, selection for drug resistance and In recent years, the pET vectors (commercialized by
removal of the marker using the excision system of phage λ. Novagen, Madison, WI) have gained increasing popularity.
As a result, instability problems are eliminated although In this system, target genes are positioned downstream of
strain performance may be compromised if important chro- the bacteriophage T7 late promoter on medium copy num-
mosomal loci are disrupted. ber plasmids. The highly processive T7 RNA polymerase
is supplied in trans. Typically, production hosts contain a associating with free 30S ribosomal subunits, allowed con-
prophage (λDE3) encoding the enzyme under control of tinuous cspA-driven production of β-galactosidase in high
the IPTG-inducible lacUV5 promoter. While this system density cell fermentations for several hours following trans-
leads to the synthesis of large amounts of mRNA, and, in fer to low temperatures [15••]. The recent demonstration
most cases, the concomitant accumulation of the desired that cspA-driven transcription is beneficial for the expres-
protein at very high concentrations (40–50% of the total sion of toxic and proteolytically sensitive gene products,
cell protein), it is not without drawbacks. For example, together with the availability of cloning vectors designed
high level of mRNA can cause ribosome destruction and for rapidly positioning cDNAs under cspA transcriptional
cell death, and leaky expression of T7 RNA polymerase control (M Mujacic, K Cooper, F Baneyx, unpublished
may result in plasmid or expression instability. data) should stimulate interest in this system. Interestingly,
Furthermore, even ‘empty’ pET plasmids are toxic to the strong bacteriophage λPL promoter, which is typically
E. coli in the presence of IPTG [11]. Some of the strategies used to drive the synthesis of recombinant proteins by
that have been developed to address these issues are co- transferring strains containing a thermosensitive version of
overexpression of phage T7 lysozyme (which degrades T7 the λcI repressor protein (cI857) from 30 to 42°C, is also
RNA polymerase) from the compatible pLysS and pLysE cold-inducible [16]. In this case, the main drawback is a
plasmids (Novagen) and the insertion of a lac operator high basal level of expression as low-temperature induction
sequence downstream of plasmid-encoded T7 promoters, must be performed in strains lacking λcI.
in order to reduce leaky transcription. In addition, empiri-
cal selection has yielded strains that are superior to the Among the various nutritionally inducible promoters
traditional BL21(DE3) host by overcoming toxic effects (e.g. phoA and trp, which are induced by phosphate and
associated with the overexpression of membrane and glob- tryptophan limitation, respectively), the arabinose promot-
ular proteins under T7 transcriptional control [11]. Finally, er (araBAD or PBAD) has recently become commercialized
because it has been reported that the lacUV5 promoter by Invitrogen Corp (Carlsbad, CA). This system uses the
becomes activated in stationary phase cultures in a process inexpensive sugar L-arabinose as an inducer and is some-
requiring cAMP, cAMP-deficient (cya) mutants of what weaker than the tac promoter. Although it is
BL21(DE3) should be used for clone selection and fer- commonly believed that araBAD can be used to achieve
mentation to avoid counter-selection of plasmids carrying graded levels of protein expression by varying the arabinose
toxic genes under T7-control [7•]. concentration, there is extensive heterogeneity in cell pop-
ulations treated with subsaturating concentrations of the
An additional limitation of the T7 and other strong promot- inducer, with some bacteria fully induced and others not at
er systems is that the target protein is often unable to reach all [9]. Thus, araBAD will not be useful for precisely con-
a native conformation and either partially or completely trolling the levels of protein accumulation until a host strain
segregates within inclusion bodies. Although this problem that efficiently uptakes arabinose by constitutively synthe-
may be addressed by co-overexpressing folding modulators sizing the arabinose transporter(s), or a gratuitous inducer
or through fusion protein technology (see below), an alter- that does not employ them is identified [8•,9]. Additional
native approach is to use promoters that are activated by promoters regulated by a variety of signals (pH, dissolved
temperature downshift, as proper protein folding is often oxygen concentration, osmolarity, etc.) are available and
favored under low temperature cultivation conditions (see have been reviewed in detail elsewhere [17].
[12•] and references therein). The best characterized cold-
shock promoter is that of the major E. coli cold-shock Upstream elements
protein CspA [13•]. Although the cspA core promoter is only The DNA regions that flank core promoters play an
weakly induced upon temperature downshift, a important role in determining transcription efficiency.
159 nucleotide (nt) long untranslated region (UTR) at the Upstream (UP) elements located 5′ of the –35 hexamer in
5′ end of cspA-driven transcripts makes them highly unsta- certain bacterial promoters are A+T rich sequences that
ble at 37°C but significantly increases their stability at low increase transcription by interacting with the α subunit of
temperatures while simultaneously favoring their preferen- RNA polymerase [18•]. Because few UP elements have
tial engagement by a cold-modified translational machinery been isolated, Gourse and co-workers [19••] used in vitro
containing fewer polysomes and a larger number of mono- selection to identify upstream sequences conferring
somes, 30S and 50S ribosomal particles. The cspA promoter increased activity to the rrnB P1 core promoter. The best
is rather well repressed at and above 37°C, compares favor- UP sequence was portable and increased in vivo transcrip-
ably to the tac promoter for the expression of an tion from the rrnB P1 and lac core promoters 326- and
aggregation-prone fusion protein at reduced temperatures 108-fold, respectively.
and remains functional at 10°C [14]. The major disadvan-
tage of the cspA system is that it becomes repressed The degree of homology with the deduced consensus
1–2 hours after temperature downshift, a time period that is sequence (–59 NNAAA[A/T][A/T]T[A/T]TTTTNNAANNN
too short to allow high-level accumulation of recombinant –38; where N is any nucleotide) was also shown to correlate
proteins. However, the use of a host strain carrying a with the strength of natural UP elements fused upstream of
null mutation in rbfA, a gene encoding a 15 kDa protein the lac core promoter [20•]. These results suggest that the
positioning of highly active UP sequences upstream of well below 4 nt or increased above 14 nt [27]. Because of the
repressed promoters may increase their strength to a level close coupling between transcription and translation in
only achieved thus far with phage promoters, but without prokaryotes, engineering of the translation initiation region
the drawbacks associated with phage polymerase expres- is a powerful tool for modulating gene expression in a pro-
sion (e.g. leakiness, toxicity and counter-selection). moter-independent fashion [27]. This also means that
stable mRNA secondary structures encompassing the SD
mRNA stability sequence and/or the initiation codon can dramatically
E. coli mRNAs are rather unstable, with half-lives ranging reduce gene expression by interfering with ribosome bind-
between 30 s and 20 min. The major enzymes involved in ing. This problem can be circumvented by increasing the
mRNA degradation are two 3′→5′ exonucleases (RNase II homology of SD regions to the consensus, and by raising
and polynucleotide phosphorylase [PNPase]) and the the number of A residues in the initiation region through
endonuclease RNase E [21•,22•]. The catalytic activity of site-directed mutagenesis. An additional mRNA feature
RNase E is located at the protein amino terminus, whereas affecting translation initiation is the downstream box (DB),
the carboxy-terminus serves as a scaffold for the assembly which is located after the initiation codon and comple-
of a highly efficient RNA ‘degradasome’ involving PNPase, mentary to bases 1469–1483 of the 16S rRNA. DBs have a
the DEAD-box RNA helicase RhlB and the glycolytic 5′-AUGAAUCACAAAGUG-3′ consensus sequence and
enzyme enolase. There is considerable controversy over recent evidence suggests that they play a major role as
whether RNase E-dependent mRNA decay proceeds in translational enhancers [28••]. Although introduction of a
the 5′→3′ or in the opposite direction. In either case, stable consensus DB at the 5′ end of genes encoding recombi-
secondary structures present in the 5′ UTR of certain tran- nant proteins would change their amino acid sequence,
scripts as well as in 3′ rho-independent terminators can increasing the homology of this region to that of a DB by
both increase mRNA stability; however, their efficiency is using synonymous codons may improve translation initia-
modulated by fine features. For example, addition of tion of certain transcripts.
poly(A) tails to the 3′ end of mRNAs by the seemingly
redundant poly(A) polymerases PAP I (the pcnB gene prod- Differences in codon usage between prokaryotes and
uct) and PAP II [23] provides a single stranded ‘toehold’ for eukaryotes can have a significant impact on heterologous
RNase II and PNPase that facilitates transcript degrada- protein production. The arginine codons AGA and AGG
tion. In general, polyadenylation is not a problem for are rarely found in E. coli genes, whereas they are common
recombinant protein expression as only a small fraction of in Saccharomyces cerevisiae and eukaryotes. The presence of
mRNAs contain poly(A) tails in wild-type E. coli strains. such codons in cloned genes affects protein accumulation
levels, mRNA and plasmid stability and, in extreme cases,
The stabilizing effect conferred by untranslated 5′ hairpins inhibits protein synthesis and cell growth [29]. An impor-
was first demonstrated in the case of the long-lived ompA tant, but much less obvious effect of AGA codons, is
mRNA. Fusion of the ompA 5′ UTR to a variety of het- primary structure changes due to the misincorporation of
erologous mRNAs significantly increased transcript lysine for arginine, particularly when cells are grown in
half-life, presumably by interfering with RNase E binding minimal medium [30•]. Fortunately, these problems can
[24]. This protective effect is abrogated, however, when usually be addressed by using site-directed mutagenesis to
the hairpin is preceded by 5′ unpaired nucleotides [25•]. replace rare arginine codons by the E. coli-preferred CGC
Because RNase E is much more efficient at cleaving sub- codon or by co-overexpressing the argU(dnaY) gene which
strates with 5′ monophosphate ends than 5′ triphosphate encodes tRNAArg(AGG/AGA).
ends [26••], the stabilizing function of 5′ hairpins may be
related to their ability to sequester the end of transcripts. Folding in the cytoplasm
The 5′ UTR of the ompA mRNA appears particularly well Overproduction of heterologous proteins in the cytoplasm
suited for this task as among 10 synthetic hairpins, only of E. coli is often accompanied by their misfolding and seg-
one was slightly more effective than the ompA UTR in sta- regation into insoluble aggregates known as inclusion
bilizing lacZ transcripts [25•]. bodies. Although inclusion body formation can greatly
simplify protein purification, there is no guarantee that the
Translational issues in vitro refolding will yield large amounts of biologically
Initiation of translation of E. coli mRNAs requires a Shine- active product (unsuccessful refolding attempts are seldom
Dalgarno (SD) sequence complementary to the 3′ end of reported in the literature). A traditional approach to reduce
the 16S rRNA and of consensus 5′-UAAGGAGG-3′, fol- protein aggregation is through fermentation engineering,
lowed by an initiation codon, which is most commonly most commonly by reducing the cultivation temperature
AUG. About 8% of start sites use GUG, whereas UUG and (see [12•] and references therein). The more recent real-
AUU are rare initiators that are only present in autoge- ization that in vivo protein folding is assisted by molecular
nously regulated genes (e.g. those encoding ribosomal chaperones, which promote the proper isomerization and
protein S20 and initiation factor 3). Although the optimal cellular targeting of other polypeptides by transiently
spacing between these two features is 8 nt, translation ini- interacting with folding intermediates, and by foldases,
tiation is only severely affected if this distance is reduced which accelerate rate-limiting steps along the folding
pathway, has provided powerful new tools to combat the involve dissolution of preformed recombinant inclusion
problem of inclusion body formation [12•,31]. bodies but is related to improved folding of newly synthe-
sized protein chains [38]. It is important to point out,
The best characterized molecular chaperones in the cyto- however, that the beneficial effect associated with an
plasm of E. coli are the ATP-dependent DnaK-DnaJ-GrpE increase in the intracellular concentration of DnaK-DnaJ
and GroEL-GroES systems [32•,33•]. DnaK binds to and GroEL-GroES is highly dependent on the nature of
hydrophobic regions exposed to the solvent by nascent or the overproduced protein, and that success is by no means
stress-unfolded polypeptides, thereby preventing off-path- guaranteed (and highly unlikely if the protein is inherent-
way reactions leading to aggregation. The promiscuity of ly incapable of folding).
DnaK binding is well explained by the fact that it recog-
nizes heptameric stretches of amino acids consisting of a Based on in vitro studies and homology considerations, a
4–5 residues-long hydrophobic core flanked by basic number of additional cytoplasmic proteins have been pro-
residues. This motif occurs every 36 residues on the aver- posed to function as molecular chaperones. They include
age protein [34]. DnaJ, which independently binds folding ClpB, HtpG and IbpA/B, which, like DnaK-DnaJ-GrpE
intermediates, activates DnaK for tight substrate binding and GroEL-GroES, are heat-shock proteins (Hsps)
and might direct it to high-affinity sites. The nucleotide belonging to the σ32 stress regulon. Although inactivation
exchange factor GrpE mediates complex resolution: of these Hsps has a modest effect on the ability of E. coli to
released proteins may either fold into a proper conforma- handle thermal stress [39•], they appear to have a support-
tion, be recaptured by DnaK-DnaJ for additional cycles of ing role in cellular protein folding by acting as minor
interaction or be reversibly transferred to the ‘downstream’ chaperones that bind folding intermediates or misfolded
GroEL-GroES chaperonins. GroEL is an ~800 kDa hollow proteins and transfer them to the DnaK-DnaJ-GrpE team
toroid consisting of two stacked homoheptameric rings. It for subsequent reactivation (Figure 1). Although overpro-
binds both substrate proteins and GroES (a 70 kDa dome- duction of IbpA/B, HtpG or ClpB did not suppress the
shaped homoheptamer) via a ring of hydrophobic residues misfolding of an aggregation-prone fusion protein in the
located in its apical domain. Although no clear consensus E. coli cytoplasm (JG Thomas, F Baneyx, unpublished
sequence has been identified, GroEL, like DnaK, appears data), increased intracellular levels of these Hsps might
to favor hydrophobic and basic residues in its substrates improve the solubility of other substrates, particularly if
[35]. Upon GroES binding, partially structured folding coordinated with DnaK-DnaJ overexpression.
intermediates are released into the inner cavity of GroEL
where they can fold in a capped and hydrophilic environ- The trans conformation of X–Pro bonds is energetically
ment. There is extensive evidence that co-overproduction favored in nascent protein chains; however, ~5% of all pro-
of the DnaK-DnaJ or GroEL-GroES chaperones can great- lyl peptide bonds are found in a cis conformation in native
ly increase the soluble yields of aggregation-prone proteins proteins. The trans to cis isomerization of X–Pro bonds is
(see [31] and references therein) and a number of plasmids rate limiting in the folding of many polypeptides and is
compatible with pMB1-derived cloning vectors are avail- catalyzed in vivo by peptidyl prolyl cis/trans isomerases
able for this purpose [36•,37,38]. The process does not (PPIases). Three cytoplasmic PPIases, SlyD, SlpA and
Figure 1
trigger factor (TF), have been identified to date [40,41]. AANDENYALAA (using amino acid single letter code)
The most potent is TF, a 48 kDa protein associated with [47•,48•]. The tagging mechanism involves the 10Sa (SsrA)
50S ribosomal subunits that has been postulated to coop- stable RNA and is designed to prevent ribosome stalling at
erate with chaperones to guarantee proper folding of newly the 3′ end of damaged mRNAs [49]. Proteases Lon and
synthesized proteins. Whether TF overproduction will ClpYQ appear to be more generic as they efficiently
improve the folding of recombinant proteins synthesized degrade puromycin-truncated proteins; however, there is
in the E. coli cytoplasm remains to be determined. some evidence that they also exhibit tail specificity. An
It should be noted, however, that this may not be obvious consequence of the existence of the SsrA tagging
without physiological consequences as TF and SlyD system is that any heterologous proteins rich in non-polar
overproduction lead to cell filamentation. Interestingly, co- residues at its carboxyl terminus will be an appetizing sub-
overproduction of a leader-less version of PpiA (thus strate for cellular proteases.
confining to the cytoplasm a PPIase that normally resides
in the periplasm) has been shown to increase the yields of A possible strategy to avoid degradation is to make use of
a cytoplasmic fusion protein [42•]. host strains bearing mutations in protease genes; however,
there are drawbacks to this approach. For example, inacti-
Structural disulfide bonds do not form in the cytoplasm of vation of Lon leads to filamentation and FtsH is an
wild-type E. coli strains, as this environment is reducing essential protein for which only thermosensitive mutants
and at least five proteins (thioredoxins 1 and 2, and glutare- are available. In addition, several proteases are usually
doxins 1, 2 and 3, the products of the trxA, trxC, grxA, grxB involved in the degradation of a given protein substrate
and grxC genes, respectively) are involved in the reduction but multiple mutations in genes encoding proteases
of disulfide bridges that transiently arise in cytoplasmic reduce cell growth rates and compromise strain fitness. An
enzymes [43••]. Nevertheless, disulfide-bonded recombi- alternative is to target the polypeptide of interest to the
nant proteins can accumulate in the cytoplasm of insoluble fraction of the cell, as inclusion-body proteins are
surprisingly healthy trxB mutants that lack thioredoxin generally protected from degradation. For a normally solu-
reductase, a protein responsible for the reduction of oxi- ble protein, this can be achieved by using strains bearing
dized thioredoxins. Oxidation occurs post-translationally thermosensitive mutations in the major molecular chaper-
and is favored at low temperatures (see [44] and references one systems [50]. It is important, however, to bear in mind
therein). While mutants lacking both trxB and genes that certain proteases (e.g. OmpT) adsorb to the surface of
involved in the reduction of glutaredoxins (e.g. gshA and inclusion bodies during the recovery process and may
gor) are even more efficient at accumulating oxidized degrade the desired protein while it is being refolded. The
recombinant proteins in dithiothreitol-free medium, they inner membrane protease FtsH is also active under dena-
exhibit severe growth deficiencies in the absence of the turing conditions and can process recombinant proteins
reducing agent [45]. As the majority of a cysteine-rich associated with the inner-membrane during their refolding
eukaryotic protein was found to accumulate in an almost (KW Cooper, F Baneyx, unpublished data).
completely oxidized, but inter-molecular disulfide bonded
form in trxB mutants held on ice [44], the main challenge Fusion proteins
will be to engineer protein disulfide isomerases capable of Although fusion proteins were originally constructed to
reshuffling disulfide bridges into their native pattern in facilitate protein purification and immobilization and to
this environment. couple the activity of enzymes acting in a single metabolic
pathway, it soon became apparent that certain fusion part-
Cytoplasmic degradation ners could greatly improve the solubility of passenger
Protein folding and proteolytic degradation are intimately proteins that would otherwise accumulate within inclusion
linked as catabolism is an efficient way to conserve cellular bodies in the cell cytoplasm. Systems suitable for the con-
resources by recycling improperly folded or irremediably struction of fusions to maltose-binding protein (MBP),
damaged proteins into their constituent amino acids. In the thioredoxin and glutathione S-transferase are commercially
cytoplasm of E. coli, most — if not all — early degradation available and additional ‘solubilizing’ fusion partners (e.g.
steps are carried out by five ATP-dependent Hsps: Lon/La variants of DsbA and gpHD) have recently been described
FtsH/HflB, ClpAP, ClpXP, and ClpYQ/HslUV [46]. ClpAP [42•,51•]. The most probable reason for improved folding
and ClpXP are two-component proteases that share the (and/or reduced degradation) of passenger proteins is that
same degradation subunit (ClpP) but have different the fusion partner efficiently and rapidly reaches a native
ATPase regulatory subunits (ClpA or ClpX). The latter conformation as it emerges from the ribosome (or soon after
appear to bind substrates in a chaperone-like manner and its release), and promotes the acquisition of correct struc-
use ATP hydrolysis to feed them to the proteolytic center ture in downstream folding units by favoring on-pathway
of mini-proteasome structures. Along with FtsH (an inner isomerization reactions. In the case of unfused cytoplasmic
membrane-associated protease the active site of which MBP, proper folding requires both DnaK-DnaJ-GrpE and
faces the cytoplasm), ClpAP and ClpXP are responsible for GroEL-GroES, which may recruit chaperones in the vicin-
the degradation of proteins modified at their carboxyl ter- ity of the passenger protein (JG Thomas, F Baneyx,
mini by addition of the non-polar destabilizing tail unpublished data). It has also been proposed that MBP
may directly interact with passenger proteins [52••], there- efficient translocation of heterologous polypeptides across
by acting as an ‘intramolecular’ chaperone, much like the inner membrane when fused to their amino termini. In
protease propeptides do [53]. These mechanisms require some cases, however, preproteins are not readily exported
the MBP domain to be synthesized first and are in agree- and either become ‘jammed’ in the inner membrane, accu-
ment with a study showing that, whereas mammalian mulate in precursor inclusion bodies, or are rapidly
asparatic proteinases are soluble when fused to the car- degraded within the cytoplasm. While membrane jamming
boxyl-terminus of MBP, they become insoluble when the is an indication that translocation may be physically impos-
order of the fusion proteins is reversed [54]. It should final- sible (e.g. in the case of large cytoplasmic proteins,
ly be noted that, despite outlandish claims, all fusion unatural fusion proteins, and mutant proteins evolved by
partners are not equally proficient at alleviating inclusion combinatorial approaches), an improved understanding of
body formation. In a systematic comparison of the effec- secretory mechanisms in E. coli has provided clues to cir-
tiveness of various fusion partners in increasing the cumvent other problems.
solubility of six aggregation-prone passenger polypeptides,
Kapust and Waugh [52••] found that MBP was far superior Efficient translocation requires that secretory proteins be
to either thioredoxin or glutathione-S-transferase as a ‘solu- brought into the vicinity of the inner membrane in a loose-
bilizing’ partner. ly folded form. This is guaranteed by molecular
chaperones, which can be either generic (e.g. DnaK and
The affinity of certain fusion partners for immobilized lig- GroEL) or specific for secretory proteins. SecB, a tetramer-
ands can facilitate the purification of the desired fusion ic polypeptide present at low levels in the cytoplasm binds
protein; however, binding usually occurs with low affinity to the mature domain of a subset of preproteins destined
(which precludes the use of stringent wash conditions) and for the outer membrane and transfers them to peripheral
can be disrupted by passenger proteins. The use of poly- membrane protein SecA. The latter uses energy derived
histidine tags at the amino-terminus or at the junction from ATP hydrolysis and the proton motive force to medi-
region of the fusion partner can solve this problem by ate preprotein export by cycles of insertion and de-insertion
allowing efficient purification via immobilized metal affin- into the SecYEG translocon [56•,57•]. The signal recogni-
ity chromatography [51•,55]. An additional advantage of tion particle (SRP), which consists of a 4.5S RNA and a
fusion proteins is that they appear to permit the synthesis 48 kDa GTPase termed Ffh/P48 binds highly hydrophobic
of otherwise poorly translated polypeptides. A probable signal sequences in certain preproteins (e.g. integral inner
explanation for this result is that the translation of passen- membrane proteins) and delivers them to the peripheral
ger proteins containing rare codons occurs with higher membrane protein FtsY in the vicinity of SecA and
efficiency; however, this may also lead to Lys→Arg misin- SecYEG [58••]. It is therefore probable that the majority of
corporation at rare codons. secretory proteins are delivered to the SecA motor via a
variety of targeting mechanisms for export through
Currently, the main disadvantages of fusion-protein tech- SecYEG; however, some inner-membrane proteins also
nologies are that: firstly, liberation of the passenger appear to directly integrate into the lipid bilayer [59].
proteins requires expensive proteases (e.g. Factor Xa and
enterokinase); secondly, cleavage is rarely complete lead- In view of the above mechanistic information, it is tempt-
ing to reduction in yields; thirdly, additional steps may be ing to hypothesize that the misfolding and degradation of a
required to obtain an active product (e.g. formation and number of heterologous proteins targeted for the periplasm
isomerization of disulfide bonds); and finally, solubility is results from their inefficient chaperoning to the translocase,
never guaranteed. either because they fold (or misfold) too rapidly in the cyto-
plasm, or because the necessary chaperone(s) become
Secretion limiting. Attempts to co-overproduce SecB, DnaK-DnaJ
Polypeptides destined for export are synthesized as pre- and GroEL-GroES have met with variable success and
proteins containing an amino-terminal signal sequence improved secretion depends heavily on the signal-
(leader peptide) that is cleaved during the translocation sequence–mature protein combination [60]. This suggests
process by inner-membrane-associated leader peptidases, that the signal sequence influences secondary and tertiary
the active sites of which face the periplasm. Typical signal structure formation in the mature region of secretory pro-
sequences are 18–30 amino acids in length and consist of teins, which in turn affects chaperone recognition. It may
two or more basic residues at the amino terminus, a central therefore be necessary to try several signal sequences
hydrophobic core of seven or more amino acids, and a and/or overproduce different chaperones to optimize the
hydrophilic carboxyl-terminus motif recognized by leader translocation of a given heterologous protein. At present,
peptidases (usually small residues at positions –1 and –3 there are no reports on how overproduction of components
[A, G or S] preceeded by a helix-breaking residue at posi- of the SRP (and in particular FfH) affects protein secretion.
tion –6 [P or G]; where +1 denotes the first amino acid of This route may be particularly valuable for improving the
the mature protein). Many signal sequences derived from assembly of inner membrane proteins. It should finally be
naturally occuring secretory proteins (e.g. OmpA, OmpT, noted that strains selected for their ability to restore the
PelB, β-lactamase and alkaline phosphatase) support the export of preproteins with defective signal sequences are
useful hosts for facilitating export. The most potent muta- erologous proteins. The facts that only a small amount of
tion (prlA4 in SecY) was recently shown to function by information has been exploited for practical purposes and
stabilizing SecA at the SecYEG translocon [61••]. that many fundamental aspects of E. coli physiology remain
to be uncovered will continue to fuel progress in optimizing
this microorganism for protein expression. Many improve-
Folding and degradation in the periplasm
ments have resulted from serendipitous discoveries
The periplasm is an oxidizing environment that contains
(e.g. the usefulness of fusion proteins and the fact that
enzymes catalyzing the formation and rearangement of
disulfide bridges can form in the cytoplasm of trxB strains)
disulfide bonds [43••,62]. As a result, it is a particularly
and this trend is likely to continue. Although certain post-
attractive destination for the production of secreted
translational modifications (e.g. glycosylation) will probably
eukaryotic proteins. Recent studies have shown that co-
remain beyond the reach of E. coli, robust engineered
overproduction of protein disulfide isomerases (and in
strains suitable for the cost-effective production of a wide
particular DsbC) can greatly improve proper disulfide
variety of complex eukaryotic proteins should become
bond formation in cysteine-rich recombinant proteins,
available in the near future.
such as human tissue plasminogen activator [63••].
