Identifying Site-specific Metastasis Genes and Functions
G.P. GUPTA,* A.J. MINN,*§ Y. KANG,* ** P.M. SIEGEL,*# I. SERGANOVA,†
C. CORDÓN-CARDO,‡ A.B. OLSHEN,¶ W.L. GERALD,‡ AND J. MASSAGUÉ*
*Cancer Biology and Genetics Program and Howard Hughes Medical Institute, and Departments of †Neurology,
‡
Pathology, and ¶Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York 10021
Metastasis is a multistep and multifunctional biological cascade that is the final and most life-threatening stage of cancer progression. Understanding the biological underpinnings of this complex process is of extreme clinical relevance and requires unbiased and comprehensive biological scrutiny. In recent years, we have utilized a xenograft model of breast cancer metastasis
to discover genes that mediate organ-specific patterns of metastatic colonization. Examination of transcriptomic data from cohorts of primary breast cancers revealed a subset of site-specific metastasis genes that are selected for early in tumor progression. High expression of these genes predicts the propensity for lung metastasis independently of several classic markers of
poor prognosis. These genes fulfill dual functions—enhanced primary tumorigenicity and augmented organ-specific metastatic
activity. Other metastasis genes fulfill functions specialized for the microenvironment of the metastatic site and are consequently not selected for in primary tumors. These findings improve our understanding of metastatic progression, facilitate the
interpretation of primary tumor gene expression data, and open several important possibilities for future clinical application.
Tumorigenesis involves the temporal acquisition of genetic and epigenetic alterations that ultimately enable a
cell to divide without concern for the homeostatic constraints that limit the growth of normal tissues. As such,
cancer is not a static phenomenon, but rather a dynamic
process that evolves over the course of tumor initiation
and progression, and can manifest with an impressively
diverse array of phenotypic properties from one primary
tumor to the next. One of these properties is the ability of
cancerous cells from one organ to invade and thrive at another organ. Also known as metastasis, distant recurrence
is the leading cause for mortality in patients with solid tumors of most organs. However, not all primary tumors
acquire the metastatic phenotype in the course of disease
progression, and prospectively identifying which patients
are most (or least) likely to develop metastases is of immense clinical importance. Furthermore, understanding
the mechanisms that drive the formation of metastases
may identify novel targets for much-needed therapy
against this deadly biological process (see Fig. 1).
From the perspective of an invasive cancer cell, not all
potential sites for metastasis are created equal. Clinicians
have observed for over a century that certain types of primary tumors are more likely to metastasize to specific organs (Fidler 2003). For example, advanced colon cancers
frequently spread to liver, and breast cancers preferentially metastasize to bone and lung. On the basis of these
clinical observations, Stephen Paget proposed in 1889 the
“seed and soil hypothesis,” which postulated that tumor
cells (the seeds) will only grow in a distant organ if they
are competent to thrive in that microenvironment (the
Present addresses: §Department of Radiation Oncology, University of Chicago, Chicago, Illinois 60637; **Department of
Molecular Biology, Princeton University, Princeton, New Jersey 08544; #Departments of Medicine and Biochemistry,
McGill University, Montreal, Quebec, Canada H34 1A4.
soil). This theory, which placed prime emphasis on the
cross talk between metastatic tumor cells and their microenvironment, was contested by James Ewing in 1926,
when he proposed that metastatic propensities are dictated primarily by circulatory patterns—i.e., cells will
metastasize to the organ to which they have the greatest
vascular access. Subsequent analyses of patterns of
metastatic spread in patients as well as in experimental
models concluded that although regional recurrences
were highly dependent on the efficiency of vascular perfusion, distant metastatic recurrence for most tumors was
truly non-random, with no correlation to anatomically defined patterns of hematogenous or lymphatic circulation.
Thus, Paget’s seed and soil hypothesis prevailed, although molecular determinants of this putative cross talk
were still entirely unknown.
For a cell to successfully metastasize to a distant organ,
it must resist cell death pathways while accomplishing
several distinct biological steps, including intravasation,
adhesion, extravasation, angiogenesis, and growth in a
foreign tissue (Chambers et al. 2002). Presumably, there
are molecular mediators of these various processes, and
the seed and soil hypothesis would suggest that at least
some of these mediators are tissue-specific (Fidler 2003).
Because of these complexities, metastasis is considered
an inefficient process. In fact, it is postulated that malignant tumors release many thousands of cells into circulation daily, yet several orders of magnitude fewer metastases are ever observed in patients. Mouse models of
experimental metastasis also recapitulate this phenomenon, where only a fraction of cancer cells are able to
generate macroscopic metastases. Cancer biologists have
harnessed this highly selective feature of metastasis to
discover genes that are specifically enriched (or depleted)
in those tumorigenic cells that give rise to distant metastases in animal models.
How the multiple functions required for metastasis are
selected for in the process of tumor progression is not
Cold Spring Harbor Symposia on Quantitative Biology, Volume LXX. © 2005 Cold Spring Harbor Laboratory Press 0-87969-773-3.
149
150
GUPTA ET AL.
Figure 1. Steps in tumor progression and metastatic dissemination and growth. A schematic depicting the various pathologically defined stages of tumor progression, as well as various functions associated with metastatic spread. The functions related to dissemination are suspected to be more general mediators of metastasis. Subsequent functions required for metastatic colonization (in purple
box) may be unique to the microenvironments of the different metastatic sites.
well understood, and is currently a subject for debate
(Bernards and Weinberg 2002; Hynes 2003). Generally,
it is agreed that the genomic and epigenomic instabilities
known to exist in cancerous cells can spawn massive genetic heterogeneity within tumor cell populations. However, whether or not there are specific genetic events selected for during metastatic progression, and where and
when this selection may be taking place, is still highly
contentious. Conventional belief on this issue, largely
from models of experimental metastasis, is that rare variants from primary tumor populations are selected for at
the metastatic site based on an ability to survive and ultimately thrive at the distant site. However, why a cell with
such unique abilities would be present with any prevalence in the primary tumor, coupled with the notion that
metastasis itself is a physically inefficient process, makes
it difficult to imagine why metastasis is not more rare
than it actually is.
A breakthrough in this conundrum was revealed when
microarray analysis was conducted on primary tumors
from breast cancer patients. It was discovered by several
laboratories that genes expressed in the bulk primary tumor population were sufficient to predict whether a patient
would develop distant metastatic recurrence (van’t Veer et
al. 2002; Ramaswamy et al. 2003). These surprising findings demanded a reassessment of the prevailing models of
metastasis. Some have interpreted from these results that
primary tumors are, at a very early stage, destined to be either metastatic or non-metastatic, and that no further
meaningful selection is necessary before cells from the primary tumor metastasize to a distant organ. However,
whether any of the genes included in these poor prognosis
signatures mediate metastasis remains an unanswered
question. Additionally, the expression of poor prognosis
genes in primary tumors does not explain the diversity of
organ-specific metastasis patterns exhibited by advanced
breast cancers. It has more recently been appreciated that
poor prognosis signatures derived from different cohorts
of patients by different laboratories yield distinct genes
with very little overlap (van de Vijver et al. 2002; Jenssen
and Hovig 2005; Wang et al. 2005). Thus, the biological
meaning, as well as the clinical utility, of poor prognosis
genes in primary tumors remains to be unraveled.
In recent years, our laboratory has explored mechanisms of metastasis using a heterogeneous breast cancer
cell line derived from the pleural effusion of a patient
with widely metastatic breast cancer (Kang et al. 2003;
Minn et al. 2005a,b). Through various techniques, we
have identified subpopulations of cells within the
parental cell line that exhibit distinct patterns of site-specific metastasis when inoculated into immunocompromised mice. We have demonstrated that these metastatic
phenotypes can be linked to specific patterns of gene expression. By overexpressing candidate metastasis genes
in weakly metastatic cells, or by knocking down their expression in aggressively metastatic cells, we have confirmed that many of these genes are mediators as well
as markers of site-specific metastasis. Finally, we have
applied these site-specific metastasis signatures to a cohort of primary breast cancers with known metastatic outcome, which has yielded significant insight into the biology driving this complex process. We present this body
of work as a new methodological paradigm that couples
experimental models of metastasis with the analysis of
human breast tumors, in an attempt to discover clinically
relevant mechanisms of breast cancer metastasis.
GENETIC HETEROGENEITY DETERMINES
DIFFERENCES IN METASTATIC POTENTIAL
The MDA-MB-231 cell line is derived from the malignant pleural effusion of a patient with metastatic
SITE-SPECIFIC METASTASIS GENES
151
Figure 2. Genetic heterogeneity determines metastatic phenotypes. (A) Multidimensional scaling plot of ~1200 genes that are differentially expressed among SCPs derived from the parental MDA-MB-231 cell line. (B) Metastatic phenotypes after intracardiac and
tail vein inoculation of representative SCPs. (C) Expression of a Rosetta-like poor prognosis signature by several SCPs derived from
MDA-MB-231 cells. The Pearson correlation coefficient between the different SCPs is invariably greater than 0.95. (D) Orthotopic
inoculation of different tumor cell populations and subsequent surgical resection and monitoring for emergent lung metastases. The
table shows lung metastatic activity for parental MDA-MB-231 cells, highly bone metastatic 1833 cells, moderately lung metastatic
1834 cells, and aggressively lung metastatic 4175 (LM2) cells.
breast cancer (Cailleau et al. 1978). We postulated that
this cell line might be composed of cells that are genetically heterogeneous. In fact, one may imagine that malignant cells in the pleural fluid of a patient with
metastatic breast cancer may represent a demographic
cross section of the circulating tumor cells that have been
released from metastases in diverse organs. To test this
hypothesis, we performed limiting dilution cloning of
the parental cell line to derive several distinct populations of single cell-derived progeny, or SCPs. Transcriptomic analysis of these cells using Affymetrix HGU133A microarrays revealed that over 1200 genes were
differentially expressed among these different populations (Minn et al. 2005b). By representing these gene expression differences in three dimensions using multidimensional scaling, three distinct subgroups of SCPs
were identified (Fig. 2A). These findings confirmed that
cells within the MDA-MB-231 cell line were genetically
heterogeneous and exhibited distinct patterns of gene
expression.
We next sought to determine whether these genetic differences had implications for the metastatic potential of
the different SCPs. Consequently, we xenografted the
SCPs into either the left cardiac ventricle or the lateral tail
vein of immunocompromised nude mice. To facilitate
identification and monitoring of the emergent metastases,
we engineered the SCPs to express a triple modality
imaging vector encoding a fusion protein of thymidine kinase, GFP, and firefly luciferase (Ponomarev et al. 2004),
and performed noninvasive bioluminescence imaging sequentially over time. To our surprise, the SCPs exhibited
a diverse array of phenotypic patterns of metastatic
spread (Fig. 2B). Some of the SCPs were robustly
metastatic to the bones, yet displayed no metastatic activity to the lungs. Alternatively, some SCPs yielded aggressive metastases to the lungs and/or adrenal medulla,
while being only mildly metastatic to the bones. A third
group of SCPs exhibited dormant metastatic behavior,
giving rise only to indolent growths that rarely developed
into overt metastases. Gratifyingly, SCPs that were simi-
152
GUPTA ET AL.
lar based on the multidimensional scaling plot of differentially expressed genes displayed similar metastatic behaviors. Thus, cells within the pleural effusion-derived
MDA-MB-231 cell line were phenotypically diverse in
their metastatic potential, and these differences correlated
with distinct patterns of gene expression.
ORGAN-SPECIFIC METASTATIC POTENTIAL
IS NOT RELATED TO DIFFERENCES IN A
POOR PROGNOSIS SIGNATURE
By using supervised clustering of a cohort of primary
breast cancers, van’t Veer and colleagues (van’t Veer et
al. 2002) identified a 70-gene poor prognosis signature
that classified breast cancers with a high likelihood of developing distant metastatic recurrence. This signature
was validated on an independent cohort of 295 breast cancers and was shown to be an independent factor predicting patient prognosis (van de Vijver et al. 2002). We
wanted to determine whether genes in this signature correlated with organ-specific patterns of metastasis. To this
end, we confirmed that the parental MDA-MB-231 cell
line expressed a Rosetta-type poor prognosis signature,
containing 54 of the 70 poor prognosis genes identified
by van’t Veer et al. that are represented on the Affymetrix
HG-U133A microarray platform. This subset of genes
was validated by two methods. First, these 54 genes performed nearly as well as the original 70-gene signature in
predicting patient prognosis of the 78-tumor cohort from
which the gene set was derived. Second, Affymetrix
probe sets corresponding to these 54 genes were able to
segregate a subgroup of patients with a worse prognosis
in an independent cohort of primary breast cancers with
at least 5 years of follow-up obtained at our institution.
Parental MDA-MB-231 cells expressed this Rosetta-type
poor prognosis signature in a manner similar to primary
breast cancers that fell into the poor prognosis classification. In contrast, MCF-10A cells (an immortalized nontumorigenic mammary epithelial cell line) did not express the poor prognosis signature.
All of the SCPs derived from MDA-MB-231 cells also
expressed the poor prognosis signature. This was evident
from a hierarchical clustering analysis of the SCPs combined with the MSKCC 82 primary breast cancer cohort.
In addition, there was very little variation in the expression of the poor prognosis genes among the different
SCPs (Fig. 2C, Pearson correlation coefficients greater
than 0.95). Furthermore, none of the poor prognosis
genes correlated with any of the organ-specific metastatic
patterns exhibited by the different SCPs. Thus, although
the poor prognosis genes may indicate whether a primary
tumor is likely to develop distant metastatic recurrence,
expression of these genes does not explain the diversity of
metastatic patterns exhibited by advanced breast cancer
cells. Interestingly, dormant SCPs that rarely gave rise to
overt metastases also uniformly expressed the poor prognosis signature. This indicates that expression of these
genes is not sufficient for metastasis and, consequently,
that additional gene expression events must occur before
cells gain a truly metastatic phenotype.
GENES THAT MEDIATE SITE-SPECIFIC
METASTASIS
To identify genes that mediate organ-specific metastasis, we utilized in vivo selection of highly metastatic subpopulations from the weakly metastatic parental cell line
by passaging it through immunocompromised mice. For
bone metastasis assays we injected parental MDA-MB231 cells into the left cardiac ventricle, and for lung
metastasis we introduced weakly metastatic parental or
1834 cells (a bone metastasis isolate from the parental
cell line that did not exhibit any enrichment in metastatic
activity) into the lateral tail vein. By extracting metastatic
lesions and reinoculating them into mice to assay for enrichment in metastatic activity, we were able to isolate
aggressively bone metastatic sublines after one round of
in vivo selection (denoted BM1), and highly lung metastatic populations after two rounds of in vivo selection
(denoted LM2).
Transcriptomic analysis of these different in vivo selected subpopulations and the parental cell line enabled
the elucidation of bone and lung metastasis gene expression signatures. The bone metastasis signature comprised
102 genes, of which 43 were overexpressed and 59 underexpressed in highly bone metastatic populations
(Kang et al. 2003). Similarly, the lung metastasis signature contained 48 genes that were overexpressed and 47
genes that were underexpressed in aggressively lung
metastatic LM2 populations (Minn et al. 2005a). Many of
the genes in these signatures encoded secretory or cellsurface proteins, making them ideal candidates for enabling interactions between the metastatic tumor cells
and their adopted microenvironments (see Fig. 3A,B). Interestingly, only 6 genes were concordantly expressed in
both metastasis signatures.
There are likely to be many genes that facilitate general
metastatic activity, such as those that promote intravasation from the primary tumor into the circulation. However, our experimental approach was in principle seeking
no such genes, but genes that mediate metastatic events in
the distant organs (see purple box in Fig. 1). First, our
starting cell line was already derived from cells that had
escaped from the patient’s primary tumor and metastasized to the pleural cavity. In addition, our experimental
metastasis assays involved direct inoculation into the circulation, thereby modeling the later steps of metastatic
growth that are more likely to be site-specific. Nonetheless, the existence of distinct site-specific metastasis signatures expressed by subpopulations of cells from the
same parental cell line confirmed the hypothesis that cells
in malignant breast cancer pleural effusions are differentially genetically endowed to metastasize to various organs.
To distinguish genes that mediate metastasis from
those that simply correlate with and serve as markers for
metastatic potential, we performed functional assays
(Kang et al. 2003). Overexpression of interleukin-11 (IL11) and osteopontin (OPN) together, but neither alone,
was sufficient to enhance the osteolytic bone metastatic
activity of parental cells (Fig. 3C). Addition of either con-
SITE-SPECIFIC METASTASIS GENES
153
Figure 3. Cooperation between functional mediators of tissue-specific metastasis. (A) Expression patterns of several overexpressed
bone metastasis genes. Populations listed include the two replicates of the parental MDA-MB-231 cells, in vivo selected subpopulations, and SCPs. Metastatic activity of the different populations is color-coded. (B) Heatmap showing expression of several lung
metastasis genes overexpressed in lung metastatic populations. Lung metastatic SCPs expressed only a subset of lung metastasis
genes. (C) Summary of overexpression experiments demonstrating the cooperative action of different bone metastasis mediators. Relative bone metastatic strength was calculated from the time until 50% of the bone metastatic events occurred and the total percentage
of mice that developed bone metastasis for each cohort, both of which were normalized to the basal metastatic activity of parental
MDA-MB-231 cells. (D) Relative lung metastatic activity after transducing parental cells with various genes and gene combinations.
Lung metastatic activity was determined by bioluminescent imaging after approximately 7 weeks post-xenografting, normalized to
parental cells transduced with a vector control.
nective tissue growth factor (CTGF) or chemokine (C-XC motif) receptor 4 (CXCR4) to these genes increased the
rate and frequency of bone metastasis (Fig. 3C). These
observations supported the notion that metastasis is a
multistep and multifunctional process and, as such, multiple genes may be required to see an enhancement in the
metastatic rate. Based on the previously known biology
of these gene products, we postulated that CXCR4 may
facilitate homing to and survival in the bone microenvironment, that IL-11 and OPN may cooperate in the recruitment and activation of host osteoclasts, and that
CTGF may modify the extracellular environment to facilitate angiogenesis. In this manner, metastasis genes
may fulfill different aspects of the cross talk between the
tumor cells and stromal cells that are necessary to enable
metastatic growth.
Similar functional analysis identified nine lung metastasis genes that cooperated to promote aggressive lung
metastatic growth (Minn et al. 2005a). Overexpression
studies demonstrated that lung metastasis resulted from
the cooperation of several extracellular modifiers, including the extracellular matrix molecule SPARC, the
chemokine CXCL1, the mitogen Epiregulin, the matrix
metalloproteinases MMP1 and MMP2, the cell surface receptors VCAM1 and IL13RA2, as well as intracellular
modulators of gene expression and signaling including
ID1 and COX2/PTGS2 (Fig. 3D). Importantly, overexpression of these genes had no effect on the bone
metastatic activity of the parental cell line. RNAi-mediated knockdown of ID1, VCAM1, and IL13RA2 resulted
in a greater than 10-fold reduction in lung metastatic
growth within 6 weeks after tail vein injection. These
findings confirmed that many of the genes selected for
during in vivo selection of highly lung metastatic subpopulations were mediators of aggressive lung metastasis. Furthermore, these genes cooperated to facilitate
lung-specific functions necessary for aggressive lung
metastatic growth.
154
GUPTA ET AL.
SITE-SPECIFIC METASTASIS GENES ARE
EXPRESSED BY A SUBSET OF CELLS IN THE
PARENTAL CELL POPULATION
Although the site-specific metastasis signatures were
generated by transcriptomic analysis of in vivo selected
populations, they were also useful in identifying in vitro
derived SCPs that were more or less metastatic to either
bone or lung. Hierarchical clustering analysis of the SCPs
with the bone metastasis signature identified a subgroup
of SCPs that was genetically similar to the in vivo selected highly bone metastatic populations (Fig. 3A).
Northern blot analysis of 46 SCPs from the parental
MDA-MB-231 cell line for five of the most differentially
expressed bone metastasis genes revealed SCPs that expressed a continuum of these genes, ranging from none of
them to all five of them (Kang et al. 2003). The bone
metastatic activity of these SCPs was directly correlated
with the degree of expression of these genes, providing
further evidence that these genes were mediators of osteolytic bone metastatic growth. By analyzing expression
of bone metastasis genes in SCPs derived from in vivo selected populations, we noticed an approximately 5-fold
enrichment in the proportion of cells expressing several
bone metastasis genes. Thus, bone metastasis genes were
expressed in a minority of cells in the malignant pleural
effusion-derived parental cell line, and in vivo selection
resulted in the enrichment of these preexisting highly
bone metastatic cells.
Similarly, the in vivo selected lung metastasis signature was also able to segregate lung metastatic SCPs from
those that were not metastatic to the lungs (Minn et al.
2005a). When compared to the LM2 lung metastatic populations obtained through three rounds of in vivo selection, the lung metastatic SCPs only expressed a partial
lung metastasis signature (Fig. 3B). In accordance with
this observation, lung metastatic SCPs were approximately 10-fold less metastatic than the in vivo selected
populations upon injection into the lateral tail vein. An
observation of interest was that the genes in the lung
metastasis signature were naturally divided into two categories. We postulated that genes that were expressed by
both the lung metastatic SCPs and the LM2 populations
may facilitate a baseline level of lung metastatic activity,
which we describe as “lung metastagenicity.” Genes that
were expressed exclusively by the most highly metastatic
LM2 populations may confer functions enabling aggressive growth within the lung microenvironment, which we
denote as “lung metastatic virulence.” Lung metastagenicity genes included ID1, COX2, CXCL1, MMP1, and
many others. Examples of lung metastatic virulence
genes were VCAM1, SPARC, IL13RA2, and MMP2.
Genes from both of these categories were shown to be
mediators of lung metastasis in functional assays.
EXPRESSION OF SITE-SPECIFIC METASTASIS
GENES IN PRIMARY BREAST CANCERS
Because of the recent discovery of poor prognosis signatures in primary breast cancers, we wanted to examine
whether our organ-specific metastasis signatures may
also be expressed at this early stage. To ask this question,
we utilized a cohort of 82 primary breast tumors with at
least 3 years of follow-up, and with known organ-specific
metastatic outcome (hereby referred to as the MSKCC
cohort). Direct hierarchical clustering of all 82 primary
tumors with the bone metastasis signature did not identify
a subgroup of tumors that expressed the signature in a
manner resembling the BM1 populations (Minn et al.
2005b). When the analysis was restricted only to patients
that were known to develop metastatic recurrence, the
bone metastasis signature segregated patients that went
on to develop primarily bone metastasis from those that
developed metastasis to other sites. Thus, the experimentally derived bone metastasis signature was only
marginally expressed by primary breast tumors, and partial expression of this signature could not be used to
prospectively identify patients with an increased likelihood of developing bone metastatic recurrence.
Hierarchical clustering of this same primary tumor cohort with the lung metastasis signature yielded a dramatically different result (Minn et al. 2005a). A group of tumors in a highly reproducible branch of the dendrogram
expressed the lung metastasis genes in a manner resembling the LM2 populations (Fig. 4A). This subset of tumors had molecular features of aggressive disease, including negative estrogen and progesterone receptor
status, expression of a Rosetta-type poor prognosis signature, and a basaloid genotype. Examination of the
metastatic outcome of these patients revealed a high
prevalence of lung metastatic recurrence (Fig. 4A). This
prompted us to examine these data in a statistically rigorous manner. First, univariate analysis of the lung metastasis signature identified expression of 12 genes that significantly correlated with tumors which gave rise to lung
metastases. A classifier was generated by weighting the
expression of lung metastasis genes according to the
aforementioned univariate correlations, which identified
patients that had a significantly higher likelihood of developing distant lung, but not bone, metastatic recurrence
(p = 0.0018 for lung metastasis, and p = 0.31 [NS] for
bone metastasis). In a separate analysis, weighting the expression of the lung metastasis genes according to the
fold change exhibited by LM2 versus parental MDAMB-231 populations also created a classifier that distinguished a subgroup of patients with a high risk of developing lung metastasis. Thus, an experimentally derived
lung metastasis signature was at least partially expressed
by a subset of primary breast cancers, and these patients
were more likely to develop lung metastases in the course
of their disease.
A biologically meaningful lung metastasis signature
should be expressed by different cohorts of primary tumors transcriptomically profiled on different microarray
platforms. We therefore validated our lung metastasis
signature on the cohort of 78 primary breast cancers utilized by van’t Veer et al. using dual color Agilent cDNA
microarrays (Minn et al. 2005a). Hierarchical clustering
revealed a group of tumors that coexpressed the lung
metastasis signature in a manner resembling the LM2 in
vivo selected populations. Of note, the 9 functionally validated lung metastasis genes were remarkably overex-
SITE-SPECIFIC METASTASIS GENES
155
Figure 4. Expression of lung metastasis genes by primary breast cancers. (A) Unsupervised hierarchical clustering of the MSKCC
cohort (82 tumors) with the 12 most univariately significant lung metastasis genes (correlation with lung metastatic outcome, p<0.05).
Also shown are a Rosetta-like poor prognosis signature, ER, PR, and Her2 expression, as well as keratin markers of basal and luminal subtypes of breast cancer. Patients identified with a red dot developed metastases to sites other than the lung, whereas patients labeled with a black dot suffered from lung metastases. Highlighted in a light blue background is a robust branch of the dendrogram
containing tumors that expressed the lung metastasis genes in a manner resembling the high lung metastatic populations derived from
MDA-MB-231 cells. This group of tumors was enriched for those with a high likelihood of developing lung metastatic recurrence.
(B) Lung (top) and bone (bottom) metastasis-free survival curves for patients that expressed the lung metastasis signature (red) and
those that did not express the lung metastasis signature.
pressed in this group of tumors. Although organ-specific
metastatic outcome is not publicly available for this cohort, time until distant metastatic recurrence and overall
survival of these patients is known. The majority of primary breast cancers that expressed the lung metastasis
signature in this cohort went on to develop distant
metastatic recurrence and had a poor overall survival. We
hypothesize that many of these patients may have suffered from lung metastasis.
Interestingly, not all of the genes in the experimentally
derived lung metastasis signature were informative in
predicting primary tumors that went on to develop lung
metastasis. For this reason, training the signature on a primary tumor cohort to weight genes according to their robustness of expression among tumors that coexpress lung
metastasis genes may engender a more accurate algorithm that is better suited to classify primary breast cancers. Because clinical outcome is not needed for algorithm training, the cohort studied by van’t Veer et al. was
used to generate a classifier that accurately segregated tumors which most resembled the LM2 populations (Minn
et al. 2005a). Six of the nine functionally validated genes
were among the most heavily weighted genes in this classifier. The remaining three genes were SPARC, MMP2,
and IL13RA2, which were all lung metastasis virulence
genes in MDA-MB-231 cells. This trained classifier was
the most accurate predictor of lung metastatic recurrence
in the MSKCC cohort (Fig. 4B), providing further evi-
dence that the lung metastasis signature is biologically
relevant, reproducible, and informative in identifying
breast cancer patients with a high likelihood of developing lung metastatic recurrence.
A SUBSET OF LUNG METASTASIS GENES, BUT
NOT BONE METASTASIS GENES, FACILITATE
PRIMARY TUMORIGENICITY
The selective pressure that encourages expression of
metastasis genes in a subset of primary tumors is not apparent. One possibility is that some metastasis genes may
facilitate growth of the primary tumor, thereby increasing
the representation of cells expressing these genes in the
cancerous population. This is supported by the clinical
observation that larger breast cancers tend to have a
poorer prognosis, with an increased likelihood of
metastatic recurrence. According to this postulate, lung
metastasis genes that were informatively expressed in
primary tumors might encourage primary tumor growth,
whereas lung metastasis genes that were not predictive in
primary tumors, and all bone metastasis genes, might
have no effect on primary tumorigenicity.
To test this hypothesis, we established an orthotopic
model that mimicked breast cancer progression in patients (Fig. 2D). Injection of MDA-MB-231 populations
in the mouse mammary fat pad gave rise to primary tumors. When these tumors were surgically resected, lung
156
GUPTA ET AL.
Figure 5. Lung metastagenicity and primary tumorigenicity. (A) Relative tumor growth rates of different MDA-MB-231 populations
after orthotopic implantation into the mouse mammary fat pad. (B) Tumor growth rates after reducing the expression of various lung
metastasis genes in the 4175 (LM2) population. (C) Model depicting different subsets of lung metastasis genes and the roles they play
in primary tumorigenicity, lung metastagenicity, and lung metastatic virulence. Bone metastasis genes did not significantly affect primary tumor growth, and as such only have a role in bone metastatic growth.
metastases could be detected only in mice injected with
cells that were selected for aggressive lung metastatic activity (Fig. 2D). We also noticed a difference in primary
tumor growth rate upon orthotopic implantation of these
populations. Although tumors formed by parental and
bone metastatic populations grew at a similar rate in the
mammary fat pad, the tumors formed by lung metastatic
cell populations grew more rapidly in a manner proportional to their lung metastatic aggressiveness (Fig. 5A).
This increase in primary tumor growth rate was not due to
differences in the proliferative rate of the cells, because
immunohistochemical analysis of cell division did not reveal significant disparities. Consequently, genes in the
lung metastasis signature facilitated primary tumorigenicity, whereas bone metastasis genes had no effect on
primary tumor growth rate.
To identify which of the lung metastasis genes were
also facilitating primary tumorigenicity, we performed
orthotopic injections of LM2 cells expressing RNAi vectors targeting SPARC, IL13RA2, VCAM1, or ID1.
Whereas targeted inhibition of all of these genes had an
effect on inhibiting lung metastasis both via tail vein assays and after surgical resection of primary mammary fat
pad tumors, only ID1 had a role in promoting primary tumorigenicity (Fig. 5B). Interestingly, ID1 was the only
lung metastagenicity gene among the four genes tested,
and was among the 18 most heavily weighted genes in the
lung metastasis classifier of primary breast cancers. Thus,
lung metastasis genes can be divided into two categories:
those that facilitate both primary tumor formation and
basal lung metastatic activity, and those that enable aggressive growth in the lung microenvironment without
affecting growth in the primary tumor (Fig. 5C).
LESSONS LEARNED
By xenografting a malignant pleural effusion-derived
breast cancer cell line into immunocompromised mice,
we have learned several valuable lessons regarding principles governing metastasis. First, metastatic cells from
the same primary tumor exhibit diverse metastatic
tropisms, reflecting the genetic heterogeneity that pervades metastatic breast cancer populations. Second,
genes that mediate metastasis to lung and bone are distinct, and encode many cell surface and secreted proteins
that enable cross talk between the metastatic tumor cells
and these two divergent organ microenvironments. Together, these gene products act in concert to facilitate the
multifunctional task of metastasis.
The recent findings of others have identified poor prognosis genes that, when expressed in primary tumors, predict a poor prognosis for breast cancer patients. The organ-specific metastasis genes that we have identified are
distinct from these poor prognosis signatures. In fact,
analysis of SCPs from MDA-MB-231 cells revealed that
expression of these poor prognosis genes is not sufficient
SITE-SPECIFIC METASTASIS GENES
to enable metastasis. Rather, malignant breast cancer
cells become overtly metastatic only when organ-specific
metastasis signatures are layered upon this predisposing
genetic platform.
Analysis of organ-specific metastasis signatures in primary breast carcinomas has provided substantial biological insight into the mechanisms driving metastasis.
Although bone metastasis genes were only weakly expressed in primary tumors that later disseminated to the
bones, the lung metastasis signature was significantly expressed by a subset of breast cancers with a high likelihood of metastasizing to the lungs. This difference may
be due to partial similarities between the breast and lung
microenvironments that do not exist between breast and
bone. In fact, cells expressing lung, but not bone, metastasis genes are significantly more tumorigenic when orthotopically inoculated into the mammary fat pad of
immunocompromised mice. By selectively targeting different genetic mediators of lung metastasis, we exposed
two distinct types of lung metastasis genes; one group
that facilitated both primary tumorigenicity and lung
metastatic ability, and another group that mediated aggressive growth exclusively in the lung microenvironment. Because none of the bone metastasis genes enhanced primary tumorigenicity, they are all examples of
the latter category of metastasis genes (Fig. 5C).
FUTURE DIRECTIONS
Microarray analysis has unveiled a new era in the understanding of breast cancer. Examination of global differences in gene expression has confirmed the existence
of distinct molecular subtypes of breast cancer with different clinical tendencies (Perou et al. 2000; Sorlie et al.
2003), a long-suspected postulate of breast cancer pathologists and clinicians. Microarray technology has also enabled the identification of gene expression signatures that
indicate a poor prognosis for breast cancer patients (van
de Vijver et al. 2002; Wang et al. 2005). However, the
clinical utility of this application is hampered by concerns
regarding the reproducibility and robustness of the correlations between these poor prognosis genes and clinical
outcome, especially when tested on different patient populations using different microarray platforms (Jenssen
and Hovig 2005). Indeed, clinical trials are currently ongoing to address some of these issues. Nonetheless, cancer biologists have, as yet, been unable to make a mechanistic link between the expression of these poor prognosis
genes and the eventual clinical outcome that afflicts
breast cancer patients. Consequently, improvements in
the design of microarray-based experiments are necessary before the wealth of information provided by microarray analyses is harnessed for its true potential.
One way to extract biological meaning from microarrays of clinical samples may be to use an experimentally
tractable model of cancer as a biological filter for the
identification of functionally relevant metastasis genes.
For example, a recent analysis of a “wound signature,”
derived by identifying serum-responsive genes in fibroblasts cultured in vitro, revealed that it classified tumors
157
into good and poor prognosis categories almost as effectively as the van’t Veer et al. poor prognosis signature
(Chang et al. 2004, 2005). In contrast to the van’t Veer et
al. signature, the wound signature is based on biological
principles and excites intriguing future experimental
pursuits using both model systems and clinical specimens.
Our organ-specific metastasis signatures adopted the
paradigm of functional derivation in a model system and
subsequent validation using clinical samples. For this reason, we anticipate that the clinical correlations of the lung
metastasis genes with lung metastatic outcome may be
more reproducible across patient cohorts. Indeed, we
have observed a subgroup of breast cancers expressing
the lung metastasis signature in several other publicly
available gene expression data sets derived from different
microarray platforms. Whether these patients were also
predisposed for developing lung metastasis is currently
being investigated. In addition, we are developing quantitative RT-PCR methodology for the analysis of paraffin-extracted RNA (Cronin et al. 2004; Paik et al. 2004),
which would enhance the validation potential of this signature as a clinically useful prognostic assay. It is our
hope that because this signature is based on several experimentally validated mediators of lung metastasis, it
may be more reproducible than other correlation-based
gene expression signatures.
Our experimental approach could be applied to identify
genes mediating metastasis to other clinically important
sites such as the brain or the liver, or metastasis by other
types of primary tumors. Furthermore, organ-specific
metastasis signatures may have implications for cancer
management and therapy. The prospective identification
of subgroups of patients coexpressing sets of genes that
collectively facilitate metastasis may aid in the selection
of appropriate patient populations for the administration
of novel or preexisting metastasis therapies. For example,
patients with primary tumors that express the lung metastasis signature may be monitored more frequently by thoracic computed tomography (CT) scans, and treated more
aggressively with conventional chemotherapies. Additionally, experimental drugs inhibiting specific biological
pathways may be tested in a combinatorial fashion selectively in this high-risk group of patients. It is not difficult
to imagine that targeted therapies which proved unsuccessful as single agents may be efficacious when rationally combined with other drugs. A thorough understanding of the biological mechanisms by which metastasis
genes facilitate tumorigenicity and/or metastasis should
drive the rational design of clinical trials to test new therapeutic strategies aimed at selectively targeting metastases, while being nontoxic to the patient.
Metastatic cancer remains, for the most part, a complex
and incurable disease. In recent years, cancer biologists
have accumulated a daunting array of technologies that
facilitate the modeling, mechanistic dissection, and gene
discovery of cancer progression. The next era in the fight
against cancer has already begun, and it is founded upon
the interdisciplinary efforts of engineers, biologists, biostatisticians, computer scientists, and clinicians. We are
158
GUPTA ET AL.
optimistic that this interdisciplinary communication will
make the coming years among the most exciting in the
war against cancer.
REFERENCES
Bernards R. and Weinberg R.A. 2002. A progression puzzle. Nature 418: 823.
Cailleau R., Olive M., and Cruciger Q.V. 1978. Long-term human breast carcinoma cell lines of metastatic origin: Preliminary characterization. In Vitro 14: 911.
Chambers A.F., Groom A.C., and MacDonald I.C. 2002. Dissemination and growth of cancer cells in metastatic sites. Nat.
Rev. Cancer 2: 563.
Chang H.Y., Sneddon J.B., Alizadeh A.A., Sood R., West R.B.,
Montgomery K., Chi J.T., van de Rijn M., Botstein D., and
Brown P.O. 2004. Gene expression signature of fibroblast
serum response predicts human cancer progression: Similarities between tumors and wounds. PLoS Biol 2: E7.
Chang H.Y., Nuyten D.S., Sneddon J.B., Hastie T., Tibshirani
R., Sorlie T., Dai H., He Y.D., van’t Veer L.J., Bartelink H.,
van de Rijn M., Brown P.O., and van de Vijver M.J. 2005. Robustness, scalability, and integration of a wound-response
gene expression signature in predicting breast cancer survival.
Proc. Natl. Acad. Sci. 102: 3738.
Cronin M., Pho M., Dutta D., Stephans J.C., Shak S., Kiefer
M.C., Esteban J.M., and Baker J.B. 2004. Measurement of
gene expression in archival paraffin-embedded tissues: Development and performance of a 92-gene reverse transcriptase-polymerase chain reaction assay. Am. J. Pathol. 164: 35.
Fidler I.J. 2003. The pathogenesis of cancer metastasis: The
‘seed and soil’ hypothesis revisited. Nat. Rev. Cancer 3: 453.
Hynes R.O. 2003. Metastatic potential: Generic predisposition
of the primary tumor or rare, metastatic variants-or both? Cell
113: 821.
Jenssen T.K. and Hovig E. 2005. Gene-expression profiling in
breast cancer. Lancet 365: 634.
Kang Y., Siegel P.M., Shu W., Drobnjak M., Kakonen S.M.,
Cordon-Cardo C., Guise T.A., and Massagué J. 2003. A
multigenic program mediating breast cancer metastasis to
bone. Cancer Cell 3: 537.
Minn A.J., Gupta G.P., Siegel P.M., Bos P.D., Shu W., Giri
D.D., Viale A., Olshen A.B., Gerald W.L., and Massagué J.
2005a. Genes that mediate breast cancer metastasis to lung.
Nature 436: 518.
Minn A.J., Kang Y., Serganova I., Gupta G.P., Giri D.D.,
Doubrovin M., Ponomarev V., Gerald W.L., Blasberg R., and
Massagué J. 2005b. Distinct organ-specific metastatic potential of individual breast cancer cells and primary tumors. J.
Clin. Invest. 115: 44.
Paik S., Shak S., Tang G., Kim C., Baker J., Cronin M., Baehner
F.L., Walker M.G., Watson D., Park T., Hiller W., Fisher
E.R., Wickerham D.L., Bryant J., and Wolmark N. 2004. A
multigene assay to predict recurrence of tamoxifen-treated,
node-negative breast cancer. N. Engl. J. Med. 351: 2817.
Perou C.M., Sorlie T., Eisen M.B., van de Rijn M., Jeffrey S.S.,
Rees C.A., Pollack J.R., Ross D.T., Johnsen H., Akslen L.A.,
Fluge O., Pergamenschikov A., Williams C., Zhu S.X., Lonning P.E., Borresen-Dale A.L., Brown P.O., and Botstein D.
2000. Molecular portraits of human breast tumours. Nature
406: 747.
Ponomarev V., Doubrovin M., Serganova I., Vider J., Shavrin
A., Beresten T., Ivanova A., Ageyeva L., Tourkova V., Balatoni J., Bornmann W., Blasberg R., and Gelovani Tjuvajev J.
2004. A novel triple-modality reporter gene for whole-body
fluorescent, bioluminescent, and nuclear noninvasive imaging. Eur. J. Nucl. Med. Mol. Imaging 31: 740.
Ramaswamy S., Ross K.N., Lander E.S., and Golub T.R. 2003.
A molecular signature of metastasis in primary solid tumors.
Nat. Genet. 33: 49.
Sorlie T., Tibshirani R., Parker J., Hastie T., Marron J.S., Nobel
A., Deng S., Johnsen H., Pesich R., Geisler S., Demeter J.,
Perou C.M., Lonning P.E., Brown P.O., Borresen-Dale A.L.,
and Botstein D. 2003. Repeated observation of breast tumor
subtypes in independent gene expression data sets. Proc. Natl.
Acad. Sci. 100: 8418.
van de Vijver M.J., He Y.D., van’t Veer L.J., Dai H., Hart A.A.,
Voskuil D.W., Schreiber G.J., Peterse J.L., Roberts C., Marton M.J., Parrish M., Atsma D., Witteveen A., Glas A., Delahaye L., van der Velde T., Bartelink H., Rodenhuis S., Rutgers E.T., Friend S.H., and Bernards R. 2002. A
gene-expression signature as a predictor of survival in breast
cancer. N. Engl. J. Med. 347: 1999.
van’t Veer L.J., Dai H., van de Vijver M.J., He Y.D., Hart A.A.,
Mao M., Peterse H.L., van der Kooy K., Marton M.J., Witteveen A.T., Schreiber G.J., Kerkhoven R.M., Roberts C.,
Linsley P.S., Bernards R., and Friend S.H. 2002. Gene expression profiling predicts clinical outcome of breast cancer.
Nature 415: 530.
Wang Y., Klijn J.G., Zhang Y., Sieuwerts A.M., Look M.P.,
Yang F., Talantov D., Timmermans M., Meijer-van Gelder
M.E., Yu J., Jatkoe T., Berns E.M., Atkins D., and Foekens
J.A. 2005. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet
365: 671.