Global Regulatory DNA Potentiation by SMARCA4
Global Regulatory DNA Potentiation by SMARCA4
Global Regulatory DNA Potentiation by SMARCA4
Correspondence
[email protected] (A.P.W.F.),
[email protected] (J.A.S.)
In Brief
Chromatin remodeling complexes
regulate developmental transitions and
are increasingly implicated in the
formation of cancer. Here, Lazar et al.
uncover how multiple aspects of
chromatin state and organization can
filter the global effects of the key
remodeling factor SMARCA4 toward
specific changes in expression of key
developmental regulators.
Highlights
d Reactivation of SMARCA4 globally potentiates regulatory
DNA accessibility
Article
Global Regulatory DNA Potentiation by SMARCA4
Propagates to Selective Gene Expression Programs
via Domain-Level Remodeling
John E. Lazar,1,2,3 Sandra Stehling-Sun,2,3 Vivek Nandakumar,2 Hao Wang,2 Daniel R. Chee,1,2 Nicholas P. Howard,2
Reyes Acosta,2 Douglass Dunn,2 Morgan Diegel,2 Fidencio Neri,2 Andres Castillo,2 Sean Ibarrientos,2 Kristen Lee,2
Ninnia Lescano,2 Ben Van Biber,2 Jemma Nelson,2 Jessica Halow,2 Richard Sandstrom,2 Daniel Bates,2 Fyodor D. Urnov,2
Alister P.W. Funnell,2,4,* and John A. Stamatoyannopoulos1,2,4,5,*
1Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
2Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
3These authors contributed equally
4Senior author
5Lead Contact
SUMMARY
The human genome encodes millions of regulatory elements, of which only a small fraction are active within a
given cell type. Little is known about the global impact of chromatin remodelers on regulatory DNA land-
scapes and how this translates to gene expression. We use precision genome engineering to reawaken ho-
mozygously inactivated SMARCA4, a central ATPase of the human SWI/SNF chromatin remodeling complex,
in lung adenocarcinoma cells. Here, we combine DNase I hypersensitivity, histone modification, and tran-
scriptional profiling to show that SMARCA4 dramatically increases both the number and magnitude of acces-
sible chromatin sites genome-wide, chiefly by unmasking sites of low regulatory factor occupancy. By
contrast, transcriptional changes are concentrated within well-demarcated remodeling domains wherein
expression of specific genes is gated by both distal element activation and promoter chromatin configura-
tion. Our results provide a perspective on how global chromatin remodeling activity is translated to gene
expression via regulatory DNA.
INTRODUCTION 2013); yet how defective chromatin remodeling, per se, contrib-
utes to oncogenesis in these cases is unclear. BAF complexes
Eukaryotic gene regulation involves the integrated action of regulate diverse sets of target genes across different develop-
sequence-specific transcription factors (TFs) and chromatin- mental stages and cancer types; as such, modifying their activity
modifying complexes to reorganize nucleosome-bound DNA can have heterogeneous effects (Hodges et al., 2016). BAF sub-
into an active regulatory template (Kadonaga, 1998). While the units commonly exhibit tumor suppressor function, yet these
role of localized binding of sequence-specific factors has been complexes also activate oncogenic regulatory programs in
intensively studied, less is known about the genome-wide inter- certain contexts—indeed, the same BAF subunit may act as a tu-
play between regulatory DNA actuation and specific chromatin mor suppressor or oncogene at different stages of cancer pro-
remodelers. gression (Glaros et al., 2008; Roy et al., 2015; Sun et al., 2017).
The mammalian SWI/SNF (mSWI/SNF) family is a set of Such heterogeneous behavior has been linked to distinct sets
closely related chromatin-remodeling complexes, which interact of target genes in different cellular contexts. Therefore, a better
with nucleosomes and other chromatin-modifying factors to understanding of how BAF target genes are specified is neces-
activate promoter and enhancer elements in diverse regulatory sary to interpret their diverse behavior.
contexts (Bao et al., 2015; Bossen et al., 2015; Hodges et al., Target specificity of BAF complexes is caused, in part, by se-
2018; John et al., 2008). These complexes include canonical lective recruitment to regulatory elements either through cell-
BAF (BRG-/BRM-associated factor), ncBAF (non-canonical type-specific subunits (Ho and Crabtree, 2010; Kadoch and
BAF), and PBAF (Polybromo-associated BAF) (Hodges et al., Crabtree, 2013) or TFs (Boulay et al., 2017; Kadam et al., 2000;
2016; Mashtalir et al., 2018), hereinafter collectively referred to Vierbuchen et al., 2017). However, in eukaryotes, the relationship
as ‘‘BAF complexes.’’ between recruitment of regulatory factors and target gene
BAF subunits are frequently mutated in certain cancers and expression is not straightforward. In humans, genome-wide as-
such mutations are thought to be driver events (Kadoch et al., says identify thousands of binding sites for most TFs and
Cell Reports 31, 107676, May 26, 2020 ª 2020 The Authors. 1
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
ll
OPEN ACCESS Article
chromatin remodelers (Dunham et al., 2012; Gerstein et al., for their capacity to scarlessly correct SMARCA4 by homol-
2012), yet these factors affect the expression of more restricted ogy-directed repair (HDR) (Figure 1A; Figures S1A and S1B).
sets of target genes (Lin et al., 2009; Ramagopalan et al., 2010; We used the lead TALENs to derive multiple clonal lines with ho-
Reddy et al., 2009). This phenomenon similarly holds for BAF mozygous correction of SMARCA4 (SMARCA4+/+) and arbitrarily
complexes (Hu et al., 2011; McBride et al., 2018; Raab et al., selected three for phenotypic and regulatory profiling. For
2015). matched controls, we selected two lines that were subject to
Various models might explain the discrepancy between factor the same transfection and sorting process but which remained
recruitment and effects on expression, including small effects for unedited at SMARCA4 (SMARCA4/). Genotypes were
low-affinity binding (Fisher et al., 2012; Tanay, 2006), synergistic confirmed by Sanger sequencing (Figure S1C).
relationships between clustered binding sites (Courey et al., We next confirmed that SMARCA4 expression was restored
1989; Li et al., 2002; Parker et al., 2013; Whyte et al., 2013), in the three genetically corrected clones. In each case, the
and enhancer-promoter specificity (Butler and Kadonaga, repair caused marked upregulation of full-length SMARCA4
2001; Li and Noll, 1994; Zabidi et al., 2015). Experiments using mRNA expression to levels comparable to those of normal
inducible TFs, such as the glucocorticoid and hormone recep- lung tissue (Figures 1B and 1C). SMARCA4 protein is also ex-
tors, suggest that binding site strength (Vockley et al., 2016); pressed in these clones (Figure 1D) and exhibits correct nuclear
clustered, codependent TF binding sites (Hakim et al., 2011; localization by immunofluorescence (IF) (Figures 1E and 1F).
Reddy et al., 2009; Vockley et al., 2016); and activation of chro- Phenotypically, SMARCA4+/+ clones have a decreased growth
matin domains (Le Dily et al., 2014, 2019) all play a role in defining rate (Figure 1G), in line with SMARCA4’s putative role as a tu-
the link between factors’ binding sites and gene regulation. mor suppressor in lung adenocarcinoma and the involvement
To investigate the relationship between the recruitment of a of BAF complexes in cell-cycle control (Nagl et al., 2005; Zhang
chromatin remodeler, resulting changes to chromatin state, et al., 2000).
and regulation of specific expression programs, we reactivated
SMARCA4 (BRG1) expression in pulmonary adenocarcinoma SMARCA4 Reactivation Causes a Widespread Increase
(A549) cells harboring a well-characterized, homozygous in Chromatin Accessibility at Pre-marked Sites
SMARCA4 null mutation (Medina et al., 2008). SMARCA4, one To profile the effects of SMARCA4 reactivation on chromatin, we
of two mutually exclusive ATPase components of BAF, is assayed the SMARCA4+/+ and SMARCA4/ clones for: chro-
commonly mutated in cancers including 5%–10% of lung matin accessibility, by DNase I sequencing (DNase I-seq);
adenocarcinoma samples (Medina et al., 2008; Collisson et al., the histone marks H3K4me1, H3K4me2, and H3K27me3 by
2014; Rodriguez-Nieto et al., 2011), and its loss causes changes CUT&RUN (Skene and Henikoff, 2017); and the genomic occu-
in cellular morphology and increased tumorigenicity in lung pancy of SMARCA4 and SMARCA2, the two interchangeable
adenocarcinoma models (Medina et al., 2005; Orvis et al., ATPase subunits of the BAF complex, by CUT&RUN (Figures
2014). Here, we show that SMARCA4 rescue causes a global in- 1A and 2A, Tables S1–S3). Data from different clones are highly
crease in chromatin accessibility but highly specific alterations to reproducible, and there is a clear separation in all data types be-
gene expression. The association between distal element activa- tween SMARCA4/ and SMARCA4+/+ samples (Figure S1D).
tion and expression is influenced by regional chromatin architec- Since chromatin accessibility provides a focal measurement
ture and the promoter state of the gene, demonstrating how a of multiple classes of regulatory activity (Gross and Garrard,
gene’s genomic context can lead to heterogeneous interpreta- 1988), we centered our analysis of regulatory changes around
tions of widespread chromatin reorganization. DNase I-hypersensitive sites (DHSs). Consistent with BAF’s ef-
fects on chromatin in other systems (Bao et al., 2015; Bossen
RESULTS et al., 2015; Hodges et al., 2018; Kelso et al., 2017), we observed
a widespread increase in chromatin accessibility in SMARCA4+/+
Efficient Repair of the SMARCA4 Locus and Restoration samples. We identified 32,689 sites with increased accessibility
of Expression (STAR Methods), hereinafter termed ‘‘remodeled DHSs,’’
SMARCA4 is frequently mutated in non-small-cell lung cancers compared to only 1,573 sites with decreased accessibility (Fig-
(Medina et al., 2008; Collisson et al., 2014; Rodriguez-Nieto ure 2B). Strikingly, 23,861 (73%) of remodeled DHSs are novel
et al., 2011); thus, we opted to use the A549 lung adenocarci- DHSs not reproducibly detected in the SMARCA4/ samples.
noma line as a physiologically pertinent model system to study The remodeled sites are primarily distal (non-promoter) and are
SMARCA4 function. A549 cells are homozygous for a 23-bp enriched for epithelial and fibroblast specific DHSs (Thurman
deletion in exon 15 of SMARCA4, which introduces a premature et al., 2012) (Figures S2A and S2B) analogous to the dependence
stop codon (Medina et al., 2008) (Figure S1A). Since develop- of cell-type-specific enhancers on SMARCA4 observed in other
mental roles of SMARCA4 are sensitive to gene dosage (Bultman systems (Alver et al., 2017; Attanasio et al., 2014).
et al., 2000), non-physiological levels of expression—whether The remodeled DHSs overlap SMARCA4 binding sites: 60% of
introduced through transient transfection or stable insertion at remodeled DHSs overlap SMARCA4 CUT&RUN peaks in
a non-native locus—may not fully recapitulate key regulatory in- SMARCA4+/+ clones, and 87% have a SMARCA4 CUT&RUN
teractions. We, therefore, sought to repair the deletion to ex- signal greater than the median unaffected DHSs (Figure 2C; Fig-
press SMARCA4 in its native regulatory context. ure S2C). Since BAF complexes contain either SMARCA4 or
We generated a panel of Transcription Activator-Like Effector SMARCA2, we analyzed whether the sensitivity of the remodeled
Nucleases (TALENs) targeted near the mutation and screened DHSs to SMARCA4 activity is due to preferential recruitment of
B C D
E F G
Figure 1. Restoration of SMARCA4 Expression through Corrective Editing of the Native Locus in A549 Lung Adenocarcinoma Cells
(A) Schematic of SMARCA4 rescue experiments. A549 cells, which harbor a homozygous, disruptive deletion in the SMARCA4 gene, were corrected by TALEN-
mediated HDR. Edited pools were single-cell sorted, and clonal lines (SMARCA4+/+ and SMARCA4/ matched controls) were profiled as indicated.
(B) RNA-seq data demonstrating restoration of SMARCA4 expression in SMARCA4+/+ clones.
(C) Comparison of restored SMARCA4 expression to normal lung tissue. Barplot (mean ± SEM) displays SMARCA4 expression in SMARCA4+/+ and
SMARCA4/ cells compared to normal lung tissue from TCGA (boxplot).
(D) Detection of restored SMARCA4 protein expression by western blot in untransfected (UT), two SMARCA4/, and three SMARCA4+/+ clones. K562 extracts
are included as a positive control, and GAPDH is included as a loading control.
(E) SMARCA4 (green) and actin (red) IF imaging of SMARCA4/ and SMARCA4+/+ clones. Scale bar, 5 mm.
(F) Image-based quantitation of SMARCA4 nuclear protein abundance in SMARCA4/ and SMARCA4+/+ clones. Boxplots show distribution of SMARCA4 IF
signal from 500 cells per clone.
(G) Growth curves for SMARCA4+/+ and SMARCA4/ clones. Data represent mean ± SD of replicates (n = 3 for days 5–6; n = 2 for day 4).
one subunit over the other. For remodeled DHSs bound by either To identify TFs that might recruit BAF complexes to the re-
or both ATPases in SMARCA4+/+ samples, the majority are occu- modeled DHSs, we performed de novo motif discovery and iden-
pied by both (Figure 2C). Quantitatively, the SMARCA4: tified AP-1 (JUN/FOS), RUNX, and SP1 motifs as enriched in the
SMARCA2 ratio at remodeled DHSs that overlap a SMARCA4 sequences of remodeled DHSs (Figure S2E). AP-1 and SP1 fac-
peak is only modestly higher than at SMARCA4-bound, unaf- tors are known to recruit BAF complexes (Ito et al., 2001; Kadam
fected DHSs (Figure S2D). However, while SMARCA2 binds at et al., 2000), and supporting the motif enrichment, chromatin
most remodeled DHSs, this occupancy requires SMARCA4 ac- immunoprecipitation sequencing (ChIP-seq) peaks of AP-1
tivity: only 40% of SMARCA4-bound remodeled sites overlap a members JUNB and FOSL2 have the highest overlap with the re-
SMARCA2 peak in SMARCA4/ cells (compared to 74% for modeled DHSs of the 51 TFs assayed by ENCODE in A549 cells
SMARCA4-bound, unaffected sites), and SMARCA2 CUT&RUN (Figure 2D, Table S4) (Dunham et al., 2012; D’Ippolito et al.,
signal increases specifically at the remodeled DHSs after 2018). For JUNB sites overlapping SMARCA4 peaks, chromatin
SMARCA4 rescue (Figure S2D). remodeling occurs preferentially at the subset of sites with low
A B
D E H
G
F
chromatin accessibility in SMARCA4/ cells. These weakly SMARCA4 rescue increases chromatin accessibility toward
accessible sites contain low-occupancy AP-1 binding sites as levels predicted by other measures of regulatory activity, and
measured by DNase I footprinting, a direct readout of factor oc- H3K4me1 specifically pre-marks the remodeled DHSs (Fig-
cupancy (Figure 2E) (Vierstra and Stamatoyannopoulos, 2016). ure 2H). These findings are consistent with recent reports high-
The sequences of the low-occupancy sites have, on average, lighting crosstalk between BAF and H3K4me1/MLL complexes:
weaker motif matches, suggesting that the low occupancy is BAF preferentially binds to and remodels H3K4me1-marked
due, in part, to weaker TF binding affinity at the sites. Upon resto- nucleosomes (Local et al., 2018; Pan et al., 2019), and chromatin
ration of SMARCA4, there is an increase in occupancy over the remodeling by BAF allows for recruitment of the H3K4 methyl-
AP-1 motif at these sites to levels observed for the higher affinity transferase MLL to enhancers (Pan et al., 2019).
motifs (Figure 2E). The phenomenon is not specific to AP-1, as Overall, our results demonstrate that the chromatin accessi-
we observed similar results for sites bound by both SMARCA4 bility landscape of A549 is highly dependent on SMARCA4 activ-
and SP1 (Figure S2F). Taken together, these data suggest that ity: after SMARCA4 rescue, the accessible compartment of the
SMARCA4 potentiates TF occupancy at co-bound sites and genome expands by 21%, as TF binding and enhancer-associ-
can markedly increase binding to weaker motifs through chro- ated marks increase at latent, low-occupancy sites pre-marked
matin remodeling. by H3K4me1.
Since SMARCA4 rescue activates low-occupancy sites, we
hypothesized that the novel DHSs that appear after SMARCA4 Promoter State Modifies the Association between Distal
rescue would be pre-marked for regulatory activity in Element Activation and Gene Expression
SMARCA4/ cells. Compared to a background of DHSs active We next compared the chromatin changes to the transcriptional
in unrelated cell types, we observed that a higher fraction of novel response to understand the relationship between the wide-
DHSs overlap TF ChIP-seq peaks for the 51 assayed TFs (Fig- spread increase in accessibility at low-affinity binding sites and
ure S2G). However, 70% of the novel DHSs are not bound by SMARCA4’s effects on gene expression. The transcriptional ef-
any of the 51 TFs assayed in A549 cells, demonstrating a high de- fects of SMARCA4 binding appear conditional on chromatin re-
gree of co-dependence between TF binding and BAF activity. To modeling: the genes closest to remodeled SMARCA4 binding
extend the analysis, we inspected histone marks around remod- sites are, on average, upregulated (Figure S3A), whereas genes
eled DHSs. In SMARCA4/ samples, remodeled DHSs are near non-remodeled SMARCA4 sites do not exhibit increased
marked by levels of H3K4me1 similar to those of unaffected expression, even for CUT&RUN peaks within 1 kb of a gene’s
DHSs, despite lower chromatin accessibility (Figure 2F). After transcription start site (TSS) (Figure 3A).
SMARCA4 rescue, H3K4me1 increases at the remodeled DHSs However, despite the global association between chromatin
to levels higher than unaffected DHSs (Figure 2F), although the and expression, the changes in expression following SMARCA4
ratio of H3K4me1 to chromatin accessibility decreases toward restoration are highly attenuated compared to the increase in
the genomic average at DHSs (Figure 2G). We observed similar chromatin accessibility (Figure 3B; Figure S3B). We sought to
patterns for H3K4me2, but relative to H3K4me1, the H3Kme2 explore this discrepancy by analyzing how local changes in chro-
signal tracks closer to the levels expected from a region’s acces- matin accessibility associate with gene expression. We linked all
sibility (Figures 2F and 2G). Therefore, at remodeled sites, DHSs to their nearest TSS and found a dose-dependent
Figure 2. SMARCA4 Reactivation Causes a Widespread Increase in Chromatin Accessibility at Pre-marked Sites
(A) DNase I cleavage, SMARCA2/4 CUT&RUN, and histone modification CUT&RUN read density in SMARCA4/ and SMARCA4+/+ cells for an example genomic
region. Remodeled DHSs are highlighted in red. Gray stripes indicate align signal for different assays.
(B) Scatterplot of DNase I cleavages at DHSs in SMARCA4/ versus SMARCA4+/+ samples. Note the asymmetric increase in accessibility in the SMARCA4+/+
samples.
(C) Relationship between SMARCA2/4 CUT&RUN peaks and SMARCA4 remodeled DHSs. Left: Venn diagram of remodeled DHSs and SMARCA2/4 CUT&RUN
peaks in SMARCA4+/+ samples. Right: heatmap of SMARCA4 CUT&RUN signal in SMARCA4+/+ samples (top) and change in DNase I signal (bottom) at the
32,689 remodeled DHSs. Signal displayed is mean of values from three independent clones. DHSs are ordered by SMARCA4 CUT&RUN signal.
(D) Heatmaps display the overlap of remodeled DHSs and SMARCA4 CUT&RUN peaks with ChIP-seq peaks of TFs assayed by ENCODE in A549 cells. All
assayed TFs with >500 peaks are included and are ordered by the fraction of peaks overlapping remodeled DHSs.
(E) Top: aggregate cleavage profile (trimmed mean of middle 98% of log2 observed/expected values) around AP1 motifs in remodeled and unchanging DHSs in
SMARCA4/ and SMARCA4+/+ samples. Motifs were selected from DHSs that overlap both a SMARCA4 CUT&RUN and JUNB ChIP-Seq peak and are
footprinted in either SMARCA4/ or SMARCA4+/+ cells. Bottom: DHSs were ordered by motif footprint occupancy in SMARCA4/ samples. From left to right,
heatmaps display the average log fold change in DHS accessibility after SMARCA4 reactivation, DNase I cleavages around AP1 motif in SMARCA4/ cells,
DNase I cleavages around AP1 motif in SMARCA4+/+ cells, and AP1 motif strength. Ordering of DHSs (rows) is consistent across heatmaps. To highlight the trend
with footprint occupancy, for heatmaps displaying DHS log fold change and motif p values, DHSs were separated into 25 bins and average values for the bin are
indicated.
(F) Relationship between SMARCA4 remodeled DHSs and enhancer-associated histone marks. DNase I cleavage versus H3K4me1 (left) and H3K4me2 (right)
CUT&RUN signal is plotted for SMARCA4 remodeled DHSs (red), unaffected DHSs (black), and a background of DHSs accessible in other cell types (tan). Solid
lines indicate values in SMARCA4/ cells and dashed lines indicate values in SMARCA4+/+ cells. Scatterplots display median ± 25th–75th percentile signal for
each class of elements.
(G) Boxplots of the ratio of H3K4me1 and H3K4me2 signal to DNase I signal at remodeled DHSs. Ratios are plotted relative to the median histone
CUT&RUN:DNase I cleavage ratio in unaffected DHSs.
(H) Model of the relationship between H3K4me1/me2 signal around a DHS and SMARCA4 remodeling.
A B C
D G
Figure 3. Promoter State Modifies the Association between Distal Element Activation and Gene Expression
(A) Median change in expression for genes neighboring DHSs with increasing accessibility, SMARCA4 CUT&RUN peaks, or both. Error bars display 25th–75th
percentiles.
relationship between the number of activated DHSs near a gene ation, overlap with H3K27me3 peaks, overlap with H3K4me3
and the average change in expression (Figure 3C). For instance, peaks, and bivalency (both H3K4me3 and H3K27me3 peaks)
genes linked to 8 or more SMARCA4-activated DHSs display an in SMARCA4/ A549s.
average 1.84-fold increase in expression compared to an We observed strong interactions between the change in distal
average 1.07-fold change for genes linked to a single activated chromatin accessibility and promoter state (Table S5). In partic-
DHS. ular, promoter bivalency had positive interaction terms with the
The higher average increase in expression of genes neigh- change in distal DHSs (Figure 3D), providing further rationale
boring larger numbers of remodeled DHSs could be caused for the observed association between BAF activity and bivalent
by: (1) dispersed control of gene expression, with many DHSs genes (Nakayama et al., 2017). In contrast, CpG methylation
contributing modest effects; and/or (2) focal control, wherein a and H3K4me3 status had significant negative interactions with
single DHS affects the gene’s expression, and the higher the DHS change (Figures 3D and 3E), with genes whose TSSs
number of neighboring remodeled DHSs, the higher the proba- were marked solely by H3K4me3, in particular, showing limited
bility of a regulatory impact. Under the focal model, the average response to activation of nearby distal DHSs (Figure 3D). There-
increase in expression should be primarily driven by a change in fore, while SMARCA4 rescue activates a specific class of genes
the fraction of genes that are upregulated, whereas in a with repressed or poised promoters marked by H3K27me3, the
dispersed model, the average change should be driven more distal remodeled DHSs appear to have moderated effects on
by an increase in the magnitude of upregulation. To distinguish genes with promoters that are already maximally active (only
between the two possibilities, we fit a mixture model to distin- H3K4me3 signal) or stably repressed (high CpG methylation).
guish between the fraction of upregulated genes and the magni- For example, although the genes GALNT15, DLC1, and
tude of expression change for upregulated genes. We found that LAMC1 all show dramatic increases in distal DHSs near their
the greater upregulation of genes neighboring a larger number of TSS and in the gene body, they show distinct transcriptional re-
DHSs appears primarily due to an increase in the fraction of up- sponses to SMARCA4 that are associated with their specific pro-
regulated genes (Figure S3C). This is consistent with a focal moter states prior to reactivation (Figure 3G). Importantly, the
model in which the attenuated changes in gene expression arise effect of promoters on gene expression is conditional on distal
due to only a subset of remodeled DHSs (approximately 20%) element activation, as the main effects of promoter states had
driving changes in gene expression (Figures S3D and S3E). How- mostly non-significant relationships with changes in gene
ever, we also observed that the fraction of genes upregulated at expression (Figure 3F; Table S5). The different promoter states
higher numbers of remodeled DHSs is lower than expected are, therefore, not merely marking classes of SMARCA4-sensi-
based on a simple model of a constant fraction of DHSs affecting tive genes but rather appear to alter the response to SMARCA4
gene expression (Figure S3D). This saturation in the proportion of reactivation by modifying the effect of distal element activation.
genes changing expression suggests that locus-wide effects
outside of individual DHSs might play a role in whether a gene’s SMARCA4 Rescue Causes Regional Chromatin
expression responds to distal element activation. Remodeling
We hypothesized that locus-specific effects might arise, in Since genes located near multiple remodeled DHSs showed the
part, due to genes’ promoter states modifying the effects of largest response to SMARCA4 rescue, we next investigated
distal DHSs. To test the relationship between promoter state whether the high number of remodeled DHSs at certain genes re-
and the association between distal element remodeling and flected spatial clustering or merely a random distribution.
gene expression, we first fit a function to weight each distal Compared to a block bootstrapped background, remodeled
element to best explain the changes in expression (STAR DHSs show significant spatial clustering (Figure S4A). Therefore,
Methods). We then performed multiple linear regression, re- we fit a hidden Markov model (HMM) to identify regions of
gressing genes’ change in expression on the weighted change SMARCA4 sensitivity (Figures 4A and 4B; STAR Methods) and
in chromatin accessibility, promoter state, and interaction terms discovered 202 remodeled regions with an average size of 842
between the two (STAR Methods). We characterized the pro- kb. These regions contain 7.5% of identified DHSs and 19.5%
moter state for each gene by CpG island status, CpG methyl- of the remodeled DHSs.
(B) Scatterplot of expression in SMARCA4/ versus SMARCA4+/+ clones as measured by RNA-seq. Significantly changing genes are colored, and select
up-/downregulated genes are highlighted. Note the attenuated change in gene expression compared to chromatin accessibility (compare to Figure 2A).
(C) Mean change in expression at genes grouped by number of neighboring SMARCA4-responsive DHSs. Dark error bars indicate ±SEM; light error bars indicate
25th–75th percentile.
(D) Genes with bivalent promoters show greater changes in expression (RNA-seq) compared to genes with similar numbers of changing distal DHSs. Plotted is the
change in gene expression (mean ± SEM) versus the number of neighboring remodeled DHSs for different gene sets. Gene sets are grouped by ChIP-seq peaks
at the gene’s promoter. Inset pie chart indicates the fraction of genes in each set.
(E) Genes with highly methylated promoters show smaller changes in expression (RNA-seq) compared to genes with similar numbers of changing distal DHSs.
Plot as in (D), with gene sets grouped by promoter CpG methylation levels.
(F) Promoter effects on gene expression are dependent on distal element changes. Barplot shows the fraction of upregulated genes for genes linked to 0 re-
modeled DHSs grouped by promoter class. All genes linked to R3 remodeled DHSs are included for comparison.
(G) Example change in chromatin accessibility (DNase I cleavage profile, left) and gene expression (swarmplot displaying data from individual samples with
mean ± SEM colored, right) for three genes with similar changes in chromatin accessibility, different promoter states, and different changes in expression after
SMARCA4 reactivation.
A B
D E F G
Within the remodeled domains, in addition to changes at focal marks associated with active chromatin, such as H3K4me3, in
sites of accessibility (i.e., remodeled DHSs), we also observed a the SMARCA4/ state (Figures S4J and S4K).
relative increase in background accessibility to DNase I cleavage Despite identifying regional chromatin changes in an unsuper-
outside of DHSs (Figure 4C). This finding demonstrates that the vised manner agnostic to 3D chromatin conformation, we find
chromatin changes caused by SMARCA4 reactivation are not that SMARCA4 reactivation causes not only increases in chro-
restricted to individual DHSs but, instead, extend more broadly matin accessibility at the nucleosome (i.e., DHSs) level but also
over the entire remodeled region. Consistent with a model in regional activation of chromatin at a subset of predefined loop
which SMARCA4 rescue activates the entire chromatin territory, domains.
levels of H3K27me3 decline and the enhancer-associated marks
H3K4me1 and H3K3me2 increase specifically across the remod- Regional Chromatin Remodeling Is Associated with
eled regions (Figure 4C; Figure S4B). Induction of Developmental Regulators
Previous work has proposed that extended regions of high The regions of increased accessibility are enriched for genes
regulatory activity, termed super-enhancers, regulate cell-type- with the greatest response to SMARCA4 reactivation. Of the
specific genes and are preferentially sensitive to disruption of 220 significantly upregulated genes with log2 fold change > 2,
regulatory components (Whyte et al., 2013). Therefore, we 32% are located in the remodeled domains (compared to 12%
compared the SMARCA4 remodeled regions to a variation on of all significantly upregulated genes). Highlighting the large ef-
super-enhancers defined by clustered SMARCA4 binding. The fect of regional chromatin remodeling on gene expression, genes
remodeled regions do not correspond to SMARCA4 super-en- found in the activated regions, but without any change in pro-
hancers as identified by the ROSE algorithm (Lovén et al., moter accessibility, have changes in expression similar to those
2013; Whyte et al., 2013) but, instead, reflect larger regions of of genes outside these regions but with increased promoter
chromatin activation (Figure S4C). However, the majority (74%) accessibility (Figure 4F). Additionally, genes within the activated
of these domains contain at least one SMARCA4 super- regions that also have increased promoter accessibility upon
enhancer (Figure S4C), suggesting that a subset of SMARCA4- SMARCA4 rescue displayed further, marked upregulation
binding clusters may act as bona fide locus control regions (Figure 4F).
enabling chromatin activation across a region. The domain scale Developmentally regulated genes are proposed to exist in
chromatin activation may be a key functional consequence of insulated regulatory neighborhoods (Dowen et al., 2014).
clusters of SMARCA4 binding, as genes closest to the Consistent with this hypothesis, we observe that genes encod-
SMARCA4 super-enhancers falling inside remodeled regions ing several lineage-specifying TFs and other key developmental
show large changes in expression, while the genes closest to regulators are located in the SMARCA4-sensitive domains.
the super-enhancers falling outside of remodeled regions show These genes include those encoding the RUNX TFs, the EMT
minimal changes in expression (Figure S4C). regulator SNAI2, and the homeodomain TF PBX1 (Figures S4D
and S4E). Other genes important for epithelial morphogenesis
Activated Regions Align with Boundaries of Topological or cell-cell interaction, such as DLC1 and EDIL3,are also found
Domains in the activated domains. Supporting the link between domain
To understand the features demarcating the sharp boundaries of level activation and developmental processes more generally,
the remodeled domains, we next analyzed the elements at the genes annotated with Gene Ontology (GO) terms such as ‘‘tissue
borders of the activated regions. The boundary elements are en- development,’’ ‘‘cell differentiation,’’ and ‘‘extracellular matrix
riched for binding sites of CTCF and members of the cohesin organization’’ are enriched in the SMARCA4-sensitive domains
complex (SMC3 and RAD21) (Figure 4D). CTCF and cohesin (Figure 4G). The enrichment of developmental regulators in
interact to establish three-dimensional chromatin organization, epigenetically sensitive domains, therefore, links the largest ef-
including long-range chromatin interactions and topologically fects of SMARCA4 on gene expression toward developmental
associating domains (TADs) (Ong and Corces, 2014; Rao et al., regulatory programs, despite relatively non-specific genome-
2014). We, therefore, compared the regions of SMARCA4 sensi- wide increases in chromatin accessibility.
tivity to TADs and Hi-C-defined DNA loops in A549 cells. We
found a striking alignment between remodeled SMARCA4 re- Chromatin Context of Expression Programs Altered in
gions and both TADs and DNA loops (Figure 4E; Figures S4D– SMARCA4 Null Patients
S4I). Combined, these results suggest that, for a subset of We next considered how the relationship between SMARCA4
loop domains, SMARCA4 reactivation enables chromatin re- regulation and chromatin states might affect SMARCA4’s tar-
modeling across the entire domain. gets in lung adenocarcinoma. We first identified a set of relevant
To investigate the chromatin states that specify domain sensi- SMARCA4 targets by selecting genes that are downregulated in
tivity to SMARCA4 activation, we compared the genomic fea- The Cancer Genome Atlas (TCGA) lung adenocarcinoma
tures of TADs overlapping regional changes in accessibility to SMARCA4 null samples. The expression of this gene set in
unaffected TADs. While BAF is thought to act, in part, through A549s matches the expression in SMARCA null patient samples
antagonism with polycomb repressive complexes (Kadoch and (Figure 5A; Figure S5A). Rescue of SMARCA4 in A549s partially
Crabtree, 2013; Tamkun et al., 1992), we do not observe a direct reactivates the downregulated TCGA gene set (Figure 5B),
correspondence between H3K27me3 marked domains and consistent with previous work showing partial overlap between
SMARCA4 sensitivity. Instead, sensitive regions overlap with genes affected by SMARCA4 knockdown in lung adenocarci-
bivalent TADs that contain matched levels of H3K27me3 and noma cell lines and genes with decreased expression in
A B
( (
D E
F G
SMARCA4 null patient samples (Orvis et al., 2014). The reacti- p < 0.01; Figure 5C). Reactivated genes are also associated
vated genes (i.e., genes with decreased expression in SMARCA4 with decreased cancer cell proliferation and invasiveness
null patient samples and upregulated after SMARCA4 rescue) (Qian et al., 2007; Saintigny et al., 2012; Torrino et al.,
are enriched in SMARCA4 remodeled domains, with 26% of 2019)—most notably, DLC1, a known tumor suppressor
the reactivated genes lying within the remodeled regions (Yuan et al., 2004). DLC1 regulates actin organization (Seki-
compared to 12% of all significantly upregulated genes and mata et al., 1999), and restoration of its expression in cancer
2% of all genes. cells can induce apoptosis (Zhou et al., 2004), decrease migra-
Since we found that promoter state modifies the effect of distal tion and invasion (Goodison et al., 2005), and reduce cell
accessibility changes, we considered whether it might influence growth (Yuan et al., 2004). DLC1 resides in a SMARCA4-sensi-
which genes are reactivated following SMARCA4 rescue. Reac- tive chromatin domain (Figure 5D) and is the most highly upre-
tivated genes have lower promoter methylation than stable gulated gene after SMARCA4 rescue (Figures 3B and 3G). Its
genes (Figure 5B), but no significant difference in either expression is low in SMARCA4 null patient samples (Fig-
H3K27me3 (Figure S5B) or bivalent marks (Fisher exact test, ure S5C) and is positively associated with patient survival (Fig-
p = 0.50). Promoters of tumor suppressors are frequently meth- ure 5F). Interestingly, we observed changes in cell morphology
ylated in cancers, so the diminished reactivation of genes with after SMARCA4 reactivation (Figure 5G; Figure S5D)—in line
highly methylated promoters suggests that accumulated epige- with previous literature linking mutations in SMARCA4 to
netic changes in cancer cells could alter SMARCA4 targets at changes in cell morphology and cytoskeletal disorganization
different stages of cancer development. (Wong et al., 2000)—that match the description of cytoskeletal
The reactivated genes are strong candidates for mediating changes after ectopic expression of DLC1 in lung adenocarci-
SMARCA4’s role in lung adenocarcinoma and include genes noma cells (Yuan et al., 2007).
in pathways involved in lung adenocarcinoma pathogenesis Across multiple scales—from individual TF binding to chro-
such as WNT (DAAM2, WLS, and TNIK) (Stewart, 2014) and re- matin domains—we observe a tight relationship between genes’
ceptor tyrosine kinase signaling (ROS1 and PTPRE) (Collisson chromatin states and the transcriptional effects of SMARCA4
et al., 2014) (Figure 5E). Compared to all genes upregulated in binding. This interaction between SMARCA4 and chromatin
the SMARCA4+/+ clones, the 74 reactivated genes are biased state adds additional transcriptional specificity to SMARCA4
for genes whose expression is associated with increased pa- perturbation beyond recruitment (Figure 6) and likely plays a
tient survival in wild-type (WT) SMARCA4 samples (c2 test, role in defining the developmental and oncogenic expression
programs regulated by SMARCA4 in lung adenocarcinomas and in expression of these genes might, therefore, require activation
other systems. of an alternative promoter. Consistent with this hypothesis, we
observe greater chromatin remodeling at alternative TSSs, and
DISCUSSION genes with annotated alternative promoters are more likely to
be upregulated after SMARCA4 reactivation (Figures S3F and
Mutations in chromatin remodelers such as SMARCA4 are S3G). While alternative TSSs introduce isoform diversity, they
increasingly implicated in cancer development as well as a range may also have a function in allowing for graded levels of gene
of developmental diseases. A primary challenge in understand- expression through sequential recruitment.
ing these complexes’ roles in human disease is understanding Mutations in chromatin remodelers, including BAF subunits,
their targets in the relevant cell types and the mechanism by are frequently linked to dysregulation of developmental expres-
which they affect target gene expression. Here we leveraged a sion programs. Our results suggest that these targeted effects
well-profiled model system to disentangle the relationship be- may result from the atypical chromatin domain structure of key
tween SMARCA4’s chromatin remodeling activity and its effects developmental regulators. Here, in a lung adenocarcinoma
on gene expression. Beyond specificity due to SMARCA4 model, SMARCA4 rescue upregulates developmental genes an-
recruitment, we observed that chromatin context plays a key notated with GO terms such as epithelial cell differentiation,
role in shaping transcriptional regulation by SMARCA4 at multi- regulation of cell morphogenesis, positive regulation of cell
ple scales. At the level of regulatory elements, TF binding affinity development (Figure S5D). These genes are found in isolated
influences the requirement for SMARCA4 activity to induce chromatin domains that undergo domain-wide chromatin re-
nucleosome remodeling. For individual genes, promoter chro- modeling after SMARCA4 reactivation. Since many develop-
matin states modify the effect of distal element activation. At mental regulators reside in isolated, gene-poor domains, the
the scale of chromatin domains, domains’ sensitivity to remodel- accessibility of their domains is independent of the regulation
ing determines the transcriptional effects linked to SMARCA4 of other genes and, therefore, may be preferentially susceptible
binding clusters. to a single regulatory perturbation. The regions susceptible to
SMARCA4 activation extensively increases chromatin acces- SMARCA4 reactivation contain clustered SMARCA4 binding
sibility at cell-type-specific regulatory elements (Bao et al., 2015; sites, which may act as locus control regions to control the chro-
Bossen et al., 2015; Hodges et al., 2018; Kelso et al., 2017). matin state of the entire domain. Notably, genes near clustered
SMARCA4’s specificity for these elements is only partially ex- binding sites not associated with domain scale remodeling
plained by targeted recruitment by sequence-specific TFs. have minimal change in expression, suggesting that the critical
Among SMARCA4 binding sites, chromatin remodeling occurs role for clustered binding may be to provide sufficient concentra-
at sites containing low-affinity TF binding sites. Since low-affinity tion of regulatory activity to increase chromatin accessibility
TF binding is believed to be a common feature of developmen- across an entire domain.
tally regulated enhancers (Crocker et al., 2015; Farley et al., Perturbations to general regulatory factors cause highly spe-
2015) and the transcriptional effects of SMARCA4 binding are cific transcriptional responses—frequently preferentially
conditional on chromatin remodeling, the heightened depen- affecting developmental and oncogenic programs—despite
dence of low-affinity binding sites on SMARCA4 may act to limited targeting specificity. Our results demonstrate how global
target SMARCA4’s effects toward developmental regulatory changes to regulatory activity can be interpreted in highly hetero-
programs. geneous ways. Locus-specific chromatin features shape the
The relationship between remodeled regulatory elements and selection of a regulatory factor’s target genes and introduce
changes in gene expression suggests that a minority of distal transcriptional specificity to perturbations of the basal regulatory
regulatory elements with relatively large effect sizes cause machinery.
changes in gene expression. Which subset of distal elements
drives changes in gene expression is dependent not only on fea- STAR+METHODS
tures intrinsic to the regulatory element but also on promoter
gating, as promoters’ chromatin states modify the association Detailed methods are provided in the online version of this paper
between distal element activation and gene expression. Pro- and include the following:
moter gating of distal elements leads to heterogeneous tran-
scriptional responses to chromatin remodeling and attenuates d KEY RESOURCES TABLE
the effects of the global chromatin remodeling on gene expres- d RESOURCE AVAILABILITY
sion. The attenuation of the transcriptional response can, B Lead Contact
notably, be observed at genes with promoters marked solely B Materials Availability
by H3K4me3, which show minimal change in expression even B Data and Code Availability
when neighboring multiple activated DHSs. We interpret the d EXPERIMENTAL MODEL AND SUBJECT DETAILS
distinct behavior of H3K4me3-marked promoters to suggest d METHOD DETAILS
that promoters have an intrinsic maximum level of expression B Synthesis of TALEN constructs
and that additional recruitment of regulatory factors may have B Transfections
minimal effects on expression once that level is reached. For B Assessment of HDR and Indel Rates
the majority (51%) of protein-coding genes in A549, the most B Clonal Isolation
accessible TSS is marked solely by H3K4me3. Further increases B Western Blotting
B Immunofluorescence imaging and analysis Boulay, G., Sandoval, G.J., Riggi, N., Iyer, S., Buisson, R., Naigles, B., Awad,
B DNaseI-Seq M.E., Rengarajan, S., Volorio, A., McBride, M.J., et al. (2017). Cancer-Specific
B RNA-Seq Retargeting of BAF Complexes by a Prion-like Domain. Cell 171, 163–178.e19.
B CUT&RUN Bultman, S., Gebuhr, T., Yee, D., La Mantia, C., Nicholson, J., Gilliam, A., Ran-
dazzo, F., Metzger, D., Chambon, P., Crabtree, G., and Magnuson, T. (2000). A
B ENCODE Datasets
Brg1 null mutation in the mouse reveals functional differences among mamma-
B TCGA Data
lian SWI/SNF complexes. Mol. Cell 6, 1287–1295.
d QUANTIFICATION AND STATISTICAL ANALYSIS
Butler, J.E.F., and Kadonaga, J.T. (2001). Enhancer-promoter specificity medi-
B DNaseI-Seq Data Analysis
ated by DPE or TATA core promoter motifs. Genes Dev. 15, 2515–2519.
B RNA-Seq Data Analysis
Cermak, T., Doyle, E.L., Christian, M., Wang, L., Zhang, Y., Schmidt, C., Baller,
B CUT&RUN Data Analysis J.A., Somia, N.V., Bogdanove, A.J., and Voytas, D.F. (2011). Efficient design
B Comparison of DNaseI-seq and RNA-Seq data and assembly of custom TALEN and other TAL effector-based constructs
B Regional Changes in Chromatin Accessibility for DNA targeting. Nucleic Acids Res. 39, e82.
B GO Terms Collisson, E.A., Campbell, J.D., Brooks, A.N., Berger, A.H., Lee, W., Chmie-
lecki, J., Beer, D.G., Cope, L., Creighton, C.J., et al.; Cancer Genome Atlas
Research Network (2014). Comprehensive molecular profiling of lung adeno-
SUPPLEMENTAL INFORMATION
carcinoma. Nature 511, 543–550.
Supplemental Information can be found online at https://doi.org/10.1016/j. Courey, A.J., Holtzman, D.A., Jackson, S.P., and Tjian, R. (1989). Synergistic
celrep.2020.107676. activation by the glutamine-rich domains of human transcription factor Sp1.
Cell 59, 827–836.
ACKNOWLEDGMENTS Crocker, J., Abe, N., Rinaldi, L., McGregor, A.P., Frankel, N., Wang, S., Alsa-
wadi, A., Valenti, P., Plaza, S., Payre, F., et al. (2015). Low affinity binding
This work was supported by National Institutes of Health (NIH) grants UM1HG site clusters confer hox specificity and regulatory robustness. Cell 160,
009444 and U54 HG007010 to J.A.S. and by a charitable financial contribution 191–203.
from GlaxoSmithKline. D’Ippolito, A.M., McDowell, I.C., Barrera, A., Hong, L.K., Leichter, S.M., Bar-
telt, L.C., Vockley, C.M., Majoros, W.H., Safi, A., Song, L., et al. (2018). Pre-es-
AUTHOR CONTRIBUTIONS tablished Chromatin Interactions Mediate the Genomic Response to
Glucocorticoids. Cell Syst. 7, 146–160.e7.
J.E.L., A.P.W.F., F.D.U., and J.A.S. conceived the project. J.E.L. conceived Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut,
and performed the analysis. S.S.-S., N.P.H., R.A., N.L., B.V.B., and A.P.W.F. P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq
generated and assayed the SMARCA4+/+ clones. H.W. performed CUT&RUN aligner. Bioinformatics 29, 15–21.
experiments. V.N. performed IF. D.D., M.D., F.N., A.C., S.I., K.L., J.H., and D.B. Dowen, J.M., Fan, Z.P., Hnisz, D., Ren, G., Abraham, B.J., Zhang, L.N., Wein-
performed DNase I- and RNA-seq. S.S.-S., A.P.W.F., D.R.C., J.N., and R.S. traub, A.S., Schujiers, J., Lee, T.I., Zhao, K., and Young, R.A. (2014). Control of
performed analysis. J.E.L., A.P.W.F., S.S.-S., V.N., H.W., and J.A.S. wrote cell identity genes occurs in insulated neighborhoods in mammalian chromo-
the manuscript. A.P.W.F., F.D.U., and J.A.S. supervised the project. somes. Cell 159, 374–387.
Dunham, I., Kundaje, A., Aldred, S.F., Collins, P.J., Davis, C.A., Doyle, F., Ep-
DECLARATION OF INTERESTS stein, C.B., Frietze, S., Harrow, J., et al.; ENCODE Project Consortium (2012).
An integrated encyclopedia of DNA elements in the human genome. Nature
All authors are employees of the not-for-profit Altius Institute for Biomedical 489, 57–74.
Sciences.
Farley, E.K., Olson, K.M., Zhang, W., Brandt, A.J., Rokhsar, D.S., and Levine,
M.S. (2015). Suboptimization of developmental enhancers. Science 350,
Received: July 31, 2019
325–328.
Revised: December 23, 2019
Accepted: April 30, 2020 Fisher, W.W., Li, J.J., Hammonds, A.S., Brown, J.B., Pfeiffer, B.D., Weisz-
Published: May 26, 2020 mann, R., MacArthur, S., Thomas, S., Stamatoyannopoulos, J.A., Eisen,
M.B., et al. (2012). DNA regions bound at low occupancy by transcription fac-
tors do not drive patterned reporter gene expression in Drosophila. Proc. Natl.
REFERENCES
Acad. Sci. USA 109, 21330–21335.
Alver, B.H., Kim, K.H., Lu, P., Wang, X., Manchester, H.E., Wang, W., Haswell, Gerstein, M.B., Kundaje, A., Hariharan, M., Landt, S.G., Yan, K.-K., Cheng, C.,
J.R., Park, P.J., and Roberts, C.W.M. (2017). The SWI/SNF chromatin remod- Mu, X.J., Khurana, E., Rozowsky, J., Alexander, R., et al. (2012). Architecture of
elling complex is required for maintenance of lineage specific enhancers. Nat. the human regulatory network derived from ENCODE data. Nature 489,
Commun. 8, 14648. 91–100.
Attanasio, C., Nord, A.S., Zhu, Y., Blow, M.J., Biddie, S.C., Mendenhall, E.M., Glaros, S., Cirrincione, G.M., Palanca, A., Metzger, D., and Reisman, D. (2008).
Dixon, J., Wright, C., Hosseini, R., Akiyama, J.A., et al. (2014). Tissue-specific Targeted Knockout of BRG1 Potentiates Lung Cancer Development. Cancer
SMARCA4 binding at active and repressed regulatory elements during Res. 68, 3689–3696.
embryogenesis. Genome Res. 24, 920–929. Goodison, S., Yuan, J., Sloan, D., Kim, R., Li, C., Popescu, N.C., and Urquidi,
Bao, X., Rubin, A.J., Qu, K., Zhang, J., Giresi, P.G., Chang, H.Y., and Khavari, V. (2005). The RhoGAP protein DLC-1 functions as a metastasis suppressor in
P.A. (2015). A novel ATAC-seq approach reveals lineage-specific reinforce- breast cancer cells. Cancer Res. 65, 6042–6053.
ment of the open chromatin landscape via cooperation between BAF and Gross, D.S., and Garrard, W.T. (1988). Nuclease hypersensitive sites in chro-
p63. Genome Biol. 16, 284. matin. Annu. Rev. Biochem. 57, 159–197.
Bossen, C., Murre, C.S., Chang, A.N., Mansson, R., Rodewald, H.-R., and Hakim, O., Sung, M.-H., Voss, T.C., Splinter, E., John, S., Sabo, P.J., Thurman,
Murre, C. (2015). The chromatin remodeler Brg1 activates enhancer reper- R.E., Stamatoyannopoulos, J.A., de Laat, W., and Hager, G.L. (2011). Diverse
toires to establish B cell identity and modulate cell growth. Nat. Immunol. gene reprogramming events occur in the same spatial clusters of distal regu-
16, 775–784. latory elements. Genome Res. 21, 697–706.
tensin binding and Rho-specific GTPase-activating protein activities. Proc. Tanay, A. (2006). Extensive low-affinity transcriptional interactions in the yeast
Natl. Acad. Sci. U S A 104, 9012–9017. genome. Genome Res. 16, 962–972.
Raab, J.R., Resnick, S., and Magnuson, T. (2015). Genome-Wide Transcrip- Thurman, R.E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M.T., Haugen,
tional Regulation Mediated by Biochemically Distinct SWI/SNF Complexes. E., Sheffield, N.C., Stergachis, A.B., Wang, H., Vernot, B., et al. (2012). The
PLoS Genet. 11, e1005748. accessible chromatin landscape of the human genome. Nature 489, 75–82.
Ramagopalan, S.V., Heger, A., Berlanga, A.J., Maugeri, N.J., Lincoln, M.R., Torrino, S., Roustan, F.R., Kaminski, L., Bertero, T., Pisano, S., Ambrosetti, D.,
Burrell, A., Handunnetthi, L., Handel, A.E., Disanto, G., Orton, S.-M., et al. Dufies, M., Uhler, J.P., Lemichez, E., Mettouchi, A., et al. (2019). UBTD1 is a
(2010). A ChIP-seq defined genome-wide map of vitamin D receptor binding: mechano-regulator controlling cancer aggressiveness. EMBO Rep. 20,
associations with disease and evolution. Genome Res. 20, 1352–1360. e46570.
Rao, S.S.P., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Vierbuchen, T., Ling, E., Cowley, C.J., Couch, C.H., Wang, X., Harmin, D.A.,
Robinson, J.T., Sanborn, A.L., Machol, I., Omer, A.D., Lander, E.S., and Aiden, Roberts, C.W.M., and Greenberg, M.E. (2017). AP-1 Transcription Factors
E.L. (2014). A 3D map of the human genome at kilobase resolution reveals prin- and the BAF Complex Mediate Signal-Dependent Enhancer Selection. Mol.
ciples of chromatin looping. Cell 159, 1665–1680. Cell 68, 1067–1082.e12.
Reddy, T.E., Pauli, F., Sprouse, R.O., Neff, N.F., Newberry, K.M., Garabedian, Vierstra, J., and Stamatoyannopoulos, J.A. (2016). Genomic footprinting. Nat.
M.J., and Myers, R.M. (2009). Genomic determination of the glucocorticoid Methods 13, 213–221.
response reveals unexpected mechanisms of gene regulation. Genome Res. Vockley, C.M., D’Ippolito, A.M., McDowell, I.C., Majoros, W.H., Safi, A., Song,
19, 2163–2171. L., Crawford, G.E., and Reddy, T.E. (2016). Direct GR Binding Sites Potentiate
Ripley, B.D. (1976). The Second-Order Analysis of Stationary Point Processes. Clusters of TF Binding across the Human Genome. Cell 166, 1269–1281.e19.
J. Appl. Prob. 13, 255–266. Wang, Y., Song, F., Zhang, B., Zhang, L., Xu, J., Kuang, D., Li, D., Choudhary,
M.N.K., Li, Y., Hu, M., et al. (2018). The 3D Genome Browser: a web-based
Rodriguez-Nieto, S., Cañada, A., Pros, E., Pinto, A.I., Torres-Lanzas, J., Lo-
browser for visualizing 3D genome organization and long-range chromatin in-
pez-Rios, F., Sanchez-Verde, L., Pisano, D.G., and Sanchez-Cespedes, M.
teractions. Genome Biol. 19, 151.
(2011). Massive parallel DNA pyrosequencing analysis of the tumor suppres-
sor BRG1/SMARCA4 in lung primary tumors. Hum. Mutat. 32, E1999–E2017. Whyte, W.A., Orlando, D.A., Hnisz, D., Abraham, B.J., Lin, C.Y., Kagey, M.H.,
Rahl, P.B., Lee, T.I., and Young, R.A. (2013). Master transcription factors and
Roy, N., Malik, S., Villanueva, K.E., Urano, A., Lu, X., Von Figura, G., Seeley,
mediator establish super-enhancers at key cell identity genes. Cell 153,
E.S., Dawson, D.W., Collisson, E.A., and Hebrok, M. (2015). Brg1 promotes
307–319.
both tumor-suppressive and oncogenic activities at distinct stages of pancre-
atic cancer formation. Genes Dev. 29, 658–671. Wong, A.K.C., Shanahan, F., Chen, Y., Lian, L., Ha, P., Hendricks, K., Ghaffari,
S., Iliev, D., Penn, B., Woodland, A.-M., et al. (2000). BRG1, a Component of
Saintigny, P., Peng, S., Zhang, L., Sen, B., Wistuba, I.I., Lippman, S.M., Girard,
the SWI-SNF Complex, Is Mutated in Multiple Human Tumor Cell Lines. Can-
L., Minna, J.D., Heymach, J.V., and Johnson, F.M. (2012). Global evaluation of
cer Res. 60, 6171–6177.
Eph receptors and ephrins in lung adenocarcinomas identifies EphA4 as an in-
hibitor of cell migration and invasion. Mol. Cancer Ther. 11, 2021–2032. Yuan, B.-Z., Jefferson, A.M., Baldwin, K.T., Thorgeirsson, S.S., Popescu, N.C.,
and Reynolds, S.H. (2004). DLC-1 operates as a tumor suppressor gene in hu-
Sakuma, T., Hosoi, S., Woltjen, K., Suzuki, K., Kashiwagi, K., Wada, H., Ochiai,
man non-small cell lung carcinomas. Oncogene 23, 1405–1411.
H., Miyamoto, T., Kawai, N., Sasakura, Y., et al. (2013). Efficient TALEN con-
Yuan, B.-Z., Jefferson, A.M., Millecchia, L., Popescu, N.C., and Reynolds, S.H.
struction and evaluation methods for human cell and animal applications.
(2007). Morphological changes and nuclear translocation of DLC1 tumor sup-
Genes Cells 18, 315–326.
pressor protein precede apoptosis in human non-small cell lung carcinoma
Sekimata, M., Kabuyama, Y., Emori, Y., and Homma, Y. (1999). Morphological cells. Exp. Cell Res. 313, 3868–3880.
changes and detachment of adherent cells induced by p122, a GTPase-acti-
Zabidi, M.A., Arnold, C.D., Schernhuber, K., Pagani, M., Rath, M., Frank, O.,
vating protein for Rho. J. Biol. Chem. 274, 17757–17762.
and Stark, A. (2015). Enhancer-core-promoter specificity separates develop-
Skene, P.J., and Henikoff, S. (2017). An efficient targeted nuclease strategy for mental and housekeeping gene regulation. Nature 518, 556–559.
high-resolution mapping of DNA binding sites. eLife 6, e21856.
Zhang, H.S., Gavin, M., Dahiya, A., Postigo, A.A., Ma, D., Luo, R.X., Harbour,
Skene, P.J., Henikoff, J.G., and Henikoff, S. (2018). Targeted in situ genome- J.W., and Dean, D.C. (2000). Exit from G1 and S phase of the cell cycle is regu-
wide profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006– lated by repressor complexes containing HDAC-Rb-hSWI/SNF and Rb-hSWI/
1019. SNF. Cell 101, 79–89.
Stewart, D.J. (2014). Wnt Signaling Pathway in Non–Small Cell Lung Cancer. Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E.,
J. Natl. Cancer Inst. 106, djt356. Nusbaum, C., Myers, R.M., Brown, M., Li, W., and Liu, X.S. (2008). Model-
Sun, X., Wang, S.C., Wei, Y., Luo, X., Jia, Y., Li, L., Gopal, P., Zhu, M., Nassour, based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137.
I., Chuang, J.-C., et al. (2017). Arid1a Has Context-Dependent Oncogenic and Zhou, X., Thorgeirsson, S.S., and Popescu, N.C. (2004). Restoration of DLC-1
Tumor Suppressor Functions in Liver Cancer. Cancer Cell 32, 574–589.e6. gene expression induces apoptosis and inhibits both cell growth and tumori-
Tamkun, J.W., Deuring, R., Scott, M.P., Kissinger, M., Pattatucci, A.M., Kauf- genicity in human hepatocellular carcinoma cells. Oncogene 23, 1308–1313.
man, T.C., and Kennison, J.A. (1992). brahma: a regulator of Drosophila home- Zhou, X., Maricque, B., Xie, M., Li, D., Sundaram, V., Martin, E.A., Koebbe,
otic genes structurally related to the yeast transcriptional activator SNF2/ B.C., Nielsen, C., Hirst, M., Farnham, P., et al. (2011). The Human Epigenome
SWI2. Cell 68, 561–572. Browser at Washington University. Nat. Methods 8, 989–990.
STAR+METHODS
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, John
Stamatoyannopoulos ([email protected]).
Materials Availability
All unique/stable reagents generated in this study are available from the Lead Contact with a completed Uniform Biological Materials
Transfer Agreement.
Cell Lines: A549 (ATCC CCL-185) cells were maintained in F-12K Medium (ATCC, Cat #30-2004) supplemented with 10% HyClone
FBS (GE Healthcare Life Sciences, Cat #SH30071.03) and 1% Penicillin-Streptomycin (Corning, Cat #30-002-CI). Cells were
passaged every 4-5 days and detached using Accutase (Innovative Cell Technology, Cat #AT-104). Cells were sourced from
ATCC (A549 (ATCC CCL-185)).
METHOD DETAILS
Transfections
For all transfections, a BTX ECM830 device (BTX Harvard Apparatus) with a 2 mm gap cuvette was used. TALEN mRNAs were pre-
pared using a mMessageMachine T7 Ultra Kit (#AM1345, Ambion). Per transfection, 2 3 105 cells were collected and washed twice
with PBS. Cell pellets were resuspended in 100 mL BTXpress Electroporation Solution (BTX Harvard Apparatus, Cat#45-0805)
together with 2 mg mRNA per TALEN Monomer and 2 mL 100 mM 90-mer ssODN containing the 23 nt corrective insertion (underlined)
and 33-34 nt homology arms: 50 - ATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGT
CACTGAGAGAGTGGACAAGCAGTCAGC-30 . The cell/mRNA mixture was transferred to the transfection cuvette and immediately
electroporated with one pulse of 250 V for 5 ms. Following electroporation, cells were transferred to 12 well plates containing
pre-warmed F-12K Medium and incubated at 37 C.
Clonal Isolation
7 days post-transfection, edited A549 pools were sorted as single cells into a 96 well plate using a MoFlo Astrios EQ Cell Sorter (Beck-
man Coulter). Cells were expanded for 17 days and subsequently split 1:10 for maintenance and 9:10 for genomic DNA, which was
harvested 4 days later. Genotyping of clones was performed as described above and was confirmed by Sanger sequencing
(GENEWIZ) using outer primers (50 -CTCGGCTCTCTGCAAGCT-30 and 50 -GGAGTTGTACTGGTTGTCTTGT-30 ; IDT) that each
lie 700 bp away from the edit site to check for larger deletions.
Western Blotting
A549 cells were lysed in RIPA buffer (ThermoFisher Scientific, Catalog # 89900,) containing Complete, EDTA-free protease inhibitor
cocktail tablets (Roche #11873580001, Roche). The total protein concentration was measured by Bradford assay (Bio-Rad). Samples
were run on the Simple Western System (Protein Simple) with the 12-220kDa kit, loading 2-4 mg lysate and antibodies for SMARCA4/
BRG1-(G-7) (Santa Cruz, #sc17796) and GAPDH (Santa Cruz, #sc47724) at 1:400 and1:2000 dilutions, respectively.
DNaseI-Seq
DNaseI-Seq was performed as previously described (Hesselberth et al., 2009; John et al., 2011; Thurman et al., 2012). Briefly, 5x106
cells were lysed using 0.01% IGEPAL. Nuclei were collected by centrifugation at 500 g for 5 min, and DNaseI digestion was per-
formed for 3 min at 37 C. DNaseI cleavage fragments were size selected by PEG fractionation, fragments were end repaired, and
Illumina sequencing libraries were prepared using the ThruPLEX DNA-seq kit. Libraries were sequenced to a typical depth of 50M
reads (Table S1).
RNA-Seq
RNA was extracted from 500K cells using the QIAGEN RNeasy kit. RNA-Seq libraries (Table S2) were prepared using the TruSeq
Stranded Total RNA kit (Illumina).
CUT&RUN
CUT&RUN datasets (Table S3) were generated as previously described (Skene et al., 2018). Antibodies were obtained from the
following suppliers: H3K4me1 (Active Motif 39297), H3K4me2 (Active Motif 39141), H3K27me3 (Cell Signaling 9733), BRG1/
SMARCA4 (Cell Signaling 49360), BRM/SMARCA2 (Cell Signaling 11966). Protein A-MNase was kindly provided by Dr. Steven He-
nikoff (Fred Hutchinson Cancer Research Center).
ENCODE Datasets
TF ChIP-Seq
All available ChIP-Seq data in A549 aligned to GRCh38 was downloaded from the ENCODE portal (https://www.encodeproject.org/).
To analyze the fraction of novel SMARCA4 dependent DHSs overlapping pre-existing TF ChIP-Seq peaks, the list was manually
curated to remove non transcription factor targets and a single experiment was chosen for each TF (Table S4). For detailed analysis
of specific TFs, replicate concordant peaks were used: JUNB (ENCFF565QYS and ENCFF683JTQ), CTCF (ENCFF465EGH, ENCF-
F531OAI, and ENCFF751UOX), RAD21(ENCFF345LNM, ENCFF512QEA, and ENCFF178CSM), and SMC3(ENCFF046RJH,
ENCFF321GIF, and ENCFF922EYU).
Methylation
A549 bisulfite sequencing datasets were used to analyze promoter methylation (ENCFF005TID, ENCFF003JVR).
Histone ChIP
H3K4me3 (ENCFF428UWO, ENCFF643FMK, ENCFF973TUQ) histone ChIP-Seq data were downloaded from the ENCODE portal.
Hi-C:
Hi-C data from two A549 experiments was obtained from the ENCODE portal (https://www.encodeproject.org/). TAD file:
ENCFF716CFF. Chromatin Loop File: ENCFF803ZOW. A549 Hi-C heatmaps were visualized using the 3D Genome Browser
(Wang et al., 2018). Chromatin loops were visualized using the WashU epigenome browser (Zhou et al., 2011).
TCGA Data
Identification of SMARCA4 gene signature
Mutation, expression, and patient data for TCGA lung adenocarcinoma samples were obtained from the Genomic Data
Commons Data Portal (https://portal.gdc.cancer.gov/). Samples with an annotated Nonsense_Mutation or Frame_Shift_Del muta-
tion in SMARCA4 were classified as SMARCA4 null samples. Samples with other mutations (annotated as Missense_Mutation or
Splice_Site) in SMARCA4 or expression level of SMARCA4 below the average expression in SMARCA4 null samples were not
included in the analysis. Expression data were log transformed (with 1 added to the FPKM value) and compared between SMARCA4
null and wild-type (WT) samples. Protein coding genes significantly downregulated in SMARCA4 samples were identified by Welch’s
t test (BH FDR < 0.01 and |log fold change| > log2[1.5]). Genes with low expression (mean expression < 1 FPKM in both SMARCA4 null
and WT samples) were filtered from the gene signature.
Stable versus Reactivated genes
Genes in the TCGA SMARCA4 signature that were significantly upregulated after SMARCA4 rescue were classified as reactivated
while those that were not upregulated were classified as stable. Genes with zero mapped reads in either SMARCA4/ or
SMARCA4+/+ A549s were not included in the stable gene set since the genes were often members of highly duplicated gene families
and likely not detected for technical reasons (i.e., mappability).
Survival analysis
Expression data from the SMARCA WT samples was log transformed and a Cox regression was fit between expression and patient
survival for each protein coding gene. Patient age and gender were included as additional variables in the regression. p values for the
association between gene expression and survival were derived from the t-statistic of the gene expression regression coefficient. For
comparisons between reactivated genes and all upregulated genes, only upregulated genes expressed highly enough in TCGA sam-
ples to be included in the SMARCA4 gene signature were included.
where GN is a gamma distribution. We used p1;N as our estimate of the fraction of upregulated genes in the set of genes with
DHS_Change equal N, and we calculated the weighted mean of the Log2 fold change for the upregulated population to estimate
the average magnitude of expression change.
P
DHS Change = N wi LFCi
m1;N = P wi = Pð genei is upregulatedÞ
DHS Change = N wi
wi = Pðgenei is upregulatedÞ
To understand how the increase in the fraction of genes that are upregulated and the magnitude of upregulation compare to the
expectation from simple binomial model, we extrapolated from the values found for genes with DHS_Change = 1 (i.e. p1;1 and m1;1 ) to
estimate the fraction of genes we would expect to see change in expression if DHS_Change equals k ðpbinomial;k Þ and the magnitude of
response for those genes ðmbinomial;k Þ.
pbinomial;k = 1 ð1 p1;1 Þk
m1;N ðk p1;1 Þ
mbinomial;k =
pbinomial;k
Aggregate gene level score of distal change in accessibility
To quantify the relationship between changes in accessibility and change in gene expression, we first created an aggregate score to
incorporate information about the DHSs distance from a gene’s TSS (Gencode v25 basic) to weight the potential contribution of each
DHSs to a gene’s change in expression. Only DHSs greater than 1kb from a gene’s TSS were included to isolate the effects of distal
element remodeling on gene expression.
X
DHS ScoreGenei = e½lother Iother;ij logðdj Þ + lclosest Iclosest;ij logðdj ÞLFCj
dj < 1Mb & d > 1kb
1 if DHSj is closest to Genei
Iclosest;ij =
0 if DHSj is not closest to Genei
1 if DHSj is not closest to Genei
Iother;ij =
0 if DHSj is closest to Genei
We used different decay rates for DHSs closest to the gene’s TSS and all other genes and selected the two decay rates to maximize
the correlation (minimize the squared error) between the aggregate gene level scores and genes’ change in expression (Log2FC).
Regression on promoter state
To analyze the relationship between promoter state and changes in gene expression, we fit the model
RNA LFC = DHS Score + Promoter State + DHS Score : Promoter State
1 if TSS + = 1kb overlaps feature
IFeature =
0 otherwise
8
< 1 if methylation < 33rd percentile for non CpG island genes
Cmethylation = 0 if 33rd percentile < methylation < 66th percentile
:
1 if methylation > 66th percentile for non CpG island genes
The promoter state included indicator variables for overlap with histone ChIP-Seq peaks and discretized the average fraction of CpG
methylation at the promoter into low, medium, and high values based on the methylation tertiles in non-CpG promoters (33% and
74% methylated). For genes with multiple TSSs, the TSS with the largest number of differential DHSs ± 5kb was selected. Ties
were broken by selecting the TSS with the greatest accessibility in SMARCA4/ A59s.
GO Terms
GO analysis of genes whose promoters (TSS +/ 1kb) fall in the SMARCA4 sensitive regions was performed using the Panther web
tool with default parameters and GO biological process complete as the annotation set (Mi et al., 2017).
Supplemental Information
SMARCA4+/+/ssODN Y G V S Q A L A R G L Q S Y Y A V A H A V T E R V D K Q S
ATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGC
TGAATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGCGCTTA
SMARCA4-/- (Q729fs*4) Y G V S Q A L A R G L C C H *
L1 R1
L2 R2
TALENs L3 R3
L4 R4
L5 R5
C
Reference AGATGTCGATGATGAATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGCGCTTA
Sequence
SMARCA4 -/-
Clone #1
Reference AGATGTCGATGATGAATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGCGCTTA
Sequence
SMARCA4 +/+
Clone #1
D
DNaseI-Seq 100
RNA-Seq
SMARCA4 -/- Clone 1
200 50
SMARCA4 -/- Clone 2
15% Variance
10% Variance
Explained
Explained
SMARCA4 300
SMARCA2
250
23% Variance
21% Variance
150
Explained
Explained
0
0
−250
−150
17% Variance
25% Variance
100 100
Explained
Explained
Explained
0
0 0
−100
−100
−200
−150
−200
−200 −100 0 100 200 −300 −150 0 150 300 −200−100 0 100 200 300
48% Variance Explained 60% Variance Explained 31% Variance Explained
Figure S1: Generation of SMARCA4+/+ A549 clones. Related to Figure #1
a) Schematic of TALENs targeting the SMARCA4 locus that contains a homozygous 23 bp deletion in A549
cells (Q729fs*4 mutation). TALEN monomers recognizing the + strand (L1 to L5) and – strand (R1 to R5)
are depicted, and heterodimers that were tested are denoted by dotted lines. The sequence of the single
stranded donor DNA (ssODN) used to correct SMARCA4 is shown.
b) Indel and scarless HDR rates for TALENs targeting the SMARCA4 locus. The lead TALEN used to
generate clonal lines is shown in orange. Error bars: STD of Indels and HDR from three sequencing runs.
c) Chromatograms from Sanger sequencing for a representative SMARCA4-/- and SMARCA4+/+ clone.
d) Scatterplots display reduced dimensionality representations (PCA) of data from genome-wide assays
performed on SMARCA4+/+ and SMARCA4-/- clones. For RNA-Seq, PCA is based on normalized counts
for all genes. For all other assays, PCA is based on normalized counts around identified DHSs. All genomic
assays show clear separation between SMARCA4+/+ samples and SMARCA4-/- samples in the first
principle component. For RNA-Seq/DNaseI-Seq, multiple data points from the same clone represent
technical replicates (independent cultures).
A B 3.0
Cell Type Overlap
Remodeled DHSs
Remodeled DHSs
2.0
DHSs
1.5
1.0
0.5
0.0
transcript proximal-intergenic
t ic
st
ll
al
le
al
Ce
liu
b la
e li
oe
sc
ur
promoter distal-intergenic
th e
Ne
it h
Mu
ta p
em
ro
Ep
F ib
do
ma
St
En
C D
He
120 120
80
SMARCA2 Signal
SMARCA2 Signal
(SMARCA4(-/-))
(SMARCA4(+/+))
90 90
SMARCA4 Cut-&-Run Signal
60% Overlap
SMARCA4/SMARCA2 Ratio
87% >= Median DHS
MACS Peak
60 60 0.60
30 30
60 0.45
0 0
0.30
120 120
0.15
40
SMARCA4 Signal
SMARCA4 Signal
(SMARCA4(-/-))
(SMARCA4(+/+))
90 90
0.00
60 60
le d
te d
de
ec
30 30
20
mo
a ff
Un
Re
0 0
le d
le d
te d
te d
S s /-)
S s -/-)
D H 9 (-
de
de
ec
ec
0
9 (
mo
mo
54
54
a ff
a ff
SMARCA4 CUT&RUN
rA
rA
DH
Un
Un
Re
Re
Peaks
d
d
he
he
Ss e
S s le
ou E
H ct
Ot
Ot
H e
gr D
nd
D ffe
D od
ck CO
na
em
Ba EN
U
Peaks Peaks
E F
Motif Pvalue Motif Match SMARCA4 -/- SMARCA4 +/+
T A TCA
GA G
GG GGG
C G GA GG GGGG C G
0.8
A C
T AC
T A
A C
T AC
T
T T
A A T A C G
T
T
C
T CCT G C TT A
A
CCC
T G C
A C
TT
G
A
DNaseI Cleavage
GG
Log2(Obs/Exp)
Unaffected
1.8e-440 AP-1
bits
0.0
Remodeled
TGTGG
0 T A
2
−0.8
6.4e-251 RUNX
TCT
bits
GG G
A
C C T
T
0 G
A GGA G
C
(-Log10[pvalue])
Motif Strength
2
AT GAGAGA
DHS LFC
G G
0 T
A
A C
A
G
0.8 -15 0 15 -15 0 15 DNaseI Cleavage
Fraction of DHSs overlapping
0.6
TF ChIP Peak
0.4
0.2
0.0
49 Ss ex
A5 DH I nd
l DE
ve
No CO
EN
Figure S2: Characterization of SMARCA4 remodeled DHSs. Related to Figure #2
a) Genomic distribution of DHSs. Pie charts show the fraction of DHSs overlapping GENCODE transcript
annotations for all A549 DHSs (left) and SMARCA4 remodeled DHSs (right).
b) Overlap of remodeled DHSs with cell type specific DHSs from relevant cell types. Plotted values are
normalized to the fraction of A549 SMARCA4-/- DHSs overlapping the cells’ DHSs. Cell type specific
DHSs are defined as DHSs active in a sample and present in <90% of all ENCODE samples.
c) Quantitative SMARCA4 CUT&RUN signal at remodeled DHSs, unchanged A549 DHSs, and a
background of DHSs active in other cell types. Median signal is plotted with error bars displaying 25th-
75th percentile.
d) Relative SMARCA2 and SMARCA4 binding at SMARCA4 remodeled DHSs. Plotted is the median
SMARCA2 CUT&RUN signal in SMARCA4-/- cells (top left), SMARCA2 CUT&RUN signal in
SMARCA4+/+ cells (top middle), SMARCA4 CUT&RUN signal in SMARCA4-/- cells (bottom left),
SMARCA4 CUT&RUN signal in SMARCA4+/+ cells (bottom middle) and the ratio of
SMARCA4/SMARCA2 CUT&RUN signal in SMARCA4+/+ cells (right) for SMARCA4 remodeled and
unaffected DHSs that overlap SMARCA4 CUT&RUN peaks. Error bars show 25-75th percentiles.
e) Top significant motifs in SMARCA4 remodeled DHSs identified by de novo motif search (meme).
f) Top: Aggregate cleavage profile (trimmed mean of middle 98% of log2 observed/expected values) around
SP1 motifs in SMARCA4 remodeled and unchanging DHSs in SMARCA4-/- and SMARCA4+/+ samples.
Motifs were selected from DHSs that overlap both a SMARCA4 CUT&RUN peak and SP1 ChIP-Seq peak
and are footprinted in either SMARCA4-/- or SMARCA4+/+ cells. Bottom: DHSs were ordered by motif
footprint occupancy in SMARCA4-/- samples. From left to right, heatmaps display the average log fold
change in accessibility after SMARCA4 reactivation, DNaseI cleavages around SP1 motif in SMARCA4-/-
cells, DNaseI cleavages around SP1 motif in SMARCA4+/+ cells, and AP1 motif strength. Ordering of
DHSs (rows) is consistent across heatmaps. To highlight the trend with footprint occupancy, for heatmaps
displaying DHS log fold change and motif pvalues, DHSs were separated into 25 bins and average values
for the bin are shown.
g) Overlap of A549 DHSs, SMARCA4 remodeled DHSs, and DHSs from other cell types with transcription
factor (TF) ChIP-Seq peaks from unedited A549s.
A 0.5 B DNaseI-Seq RNA-Seq
0.4
SMARCA4+/+-Clone3 SMARCA4-/--Clone1
SMARCA4+/+-Clone3 SMARCA4-/--Clone1
SMARCA4+/+-Clone1 SMARCA4-/--Clone2
SMARCA4+/+-Clone2 SMARCA4+/+-Clone2
SMARCA4+/+-Clone2 SMARCA4+/+-Clone1
0.2
SMARCA4-/--Clone1 SMARCA4+/+-Clone2
SMARCA4-/--Clone1 SMARCA4+/+-Clone1
SMARCA4-/--Clone2 SMARCA4+/+-Clone3
0.1
All Genes SMARCA4-/--Clone2 SMARCA4+/+-Clone3
SMARCA4-/--Clone1
SMARCA4-/--Clone1
SMARCA4-/--Clone2
SMARCA4-/--Clone2
SMARCA4-/--Clone1
SMARCA4-/--Clone1
SMARCA4-/--Clone2
SMARCA4-/--Clone2
SMARCA4+/+-Clone3
SMARCA4+/+-Clone3
SMARCA4+/+-Clone1
SMARCA4+/+-Clone1
SMARCA4+/+-Clone2
SMARCA4+/+-Clone2
SMARCA4+/+-Clone2
SMARCA4+/+-Clone1
SMARCA4+/+-Clone2
SMARCA4+/+-Clone1
SMARCA4+/+-Clone3
SMARCA4+/+-Clone3
0.9
0.0
Correlation
0.8
1000 5000 25000 100000 500000 0.7
C D 1.0
Binomial Model
1.0
Expression Change
0.4
0.5
0.2
0.0
0.0
0 1 2 3 4 5 6 7 8 9 >=10
−0.5
0 1 2 3 4 5 6 7 >=8
E # of Distal Remodeled DHSs
# of Remodeled Distal DHSs
>=8 DHS Expected LFC effect size ~2x with 10
0 DHSS 1 DHS 4 DHS
remodeled DHSs Binomial Model
(Additive)
Mean Log2 Fold Change
1.5
of Affected Genes
μ = 1.09
μ = 0.84 ∏1 = 46% μ= 1.35 1.0
Frequency
0.5
−2 0 2 4 −2 0 2 4 −2 0 2 4 −2 0 2 4
0.0
RNA LFC RNA LFC RNA LFC RNA LFC 1 2 3 4 5 6 7 8 9 >=10
# of Distal Remodeled DHSs
F G
0.08
0.08
Alternative
TSS
Fraction of Genes Upregulated
0.04
0.04 All Protein
Coding Genes
0.02 0.02
0.00 0.00
Single TSS Multiple TSS Single TSS Multiple TSS
Figure S3: Quantification of relationship between chromatin and gene expression. Related to Figure #3
a) Relationship between gene expression and DHSs as a function of distance between a gene’s TSS and the
DHS. Plotted is the average change in gene expression for all genes within a given genomic distance of a
SMARCA4 remodeled DHS (black) or genes within a given genomic distance of a SMARCA4 remodeled
DHS only if it is the closest gene (blue). Errorbars display +/- SEM.
b) Heatmaps of pairwise correlation (pearson r of log[counts + 1]) between DNaseI-seq and RNA-seq
samples. Color scale is the same for DNaseI and RNA heatmaps.
c) Distribution of expression changes as a function of remodeled DHSs. Top: As in Figure #3c, mean change
in expression at genes grouped by number of neighboring SMARCA4 responsive DHSs. Dark error bars
+/- SEM, light error bars show 25th-75th percentile. Bottom: Histograms of the distribution of changes in
gene expression for genes with n = 0, n=1,n = 4, and n ³ 8 remodeled DHSs. For each distribution, a
mixture model was fit to deconvolve unaffected genes (black, distribution of expression for genes with n =
0 remodeled DHSs) and affected genes (red). Mean of the affected distribution and the fraction of genes
assigned to the affected distribution are reported.
d) Comparison of the fraction of genes that change expression with a binomial model where the fraction of
DHSs that lead to upregulation is determined from the population of genes with 1 remodeled DHS. Green
dots show estimates based on mixture model of actual data. Shaded area shows bootstrapped 95%
confidence intervals of binomial model.
e) Comparison of the average change in expression of the subset of genes that change expression with a
binomial model where the fraction of DHSs that lead to upregulation and the average effect of those DHSs
is determined from the population of genes with 1 remodeled DHS. Green dots show estimates based on
mixture model of actual data. Shaded area shows bootstrapped 95% confidence intervals of binomial
model.
f) Genes with a single annotated TSS were compared to genes with multiple distinct TSSs. To define distinct
TSSs, for each gene the primary TSS with the highest accessibility was chosen. All TSSs falling within 1
kb of the primary TSS were considered to be associated with that TSS/promoter while annotated TSSs > 1
kb were considered alternative TSSs. Barplot displays the fraction of upregulated genes after SMARCA4
reactivation for genes with a single TSS and multiple distinct TSSs.
g) Barplot displays the fraction of genes with change in promoter accessibility after SMARCA4 reactivation
for genes with a single TSS and multiple distinct TSSs.
B H3K4me2 C
A 2.5
1e5
Genomic Overlap Element Overlap
SMARCA4 -/-
SMARCA4 +/+
Ripley-K Activated DHSs -
2.0
Cut&Run Density
Experimental
Ripley-K All DHSs
SMARCA4 Binding
HMM Remodeled Clusters
1.0 Regions (Super Enhancers)
Bootstrapped Background
0.5
Log2FC
Gene Overlap 221 139 789
0.0
0 100 200 300 400 500 Region Start Region End
D E
4
Hi-C
RNA LFC
TAD 0
Chromatin
Loops
Remodeled
Region
−2
DNaseI-Seq
Smarca4-/-
IFNGR2 ATP5PO KCNE2 CLIC6 SETD4 CHAF1B SIM2 CNTLN ADAMTSL1 DENND4C
Smarca4+/+
Region Both Super
Enhancer
DNAJC28 MRPS6 KCNE1 RUNX1 CBR1 CLDN14 SH3GL2 SAXO1 RPS6
GART SLC5A3 RCAN1 CBR3 HLCS RRAGA ACER2
SON SMIM11A DOP1B HAUS6 SLC24A2
DONSON FAM243A MORC3
H
AP000311.1 SMIM34A PLIN2
CRYZL1 1 Mb hg38
ITSN1 chr9: 18,000,000 18,500,000 19,000,000 19,500,000 20,000,000
1 Mb hg38
chr21: 34,000,000 34,500,000 35,000,000 35,500,000 36,000,000 36,500,000
F G
TAD
Chromatin
Loops
Remodeled
Region
Smarca4-/-
I Loop
J K
Region
Overlap
Span
2
Observed 60
Randomized
1
Log2 Enrichment
H3K27ME3
45
H3K27ME3
Frequency
Count
30
−1
15
−2
0.0 0.2 0.4 0.6 0.8 1.0
Overlap Fraction
0 H3K4ME3
H3K4ME3
Figure S4: Identification and characterization of SMARCA4 remodeled domains. Related to Figure #4
a) Linear ripley K (a measure of spatial clustering) of remodeled DHSs versus all detected DHSs. Shaded
region represents null distribution +/- 95% confidence interval based on 1000 block permutations of the
remodeled DHSs.
b) H3K4me2 signal around identified regions in SMARCA4-/- and SMARCA4+/+ clones. Top: Lineplots of the
aggregate (trimmed mean of middle 95%) score over all regions. Bottom: heatmaps of log fold change
values for individual regions.
c) Top: Venn diagram of overlap of SMARCA4 remodeled domains with SMARCA4 super-enhancers
identified by the ROSE algorithm. Overlap is shown by genomic distance (left), element number (right)
and genes linked to regions/super enhancers (bottom). Bottom: Boxplots of change in expression of genes
linked to SMARCA4 remodeled domains, SMARCA4 super enhancers, or both.
d) Hi-C signal, TAD annotations, chromatin loops, and DNaseI cleavage density at an example locus
identified to contain a remodeled domain.
e) As d.
f) As d.
g) As d.
h) Density of TAD boundaries relative to the identified remodeled domains. Lineplot displays density of TAD
boundaries while tick marks below show individual TAD boundaries.
i) Overlap between Hi-C chromatin loops and regional changes in chromatin accessibility. Histogram of the
maximum fraction of genomic overlap between the region and a chromatin loop for each region (blue) with
the same quantity for a background of random regions with equal numbers of DHSs (grey) for comparison.
j) Heatmap of joint distribution for H3K27me3 and H3K4me3. TADs were binned into deciles based on the
average signal each histone mark and the number of TADs in each joint bin is plotted.
k) Frequency of TADs overlapping remodeled domains for each joint bin of H3K4me3/H3K27me3 signal.
Heatmap displays Log2(observed/expected) for each bin.
A B C
SMARCA4 Downregulated Signature
(N = 410) 120 p = 0.266 DLC1
25
24
100
20
Expression (FPKM)
80
16
15
FPKM
60
10
40 8
5
20
0 0 0
TCGA TCGA A549(-/-) Stable Reactivated SMARCA4-Null WT
SMARCA4 WT SMARCA4 LOF
D LFR
ACTIN_FILAMENT_BASED_PROCESS
REGULATION_OF_CELL
MORPHOGENESIS_INVOLVED_IN_DIFFERENTIATION
POSITIVE_REGULATION_OF_CELL_DEVELOPMENT
CELL_JUNCTION_ORGANIZATION
POSITIVE_REGULATION_OF_MAPK_CASCADE
REGULATION_OF_CELL_GROWTH
GO Term
EPITHELIAL_CELL_DIFFERENTIATION
RHO_PROTEIN_SIGNAL_TRANSDUCTION
CELL_CYCLE_G1_S_PHASE_TRANSITION
DNA_REPLICATION
MULTI_ORGANISM_METABOLIC_PROCESS
AMIDE_BIOSYNTHETIC_PROCESS
RRNA_METABOLIC_PROCESS
−8 −6 −4 −2 0 2 4 6
GSEA Normalized Enrichment Score
Figure S5: Comparison of expression changes in A549s to a TCGA derived, SMARCA4-null gene signature.
Related to Figure #5
a) Comparison of expression of SMARCA4-null gene signature (TCGA) with SMARCA4-/- A549s. Boxplot
displays distribution of gene expression for the SMARCA4-null gene signature in TCGA SMARCA4-WT,
TCGA SMARCA4-null, and A549s (SMARCA4-/-).
b) Boxplots display distribution of H3K27me3 signal around the TSSs of genes in the reactivated and stable
gene sets.
c) Example of a reactivated gene: Boxplots displaying expression of DLC1 in SMARCA4-null and
SMARCA4-WT tumor samples.
d) Enrichment of biological process GO terms by GSEA in up- and down-regulated genes after SMARCA4
reactivation.
DS-ID clone_ID Clone Total Aligned Nuclear Nuclear SPOT Hotspot Hotspot
Reads Nuclear Mapping Duplicate Number Coverage
Reads Rate Rate (Bp)
DS62068 A3 WT 82100174 35757728 0.435537786 13.62931112 0.6954 73242 30957873
DS62113 A3 WT 81576474 42730132 0.523804596 8.379403087 0.5805 72367 31206386
DS62073 B8 WT 260458804 127451240 0.489333584 14.63560496 0.5351 89616 43074752
DS62118 B8 WT 133150566 66290020 0.497857591 9.507675514 0.6293 84140 37828699
DS62078 E9 Rescue 113297102 54913020 0.484681594 6.284986694 0.5814 96058 41489731
DS62123 E9 Rescue 93097004 44993816 0.483300365 5.740877813 0.5462 83838 32038179
DS62128 C12 Rescue 147458648 69925906 0.474206884 6.555075597 0.5629 112363 46108620
DS62149 C12 Rescue 352400134 192455072 0.546126557 12.87358226 0.5018 133135 58010819
DS62133 F3 Rescue 138212502 81486932 0.589577143 7.610697627 0.5042 115540 45743470
DS62154 F3 Rescue 127891098 84769052 0.662822146 4.467566772 0.3775 113470 43047811
Table S1: Metadata and library statistics of DNaseI-Seq experiments. Related to STAR methods.
Table S2: Metadata and library statistics of RNA-Seq experiments. Related to STAR methods.
DS-ID Antibody clone_ID Clone Total Aligned Nuclear Mapping Nuclear Median
Reads Nuclear Rate Duplicate Insert
Reads Rate Size
DS65989 H3K27me3 A3 WT 14356010 12571940 0.875726612 1.805671997 175
DS65990 H3K27me3 B8 WT 16059896 13983776 0.870726436 1.705791054 168
DS65991 H3K27me3 C12 Rescue 17835566 15638850 0.876835083 1.873795068 166
DS65992 H3K27me3 E9 Rescue 15286732 13534414 0.885370006 1.938539785 165
DS65993 H3K27me3 F3 Rescue 15535854 13539026 0.871469698 1.924820884 171
DS65994 H3K4me1 A3 WT 13275202 11917610 0.897734739 1.869082811 172
DS65995 H3K4me1 B8 WT 13594488 12273226 0.902808991 1.909310559 168
DS65996 H3K4me1 C12 Rescue 12992438 11750824 0.904435642 1.87384306 167
DS65997 H3K4me1 E9 Rescue 10380306 9257664 0.891848853 1.749404601 163
DS65998 H3K4me1 F3 Rescue 14371786 13020186 0.905954625 1.938927754 164
DS65999 H3K4me2 A3 WT 17061756 15585638 0.913483817 2.581966808 164
DS66000 H3K4me2 B8 WT 12826120 11620962 0.906038771 2.14302396 158
DS66001 H3K4me2 C12 Rescue 20311292 18620400 0.916751135 2.760026637 157
DS66002 H3K4me2 E9 Rescue 18610562 17018368 0.914446753 3.227359991 152
DS66003 H3K4me2 F3 Rescue 20426588 18684266 0.914703229 2.832019197 160
DS66514 SMARCA4 A3 WT 21210270 15387936 0.725494584 43.00862702 111
DS66515 SMARCA4 B8 WT 19863482 13661520 0.687770654 54.78930602 101
DS66516 SMARCA4 C12 Rescue 24714828 20029624 0.810429431 22.73058146 88
DS66517 SMARCA4 E9 Rescue 24659906 20466306 0.82994258 15.15720521 89
DS66518 SMARCA4 F3 Rescue 25808664 21141410 0.819159411 20.08004197 87
DS66519 SMARCA2 A3 WT 22266478 18083030 0.812119007 20.18917184 92
DS66520 SMARCA2 B8 WT 24096138 20317310 0.843177027 23.36750288 101
DS66521 SMARCA2 C12 Rescue 25568084 18805740 0.735516201 33.93028937 71
DS66522 SMARCA2 E9 Rescue 24379774 19673216 0.806948251 23.72461117 90
DS66523 SMARCA2 F3 Rescue 31411738 24103992 0.767356203 8.826977706 112
Table S3: Metadata and library statistics of CUT&RUN experiments. Related to STAR methods
TF Peak-Number Accession-ID
SREBF1-human 3429 ENCFF624DDK
YY1-human 17078 ENCFF613DTQ
FOXA1-human 33874 ENCFF297HAX
POLR2AphosphoS2 6272 ENCFF156MIR
PHF8-human 17048 ENCFF907WHF
TCF12-human 31794 ENCFF228CDD
HDAC2-human 4167 ENCFF814DAF
EHMT2-human 2352 ENCFF199OOU
ETS1-human 9988 ENCFF896WFR
HES2-human 3242 ENCFF558XCJ
SREBF2-human 838 ENCFF483YCC
ELF1-human 11737 ENCFF935ZUW
CHD2-human 3440 ENCFF310IDS
MAZ-human 4323 ENCFF661NNJ
CBX8-human 2819 ENCFF330OCU
CEBPB-human 47003 ENCFF047UIF
CHD4-human 3542 ENCFF766YPH
RAD21-human 26062 ENCFF897QCA
CBX2-human 141 ENCFF208AXT
NR3C1-human 665 ENCFF963CGV
CTCF-human 43844 ENCFF535MZG
SIN3A-human 38272 ENCFF567BJI
FOSL2-human 33138 ENCFF808RWZ
TAF1-human 17246 ENCFF886KDK
JUND-human 21802 ENCFF587VEY
RFX5-human 6479 ENCFF179WDI
ZFP36-human 11426 ENCFF137JHO
ELK1-human 470 ENCFF605JXG
SP1-human 43742 ENCFF404OSB
RNF2-human 4801 ENCFF110EOX
KDM5A-human 5209 ENCFF149INM
MYC-human 9437 ENCFF542GMN
JUNB-human 11264 ENCFF565QYS
ZC3H11A-human 7858 ENCFF415SIS
SIX5-human 8553 ENCFF189NMX
ATF3-human 11014 ENCFF851UTY
REST-human 9886 ENCFF706DRE
USF2-human 11956 ENCFF593EOW
SMC3-human 23810 ENCFF256LDD
BCL3-human 12352 ENCFF093ZAB
GABPA-human 17425 ENCFF520GJC
POLR2A-human 31834 ENCFF664KTN
RCOR1-human 862 ENCFF993WZP
MAFK-human 70080 ENCFF813WJW
ESRRA-human 2631 ENCFF558UWY
KDM1A-human 7245 ENCFF316CBQ
EP300-human 4797 ENCFF727TYG
CREB1-human 3289 ENCFF576PUH
JUN-human 1767 ENCFF127HJG
ZBTB33-human 11349 ENCFF593ZJA
NFE2L2-human 7256 ENCFF418TUX
Table S4: ENCODE ChIP-Seq experiments used in the analysis of remodeled DHSs. Related to Figure 2.
OLS Regression Results
==============================================================================
Dep. Variable: RNA_LFR R-squared: 0.246
Model: OLS Adj. R-squared: 0.245
Method: Least Squares F-statistic: 414.9
Date: Tue, 23 Apr 2019 Prob (F-statistic): 0.00
Time: 20:57:35 Log-Likelihood: -13494.
No. Observations: 16544 AIC: 2.702e+04
Df Residuals: 16530 BIC: 2.712e+04
Df Model: 13
Covariance Type: nonrobust
=============================================================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------------------------------------
Intercept -0.0039 0.010 -0.383 0.702 -0.024 0.016
H3K4me3_Peak[T.True] 0.0530 0.018 3.019 0.003 0.019 0.087
H3K27me3_Peak[T.True] -0.0047 0.024 -0.196 0.845 -0.051 0.042
Bivalent[T.True] -0.0814 0.028 -2.925 0.003 -0.136 -0.027
Methylation_Fraction -0.0012 0.010 -0.119 0.905 -0.021 0.019
CpG_Island 0.0395 0.019 2.125 0.034 0.003 0.076
DHS_Score 0.5440 0.018 30.996 0.000 0.510 0.578
H3K4me3_Peak[T.True]:DHS_Score -0.0729 0.035 -2.096 0.036 -0.141 -0.005
H3K27me3_Peak[T.True]:DHS_Score 0.0538 0.040 1.360 0.174 -0.024 0.131
Bivalent[T.True]:DHS_Score 0.4889 0.047 10.387 0.000 0.397 0.581
Methylation_Fraction:DHS_Score -0.1010 0.020 -4.994 0.000 -0.141 -0.061
CpG_Island:DHS_Score -0.0478 0.057 -0.845 0.398 -0.159 0.063
Methylation_Fraction:CpG_Island 0.0463 0.019 2.416 0.016 0.009 0.084
Methylation_Fraction:CpG_Island:DHS_Score 0.0338 0.057 0.596 0.551 -0.077 0.145
==============================================================================
Omnibus: 2444.944 Durbin-Watson: 2.034
Prob(Omnibus): 0.000 Jarque-Bera (JB): 39894.936
Skew: -0.017 Prob(JB): 0.00
Kurtosis: 10.607 Cond. No. 32.9
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.