Global Regulatory DNA Potentiation by SMARCA4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Article

Global Regulatory DNA Potentiation by SMARCA4


Propagates to Selective Gene Expression Programs
via Domain-Level Remodeling
Graphical Abstract Authors
John E. Lazar, Sandra Stehling-Sun,
Vivek Nandakumar, ..., Fyodor D. Urnov,
Alister P.W. Funnell,
John A. Stamatoyannopoulos

Correspondence
[email protected] (A.P.W.F.),
[email protected] (J.A.S.)

In Brief
Chromatin remodeling complexes
regulate developmental transitions and
are increasingly implicated in the
formation of cancer. Here, Lazar et al.
uncover how multiple aspects of
chromatin state and organization can
filter the global effects of the key
remodeling factor SMARCA4 toward
specific changes in expression of key
developmental regulators.

Highlights
d Reactivation of SMARCA4 globally potentiates regulatory
DNA accessibility

d Promoter chromatin states gate the effects of distal


chromatin remodeling

d Clustered SMARCA4 occupancy triggers broad remodeling


of chromatin domains

d Domain remodeling particularly permits high level expression


of developmental genes

Lazar et al., 2020, Cell Reports 31, 107676


May 26, 2020 ª 2020 The Authors.
https://doi.org/10.1016/j.celrep.2020.107676 ll
ll
OPEN ACCESS

Article
Global Regulatory DNA Potentiation by SMARCA4
Propagates to Selective Gene Expression Programs
via Domain-Level Remodeling
John E. Lazar,1,2,3 Sandra Stehling-Sun,2,3 Vivek Nandakumar,2 Hao Wang,2 Daniel R. Chee,1,2 Nicholas P. Howard,2
Reyes Acosta,2 Douglass Dunn,2 Morgan Diegel,2 Fidencio Neri,2 Andres Castillo,2 Sean Ibarrientos,2 Kristen Lee,2
Ninnia Lescano,2 Ben Van Biber,2 Jemma Nelson,2 Jessica Halow,2 Richard Sandstrom,2 Daniel Bates,2 Fyodor D. Urnov,2
Alister P.W. Funnell,2,4,* and John A. Stamatoyannopoulos1,2,4,5,*
1Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
2Altius Institute for Biomedical Sciences, Seattle, WA 98121, USA
3These authors contributed equally
4Senior author
5Lead Contact

*Correspondence: [email protected] (A.P.W.F.), [email protected] (J.A.S.)


https://doi.org/10.1016/j.celrep.2020.107676

SUMMARY

The human genome encodes millions of regulatory elements, of which only a small fraction are active within a
given cell type. Little is known about the global impact of chromatin remodelers on regulatory DNA land-
scapes and how this translates to gene expression. We use precision genome engineering to reawaken ho-
mozygously inactivated SMARCA4, a central ATPase of the human SWI/SNF chromatin remodeling complex,
in lung adenocarcinoma cells. Here, we combine DNase I hypersensitivity, histone modification, and tran-
scriptional profiling to show that SMARCA4 dramatically increases both the number and magnitude of acces-
sible chromatin sites genome-wide, chiefly by unmasking sites of low regulatory factor occupancy. By
contrast, transcriptional changes are concentrated within well-demarcated remodeling domains wherein
expression of specific genes is gated by both distal element activation and promoter chromatin configura-
tion. Our results provide a perspective on how global chromatin remodeling activity is translated to gene
expression via regulatory DNA.

INTRODUCTION 2013); yet how defective chromatin remodeling, per se, contrib-
utes to oncogenesis in these cases is unclear. BAF complexes
Eukaryotic gene regulation involves the integrated action of regulate diverse sets of target genes across different develop-
sequence-specific transcription factors (TFs) and chromatin- mental stages and cancer types; as such, modifying their activity
modifying complexes to reorganize nucleosome-bound DNA can have heterogeneous effects (Hodges et al., 2016). BAF sub-
into an active regulatory template (Kadonaga, 1998). While the units commonly exhibit tumor suppressor function, yet these
role of localized binding of sequence-specific factors has been complexes also activate oncogenic regulatory programs in
intensively studied, less is known about the genome-wide inter- certain contexts—indeed, the same BAF subunit may act as a tu-
play between regulatory DNA actuation and specific chromatin mor suppressor or oncogene at different stages of cancer pro-
remodelers. gression (Glaros et al., 2008; Roy et al., 2015; Sun et al., 2017).
The mammalian SWI/SNF (mSWI/SNF) family is a set of Such heterogeneous behavior has been linked to distinct sets
closely related chromatin-remodeling complexes, which interact of target genes in different cellular contexts. Therefore, a better
with nucleosomes and other chromatin-modifying factors to understanding of how BAF target genes are specified is neces-
activate promoter and enhancer elements in diverse regulatory sary to interpret their diverse behavior.
contexts (Bao et al., 2015; Bossen et al., 2015; Hodges et al., Target specificity of BAF complexes is caused, in part, by se-
2018; John et al., 2008). These complexes include canonical lective recruitment to regulatory elements either through cell-
BAF (BRG-/BRM-associated factor), ncBAF (non-canonical type-specific subunits (Ho and Crabtree, 2010; Kadoch and
BAF), and PBAF (Polybromo-associated BAF) (Hodges et al., Crabtree, 2013) or TFs (Boulay et al., 2017; Kadam et al., 2000;
2016; Mashtalir et al., 2018), hereinafter collectively referred to Vierbuchen et al., 2017). However, in eukaryotes, the relationship
as ‘‘BAF complexes.’’ between recruitment of regulatory factors and target gene
BAF subunits are frequently mutated in certain cancers and expression is not straightforward. In humans, genome-wide as-
such mutations are thought to be driver events (Kadoch et al., says identify thousands of binding sites for most TFs and

Cell Reports 31, 107676, May 26, 2020 ª 2020 The Authors. 1
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
ll
OPEN ACCESS Article

chromatin remodelers (Dunham et al., 2012; Gerstein et al., for their capacity to scarlessly correct SMARCA4 by homol-
2012), yet these factors affect the expression of more restricted ogy-directed repair (HDR) (Figure 1A; Figures S1A and S1B).
sets of target genes (Lin et al., 2009; Ramagopalan et al., 2010; We used the lead TALENs to derive multiple clonal lines with ho-
Reddy et al., 2009). This phenomenon similarly holds for BAF mozygous correction of SMARCA4 (SMARCA4+/+) and arbitrarily
complexes (Hu et al., 2011; McBride et al., 2018; Raab et al., selected three for phenotypic and regulatory profiling. For
2015). matched controls, we selected two lines that were subject to
Various models might explain the discrepancy between factor the same transfection and sorting process but which remained
recruitment and effects on expression, including small effects for unedited at SMARCA4 (SMARCA4/). Genotypes were
low-affinity binding (Fisher et al., 2012; Tanay, 2006), synergistic confirmed by Sanger sequencing (Figure S1C).
relationships between clustered binding sites (Courey et al., We next confirmed that SMARCA4 expression was restored
1989; Li et al., 2002; Parker et al., 2013; Whyte et al., 2013), in the three genetically corrected clones. In each case, the
and enhancer-promoter specificity (Butler and Kadonaga, repair caused marked upregulation of full-length SMARCA4
2001; Li and Noll, 1994; Zabidi et al., 2015). Experiments using mRNA expression to levels comparable to those of normal
inducible TFs, such as the glucocorticoid and hormone recep- lung tissue (Figures 1B and 1C). SMARCA4 protein is also ex-
tors, suggest that binding site strength (Vockley et al., 2016); pressed in these clones (Figure 1D) and exhibits correct nuclear
clustered, codependent TF binding sites (Hakim et al., 2011; localization by immunofluorescence (IF) (Figures 1E and 1F).
Reddy et al., 2009; Vockley et al., 2016); and activation of chro- Phenotypically, SMARCA4+/+ clones have a decreased growth
matin domains (Le Dily et al., 2014, 2019) all play a role in defining rate (Figure 1G), in line with SMARCA4’s putative role as a tu-
the link between factors’ binding sites and gene regulation. mor suppressor in lung adenocarcinoma and the involvement
To investigate the relationship between the recruitment of a of BAF complexes in cell-cycle control (Nagl et al., 2005; Zhang
chromatin remodeler, resulting changes to chromatin state, et al., 2000).
and regulation of specific expression programs, we reactivated
SMARCA4 (BRG1) expression in pulmonary adenocarcinoma SMARCA4 Reactivation Causes a Widespread Increase
(A549) cells harboring a well-characterized, homozygous in Chromatin Accessibility at Pre-marked Sites
SMARCA4 null mutation (Medina et al., 2008). SMARCA4, one To profile the effects of SMARCA4 reactivation on chromatin, we
of two mutually exclusive ATPase components of BAF, is assayed the SMARCA4+/+ and SMARCA4/ clones for: chro-
commonly mutated in cancers including 5%–10% of lung matin accessibility, by DNase I sequencing (DNase I-seq);
adenocarcinoma samples (Medina et al., 2008; Collisson et al., the histone marks H3K4me1, H3K4me2, and H3K27me3 by
2014; Rodriguez-Nieto et al., 2011), and its loss causes changes CUT&RUN (Skene and Henikoff, 2017); and the genomic occu-
in cellular morphology and increased tumorigenicity in lung pancy of SMARCA4 and SMARCA2, the two interchangeable
adenocarcinoma models (Medina et al., 2005; Orvis et al., ATPase subunits of the BAF complex, by CUT&RUN (Figures
2014). Here, we show that SMARCA4 rescue causes a global in- 1A and 2A, Tables S1–S3). Data from different clones are highly
crease in chromatin accessibility but highly specific alterations to reproducible, and there is a clear separation in all data types be-
gene expression. The association between distal element activa- tween SMARCA4/ and SMARCA4+/+ samples (Figure S1D).
tion and expression is influenced by regional chromatin architec- Since chromatin accessibility provides a focal measurement
ture and the promoter state of the gene, demonstrating how a of multiple classes of regulatory activity (Gross and Garrard,
gene’s genomic context can lead to heterogeneous interpreta- 1988), we centered our analysis of regulatory changes around
tions of widespread chromatin reorganization. DNase I-hypersensitive sites (DHSs). Consistent with BAF’s ef-
fects on chromatin in other systems (Bao et al., 2015; Bossen
RESULTS et al., 2015; Hodges et al., 2018; Kelso et al., 2017), we observed
a widespread increase in chromatin accessibility in SMARCA4+/+
Efficient Repair of the SMARCA4 Locus and Restoration samples. We identified 32,689 sites with increased accessibility
of Expression (STAR Methods), hereinafter termed ‘‘remodeled DHSs,’’
SMARCA4 is frequently mutated in non-small-cell lung cancers compared to only 1,573 sites with decreased accessibility (Fig-
(Medina et al., 2008; Collisson et al., 2014; Rodriguez-Nieto ure 2B). Strikingly, 23,861 (73%) of remodeled DHSs are novel
et al., 2011); thus, we opted to use the A549 lung adenocarci- DHSs not reproducibly detected in the SMARCA4/ samples.
noma line as a physiologically pertinent model system to study The remodeled sites are primarily distal (non-promoter) and are
SMARCA4 function. A549 cells are homozygous for a 23-bp enriched for epithelial and fibroblast specific DHSs (Thurman
deletion in exon 15 of SMARCA4, which introduces a premature et al., 2012) (Figures S2A and S2B) analogous to the dependence
stop codon (Medina et al., 2008) (Figure S1A). Since develop- of cell-type-specific enhancers on SMARCA4 observed in other
mental roles of SMARCA4 are sensitive to gene dosage (Bultman systems (Alver et al., 2017; Attanasio et al., 2014).
et al., 2000), non-physiological levels of expression—whether The remodeled DHSs overlap SMARCA4 binding sites: 60% of
introduced through transient transfection or stable insertion at remodeled DHSs overlap SMARCA4 CUT&RUN peaks in
a non-native locus—may not fully recapitulate key regulatory in- SMARCA4+/+ clones, and 87% have a SMARCA4 CUT&RUN
teractions. We, therefore, sought to repair the deletion to ex- signal greater than the median unaffected DHSs (Figure 2C; Fig-
press SMARCA4 in its native regulatory context. ure S2C). Since BAF complexes contain either SMARCA4 or
We generated a panel of Transcription Activator-Like Effector SMARCA2, we analyzed whether the sensitivity of the remodeled
Nucleases (TALENs) targeted near the mutation and screened DHSs to SMARCA4 activity is due to preferential recruitment of

2 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

B C D

E F G

Figure 1. Restoration of SMARCA4 Expression through Corrective Editing of the Native Locus in A549 Lung Adenocarcinoma Cells
(A) Schematic of SMARCA4 rescue experiments. A549 cells, which harbor a homozygous, disruptive deletion in the SMARCA4 gene, were corrected by TALEN-
mediated HDR. Edited pools were single-cell sorted, and clonal lines (SMARCA4+/+ and SMARCA4/ matched controls) were profiled as indicated.
(B) RNA-seq data demonstrating restoration of SMARCA4 expression in SMARCA4+/+ clones.
(C) Comparison of restored SMARCA4 expression to normal lung tissue. Barplot (mean ± SEM) displays SMARCA4 expression in SMARCA4+/+ and
SMARCA4/ cells compared to normal lung tissue from TCGA (boxplot).
(D) Detection of restored SMARCA4 protein expression by western blot in untransfected (UT), two SMARCA4/, and three SMARCA4+/+ clones. K562 extracts
are included as a positive control, and GAPDH is included as a loading control.
(E) SMARCA4 (green) and actin (red) IF imaging of SMARCA4/ and SMARCA4+/+ clones. Scale bar, 5 mm.
(F) Image-based quantitation of SMARCA4 nuclear protein abundance in SMARCA4/ and SMARCA4+/+ clones. Boxplots show distribution of SMARCA4 IF
signal from 500 cells per clone.
(G) Growth curves for SMARCA4+/+ and SMARCA4/ clones. Data represent mean ± SD of replicates (n = 3 for days 5–6; n = 2 for day 4).

one subunit over the other. For remodeled DHSs bound by either To identify TFs that might recruit BAF complexes to the re-
or both ATPases in SMARCA4+/+ samples, the majority are occu- modeled DHSs, we performed de novo motif discovery and iden-
pied by both (Figure 2C). Quantitatively, the SMARCA4: tified AP-1 (JUN/FOS), RUNX, and SP1 motifs as enriched in the
SMARCA2 ratio at remodeled DHSs that overlap a SMARCA4 sequences of remodeled DHSs (Figure S2E). AP-1 and SP1 fac-
peak is only modestly higher than at SMARCA4-bound, unaf- tors are known to recruit BAF complexes (Ito et al., 2001; Kadam
fected DHSs (Figure S2D). However, while SMARCA2 binds at et al., 2000), and supporting the motif enrichment, chromatin
most remodeled DHSs, this occupancy requires SMARCA4 ac- immunoprecipitation sequencing (ChIP-seq) peaks of AP-1
tivity: only 40% of SMARCA4-bound remodeled sites overlap a members JUNB and FOSL2 have the highest overlap with the re-
SMARCA2 peak in SMARCA4/ cells (compared to 74% for modeled DHSs of the 51 TFs assayed by ENCODE in A549 cells
SMARCA4-bound, unaffected sites), and SMARCA2 CUT&RUN (Figure 2D, Table S4) (Dunham et al., 2012; D’Ippolito et al.,
signal increases specifically at the remodeled DHSs after 2018). For JUNB sites overlapping SMARCA4 peaks, chromatin
SMARCA4 rescue (Figure S2D). remodeling occurs preferentially at the subset of sites with low

Cell Reports 31, 107676, May 26, 2020 3


ll
OPEN ACCESS Article

A B

D E H

G
F

(legend on next page)

4 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

chromatin accessibility in SMARCA4/ cells. These weakly SMARCA4 rescue increases chromatin accessibility toward
accessible sites contain low-occupancy AP-1 binding sites as levels predicted by other measures of regulatory activity, and
measured by DNase I footprinting, a direct readout of factor oc- H3K4me1 specifically pre-marks the remodeled DHSs (Fig-
cupancy (Figure 2E) (Vierstra and Stamatoyannopoulos, 2016). ure 2H). These findings are consistent with recent reports high-
The sequences of the low-occupancy sites have, on average, lighting crosstalk between BAF and H3K4me1/MLL complexes:
weaker motif matches, suggesting that the low occupancy is BAF preferentially binds to and remodels H3K4me1-marked
due, in part, to weaker TF binding affinity at the sites. Upon resto- nucleosomes (Local et al., 2018; Pan et al., 2019), and chromatin
ration of SMARCA4, there is an increase in occupancy over the remodeling by BAF allows for recruitment of the H3K4 methyl-
AP-1 motif at these sites to levels observed for the higher affinity transferase MLL to enhancers (Pan et al., 2019).
motifs (Figure 2E). The phenomenon is not specific to AP-1, as Overall, our results demonstrate that the chromatin accessi-
we observed similar results for sites bound by both SMARCA4 bility landscape of A549 is highly dependent on SMARCA4 activ-
and SP1 (Figure S2F). Taken together, these data suggest that ity: after SMARCA4 rescue, the accessible compartment of the
SMARCA4 potentiates TF occupancy at co-bound sites and genome expands by 21%, as TF binding and enhancer-associ-
can markedly increase binding to weaker motifs through chro- ated marks increase at latent, low-occupancy sites pre-marked
matin remodeling. by H3K4me1.
Since SMARCA4 rescue activates low-occupancy sites, we
hypothesized that the novel DHSs that appear after SMARCA4 Promoter State Modifies the Association between Distal
rescue would be pre-marked for regulatory activity in Element Activation and Gene Expression
SMARCA4/ cells. Compared to a background of DHSs active We next compared the chromatin changes to the transcriptional
in unrelated cell types, we observed that a higher fraction of novel response to understand the relationship between the wide-
DHSs overlap TF ChIP-seq peaks for the 51 assayed TFs (Fig- spread increase in accessibility at low-affinity binding sites and
ure S2G). However, 70% of the novel DHSs are not bound by SMARCA4’s effects on gene expression. The transcriptional ef-
any of the 51 TFs assayed in A549 cells, demonstrating a high de- fects of SMARCA4 binding appear conditional on chromatin re-
gree of co-dependence between TF binding and BAF activity. To modeling: the genes closest to remodeled SMARCA4 binding
extend the analysis, we inspected histone marks around remod- sites are, on average, upregulated (Figure S3A), whereas genes
eled DHSs. In SMARCA4/ samples, remodeled DHSs are near non-remodeled SMARCA4 sites do not exhibit increased
marked by levels of H3K4me1 similar to those of unaffected expression, even for CUT&RUN peaks within 1 kb of a gene’s
DHSs, despite lower chromatin accessibility (Figure 2F). After transcription start site (TSS) (Figure 3A).
SMARCA4 rescue, H3K4me1 increases at the remodeled DHSs However, despite the global association between chromatin
to levels higher than unaffected DHSs (Figure 2F), although the and expression, the changes in expression following SMARCA4
ratio of H3K4me1 to chromatin accessibility decreases toward restoration are highly attenuated compared to the increase in
the genomic average at DHSs (Figure 2G). We observed similar chromatin accessibility (Figure 3B; Figure S3B). We sought to
patterns for H3K4me2, but relative to H3K4me1, the H3Kme2 explore this discrepancy by analyzing how local changes in chro-
signal tracks closer to the levels expected from a region’s acces- matin accessibility associate with gene expression. We linked all
sibility (Figures 2F and 2G). Therefore, at remodeled sites, DHSs to their nearest TSS and found a dose-dependent

Figure 2. SMARCA4 Reactivation Causes a Widespread Increase in Chromatin Accessibility at Pre-marked Sites
(A) DNase I cleavage, SMARCA2/4 CUT&RUN, and histone modification CUT&RUN read density in SMARCA4/ and SMARCA4+/+ cells for an example genomic
region. Remodeled DHSs are highlighted in red. Gray stripes indicate align signal for different assays.
(B) Scatterplot of DNase I cleavages at DHSs in SMARCA4/ versus SMARCA4+/+ samples. Note the asymmetric increase in accessibility in the SMARCA4+/+
samples.
(C) Relationship between SMARCA2/4 CUT&RUN peaks and SMARCA4 remodeled DHSs. Left: Venn diagram of remodeled DHSs and SMARCA2/4 CUT&RUN
peaks in SMARCA4+/+ samples. Right: heatmap of SMARCA4 CUT&RUN signal in SMARCA4+/+ samples (top) and change in DNase I signal (bottom) at the
32,689 remodeled DHSs. Signal displayed is mean of values from three independent clones. DHSs are ordered by SMARCA4 CUT&RUN signal.
(D) Heatmaps display the overlap of remodeled DHSs and SMARCA4 CUT&RUN peaks with ChIP-seq peaks of TFs assayed by ENCODE in A549 cells. All
assayed TFs with >500 peaks are included and are ordered by the fraction of peaks overlapping remodeled DHSs.
(E) Top: aggregate cleavage profile (trimmed mean of middle 98% of log2 observed/expected values) around AP1 motifs in remodeled and unchanging DHSs in
SMARCA4/ and SMARCA4+/+ samples. Motifs were selected from DHSs that overlap both a SMARCA4 CUT&RUN and JUNB ChIP-Seq peak and are
footprinted in either SMARCA4/ or SMARCA4+/+ cells. Bottom: DHSs were ordered by motif footprint occupancy in SMARCA4/ samples. From left to right,
heatmaps display the average log fold change in DHS accessibility after SMARCA4 reactivation, DNase I cleavages around AP1 motif in SMARCA4/ cells,
DNase I cleavages around AP1 motif in SMARCA4+/+ cells, and AP1 motif strength. Ordering of DHSs (rows) is consistent across heatmaps. To highlight the trend
with footprint occupancy, for heatmaps displaying DHS log fold change and motif p values, DHSs were separated into 25 bins and average values for the bin are
indicated.
(F) Relationship between SMARCA4 remodeled DHSs and enhancer-associated histone marks. DNase I cleavage versus H3K4me1 (left) and H3K4me2 (right)
CUT&RUN signal is plotted for SMARCA4 remodeled DHSs (red), unaffected DHSs (black), and a background of DHSs accessible in other cell types (tan). Solid
lines indicate values in SMARCA4/ cells and dashed lines indicate values in SMARCA4+/+ cells. Scatterplots display median ± 25th–75th percentile signal for
each class of elements.
(G) Boxplots of the ratio of H3K4me1 and H3K4me2 signal to DNase I signal at remodeled DHSs. Ratios are plotted relative to the median histone
CUT&RUN:DNase I cleavage ratio in unaffected DHSs.
(H) Model of the relationship between H3K4me1/me2 signal around a DHS and SMARCA4 remodeling.

Cell Reports 31, 107676, May 26, 2020 5


ll
OPEN ACCESS Article

A B C

D G

Figure 3. Promoter State Modifies the Association between Distal Element Activation and Gene Expression
(A) Median change in expression for genes neighboring DHSs with increasing accessibility, SMARCA4 CUT&RUN peaks, or both. Error bars display 25th–75th
percentiles.

(legend continued on next page)

6 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

relationship between the number of activated DHSs near a gene ation, overlap with H3K27me3 peaks, overlap with H3K4me3
and the average change in expression (Figure 3C). For instance, peaks, and bivalency (both H3K4me3 and H3K27me3 peaks)
genes linked to 8 or more SMARCA4-activated DHSs display an in SMARCA4/ A549s.
average 1.84-fold increase in expression compared to an We observed strong interactions between the change in distal
average 1.07-fold change for genes linked to a single activated chromatin accessibility and promoter state (Table S5). In partic-
DHS. ular, promoter bivalency had positive interaction terms with the
The higher average increase in expression of genes neigh- change in distal DHSs (Figure 3D), providing further rationale
boring larger numbers of remodeled DHSs could be caused for the observed association between BAF activity and bivalent
by: (1) dispersed control of gene expression, with many DHSs genes (Nakayama et al., 2017). In contrast, CpG methylation
contributing modest effects; and/or (2) focal control, wherein a and H3K4me3 status had significant negative interactions with
single DHS affects the gene’s expression, and the higher the DHS change (Figures 3D and 3E), with genes whose TSSs
number of neighboring remodeled DHSs, the higher the proba- were marked solely by H3K4me3, in particular, showing limited
bility of a regulatory impact. Under the focal model, the average response to activation of nearby distal DHSs (Figure 3D). There-
increase in expression should be primarily driven by a change in fore, while SMARCA4 rescue activates a specific class of genes
the fraction of genes that are upregulated, whereas in a with repressed or poised promoters marked by H3K27me3, the
dispersed model, the average change should be driven more distal remodeled DHSs appear to have moderated effects on
by an increase in the magnitude of upregulation. To distinguish genes with promoters that are already maximally active (only
between the two possibilities, we fit a mixture model to distin- H3K4me3 signal) or stably repressed (high CpG methylation).
guish between the fraction of upregulated genes and the magni- For example, although the genes GALNT15, DLC1, and
tude of expression change for upregulated genes. We found that LAMC1 all show dramatic increases in distal DHSs near their
the greater upregulation of genes neighboring a larger number of TSS and in the gene body, they show distinct transcriptional re-
DHSs appears primarily due to an increase in the fraction of up- sponses to SMARCA4 that are associated with their specific pro-
regulated genes (Figure S3C). This is consistent with a focal moter states prior to reactivation (Figure 3G). Importantly, the
model in which the attenuated changes in gene expression arise effect of promoters on gene expression is conditional on distal
due to only a subset of remodeled DHSs (approximately 20%) element activation, as the main effects of promoter states had
driving changes in gene expression (Figures S3D and S3E). How- mostly non-significant relationships with changes in gene
ever, we also observed that the fraction of genes upregulated at expression (Figure 3F; Table S5). The different promoter states
higher numbers of remodeled DHSs is lower than expected are, therefore, not merely marking classes of SMARCA4-sensi-
based on a simple model of a constant fraction of DHSs affecting tive genes but rather appear to alter the response to SMARCA4
gene expression (Figure S3D). This saturation in the proportion of reactivation by modifying the effect of distal element activation.
genes changing expression suggests that locus-wide effects
outside of individual DHSs might play a role in whether a gene’s SMARCA4 Rescue Causes Regional Chromatin
expression responds to distal element activation. Remodeling
We hypothesized that locus-specific effects might arise, in Since genes located near multiple remodeled DHSs showed the
part, due to genes’ promoter states modifying the effects of largest response to SMARCA4 rescue, we next investigated
distal DHSs. To test the relationship between promoter state whether the high number of remodeled DHSs at certain genes re-
and the association between distal element remodeling and flected spatial clustering or merely a random distribution.
gene expression, we first fit a function to weight each distal Compared to a block bootstrapped background, remodeled
element to best explain the changes in expression (STAR DHSs show significant spatial clustering (Figure S4A). Therefore,
Methods). We then performed multiple linear regression, re- we fit a hidden Markov model (HMM) to identify regions of
gressing genes’ change in expression on the weighted change SMARCA4 sensitivity (Figures 4A and 4B; STAR Methods) and
in chromatin accessibility, promoter state, and interaction terms discovered 202 remodeled regions with an average size of 842
between the two (STAR Methods). We characterized the pro- kb. These regions contain 7.5% of identified DHSs and 19.5%
moter state for each gene by CpG island status, CpG methyl- of the remodeled DHSs.

(B) Scatterplot of expression in SMARCA4/ versus SMARCA4+/+ clones as measured by RNA-seq. Significantly changing genes are colored, and select
up-/downregulated genes are highlighted. Note the attenuated change in gene expression compared to chromatin accessibility (compare to Figure 2A).
(C) Mean change in expression at genes grouped by number of neighboring SMARCA4-responsive DHSs. Dark error bars indicate ±SEM; light error bars indicate
25th–75th percentile.
(D) Genes with bivalent promoters show greater changes in expression (RNA-seq) compared to genes with similar numbers of changing distal DHSs. Plotted is the
change in gene expression (mean ± SEM) versus the number of neighboring remodeled DHSs for different gene sets. Gene sets are grouped by ChIP-seq peaks
at the gene’s promoter. Inset pie chart indicates the fraction of genes in each set.
(E) Genes with highly methylated promoters show smaller changes in expression (RNA-seq) compared to genes with similar numbers of changing distal DHSs.
Plot as in (D), with gene sets grouped by promoter CpG methylation levels.
(F) Promoter effects on gene expression are dependent on distal element changes. Barplot shows the fraction of upregulated genes for genes linked to 0 re-
modeled DHSs grouped by promoter class. All genes linked to R3 remodeled DHSs are included for comparison.
(G) Example change in chromatin accessibility (DNase I cleavage profile, left) and gene expression (swarmplot displaying data from individual samples with
mean ± SEM colored, right) for three genes with similar changes in chromatin accessibility, different promoter states, and different changes in expression after
SMARCA4 reactivation.

Cell Reports 31, 107676, May 26, 2020 7


ll
OPEN ACCESS Article

A B

D E F G

Figure 4. Regional Activation by SMARCA4 Aligns with Topological Domains


(A) Schematic of approach to identify regional chromatin remodeling.
(B) Example of locus identified as a regional change in chromatin accessibility.
(C) Background DNase I (left), H3K4me1 CUT&RUN (middle), and H3K27me3 CUT&RUN (right) signals around identified regions in SMARCA4/ and
SMARCA4+/+ clones. Top: lineplots of the aggregate (trimmed mean, middle 95%) score over all regions. To aggregate regions, the raw density for each region
was normalized by the mean signal across the region. Bottom: heatmaps of log fold change values for individual regions.
(D) Enrichment of CTCF and members of the cohesin complex at DHSs marking the boundaries of regional changes. Lineplots display overlap of DHSs with
different peak sets relative to the region boundaries. Values are normalized to the genome-wide average for the peak set.
(E) Heatmap of number of observed chromatin loops neighboring the regional changes in chromatin accessibility.
(F) Boxplots of change in expression of all genes, genes closest to any changing DHSs, genes with an increasing promoter DHS, genes located in a region of
increasing accessibility without a change in promoter accessibility, and genes located in a region of increasing accessibility with a change in promoter
accessibility.
(G) Biological process GO terms associated with protein coding genes found in the SMARCA4-sensitive regions were analyzed to identify enriched terms.
Heatmap displays enrichment of significant terms.

8 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

Within the remodeled domains, in addition to changes at focal marks associated with active chromatin, such as H3K4me3, in
sites of accessibility (i.e., remodeled DHSs), we also observed a the SMARCA4/ state (Figures S4J and S4K).
relative increase in background accessibility to DNase I cleavage Despite identifying regional chromatin changes in an unsuper-
outside of DHSs (Figure 4C). This finding demonstrates that the vised manner agnostic to 3D chromatin conformation, we find
chromatin changes caused by SMARCA4 reactivation are not that SMARCA4 reactivation causes not only increases in chro-
restricted to individual DHSs but, instead, extend more broadly matin accessibility at the nucleosome (i.e., DHSs) level but also
over the entire remodeled region. Consistent with a model in regional activation of chromatin at a subset of predefined loop
which SMARCA4 rescue activates the entire chromatin territory, domains.
levels of H3K27me3 decline and the enhancer-associated marks
H3K4me1 and H3K3me2 increase specifically across the remod- Regional Chromatin Remodeling Is Associated with
eled regions (Figure 4C; Figure S4B). Induction of Developmental Regulators
Previous work has proposed that extended regions of high The regions of increased accessibility are enriched for genes
regulatory activity, termed super-enhancers, regulate cell-type- with the greatest response to SMARCA4 reactivation. Of the
specific genes and are preferentially sensitive to disruption of 220 significantly upregulated genes with log2 fold change > 2,
regulatory components (Whyte et al., 2013). Therefore, we 32% are located in the remodeled domains (compared to 12%
compared the SMARCA4 remodeled regions to a variation on of all significantly upregulated genes). Highlighting the large ef-
super-enhancers defined by clustered SMARCA4 binding. The fect of regional chromatin remodeling on gene expression, genes
remodeled regions do not correspond to SMARCA4 super-en- found in the activated regions, but without any change in pro-
hancers as identified by the ROSE algorithm (Lovén et al., moter accessibility, have changes in expression similar to those
2013; Whyte et al., 2013) but, instead, reflect larger regions of of genes outside these regions but with increased promoter
chromatin activation (Figure S4C). However, the majority (74%) accessibility (Figure 4F). Additionally, genes within the activated
of these domains contain at least one SMARCA4 super- regions that also have increased promoter accessibility upon
enhancer (Figure S4C), suggesting that a subset of SMARCA4- SMARCA4 rescue displayed further, marked upregulation
binding clusters may act as bona fide locus control regions (Figure 4F).
enabling chromatin activation across a region. The domain scale Developmentally regulated genes are proposed to exist in
chromatin activation may be a key functional consequence of insulated regulatory neighborhoods (Dowen et al., 2014).
clusters of SMARCA4 binding, as genes closest to the Consistent with this hypothesis, we observe that genes encod-
SMARCA4 super-enhancers falling inside remodeled regions ing several lineage-specifying TFs and other key developmental
show large changes in expression, while the genes closest to regulators are located in the SMARCA4-sensitive domains.
the super-enhancers falling outside of remodeled regions show These genes include those encoding the RUNX TFs, the EMT
minimal changes in expression (Figure S4C). regulator SNAI2, and the homeodomain TF PBX1 (Figures S4D
and S4E). Other genes important for epithelial morphogenesis
Activated Regions Align with Boundaries of Topological or cell-cell interaction, such as DLC1 and EDIL3,are also found
Domains in the activated domains. Supporting the link between domain
To understand the features demarcating the sharp boundaries of level activation and developmental processes more generally,
the remodeled domains, we next analyzed the elements at the genes annotated with Gene Ontology (GO) terms such as ‘‘tissue
borders of the activated regions. The boundary elements are en- development,’’ ‘‘cell differentiation,’’ and ‘‘extracellular matrix
riched for binding sites of CTCF and members of the cohesin organization’’ are enriched in the SMARCA4-sensitive domains
complex (SMC3 and RAD21) (Figure 4D). CTCF and cohesin (Figure 4G). The enrichment of developmental regulators in
interact to establish three-dimensional chromatin organization, epigenetically sensitive domains, therefore, links the largest ef-
including long-range chromatin interactions and topologically fects of SMARCA4 on gene expression toward developmental
associating domains (TADs) (Ong and Corces, 2014; Rao et al., regulatory programs, despite relatively non-specific genome-
2014). We, therefore, compared the regions of SMARCA4 sensi- wide increases in chromatin accessibility.
tivity to TADs and Hi-C-defined DNA loops in A549 cells. We
found a striking alignment between remodeled SMARCA4 re- Chromatin Context of Expression Programs Altered in
gions and both TADs and DNA loops (Figure 4E; Figures S4D– SMARCA4 Null Patients
S4I). Combined, these results suggest that, for a subset of We next considered how the relationship between SMARCA4
loop domains, SMARCA4 reactivation enables chromatin re- regulation and chromatin states might affect SMARCA4’s tar-
modeling across the entire domain. gets in lung adenocarcinoma. We first identified a set of relevant
To investigate the chromatin states that specify domain sensi- SMARCA4 targets by selecting genes that are downregulated in
tivity to SMARCA4 activation, we compared the genomic fea- The Cancer Genome Atlas (TCGA) lung adenocarcinoma
tures of TADs overlapping regional changes in accessibility to SMARCA4 null samples. The expression of this gene set in
unaffected TADs. While BAF is thought to act, in part, through A549s matches the expression in SMARCA null patient samples
antagonism with polycomb repressive complexes (Kadoch and (Figure 5A; Figure S5A). Rescue of SMARCA4 in A549s partially
Crabtree, 2013; Tamkun et al., 1992), we do not observe a direct reactivates the downregulated TCGA gene set (Figure 5B),
correspondence between H3K27me3 marked domains and consistent with previous work showing partial overlap between
SMARCA4 sensitivity. Instead, sensitive regions overlap with genes affected by SMARCA4 knockdown in lung adenocarci-
bivalent TADs that contain matched levels of H3K27me3 and noma cell lines and genes with decreased expression in

Cell Reports 31, 107676, May 26, 2020 9


ll
OPEN ACCESS Article

A B

( (

D E

F G

Figure 5. Chromatin Context of Expression Programs Altered in SMARCA4 Null Patients


(A) Schematic of partial upregulation of SMARCA4 null-associated genes after SMARCA4 rescue in A549 cells. Reactivated genes are defined as genes
repressed in SMARCA4 null tumor samples with a significant (false discovery rate [FDR] < 0.05) increase in expression in SMARCA4+/+ clones.
(B) Reactivated genes have lower levels of promoter methylation in A549 cells compared to stable genes (p < 0.01, Wilcoxon rank-sum test). Left: boxplots of the
change in expression of all genes up-/downregulated in SMARCA4 null patient samples. Right: boxplots indicate the fraction of CpGs methylated in promoters of
stable and reactivated genes.
(C) Reactivated genes show a bias toward positive associations with survival compared to all upregulated genes. Pie charts indicate the fraction of reactivated
(right) and all upregulated genes (left) with positive, negative, and no association with survival (Cox regression, uncorrected p < 0.05).
(D) Example of a reactivated gene: DNase I cleavage, H3K4me1 CUT&RUN, and H3K27me3 CUT&RUN profiles at the DLC1 locus in SMARCA4+/+ and
SMARCA4/ samples. The identified remodeled domain is annotated (red bar).
(E) Example of a reactivated gene: DLC1 is the top upregulated gene and controls cell shape and motility by regulating Rho GTPase activity.
(F) Example of a reactivated gene: Kaplan-Meier curve indicates top and bottom tertiles of DLC1 expression in SMARCA4-WT tumors. DLC1 expression has a
significant positive association with patient survival (FDR = 0.001).
(G) Differential interference contrast (DIC) microscopy and IF (image as in Figure 1E) indicate changes in actin organization and cellular shape in SMARCA4+/+
cells. Scale bars, 5 mm.

10 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

Figure 6. Transcriptional Specificity independent of Recruitment


Model of the relationship between SMARCA4 recruitment and effects on gene expression. (A) For isolated SMARCA4 binding sites, the effect of SMARCA4
binding on gene expression is modified by the TFs’ binding affinity at the regulatory element and the promoter state of the associated gene. (B) For dense clusters
of SMARCA4 binding, high-level changes in expression occur in the context of domain-wide chromatin remodeling, which occurs at a subset of domains that
have balanced levels of histone marks associated with active and repressed chromatin.

SMARCA4 null patient samples (Orvis et al., 2014). The reacti- p < 0.01; Figure 5C). Reactivated genes are also associated
vated genes (i.e., genes with decreased expression in SMARCA4 with decreased cancer cell proliferation and invasiveness
null patient samples and upregulated after SMARCA4 rescue) (Qian et al., 2007; Saintigny et al., 2012; Torrino et al.,
are enriched in SMARCA4 remodeled domains, with 26% of 2019)—most notably, DLC1, a known tumor suppressor
the reactivated genes lying within the remodeled regions (Yuan et al., 2004). DLC1 regulates actin organization (Seki-
compared to 12% of all significantly upregulated genes and mata et al., 1999), and restoration of its expression in cancer
2% of all genes. cells can induce apoptosis (Zhou et al., 2004), decrease migra-
Since we found that promoter state modifies the effect of distal tion and invasion (Goodison et al., 2005), and reduce cell
accessibility changes, we considered whether it might influence growth (Yuan et al., 2004). DLC1 resides in a SMARCA4-sensi-
which genes are reactivated following SMARCA4 rescue. Reac- tive chromatin domain (Figure 5D) and is the most highly upre-
tivated genes have lower promoter methylation than stable gulated gene after SMARCA4 rescue (Figures 3B and 3G). Its
genes (Figure 5B), but no significant difference in either expression is low in SMARCA4 null patient samples (Fig-
H3K27me3 (Figure S5B) or bivalent marks (Fisher exact test, ure S5C) and is positively associated with patient survival (Fig-
p = 0.50). Promoters of tumor suppressors are frequently meth- ure 5F). Interestingly, we observed changes in cell morphology
ylated in cancers, so the diminished reactivation of genes with after SMARCA4 reactivation (Figure 5G; Figure S5D)—in line
highly methylated promoters suggests that accumulated epige- with previous literature linking mutations in SMARCA4 to
netic changes in cancer cells could alter SMARCA4 targets at changes in cell morphology and cytoskeletal disorganization
different stages of cancer development. (Wong et al., 2000)—that match the description of cytoskeletal
The reactivated genes are strong candidates for mediating changes after ectopic expression of DLC1 in lung adenocarci-
SMARCA4’s role in lung adenocarcinoma and include genes noma cells (Yuan et al., 2007).
in pathways involved in lung adenocarcinoma pathogenesis Across multiple scales—from individual TF binding to chro-
such as WNT (DAAM2, WLS, and TNIK) (Stewart, 2014) and re- matin domains—we observe a tight relationship between genes’
ceptor tyrosine kinase signaling (ROS1 and PTPRE) (Collisson chromatin states and the transcriptional effects of SMARCA4
et al., 2014) (Figure 5E). Compared to all genes upregulated in binding. This interaction between SMARCA4 and chromatin
the SMARCA4+/+ clones, the 74 reactivated genes are biased state adds additional transcriptional specificity to SMARCA4
for genes whose expression is associated with increased pa- perturbation beyond recruitment (Figure 6) and likely plays a
tient survival in wild-type (WT) SMARCA4 samples (c2 test, role in defining the developmental and oncogenic expression

Cell Reports 31, 107676, May 26, 2020 11


ll
OPEN ACCESS Article

programs regulated by SMARCA4 in lung adenocarcinomas and in expression of these genes might, therefore, require activation
other systems. of an alternative promoter. Consistent with this hypothesis, we
observe greater chromatin remodeling at alternative TSSs, and
DISCUSSION genes with annotated alternative promoters are more likely to
be upregulated after SMARCA4 reactivation (Figures S3F and
Mutations in chromatin remodelers such as SMARCA4 are S3G). While alternative TSSs introduce isoform diversity, they
increasingly implicated in cancer development as well as a range may also have a function in allowing for graded levels of gene
of developmental diseases. A primary challenge in understand- expression through sequential recruitment.
ing these complexes’ roles in human disease is understanding Mutations in chromatin remodelers, including BAF subunits,
their targets in the relevant cell types and the mechanism by are frequently linked to dysregulation of developmental expres-
which they affect target gene expression. Here we leveraged a sion programs. Our results suggest that these targeted effects
well-profiled model system to disentangle the relationship be- may result from the atypical chromatin domain structure of key
tween SMARCA4’s chromatin remodeling activity and its effects developmental regulators. Here, in a lung adenocarcinoma
on gene expression. Beyond specificity due to SMARCA4 model, SMARCA4 rescue upregulates developmental genes an-
recruitment, we observed that chromatin context plays a key notated with GO terms such as epithelial cell differentiation,
role in shaping transcriptional regulation by SMARCA4 at multi- regulation of cell morphogenesis, positive regulation of cell
ple scales. At the level of regulatory elements, TF binding affinity development (Figure S5D). These genes are found in isolated
influences the requirement for SMARCA4 activity to induce chromatin domains that undergo domain-wide chromatin re-
nucleosome remodeling. For individual genes, promoter chro- modeling after SMARCA4 reactivation. Since many develop-
matin states modify the effect of distal element activation. At mental regulators reside in isolated, gene-poor domains, the
the scale of chromatin domains, domains’ sensitivity to remodel- accessibility of their domains is independent of the regulation
ing determines the transcriptional effects linked to SMARCA4 of other genes and, therefore, may be preferentially susceptible
binding clusters. to a single regulatory perturbation. The regions susceptible to
SMARCA4 activation extensively increases chromatin acces- SMARCA4 reactivation contain clustered SMARCA4 binding
sibility at cell-type-specific regulatory elements (Bao et al., 2015; sites, which may act as locus control regions to control the chro-
Bossen et al., 2015; Hodges et al., 2018; Kelso et al., 2017). matin state of the entire domain. Notably, genes near clustered
SMARCA4’s specificity for these elements is only partially ex- binding sites not associated with domain scale remodeling
plained by targeted recruitment by sequence-specific TFs. have minimal change in expression, suggesting that the critical
Among SMARCA4 binding sites, chromatin remodeling occurs role for clustered binding may be to provide sufficient concentra-
at sites containing low-affinity TF binding sites. Since low-affinity tion of regulatory activity to increase chromatin accessibility
TF binding is believed to be a common feature of developmen- across an entire domain.
tally regulated enhancers (Crocker et al., 2015; Farley et al., Perturbations to general regulatory factors cause highly spe-
2015) and the transcriptional effects of SMARCA4 binding are cific transcriptional responses—frequently preferentially
conditional on chromatin remodeling, the heightened depen- affecting developmental and oncogenic programs—despite
dence of low-affinity binding sites on SMARCA4 may act to limited targeting specificity. Our results demonstrate how global
target SMARCA4’s effects toward developmental regulatory changes to regulatory activity can be interpreted in highly hetero-
programs. geneous ways. Locus-specific chromatin features shape the
The relationship between remodeled regulatory elements and selection of a regulatory factor’s target genes and introduce
changes in gene expression suggests that a minority of distal transcriptional specificity to perturbations of the basal regulatory
regulatory elements with relatively large effect sizes cause machinery.
changes in gene expression. Which subset of distal elements
drives changes in gene expression is dependent not only on fea- STAR+METHODS
tures intrinsic to the regulatory element but also on promoter
gating, as promoters’ chromatin states modify the association Detailed methods are provided in the online version of this paper
between distal element activation and gene expression. Pro- and include the following:
moter gating of distal elements leads to heterogeneous tran-
scriptional responses to chromatin remodeling and attenuates d KEY RESOURCES TABLE
the effects of the global chromatin remodeling on gene expres- d RESOURCE AVAILABILITY
sion. The attenuation of the transcriptional response can, B Lead Contact
notably, be observed at genes with promoters marked solely B Materials Availability
by H3K4me3, which show minimal change in expression even B Data and Code Availability
when neighboring multiple activated DHSs. We interpret the d EXPERIMENTAL MODEL AND SUBJECT DETAILS
distinct behavior of H3K4me3-marked promoters to suggest d METHOD DETAILS
that promoters have an intrinsic maximum level of expression B Synthesis of TALEN constructs
and that additional recruitment of regulatory factors may have B Transfections
minimal effects on expression once that level is reached. For B Assessment of HDR and Indel Rates
the majority (51%) of protein-coding genes in A549, the most B Clonal Isolation
accessible TSS is marked solely by H3K4me3. Further increases B Western Blotting

12 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

B Immunofluorescence imaging and analysis Boulay, G., Sandoval, G.J., Riggi, N., Iyer, S., Buisson, R., Naigles, B., Awad,
B DNaseI-Seq M.E., Rengarajan, S., Volorio, A., McBride, M.J., et al. (2017). Cancer-Specific
B RNA-Seq Retargeting of BAF Complexes by a Prion-like Domain. Cell 171, 163–178.e19.

B CUT&RUN Bultman, S., Gebuhr, T., Yee, D., La Mantia, C., Nicholson, J., Gilliam, A., Ran-
dazzo, F., Metzger, D., Chambon, P., Crabtree, G., and Magnuson, T. (2000). A
B ENCODE Datasets
Brg1 null mutation in the mouse reveals functional differences among mamma-
B TCGA Data
lian SWI/SNF complexes. Mol. Cell 6, 1287–1295.
d QUANTIFICATION AND STATISTICAL ANALYSIS
Butler, J.E.F., and Kadonaga, J.T. (2001). Enhancer-promoter specificity medi-
B DNaseI-Seq Data Analysis
ated by DPE or TATA core promoter motifs. Genes Dev. 15, 2515–2519.
B RNA-Seq Data Analysis
Cermak, T., Doyle, E.L., Christian, M., Wang, L., Zhang, Y., Schmidt, C., Baller,
B CUT&RUN Data Analysis J.A., Somia, N.V., Bogdanove, A.J., and Voytas, D.F. (2011). Efficient design
B Comparison of DNaseI-seq and RNA-Seq data and assembly of custom TALEN and other TAL effector-based constructs
B Regional Changes in Chromatin Accessibility for DNA targeting. Nucleic Acids Res. 39, e82.
B GO Terms Collisson, E.A., Campbell, J.D., Brooks, A.N., Berger, A.H., Lee, W., Chmie-
lecki, J., Beer, D.G., Cope, L., Creighton, C.J., et al.; Cancer Genome Atlas
Research Network (2014). Comprehensive molecular profiling of lung adeno-
SUPPLEMENTAL INFORMATION
carcinoma. Nature 511, 543–550.
Supplemental Information can be found online at https://doi.org/10.1016/j. Courey, A.J., Holtzman, D.A., Jackson, S.P., and Tjian, R. (1989). Synergistic
celrep.2020.107676. activation by the glutamine-rich domains of human transcription factor Sp1.
Cell 59, 827–836.
ACKNOWLEDGMENTS Crocker, J., Abe, N., Rinaldi, L., McGregor, A.P., Frankel, N., Wang, S., Alsa-
wadi, A., Valenti, P., Plaza, S., Payre, F., et al. (2015). Low affinity binding
This work was supported by National Institutes of Health (NIH) grants UM1HG site clusters confer hox specificity and regulatory robustness. Cell 160,
009444 and U54 HG007010 to J.A.S. and by a charitable financial contribution 191–203.
from GlaxoSmithKline. D’Ippolito, A.M., McDowell, I.C., Barrera, A., Hong, L.K., Leichter, S.M., Bar-
telt, L.C., Vockley, C.M., Majoros, W.H., Safi, A., Song, L., et al. (2018). Pre-es-
AUTHOR CONTRIBUTIONS tablished Chromatin Interactions Mediate the Genomic Response to
Glucocorticoids. Cell Syst. 7, 146–160.e7.
J.E.L., A.P.W.F., F.D.U., and J.A.S. conceived the project. J.E.L. conceived Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut,
and performed the analysis. S.S.-S., N.P.H., R.A., N.L., B.V.B., and A.P.W.F. P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq
generated and assayed the SMARCA4+/+ clones. H.W. performed CUT&RUN aligner. Bioinformatics 29, 15–21.
experiments. V.N. performed IF. D.D., M.D., F.N., A.C., S.I., K.L., J.H., and D.B. Dowen, J.M., Fan, Z.P., Hnisz, D., Ren, G., Abraham, B.J., Zhang, L.N., Wein-
performed DNase I- and RNA-seq. S.S.-S., A.P.W.F., D.R.C., J.N., and R.S. traub, A.S., Schujiers, J., Lee, T.I., Zhao, K., and Young, R.A. (2014). Control of
performed analysis. J.E.L., A.P.W.F., S.S.-S., V.N., H.W., and J.A.S. wrote cell identity genes occurs in insulated neighborhoods in mammalian chromo-
the manuscript. A.P.W.F., F.D.U., and J.A.S. supervised the project. somes. Cell 159, 374–387.
Dunham, I., Kundaje, A., Aldred, S.F., Collins, P.J., Davis, C.A., Doyle, F., Ep-
DECLARATION OF INTERESTS stein, C.B., Frietze, S., Harrow, J., et al.; ENCODE Project Consortium (2012).
An integrated encyclopedia of DNA elements in the human genome. Nature
All authors are employees of the not-for-profit Altius Institute for Biomedical 489, 57–74.
Sciences.
Farley, E.K., Olson, K.M., Zhang, W., Brandt, A.J., Rokhsar, D.S., and Levine,
M.S. (2015). Suboptimization of developmental enhancers. Science 350,
Received: July 31, 2019
325–328.
Revised: December 23, 2019
Accepted: April 30, 2020 Fisher, W.W., Li, J.J., Hammonds, A.S., Brown, J.B., Pfeiffer, B.D., Weisz-
Published: May 26, 2020 mann, R., MacArthur, S., Thomas, S., Stamatoyannopoulos, J.A., Eisen,
M.B., et al. (2012). DNA regions bound at low occupancy by transcription fac-
tors do not drive patterned reporter gene expression in Drosophila. Proc. Natl.
REFERENCES
Acad. Sci. USA 109, 21330–21335.
Alver, B.H., Kim, K.H., Lu, P., Wang, X., Manchester, H.E., Wang, W., Haswell, Gerstein, M.B., Kundaje, A., Hariharan, M., Landt, S.G., Yan, K.-K., Cheng, C.,
J.R., Park, P.J., and Roberts, C.W.M. (2017). The SWI/SNF chromatin remod- Mu, X.J., Khurana, E., Rozowsky, J., Alexander, R., et al. (2012). Architecture of
elling complex is required for maintenance of lineage specific enhancers. Nat. the human regulatory network derived from ENCODE data. Nature 489,
Commun. 8, 14648. 91–100.
Attanasio, C., Nord, A.S., Zhu, Y., Blow, M.J., Biddie, S.C., Mendenhall, E.M., Glaros, S., Cirrincione, G.M., Palanca, A., Metzger, D., and Reisman, D. (2008).
Dixon, J., Wright, C., Hosseini, R., Akiyama, J.A., et al. (2014). Tissue-specific Targeted Knockout of BRG1 Potentiates Lung Cancer Development. Cancer
SMARCA4 binding at active and repressed regulatory elements during Res. 68, 3689–3696.
embryogenesis. Genome Res. 24, 920–929. Goodison, S., Yuan, J., Sloan, D., Kim, R., Li, C., Popescu, N.C., and Urquidi,
Bao, X., Rubin, A.J., Qu, K., Zhang, J., Giresi, P.G., Chang, H.Y., and Khavari, V. (2005). The RhoGAP protein DLC-1 functions as a metastasis suppressor in
P.A. (2015). A novel ATAC-seq approach reveals lineage-specific reinforce- breast cancer cells. Cancer Res. 65, 6042–6053.
ment of the open chromatin landscape via cooperation between BAF and Gross, D.S., and Garrard, W.T. (1988). Nuclease hypersensitive sites in chro-
p63. Genome Biol. 16, 284. matin. Annu. Rev. Biochem. 57, 159–197.
Bossen, C., Murre, C.S., Chang, A.N., Mansson, R., Rodewald, H.-R., and Hakim, O., Sung, M.-H., Voss, T.C., Splinter, E., John, S., Sabo, P.J., Thurman,
Murre, C. (2015). The chromatin remodeler Brg1 activates enhancer reper- R.E., Stamatoyannopoulos, J.A., de Laat, W., and Hager, G.L. (2011). Diverse
toires to establish B cell identity and modulate cell growth. Nat. Immunol. gene reprogramming events occur in the same spatial clusters of distal regu-
16, 775–784. latory elements. Genome Res. 21, 697–706.

Cell Reports 31, 107676, May 26, 2020 13


ll
OPEN ACCESS Article
Hesselberth, J.R., Chen, X., Zhang, Z., Sabo, P.J., Sandstrom, R., Reynolds, Lin, B., Wang, J., Hong, X., Yan, X., Hwang, D., Cho, J.-H., Yi, D., Utleg, A.G.,
A.P., Thurman, R.E., Neph, S., Kuehn, M.S., Noble, W.S., et al. (2009). Global Fang, X., Schones, D.E., et al. (2009). Integrated expression profiling and ChIP-
mapping of protein-DNA interactions in vivo by digital genomic footprinting. seq analyses of the growth inhibition response program of the androgen re-
Nat. Methods 6, 283–289. ceptor. PLoS ONE 4, e6589.
Ho, L., and Crabtree, G.R. (2010). Chromatin remodelling during development. Local, A., Huang, H., Albuquerque, C.P., Singh, N., Lee, A.Y., Wang, W., Wang,
Nature 463, 474–484. C., Hsia, J.E., Shiau, A.K., Ge, K., et al. (2018). Identification of H3K4me1-
Hodges, C., Kirkland, J.G., and Crabtree, G.R. (2016). The Many Roles of BAF associated proteins at mammalian enhancers. Nat. Genet. 50, 73–82.
(mSWI/SNF) and PBAF Complexes in Cancer. Cold Spring Harb. Perspect. Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold
Med. 6, a026930. change and dispersion for RNA-Seq data with DESeq2. Genome Biol. 15, 550.
Hodges, H.C., Stanton, B.Z., Cermakova, K., Chang, C.-Y., Miller, E.L., Lovén, J., Hoke, H.A., Lin, C.Y., Lau, A., Orlando, D.A., Vakoc, C.R., Bradner,
Kirkland, J.G., Ku, W.L., Veverka, V., Zhao, K., and Crabtree, G.R. (2018). J.E., Lee, T.I., and Young, R.A. (2013). Selective inhibition of tumor oncogenes
Dominant-negative SMARCA4 mutants alter the accessibility landscape of tis- by disruption of super-enhancers. Cell 153, 320–334.
sue-unrestricted enhancers. Nat. Struct. Mol. Biol. 25, 61–72. Mashtalir, N., D’Avino, A.R., Michel, B.C., Luo, J., Pan, J., Otto, J.E., Zullow,
Hu, G., Schones, D.E., Cui, K., Ybarra, R., Northrup, D., Tang, Q., Gattinoni, L., H.J., McKenzie, Z.M., Kubiak, R.L., St Pierre, R., et al. (2018). Modular Orga-
Restifo, N.P., Huang, S., and Zhao, K. (2011). Regulation of nucleosome land- nization and Assembly of SWI/SNF Family Chromatin Remodeling Complexes.
scape and transcription factor targeting at tissue-specific enhancers by BRG1. Cell 175, 1272–1288.e20.
Genome Res. 21, 1650–1658. McBride, M.J., Pulice, J.L., Beird, H.C., Ingram, D.R., D’Avino, A.R., Shern,
Ito, T., Yamauchi, M., Nishina, M., Yamamichi, N., Mizutani, T., Ui, M., Mura- J.F., Charville, G.W., Hornick, J.L., Nakayama, R.T., Garcia-Rivera, E.M.,
kami, M., and Iba, H. (2001). Identification of SWI.SNF complex subunit et al. (2018). The SS18-SSX Fusion Oncoprotein Hijacks BAF Complex Target-
BAF60a as a determinant of the transactivation potential of Fos/Jun dimers. ing and Function to Drive Synovial Sarcoma. Cancer Cell 33, 1128–1141.e7.
J. Biol. Chem. 276, 2852–2857. Medina, P.P., Carretero, J., Ballestar, E., Angulo, B., Lopez-Rios, F., Esteller,
John, S., Sabo, P.J., Johnson, T.A., Sung, M.-H., Biddie, S.C., Lightman, S.L., M., and Sanchez-Cespedes, M. (2005). Transcriptional targets of the chro-
Voss, T.C., Davis, S.R., Meltzer, P.S., Stamatoyannopoulos, J.A., and Hager, matin-remodelling factor SMARCA4/BRG1 in lung cancer cells. Hum. Mol.
G.L. (2008). Interaction of the glucocorticoid receptor with the chromatin land- Genet. 14, 973–982.
scape. Mol. Cell 29, 611–624. Medina, P.P., Romero, O.A., Kohno, T., Montuenga, L.M., Pio, R., Yokota, J.,
John, S., Sabo, P.J., Thurman, R.E., Sung, M.-H., Biddie, S.C., Johnson, T.A., and Sanchez-Cespedes, M. (2008). Frequent BRG1/SMARCA4-inactivating
Hager, G.L., and Stamatoyannopoulos, J.A. (2011). Chromatin accessibility mutations in human lung cancer cell lines. Hum. Mutat. 29, 617–622.
pre-determines glucocorticoid receptor binding patterns. Nat. Genet. 43, Mi, H., Huang, X., Muruganujan, A., Tang, H., Mills, C., Kang, D., and Thomas,
264–268. P.D. (2017). PANTHER version 11: expanded annotation data from Gene
Kadam, S., McAlpine, G.S., Phelan, M.L., Kingston, R.E., Jones, K.A., and Ontology and Reactome pathways, and data analysis tool enhancements. Nu-
Emerson, B.M. (2000). Functional selectivity of recombinant mammalian cleic Acids Res. 45 (D1), D183–D189.
SWI/SNF subunits. Genes Dev. 14, 2441–2451. Morrow, J.J., Bayles, I., Funnell, A.P.W., Miller, T.E., Saiakhova, A., Lizardo,
M.M., Bartels, C.F., Kapteijn, M.Y., Hung, S., Mendoza, A., et al. (2018). Posi-
Kadoch, C., and Crabtree, G.R. (2013). Reversible disruption of mSWI/SNF
tively selected enhancer elements endow osteosarcoma cells with metastatic
(BAF) complexes by the SS18-SSX oncogenic fusion in synovial sarcoma.
competence. Nat. Med. 24, 176–185.
Cell 153, 71–85.
Nagl, N.G., Patsialou, A., Haines, D.S., Dallas, P.B., Beck, G.R., and Moran, E.
Kadoch, C., Hargreaves, D.C., Hodges, C., Elias, L., Ho, L., Ranish, J., and
(2005). The p270 (ARID1A/SMARCF1) Subunit of Mammalian SWI/SNF-
Crabtree, G.R. (2013). Proteomic and bioinformatic analysis of mammalian
Related Complexes Is Essential for Normal Cell Cycle Arrest. Cancer Res.
SWI/SNF complexes identifies extensive roles in human malignancy. Nat.
65, 9236–9244.
Genet. 45, 592–601.
Nakayama, R.T., Pulice, J.L., Valencia, A.M., McBride, M.J., McKenzie, Z.M.,
Kadonaga, J.T. (1998). Eukaryotic transcription: an interlaced network of tran-
Gillespie, M.A., Ku, W.L., Teng, M., Cui, K., Williams, R.T., et al. (2017).
scription factors and chromatin-modifying machines. Cell 92, 307–313.
SMARCB1 is required for widespread BAF complex-mediated activation of
Kelso, T.W.R., Porter, D.K., Amaral, M.L., Shokhirev, M.N., Benner, C., and enhancers and bivalent promoters. Nat. Genet. 49, 1613–1623.
Hargreaves, D.C. (2017). Chromatin accessibility underlies synthetic lethality
Neph, S., Kuehn, M.S., Reynolds, A.P., Haugen, E., Thurman, R.E., Johnson,
of SWI/SNF subunits in ARID1A-mutant cancers. eLife 6, e30506.
A.K., Rynes, E., Maurano, M.T., Vierstra, J., Thomas, S., et al. (2012). BEDOPS:
Le Dily, F., Baù, D., Pohl, A., Vicent, G.P., Serra, F., Soronellas, D., Castellano, high-performance genomic feature operations. Bioinformatics 28, 1919–1920.
G., Wright, R.H.G., Ballare, C., Filion, G., et al. (2014). Distinct structural tran-
Ong, C.-T., and Corces, V.G. (2014). CTCF: an architectural protein bridging
sitions of chromatin topological domains correlate with coordinated hormone-
genome topology and function. Nat. Rev. Genet. 15, 234–246.
induced gene regulation. Genes Dev. 28, 2151–2162.
Orvis, T., Hepperla, A., Walter, V., Song, S., Simon, J., Parker, J., Wilkerson,
Le Dily, F., Vidal, E., Cuartero, Y., Quilez, J., Nacht, A.S., Vicent, G.P., Carbon-
M.D., Desai, N., Major, M.B., Hayes, D.N., et al. (2014). BRG1/SMARCA4 Inac-
ell-Caballero, J., Sharma, P., Villanueva-Cañas, J.L., Ferrari, R., et al. (2019).
tivation Promotes Non–Small Cell Lung Cancer Aggressiveness by Altering
Hormone-control regions mediate steroid receptor-dependent genome orga-
Chromatin Organization. Cancer Res. 74, 6486–6498.
nization. Genome Res. 29, 29–39.
Pan, J., McKenzie, Z.M., D’Avino, A.R., Mashtalir, N., Lareau, C.A., St Pierre,
Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Bur-
R., Wang, L., Shilatifard, A., and Kadoch, C. (2019). The ATPase module of
rows-Wheeler transform. Bioinformatics 25, 1754–1760. mammalian SWI/SNF family complexes mediates subcomplex identity and
Li, X., and Noll, M. (1994). Compatibility between enhancers and promoters catalytic activity-independent genomic targeting. Nat. Genet. 51, 618–626.
determines the transcriptional specificity of gooseberry and gooseberry neuro Parker, S.C.J., Stitzel, M.L., Taylor, D.L., Orozco, J.M., Erdos, M.R., Akiyama,
in the Drosophila embryo. EMBO J. 13, 400–406. J.A., van Bueren, K.L., Chines, P.S., Narisu, N., Black, B.L., et al. (2013). Chro-
Li, Q., Peterson, K.R., Fang, X., and Stamatoyannopoulos, G. (2002). Locus matin stretch enhancer states drive cell-specific gene regulation and harbor
control regions. Blood 100, 3077–3086. human disease risk variants. Proc. Natl. Acad. Sci. U S A 110, 17921–17926.
Liao, Y., Smyth, G.K., and Shi, W. (2014). featureCounts: an efficient general Qian, X., Li, G., Asmussen, H.K., Asnaghi, L., Vass, W.C., Braverman, R., Ya-
purpose program for assigning sequence reads to genomic features. Bioinfor- mada, K.M., Popescu, N.C., Papageorge, A.G., and Lowy, D.R. (2007). Onco-
matics 30, 923–930. genic inhibition by a deleted in liver cancer gene requires cooperation between

14 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

tensin binding and Rho-specific GTPase-activating protein activities. Proc. Tanay, A. (2006). Extensive low-affinity transcriptional interactions in the yeast
Natl. Acad. Sci. U S A 104, 9012–9017. genome. Genome Res. 16, 962–972.
Raab, J.R., Resnick, S., and Magnuson, T. (2015). Genome-Wide Transcrip- Thurman, R.E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M.T., Haugen,
tional Regulation Mediated by Biochemically Distinct SWI/SNF Complexes. E., Sheffield, N.C., Stergachis, A.B., Wang, H., Vernot, B., et al. (2012). The
PLoS Genet. 11, e1005748. accessible chromatin landscape of the human genome. Nature 489, 75–82.
Ramagopalan, S.V., Heger, A., Berlanga, A.J., Maugeri, N.J., Lincoln, M.R., Torrino, S., Roustan, F.R., Kaminski, L., Bertero, T., Pisano, S., Ambrosetti, D.,
Burrell, A., Handunnetthi, L., Handel, A.E., Disanto, G., Orton, S.-M., et al. Dufies, M., Uhler, J.P., Lemichez, E., Mettouchi, A., et al. (2019). UBTD1 is a
(2010). A ChIP-seq defined genome-wide map of vitamin D receptor binding: mechano-regulator controlling cancer aggressiveness. EMBO Rep. 20,
associations with disease and evolution. Genome Res. 20, 1352–1360. e46570.
Rao, S.S.P., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Vierbuchen, T., Ling, E., Cowley, C.J., Couch, C.H., Wang, X., Harmin, D.A.,
Robinson, J.T., Sanborn, A.L., Machol, I., Omer, A.D., Lander, E.S., and Aiden, Roberts, C.W.M., and Greenberg, M.E. (2017). AP-1 Transcription Factors
E.L. (2014). A 3D map of the human genome at kilobase resolution reveals prin- and the BAF Complex Mediate Signal-Dependent Enhancer Selection. Mol.
ciples of chromatin looping. Cell 159, 1665–1680. Cell 68, 1067–1082.e12.
Reddy, T.E., Pauli, F., Sprouse, R.O., Neff, N.F., Newberry, K.M., Garabedian, Vierstra, J., and Stamatoyannopoulos, J.A. (2016). Genomic footprinting. Nat.
M.J., and Myers, R.M. (2009). Genomic determination of the glucocorticoid Methods 13, 213–221.
response reveals unexpected mechanisms of gene regulation. Genome Res. Vockley, C.M., D’Ippolito, A.M., McDowell, I.C., Majoros, W.H., Safi, A., Song,
19, 2163–2171. L., Crawford, G.E., and Reddy, T.E. (2016). Direct GR Binding Sites Potentiate
Ripley, B.D. (1976). The Second-Order Analysis of Stationary Point Processes. Clusters of TF Binding across the Human Genome. Cell 166, 1269–1281.e19.
J. Appl. Prob. 13, 255–266. Wang, Y., Song, F., Zhang, B., Zhang, L., Xu, J., Kuang, D., Li, D., Choudhary,
M.N.K., Li, Y., Hu, M., et al. (2018). The 3D Genome Browser: a web-based
Rodriguez-Nieto, S., Cañada, A., Pros, E., Pinto, A.I., Torres-Lanzas, J., Lo-
browser for visualizing 3D genome organization and long-range chromatin in-
pez-Rios, F., Sanchez-Verde, L., Pisano, D.G., and Sanchez-Cespedes, M.
teractions. Genome Biol. 19, 151.
(2011). Massive parallel DNA pyrosequencing analysis of the tumor suppres-
sor BRG1/SMARCA4 in lung primary tumors. Hum. Mutat. 32, E1999–E2017. Whyte, W.A., Orlando, D.A., Hnisz, D., Abraham, B.J., Lin, C.Y., Kagey, M.H.,
Rahl, P.B., Lee, T.I., and Young, R.A. (2013). Master transcription factors and
Roy, N., Malik, S., Villanueva, K.E., Urano, A., Lu, X., Von Figura, G., Seeley,
mediator establish super-enhancers at key cell identity genes. Cell 153,
E.S., Dawson, D.W., Collisson, E.A., and Hebrok, M. (2015). Brg1 promotes
307–319.
both tumor-suppressive and oncogenic activities at distinct stages of pancre-
atic cancer formation. Genes Dev. 29, 658–671. Wong, A.K.C., Shanahan, F., Chen, Y., Lian, L., Ha, P., Hendricks, K., Ghaffari,
S., Iliev, D., Penn, B., Woodland, A.-M., et al. (2000). BRG1, a Component of
Saintigny, P., Peng, S., Zhang, L., Sen, B., Wistuba, I.I., Lippman, S.M., Girard,
the SWI-SNF Complex, Is Mutated in Multiple Human Tumor Cell Lines. Can-
L., Minna, J.D., Heymach, J.V., and Johnson, F.M. (2012). Global evaluation of
cer Res. 60, 6171–6177.
Eph receptors and ephrins in lung adenocarcinomas identifies EphA4 as an in-
hibitor of cell migration and invasion. Mol. Cancer Ther. 11, 2021–2032. Yuan, B.-Z., Jefferson, A.M., Baldwin, K.T., Thorgeirsson, S.S., Popescu, N.C.,
and Reynolds, S.H. (2004). DLC-1 operates as a tumor suppressor gene in hu-
Sakuma, T., Hosoi, S., Woltjen, K., Suzuki, K., Kashiwagi, K., Wada, H., Ochiai,
man non-small cell lung carcinomas. Oncogene 23, 1405–1411.
H., Miyamoto, T., Kawai, N., Sasakura, Y., et al. (2013). Efficient TALEN con-
Yuan, B.-Z., Jefferson, A.M., Millecchia, L., Popescu, N.C., and Reynolds, S.H.
struction and evaluation methods for human cell and animal applications.
(2007). Morphological changes and nuclear translocation of DLC1 tumor sup-
Genes Cells 18, 315–326.
pressor protein precede apoptosis in human non-small cell lung carcinoma
Sekimata, M., Kabuyama, Y., Emori, Y., and Homma, Y. (1999). Morphological cells. Exp. Cell Res. 313, 3868–3880.
changes and detachment of adherent cells induced by p122, a GTPase-acti-
Zabidi, M.A., Arnold, C.D., Schernhuber, K., Pagani, M., Rath, M., Frank, O.,
vating protein for Rho. J. Biol. Chem. 274, 17757–17762.
and Stark, A. (2015). Enhancer-core-promoter specificity separates develop-
Skene, P.J., and Henikoff, S. (2017). An efficient targeted nuclease strategy for mental and housekeeping gene regulation. Nature 518, 556–559.
high-resolution mapping of DNA binding sites. eLife 6, e21856.
Zhang, H.S., Gavin, M., Dahiya, A., Postigo, A.A., Ma, D., Luo, R.X., Harbour,
Skene, P.J., Henikoff, J.G., and Henikoff, S. (2018). Targeted in situ genome- J.W., and Dean, D.C. (2000). Exit from G1 and S phase of the cell cycle is regu-
wide profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006– lated by repressor complexes containing HDAC-Rb-hSWI/SNF and Rb-hSWI/
1019. SNF. Cell 101, 79–89.
Stewart, D.J. (2014). Wnt Signaling Pathway in Non–Small Cell Lung Cancer. Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E.,
J. Natl. Cancer Inst. 106, djt356. Nusbaum, C., Myers, R.M., Brown, M., Li, W., and Liu, X.S. (2008). Model-
Sun, X., Wang, S.C., Wei, Y., Luo, X., Jia, Y., Li, L., Gopal, P., Zhu, M., Nassour, based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137.
I., Chuang, J.-C., et al. (2017). Arid1a Has Context-Dependent Oncogenic and Zhou, X., Thorgeirsson, S.S., and Popescu, N.C. (2004). Restoration of DLC-1
Tumor Suppressor Functions in Liver Cancer. Cancer Cell 32, 574–589.e6. gene expression induces apoptosis and inhibits both cell growth and tumori-
Tamkun, J.W., Deuring, R., Scott, M.P., Kissinger, M., Pattatucci, A.M., Kauf- genicity in human hepatocellular carcinoma cells. Oncogene 23, 1308–1313.
man, T.C., and Kennison, J.A. (1992). brahma: a regulator of Drosophila home- Zhou, X., Maricque, B., Xie, M., Li, D., Sundaram, V., Martin, E.A., Koebbe,
otic genes structurally related to the yeast transcriptional activator SNF2/ B.C., Nielsen, C., Hirst, M., Farnham, P., et al. (2011). The Human Epigenome
SWI2. Cell 68, 561–572. Browser at Washington University. Nat. Methods 8, 989–990.

Cell Reports 31, 107676, May 26, 2020 15


ll
OPEN ACCESS Article

STAR+METHODS

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER


Antibodies
Rabbit polyclonal anti-H3K4me1 Active Motif Cat# AM39297; RRID: AB_2615075
Rabbit polyclonal anti-H3K4me2 Active Motif Cat# AM39141; RRID: AB_2614985
Rabbit polyclonal anti-H3K27me3 Cell Signaling Cat# 9733; RRID: AB_2616029
Rabbit polyclonal anti-BRG1 Cell Signaling Cat# 49360; RRID: AB_2728743
Rabbit polyclonal anti-BRM Cell Signaling Cat# 11966; RRID: AB_2797783
Mouse monoclonal anti-GAPDH Santa Cruz Biotech. Cat# sc-47724; RRID: AB_627678
Mouse monoclonal anti-BRG1 Santa Cruz Biotech. Cat# sc-17796; RRID: AB_626762
Critical Commercial Assays
QIAGEN Rneasy QIAGEN Cat# 74104
TruSeq Stranded Total RNA kit Illumina Cat# 20020599
ThruPLEX DNA-seq Kit Takara Cat# R400677, R400660 (index)
Deposited Data
Raw and analyzed data This study GEO: GSE132293
A549 ChIP-Seq data ENCODE Consortium https://www.encodeproject.org/
A549 Methylation data ENCODE Consortium ENCFF003JVR,
ENCFF005TID
A549 HiC data ENCODE Consortium ENCFF716CFF (TADs)
ENCFF513HKS (loops)
Lung Adenocarcinoma Expression/Mutation Data TCGA https://portal.gdc.cancer.gov/
Experimental Models: Cell Lines
A549 ATCC CCL-185
Oligonucleotides
SMARCA4 repair donor DNA: atatggcgtgtcccaggcc This paper N/A
cttgcacgtggcctgcagtcctactatgccgtggcccatgctgtca
ctgagagagtggacaagcagtcagc
Forward primer for assessment of SMARCA4 HDR This paper N/A
rates: GCACATTGTCACAGATAGGAATGTGTG
Reverse primer for assessment of SMARCA4 HDR This paper N/A
rates: CCGAGGCAGCTACGTGGC
Forward primer for SMARCA4 Sanger sequencing: This paper N/A
CTCGGCTCTCTGCAAGCT
Reverse primer for SMARCA4 Sanger sequencing: This paper N/A
GGAGTTGTACTGGTTGTCTTGT
Software and Algorithms
Bwa (Li and Durbin, 2009) http://bio-bwa.sourceforge.net/
RNA-STAR (Dobin et al., 2013) https://github.com/alexdobin/STAR
hotspot-v2 Altius Institute https://github.com/Altius/hotspot2
DESeq2 package (Love et al., 2014) https://bioconductor.org/packages/
release/bioc/html/DESeq2.html
Bedops (Neph et al., 2012) https://bedops.readthedocs.io
macs-v2 (Zhang et al., 2008) https://github.com/taoliu/MACS

e1 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

RESOURCE AVAILABILITY

Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, John
Stamatoyannopoulos ([email protected]).

Materials Availability
All unique/stable reagents generated in this study are available from the Lead Contact with a completed Uniform Biological Materials
Transfer Agreement.

Data and Code Availability


All RNA-Seq, DnaseI-Seq, and CUT&RUN data from this study are available at NCBI Gene Expression Omnibus (GEO). The
accession number for data reported in this paper is GEO: GSE132293. Custom scripts used for analysis are available on request.

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Cell Lines: A549 (ATCC CCL-185) cells were maintained in F-12K Medium (ATCC, Cat #30-2004) supplemented with 10% HyClone
FBS (GE Healthcare Life Sciences, Cat #SH30071.03) and 1% Penicillin-Streptomycin (Corning, Cat #30-002-CI). Cells were
passaged every 4-5 days and detached using Accutase (Innovative Cell Technology, Cat #AT-104). Cells were sourced from
ATCC (A549 (ATCC CCL-185)).

METHOD DETAILS

Synthesis of TALEN constructs


TALEN monomers were cloned using adaptations of previously described methods (Cermak et al., 2011; Sakuma et al., 2013). To
target the Q729fs*4 mutation site in SMARCA4, five TALEN monomers were designed to target each of the sense and antisense
strands. Sense-targeting TALENs recognize the following sequences: L1, 50 -TGAATATGGCGTGTCCCAGGCC-30 ; L2, 50 -TGGCG
TGTCCCAGGCCCTTGC-30 ; L3, 50 -TGGCGTGTCCCAGGCCCTTGCA-30 ; L4, 50 -TGTCCCAGGCCCTTGCACGTG-30 ; L5, 50 -TCCC
AGGCCCTTGCACGTGGCC-30 . Antisense-targeting TALENs recognize the following: R1, 50 - TGTCCACTCTCTCAGTGACAGC-30 ;
R2, 50 -TGCTTGTCCACTCTCTCAGTG-30 ; R3, 50 -TGCTTGTCCACTCTCTCAGT-30 ; R4, 50 -TGACTGCTTGTCCACTCTCT-30 ; R5,
50 - TAAGCGCTGACTGCTTGTCCAC-30 .

Transfections
For all transfections, a BTX ECM830 device (BTX Harvard Apparatus) with a 2 mm gap cuvette was used. TALEN mRNAs were pre-
pared using a mMessageMachine T7 Ultra Kit (#AM1345, Ambion). Per transfection, 2 3 105 cells were collected and washed twice
with PBS. Cell pellets were resuspended in 100 mL BTXpress Electroporation Solution (BTX Harvard Apparatus, Cat#45-0805)
together with 2 mg mRNA per TALEN Monomer and 2 mL 100 mM 90-mer ssODN containing the 23 nt corrective insertion (underlined)
and 33-34 nt homology arms: 50 - ATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGT
CACTGAGAGAGTGGACAAGCAGTCAGC-30 . The cell/mRNA mixture was transferred to the transfection cuvette and immediately
electroporated with one pulse of 250 V for 5 ms. Following electroporation, cells were transferred to 12 well plates containing
pre-warmed F-12K Medium and incubated at 37 C.

Assessment of HDR and Indel Rates


Cell pellets (50,000 cells per sample) were washed with PBS and genomic DNA was harvested and amplified by PCR 48 h post-
transfection as described previously(Morrow et al., 2018). Gene-specific portions of primers used are as follows: Forward, 50 -GCA
CATTGTCACAGATAGGAATGTGTG-30 ; Reverse, 50 -CCGAGGCAGCTACGTGGC-30 . Additionally, primers contained sequences to
incorporate Illumina TruSeq adapters. Successful amplification was confirmed by electrophoresis using E-gel 48 and 96 cassettes
(ThermoFisher Scientific). PCR products were subsequently diluted 1:200 and 1 mL was used as a template for a second round ampli-
fication to incorporate barcodes and adapters. Amplification was achieved using AccuPrime Taq DNA Polymerase (ThermoFisher
Scientific) as per the manufacturer’s recommendations and using the following cycle conditions: 94 C 2 min; 12 cycles of 94 C
15 s, 60 C 30 s, 68 C 30 s; 68 C 10 min. PCR products were cleaned up using a QIAquick PCR Purification Kit (QIAGEN), quantified
using a Qubit (ThermoFisher Scientific) and loaded on a MiniSeq using Mid or High Output 300-cycle Reagent Kits (Illumina). Paired
end reads were merged and filtered to remove duplicate tags. Filtered reads were aligned to the SMARCA4/ reference sequence
and evaluated for insertions and deletions (indels). Editing efficiency (indel rate) was determined by tabulating only those indels that
partially or completely overlap the recognition sequence of the TALEN dimer (and the intervening spacer). Scarless HDR was quan-
tified as the proportion of reads that showed perfect complementarity to the wild-type (SMARCA4+/+) reference sequence.

Cell Reports 31, 107676, May 26, 2020 e2


ll
OPEN ACCESS Article

Clonal Isolation
7 days post-transfection, edited A549 pools were sorted as single cells into a 96 well plate using a MoFlo Astrios EQ Cell Sorter (Beck-
man Coulter). Cells were expanded for 17 days and subsequently split 1:10 for maintenance and 9:10 for genomic DNA, which was
harvested 4 days later. Genotyping of clones was performed as described above and was confirmed by Sanger sequencing
(GENEWIZ) using outer primers (50 -CTCGGCTCTCTGCAAGCT-30 and 50 -GGAGTTGTACTGGTTGTCTTGT-30 ; IDT) that each
lie 700 bp away from the edit site to check for larger deletions.

Western Blotting
A549 cells were lysed in RIPA buffer (ThermoFisher Scientific, Catalog # 89900,) containing Complete, EDTA-free protease inhibitor
cocktail tablets (Roche #11873580001, Roche). The total protein concentration was measured by Bradford assay (Bio-Rad). Samples
were run on the Simple Western System (Protein Simple) with the 12-220kDa kit, loading 2-4 mg lysate and antibodies for SMARCA4/
BRG1-(G-7) (Santa Cruz, #sc17796) and GAPDH (Santa Cruz, #sc47724) at 1:400 and1:2000 dilutions, respectively.

Immunofluorescence imaging and analysis


A549 cells were detached by mild trypsinization, washed twice with PBS, and then seeded onto sterile cover glasses (Fisher Scien-
tific, 1.5, 18x18 mm) placed in sterile 6-well tissue culture plates for 15-20 min, following which the wells were gently filled with 2 mL
pre-warmed media (ATCC F-12K, 10% FBS, 1% Pen/Strep). Seeded cells were incubated for 4-6 h at 37 C to recover and adhere
to the coverglass, following which they were washed 1x with PBS and then fixed with 4% PFA (Polysciences Inc, #18814-10) in PBS
for 10 min at room temperature. Fixed cells were washed 3 times with 1x PBS, permeabilized with 0.25% Triton X-100 in PBS for
10 min at room temperature, blocked for 1 h with 2% BSA (Jackson Immunoresearch, #001-000-161), and then incubated for 2 h
at room temperature with primary antibodies against SMARCA4 (SC-17796, mouse, 1:500 dilution) in 2% BSA/1x PBS. Subse-
quently, cells were washed 3x with 0.05% Tween-20 (Bio-rad, #161-0781) in PBS, and then incubated for 1 h with donkey anti-mouse
Cy3 (1:500 dilution, #711-166-152, Jackson Labs) secondary antibody and AlexaFluor647 Phalloidin (ThermoFisher Scientific,
A22287,1:100) in 2% BSA/1x PBS. Lastly, cells were counterstained with DAPI (100 ng/mL in 1x PBS) for 10 min at room temperature
and washed 3 times with 0.05% Tween in PBS prior to mounting on glass slides using Prolong Gold (Molecular Probes P36930). Each
wash step lasted 3 min at room temperature, and was performed on a shaker (Stovall Inc). Specimen coverslips were imaged using an
inverted Nikon Eclipse Ti widefield microscope equipped with an Andor Zyla 4.2CL10 CMOS camera with a 4.2-megapixel sensor
and 6.5 mm pixel size (18.8 mm diagonal FOV). Focused 2D cell images were acquired using a 40x Nikon Plan Apo 0.9 NA air objec-
tive. Acquired images were subject to 3 rounds of iterative blind deconvolution using Autoquant software (version X3.3, Media Cy-
bernetics, NY) to minimize the effect of out-of-focus blurring that is inherent to widefield microscopy optics. Deconvolved images
were processed using in-house MATLAB (version 2017B, Mathworks, Natick, MA) scripts to numerically estimate the SMARCA4 pro-
tein content in every cell nucleus, and for downstream statistical analysis.

DNaseI-Seq
DNaseI-Seq was performed as previously described (Hesselberth et al., 2009; John et al., 2011; Thurman et al., 2012). Briefly, 5x106
cells were lysed using 0.01% IGEPAL. Nuclei were collected by centrifugation at 500 g for 5 min, and DNaseI digestion was per-
formed for 3 min at 37 C. DNaseI cleavage fragments were size selected by PEG fractionation, fragments were end repaired, and
Illumina sequencing libraries were prepared using the ThruPLEX DNA-seq kit. Libraries were sequenced to a typical depth of 50M
reads (Table S1).

RNA-Seq
RNA was extracted from 500K cells using the QIAGEN RNeasy kit. RNA-Seq libraries (Table S2) were prepared using the TruSeq
Stranded Total RNA kit (Illumina).

CUT&RUN
CUT&RUN datasets (Table S3) were generated as previously described (Skene et al., 2018). Antibodies were obtained from the
following suppliers: H3K4me1 (Active Motif 39297), H3K4me2 (Active Motif 39141), H3K27me3 (Cell Signaling 9733), BRG1/
SMARCA4 (Cell Signaling 49360), BRM/SMARCA2 (Cell Signaling 11966). Protein A-MNase was kindly provided by Dr. Steven He-
nikoff (Fred Hutchinson Cancer Research Center).

ENCODE Datasets
TF ChIP-Seq
All available ChIP-Seq data in A549 aligned to GRCh38 was downloaded from the ENCODE portal (https://www.encodeproject.org/).
To analyze the fraction of novel SMARCA4 dependent DHSs overlapping pre-existing TF ChIP-Seq peaks, the list was manually
curated to remove non transcription factor targets and a single experiment was chosen for each TF (Table S4). For detailed analysis
of specific TFs, replicate concordant peaks were used: JUNB (ENCFF565QYS and ENCFF683JTQ), CTCF (ENCFF465EGH, ENCF-
F531OAI, and ENCFF751UOX), RAD21(ENCFF345LNM, ENCFF512QEA, and ENCFF178CSM), and SMC3(ENCFF046RJH,
ENCFF321GIF, and ENCFF922EYU).

e3 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS

Methylation
A549 bisulfite sequencing datasets were used to analyze promoter methylation (ENCFF005TID, ENCFF003JVR).
Histone ChIP
H3K4me3 (ENCFF428UWO, ENCFF643FMK, ENCFF973TUQ) histone ChIP-Seq data were downloaded from the ENCODE portal.
Hi-C:
Hi-C data from two A549 experiments was obtained from the ENCODE portal (https://www.encodeproject.org/). TAD file:
ENCFF716CFF. Chromatin Loop File: ENCFF803ZOW. A549 Hi-C heatmaps were visualized using the 3D Genome Browser
(Wang et al., 2018). Chromatin loops were visualized using the WashU epigenome browser (Zhou et al., 2011).

TCGA Data
Identification of SMARCA4 gene signature
Mutation, expression, and patient data for TCGA lung adenocarcinoma samples were obtained from the Genomic Data
Commons Data Portal (https://portal.gdc.cancer.gov/). Samples with an annotated Nonsense_Mutation or Frame_Shift_Del muta-
tion in SMARCA4 were classified as SMARCA4 null samples. Samples with other mutations (annotated as Missense_Mutation or
Splice_Site) in SMARCA4 or expression level of SMARCA4 below the average expression in SMARCA4 null samples were not
included in the analysis. Expression data were log transformed (with 1 added to the FPKM value) and compared between SMARCA4
null and wild-type (WT) samples. Protein coding genes significantly downregulated in SMARCA4 samples were identified by Welch’s
t test (BH FDR < 0.01 and |log fold change| > log2[1.5]). Genes with low expression (mean expression < 1 FPKM in both SMARCA4 null
and WT samples) were filtered from the gene signature.
Stable versus Reactivated genes
Genes in the TCGA SMARCA4 signature that were significantly upregulated after SMARCA4 rescue were classified as reactivated
while those that were not upregulated were classified as stable. Genes with zero mapped reads in either SMARCA4/ or
SMARCA4+/+ A549s were not included in the stable gene set since the genes were often members of highly duplicated gene families
and likely not detected for technical reasons (i.e., mappability).
Survival analysis
Expression data from the SMARCA WT samples was log transformed and a Cox regression was fit between expression and patient
survival for each protein coding gene. Patient age and gender were included as additional variables in the regression. p values for the
association between gene expression and survival were derived from the t-statistic of the gene expression regression coefficient. For
comparisons between reactivated genes and all upregulated genes, only upregulated genes expressed highly enough in TCGA sam-
ples to be included in the SMARCA4 gene signature were included.

QUANTIFICATION AND STATISTICAL ANALYSIS

DNaseI-Seq Data Analysis


Read mapping and peak calling:
Reads were processed and peaks of DNaseI cleavages (DHSs) were identified using the default ENCODE DCC DNase-DHS pipeline,
version 2, paired-end (ENCPL202DNS) aligning to GRCh38/hg38.
Consensus A549 peak calls:
Peak calls across the different DNaseI experiments were combined by first merging all peak calls from individual experiments. For
each merged element, a core element was identified by the full width half maximum of DNaseI density in samples that had a peak
overlapping the element. All peaks that overlapped (at least 50%) the core element were removed, and the process was repeated
on the remaining peaks until no more elements were added. Since combining many datasets may increase the fraction of spurious
P pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
peaks, an additional filter was applied based on the combined z-score (ð Z  scores= peak number Þ) of each element. To choose
an empirical threshold, logistic regression was performed on all pairs of technical replicates to identify the z-score at which 95% of
the DHSs were expected to be replicated, and the average value of all sets of replicates was used as the final threshold.
Identification of SMARCA4 sensitive DHSs
DNaseI cleavages were counted at consensus peaks using bedops tools (version 2.4.35) (Neph et al., 2012). Reads from technical
replicates (i.e., different DNaseI experiments on the same clone) were added. Data were normalized using DESeq20 s normalization
strategy restricted to DHSs identified in all samples. Using the R package DESeq2 (Love et al., 2014), SMARCA4 remodeled DHSs
were defined as DHSs with a log2 fold change significantly (FDR < 0.05) greater than log2(1.5).
Meme
Motif enrichment was performed using MEME (parameters: -dna -mod zoops -nmotifs 10 -minw 5 -maxw 12 -revcomp) on the top
1000 DHSs with the most significant increase in accessibility. To input sequences of identical length, the midpoint of each DHS was
identified and padded by 100 base pairs. Motifs were matched to known motifs using TOMTOM.

Cell Reports 31, 107676, May 26, 2020 e4


ll
OPEN ACCESS Article

RNA-Seq Data Analysis


Reads were aligned to GRCh38/hg38 using RNA-STAR (version 2.3.1) (Dobin et al., 2013), and counts per gene (Gencode v25 basic
annotation) were quantified using featureCounts (Liao et al., 2014). To convert raw counts to FPKM, RNA counts were first normalized
for library size using DESeq’s median ratio normalization and the normalized counts were converted to estimated FPKM values using
the average library size. Differential genes (FDR < 0.05) were identified using DESeq2 (Love et al., 2014). All genes with zero counts
across all conditions were excluded from further analysis.

CUT&RUN Data Analysis


Paired end reads were mapped to GRCh38/hg38 using bwa (version 0.7.12) (Li and Durbin, 2009). Read counts in genomic windows
around DHSs were quantified using a custom script. For SMARCA2/SMARCA4 CUT&RUN, peaks were identified using MACS2
(version 2.2.1) (Zhang et al., 2008) with a control experiment against SMARCA4 in A549 (SMARCA4/) cells used as the background.

Comparison of DNaseI-seq and RNA-Seq data


Mixture model of RNA change
To separate out the effects of an increase in the fraction of upregulated genes and an increase the magnitude of upregulation, we fit a
mixture model to each set of genes partitioned by the number of distal remodeled DHSs closest to the gene (DHS_Change). We first
fit the background t distribution, t0, to the Log2 Fold change in gene expression (LFC) for genes with zeros changing DHSs. Holding
the background T0 distribution fixed, for each number of remodeled DHSs N, of we fit a mixture model
LFCDHS Change = N  p0;N  t0 + p1;N  GN

where GN is a gamma distribution. We used p1;N as our estimate of the fraction of upregulated genes in the set of genes with
DHS_Change equal N, and we calculated the weighted mean of the Log2 fold change for the upregulated population to estimate
the average magnitude of expression change.
P
DHS Change = N wi  LFCi
m1;N = P wi = Pð genei is upregulatedÞ
DHS Change = N wi

wi = Pðgenei is upregulatedÞ
To understand how the increase in the fraction of genes that are upregulated and the magnitude of upregulation compare to the
expectation from simple binomial model, we extrapolated from the values found for genes with DHS_Change = 1 (i.e. p1;1 and m1;1 ) to
estimate the fraction of genes we would expect to see change in expression if DHS_Change equals k ðpbinomial;k Þ and the magnitude of
response for those genes ðmbinomial;k Þ.

pbinomial;k = 1  ð1  p1;1 Þk

m1;N  ðk  p1;1 Þ
mbinomial;k =
pbinomial;k
Aggregate gene level score of distal change in accessibility
To quantify the relationship between changes in accessibility and change in gene expression, we first created an aggregate score to
incorporate information about the DHSs distance from a gene’s TSS (Gencode v25 basic) to weight the potential contribution of each
DHSs to a gene’s change in expression. Only DHSs greater than 1kb from a gene’s TSS were included to isolate the effects of distal
element remodeling on gene expression.
X
DHS ScoreGenei = e½lother Iother;ij logðdj Þ + lclosest Iclosest;ij logðdj ÞLFCj
dj < 1Mb & d > 1kb

dj = distance of DHSj from TSSGenei

lclosest ; lother = decay rates

LFCj = Log2 Fold Change of DHSj

e5 Cell Reports 31, 107676, May 26, 2020


ll
Article OPEN ACCESS


1 if DHSj is closest to Genei
Iclosest;ij =
0 if DHSj is not closest to Genei


1 if DHSj is not closest to Genei
Iother;ij =
0 if DHSj is closest to Genei

We used different decay rates for DHSs closest to the gene’s TSS and all other genes and selected the two decay rates to maximize
the correlation (minimize the squared error) between the aggregate gene level scores and genes’ change in expression (Log2FC).
Regression on promoter state
To analyze the relationship between promoter state and changes in gene expression, we fit the model
RNA LFC = DHS Score + Promoter State + DHS Score : Promoter State

Promoter State = IH3K4me3 + IH3K27me3 + IBivalent + ICpG Island +

Cmethylation + ICpG Island  Cmethylation


1 if TSS + =  1kb overlaps feature
IFeature =
0 otherwise

8
< 1 if methylation < 33rd percentile for non CpG island genes
Cmethylation = 0 if 33rd percentile < methylation < 66th percentile
:
1 if methylation > 66th percentile for non CpG island genes

The promoter state included indicator variables for overlap with histone ChIP-Seq peaks and discretized the average fraction of CpG
methylation at the promoter into low, medium, and high values based on the methylation tertiles in non-CpG promoters (33% and
74% methylated). For genes with multiple TSSs, the TSS with the largest number of differential DHSs ± 5kb was selected. Ties
were broken by selecting the TSS with the greatest accessibility in SMARCA4/ A59s.

Regional Changes in Chromatin Accessibility


Ripley K
P P
Ripley K at multiple scales d was calculated using the edge corrected formula K = l1 DHSi DHSjsi wi  Iðdistanceij < dÞ=N where l
is the DHS density on the chromosome, wi is the inverse of the fraction of the window ± d basepairs from DHSi that lies within the
chromosome, Iðdistanceij < dÞ is an indicator variable that DHSj is within d basepairs of DHSi, and N is the number of DHSs on
the chromosome (Ripley, 1976).
To compare clustering of SMARCA4 responsive DHSs to all DHSs we subtracted the Ripley K calculated from all DHSs from Ripley
K calculated from SMARCA4 responsive DHSs with increasing accessibility. DHSs. We then estimated a background distribution for
this quantity by block bootstrapping with blocks of 5 consecutive DHSs.
HMM
To identify chromatin domains sensitive to SMARCA4 rescue, we fit a 3 state HMM the standardized DHS log fold changes
(LFC/LFC_SE). Emissions from each state were modeled by normal distributions. Transitions between neighboring DHS x in state
i and DHS y in state j were modeled by an exponentially decaying function Txy;ij = pj + Ai;j  eli dxy where pj is the starting probability
of state j, Ai,j is a constant, li is a decay rate, and dxy is the distance between the DHSs. Parameters were fit using the Baum-Welch
(EM) algorithm. For the transition parameters A and l, since no closed form solution for the maximum likelihood estimate exists, at
each iteration of the EM algorithm, the MLE of the parameters was found by optimization using coordinate descent. After fitting the
parameters, the most probable state for each DHS was determined using the Viterbi algorithm and neighboring DHSs with the same
state were merged into regions.
P pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
For each region, a total score was calculated as the combined z-score (ð Z  scores= peak number Þ) across the region. We
then performed block bootstrapping with blocks of 5 neighboring DHSs followed by identifying and scoring regions from the random-
ized sequences to estimate a null score distribution. A regional score threshold was then selected to obtain regions with a false dis-
covery rate of 5% when compared to scores from the background distribution.

Cell Reports 31, 107676, May 26, 2020 e6


ll
OPEN ACCESS Article

Super Enhancer Comparisons


Clusters of SMARCA4 peaks (SMARCA4 super enhancers) were identified using the ROSE algorithm (Lovén et al., 2013; Whyte et al.,
2013). SMARCA4 super enhancers were linked to the closest TSS for comparison of gene expression changes.

GO Terms
GO analysis of genes whose promoters (TSS +/ 1kb) fall in the SMARCA4 sensitive regions was performed using the Panther web
tool with default parameters and GO biological process complete as the annotation set (Mi et al., 2017).

e7 Cell Reports 31, 107676, May 26, 2020


Cell Reports, Volume 31

Supplemental Information

Global Regulatory DNA Potentiation by SMARCA4


Propagates to Selective Gene Expression Programs
via Domain-Level Remodeling
John E. Lazar, Sandra Stehling-Sun, Vivek Nandakumar, Hao Wang, Daniel R.
Chee, Nicholas P. Howard, Reyes Acosta, Douglass Dunn, Morgan Diegel, Fidencio
Neri, Andres Castillo, Sean Ibarrientos, Kristen Lee, Ninnia Lescano, Ben
Van Biber, Jemma Nelson, Jessica Halow, Richard Sandstrom, Daniel Bates, Fyodor D.
Urnov, Alister P.W. Funnell, and John A. Stamatoyannopoulos
A B

SMARCA4+/+/ssODN Y G V S Q A L A R G L Q S Y Y A V A H A V T E R V D K Q S
ATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGC

TGAATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGCGCTTA
SMARCA4-/- (Q729fs*4) Y G V S Q A L A R G L C C H *
L1 R1
L2 R2
TALENs L3 R3
L4 R4
L5 R5

C
Reference AGATGTCGATGATGAATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGCGCTTA
Sequence
SMARCA4 -/-
Clone #1

Reference AGATGTCGATGATGAATATGGCGTGTCCCAGGCCCTTGCACGTGGCCTGCAGTCCTACTATGCCGTGGCCCATGCTGTCACTGAGAGAGTGGACAAGCAGTCAGCGCTTA
Sequence
SMARCA4 +/+
Clone #1

D
DNaseI-Seq 100
RNA-Seq
SMARCA4 -/- Clone 1
200 50
SMARCA4 -/- Clone 2
15% Variance
10% Variance

Explained
Explained

0 SMARCA4 +/+ Clone 1


0
SMARCA4 +/+ Clone 2
−50
SMARCA4 +/+ Clone 3
−200 −100
−400 −200 0 200 400 −150 −100 −50 0 50 100
62% Variance Explained 33% Variance Explained

SMARCA4 300
SMARCA2
250
23% Variance

21% Variance

150
Explained

Explained

0
0

−250
−150

−500 −250 0 250 500 −300 −150 0 150 300


62% Variance Explained 44% Variance Explained

H3K4me1 H3K4me2 300


H3K27me3
200
200
150
19% Variance

17% Variance

25% Variance

100 100
Explained

Explained

Explained

0
0 0
−100
−100
−200
−150
−200
−200 −100 0 100 200 −300 −150 0 150 300 −200−100 0 100 200 300
48% Variance Explained 60% Variance Explained 31% Variance Explained
Figure S1: Generation of SMARCA4+/+ A549 clones. Related to Figure #1
a) Schematic of TALENs targeting the SMARCA4 locus that contains a homozygous 23 bp deletion in A549
cells (Q729fs*4 mutation). TALEN monomers recognizing the + strand (L1 to L5) and – strand (R1 to R5)
are depicted, and heterodimers that were tested are denoted by dotted lines. The sequence of the single
stranded donor DNA (ssODN) used to correct SMARCA4 is shown.
b) Indel and scarless HDR rates for TALENs targeting the SMARCA4 locus. The lead TALEN used to
generate clonal lines is shown in orange. Error bars: STD of Indels and HDR from three sequencing runs.
c) Chromatograms from Sanger sequencing for a representative SMARCA4-/- and SMARCA4+/+ clone.
d) Scatterplots display reduced dimensionality representations (PCA) of data from genome-wide assays
performed on SMARCA4+/+ and SMARCA4-/- clones. For RNA-Seq, PCA is based on normalized counts
for all genes. For all other assays, PCA is based on normalized counts around identified DHSs. All genomic
assays show clear separation between SMARCA4+/+ samples and SMARCA4-/- samples in the first
principle component. For RNA-Seq/DNaseI-Seq, multiple data points from the same clone represent
technical replicates (independent cultures).
A B 3.0
Cell Type Overlap

Normalized Overlap with


2.5
A549 (SMARCA4-/-)

Remodeled DHSs
Remodeled DHSs
2.0
DHSs
1.5

1.0

0.5

0.0
transcript proximal-intergenic

t ic

st
ll

al
le
al
Ce

liu

b la
e li
oe

sc
ur
promoter distal-intergenic

th e
Ne

it h
Mu
ta p
em

ro
Ep

F ib
do
ma
St

En
C D

He
120 120
80

SMARCA2 Signal

SMARCA2 Signal
(SMARCA4(-/-))

(SMARCA4(+/+))
90 90
SMARCA4 Cut-&-Run Signal

60% Overlap

SMARCA4/SMARCA2 Ratio
87% >= Median DHS
MACS Peak

60 60 0.60

30 30
60 0.45

0 0
0.30

120 120
0.15
40
SMARCA4 Signal

SMARCA4 Signal
(SMARCA4(-/-))

(SMARCA4(+/+))
90 90
0.00
60 60

le d

te d
de

ec
30 30
20

mo

a ff
Un
Re
0 0
le d

le d
te d

te d
S s /-)

S s -/-)
D H 9 (-
de

de
ec

ec
0

9 (
mo

mo
54

54
a ff

a ff
SMARCA4 CUT&RUN
rA

rA
DH
Un

Un
Re

Re
Peaks
d
d

he

he
Ss e
S s le

ou E
H ct

Ot

Ot
H e

gr D
nd
D ffe
D od

ck CO
na
em

Ba EN
U

SMARCA4 CUT&RUN SMARCA4 CUT&RUN


R

Peaks Peaks

E F
Motif Pvalue Motif Match SMARCA4 -/- SMARCA4 +/+

T A TCA
GA G
GG GGG
C G GA GG GGGG C G

0.8
A C
T AC
T A
A C
T AC
T
T T
A A T A C G
T

T
C
T CCT G C TT A

A
CCC
T G C
A C
TT
G
A
DNaseI Cleavage

GG
Log2(Obs/Exp)

Unaffected
1.8e-440 AP-1
bits

0.0
Remodeled

TGTGG
0 T A
2
−0.8

6.4e-251 RUNX
TCT
bits

GG G
A
C C T
T
0 G
A GGA G
C

(-Log10[pvalue])
Motif Strength
2

AT GAGAGA
DHS LFC

2.2e-101 G-rich (SP1)


bits

G G
0 T
A
A C
A

G
0.8 -15 0 15 -15 0 15 DNaseI Cleavage
Fraction of DHSs overlapping

Bp From SP1 Motif Bp From SP1 Motif Log2(Obs/Exp)

0.6
TF ChIP Peak

0.4

0.2

0.0

49 Ss ex
A5 DH I nd
l DE
ve
No CO
EN
Figure S2: Characterization of SMARCA4 remodeled DHSs. Related to Figure #2
a) Genomic distribution of DHSs. Pie charts show the fraction of DHSs overlapping GENCODE transcript
annotations for all A549 DHSs (left) and SMARCA4 remodeled DHSs (right).
b) Overlap of remodeled DHSs with cell type specific DHSs from relevant cell types. Plotted values are
normalized to the fraction of A549 SMARCA4-/- DHSs overlapping the cells’ DHSs. Cell type specific
DHSs are defined as DHSs active in a sample and present in <90% of all ENCODE samples.
c) Quantitative SMARCA4 CUT&RUN signal at remodeled DHSs, unchanged A549 DHSs, and a
background of DHSs active in other cell types. Median signal is plotted with error bars displaying 25th-
75th percentile.
d) Relative SMARCA2 and SMARCA4 binding at SMARCA4 remodeled DHSs. Plotted is the median
SMARCA2 CUT&RUN signal in SMARCA4-/- cells (top left), SMARCA2 CUT&RUN signal in
SMARCA4+/+ cells (top middle), SMARCA4 CUT&RUN signal in SMARCA4-/- cells (bottom left),
SMARCA4 CUT&RUN signal in SMARCA4+/+ cells (bottom middle) and the ratio of
SMARCA4/SMARCA2 CUT&RUN signal in SMARCA4+/+ cells (right) for SMARCA4 remodeled and
unaffected DHSs that overlap SMARCA4 CUT&RUN peaks. Error bars show 25-75th percentiles.
e) Top significant motifs in SMARCA4 remodeled DHSs identified by de novo motif search (meme).
f) Top: Aggregate cleavage profile (trimmed mean of middle 98% of log2 observed/expected values) around
SP1 motifs in SMARCA4 remodeled and unchanging DHSs in SMARCA4-/- and SMARCA4+/+ samples.
Motifs were selected from DHSs that overlap both a SMARCA4 CUT&RUN peak and SP1 ChIP-Seq peak
and are footprinted in either SMARCA4-/- or SMARCA4+/+ cells. Bottom: DHSs were ordered by motif
footprint occupancy in SMARCA4-/- samples. From left to right, heatmaps display the average log fold
change in accessibility after SMARCA4 reactivation, DNaseI cleavages around SP1 motif in SMARCA4-/-
cells, DNaseI cleavages around SP1 motif in SMARCA4+/+ cells, and AP1 motif strength. Ordering of
DHSs (rows) is consistent across heatmaps. To highlight the trend with footprint occupancy, for heatmaps
displaying DHS log fold change and motif pvalues, DHSs were separated into 25 bins and average values
for the bin are shown.
g) Overlap of A549 DHSs, SMARCA4 remodeled DHSs, and DHSs from other cell types with transcription
factor (TF) ChIP-Seq peaks from unedited A549s.
A 0.5 B DNaseI-Seq RNA-Seq

0.4
SMARCA4+/+-Clone3 SMARCA4-/--Clone1
SMARCA4+/+-Clone3 SMARCA4-/--Clone1

0.3 SMARCA4+/+-Clone1 SMARCA4-/--Clone2


RNA LFC

SMARCA4+/+-Clone1 SMARCA4-/--Clone2
SMARCA4+/+-Clone2 SMARCA4+/+-Clone2
SMARCA4+/+-Clone2 SMARCA4+/+-Clone1
0.2
SMARCA4-/--Clone1 SMARCA4+/+-Clone2
SMARCA4-/--Clone1 SMARCA4+/+-Clone1
SMARCA4-/--Clone2 SMARCA4+/+-Clone3
0.1
All Genes SMARCA4-/--Clone2 SMARCA4+/+-Clone3

Closest Gene 1.0

SMARCA4-/--Clone1
SMARCA4-/--Clone1
SMARCA4-/--Clone2
SMARCA4-/--Clone2

SMARCA4-/--Clone1
SMARCA4-/--Clone1
SMARCA4-/--Clone2
SMARCA4-/--Clone2
SMARCA4+/+-Clone3
SMARCA4+/+-Clone3
SMARCA4+/+-Clone1
SMARCA4+/+-Clone1
SMARCA4+/+-Clone2
SMARCA4+/+-Clone2

SMARCA4+/+-Clone2
SMARCA4+/+-Clone1
SMARCA4+/+-Clone2
SMARCA4+/+-Clone1
SMARCA4+/+-Clone3
SMARCA4+/+-Clone3
0.9
0.0

Correlation
0.8
1000 5000 25000 100000 500000 0.7

DHS Distance to TSS 0.6


0.5

C D 1.0
Binomial Model

Fraction of Genes Affected


(p=0.19)
1.5 0.8
Gene Specific
0.6
Effects?
(RNA Log2 Fold Change)

1.0
Expression Change

0.4

0.5
0.2

0.0
0.0

0 1 2 3 4 5 6 7 8 9 >=10

−0.5
0 1 2 3 4 5 6 7 >=8
E # of Distal Remodeled DHSs
# of Remodeled Distal DHSs
>=8 DHS Expected LFC effect size ~2x with 10
0 DHSS 1 DHS 4 DHS
remodeled DHSs Binomial Model
(Additive)
Mean Log2 Fold Change

1.5
of Affected Genes

μ = 1.09
μ = 0.84 ∏1 = 46% μ= 1.35 1.0
Frequency

∏1 = 17% ∏1= 67%

0.5

−2 0 2 4 −2 0 2 4 −2 0 2 4 −2 0 2 4
0.0
RNA LFC RNA LFC RNA LFC RNA LFC 1 2 3 4 5 6 7 8 9 >=10
# of Distal Remodeled DHSs

F G
0.08

0.08
Alternative
TSS
Fraction of Genes Upregulated

0.06 Primary TSS


Fraction of Differentially
Accessible Promoters

All Protein 0.06


Both
Coding Genes

0.04
0.04 All Protein
Coding Genes

0.02 0.02

0.00 0.00
Single TSS Multiple TSS Single TSS Multiple TSS
Figure S3: Quantification of relationship between chromatin and gene expression. Related to Figure #3
a) Relationship between gene expression and DHSs as a function of distance between a gene’s TSS and the
DHS. Plotted is the average change in gene expression for all genes within a given genomic distance of a
SMARCA4 remodeled DHS (black) or genes within a given genomic distance of a SMARCA4 remodeled
DHS only if it is the closest gene (blue). Errorbars display +/- SEM.
b) Heatmaps of pairwise correlation (pearson r of log[counts + 1]) between DNaseI-seq and RNA-seq
samples. Color scale is the same for DNaseI and RNA heatmaps.
c) Distribution of expression changes as a function of remodeled DHSs. Top: As in Figure #3c, mean change
in expression at genes grouped by number of neighboring SMARCA4 responsive DHSs. Dark error bars
+/- SEM, light error bars show 25th-75th percentile. Bottom: Histograms of the distribution of changes in
gene expression for genes with n = 0, n=1,n = 4, and n ³ 8 remodeled DHSs. For each distribution, a
mixture model was fit to deconvolve unaffected genes (black, distribution of expression for genes with n =
0 remodeled DHSs) and affected genes (red). Mean of the affected distribution and the fraction of genes
assigned to the affected distribution are reported.
d) Comparison of the fraction of genes that change expression with a binomial model where the fraction of
DHSs that lead to upregulation is determined from the population of genes with 1 remodeled DHS. Green
dots show estimates based on mixture model of actual data. Shaded area shows bootstrapped 95%
confidence intervals of binomial model.
e) Comparison of the average change in expression of the subset of genes that change expression with a
binomial model where the fraction of DHSs that lead to upregulation and the average effect of those DHSs
is determined from the population of genes with 1 remodeled DHS. Green dots show estimates based on
mixture model of actual data. Shaded area shows bootstrapped 95% confidence intervals of binomial
model.
f) Genes with a single annotated TSS were compared to genes with multiple distinct TSSs. To define distinct
TSSs, for each gene the primary TSS with the highest accessibility was chosen. All TSSs falling within 1
kb of the primary TSS were considered to be associated with that TSS/promoter while annotated TSSs > 1
kb were considered alternative TSSs. Barplot displays the fraction of upregulated genes after SMARCA4
reactivation for genes with a single TSS and multiple distinct TSSs.
g) Barplot displays the fraction of genes with change in promoter accessibility after SMARCA4 reactivation
for genes with a single TSS and multiple distinct TSSs.
B H3K4me2 C
A 2.5
1e5
Genomic Overlap Element Overlap
SMARCA4 -/-
SMARCA4 +/+
Ripley-K Activated DHSs -

2.0

Cut&Run Density
Experimental
Ripley-K All DHSs

Data Clustering of SMARCA4 163Mb 6Mb 17Mb 52 150 960

1.5 Sensitive DHSs

SMARCA4 Binding
HMM Remodeled Clusters
1.0 Regions (Super Enhancers)
Bootstrapped Background

0.5

Log2FC
Gene Overlap 221 139 789
0.0
0 100 200 300 400 500 Region Start Region End

Genomic Distance (Kb) Relative Genomic Location

D E
4

Hi-C

RNA LFC
TAD 0
Chromatin
Loops

Remodeled
Region
−2

DNaseI-Seq
Smarca4-/-

IFNGR2 ATP5PO KCNE2 CLIC6 SETD4 CHAF1B SIM2 CNTLN ADAMTSL1 DENND4C
Smarca4+/+
Region Both Super
Enhancer
DNAJC28 MRPS6 KCNE1 RUNX1 CBR1 CLDN14 SH3GL2 SAXO1 RPS6
GART SLC5A3 RCAN1 CBR3 HLCS RRAGA ACER2
SON SMIM11A DOP1B HAUS6 SLC24A2
DONSON FAM243A MORC3

H
AP000311.1 SMIM34A PLIN2
CRYZL1 1 Mb hg38
ITSN1 chr9: 18,000,000 18,500,000 19,000,000 19,500,000 20,000,000
1 Mb hg38
chr21: 34,000,000 34,500,000 35,000,000 35,500,000 36,000,000 36,500,000

F G

TAD Boundary Density


Hi-C

TAD
Chromatin
Loops
Remodeled
Region

Region Start Region End


DNaseI-Seq

Smarca4-/-

Relative Genomic Location


Smarca4+/+

FAM172A FAM81B ELL2 PCSK1 LIX1 RGMB CHD1 FAM174A


SPIDR EFCAB1 KIAA0825 TTC37 GPR150 CAST
CEBPD SNAI2 SLF1 ARSK RFESD ERAP1 ERAP2
PRKDC C8orf22 MCTP1 SPATA9 LNPEP RIOK2
MCM4 RHOBTB3
UBE2V2 GLRX
1 Mb 2 Mb hg38
chr5 95,000,000 96,000,000 97,000,000 98,000,000 99,000,000 100,000,000
chr8 47,000,000 48,000,000 49,000,000

I Loop
J K
Region
Overlap
Span
2

Observed 60
Randomized
1
Log2 Enrichment
H3K27ME3

45
H3K27ME3
Frequency

Count

30
−1

15
−2
0.0 0.2 0.4 0.6 0.8 1.0
Overlap Fraction
0 H3K4ME3
H3K4ME3
Figure S4: Identification and characterization of SMARCA4 remodeled domains. Related to Figure #4
a) Linear ripley K (a measure of spatial clustering) of remodeled DHSs versus all detected DHSs. Shaded
region represents null distribution +/- 95% confidence interval based on 1000 block permutations of the
remodeled DHSs.
b) H3K4me2 signal around identified regions in SMARCA4-/- and SMARCA4+/+ clones. Top: Lineplots of the
aggregate (trimmed mean of middle 95%) score over all regions. Bottom: heatmaps of log fold change
values for individual regions.
c) Top: Venn diagram of overlap of SMARCA4 remodeled domains with SMARCA4 super-enhancers
identified by the ROSE algorithm. Overlap is shown by genomic distance (left), element number (right)
and genes linked to regions/super enhancers (bottom). Bottom: Boxplots of change in expression of genes
linked to SMARCA4 remodeled domains, SMARCA4 super enhancers, or both.
d) Hi-C signal, TAD annotations, chromatin loops, and DNaseI cleavage density at an example locus
identified to contain a remodeled domain.
e) As d.
f) As d.
g) As d.
h) Density of TAD boundaries relative to the identified remodeled domains. Lineplot displays density of TAD
boundaries while tick marks below show individual TAD boundaries.
i) Overlap between Hi-C chromatin loops and regional changes in chromatin accessibility. Histogram of the
maximum fraction of genomic overlap between the region and a chromatin loop for each region (blue) with
the same quantity for a background of random regions with equal numbers of DHSs (grey) for comparison.
j) Heatmap of joint distribution for H3K27me3 and H3K4me3. TADs were binned into deciles based on the
average signal each histone mark and the number of TADs in each joint bin is plotted.
k) Frequency of TADs overlapping remodeled domains for each joint bin of H3K4me3/H3K27me3 signal.
Heatmap displays Log2(observed/expected) for each bin.
A B C
SMARCA4 Downregulated Signature
(N = 410) 120 p = 0.266 DLC1
25
24

100
20

H3K27me3 ChIP-Seq Signal

Expression (FPKM)
80
16
15
FPKM

60

10
40 8

5
20

0 0 0
TCGA TCGA A549(-/-) Stable Reactivated SMARCA4-Null WT
SMARCA4 WT SMARCA4 LOF

D LFR

ACTIN_FILAMENT_BASED_PROCESS
REGULATION_OF_CELL
MORPHOGENESIS_INVOLVED_IN_DIFFERENTIATION
POSITIVE_REGULATION_OF_CELL_DEVELOPMENT
CELL_JUNCTION_ORGANIZATION
POSITIVE_REGULATION_OF_MAPK_CASCADE
REGULATION_OF_CELL_GROWTH
GO Term

EPITHELIAL_CELL_DIFFERENTIATION
RHO_PROTEIN_SIGNAL_TRANSDUCTION
CELL_CYCLE_G1_S_PHASE_TRANSITION
DNA_REPLICATION
MULTI_ORGANISM_METABOLIC_PROCESS
AMIDE_BIOSYNTHETIC_PROCESS
RRNA_METABOLIC_PROCESS

−8 −6 −4 −2 0 2 4 6
GSEA Normalized Enrichment Score
Figure S5: Comparison of expression changes in A549s to a TCGA derived, SMARCA4-null gene signature.
Related to Figure #5
a) Comparison of expression of SMARCA4-null gene signature (TCGA) with SMARCA4-/- A549s. Boxplot
displays distribution of gene expression for the SMARCA4-null gene signature in TCGA SMARCA4-WT,
TCGA SMARCA4-null, and A549s (SMARCA4-/-).
b) Boxplots display distribution of H3K27me3 signal around the TSSs of genes in the reactivated and stable
gene sets.
c) Example of a reactivated gene: Boxplots displaying expression of DLC1 in SMARCA4-null and
SMARCA4-WT tumor samples.
d) Enrichment of biological process GO terms by GSEA in up- and down-regulated genes after SMARCA4
reactivation.
DS-ID clone_ID Clone Total Aligned Nuclear Nuclear SPOT Hotspot Hotspot
Reads Nuclear Mapping Duplicate Number Coverage
Reads Rate Rate (Bp)
DS62068 A3 WT 82100174 35757728 0.435537786 13.62931112 0.6954 73242 30957873
DS62113 A3 WT 81576474 42730132 0.523804596 8.379403087 0.5805 72367 31206386
DS62073 B8 WT 260458804 127451240 0.489333584 14.63560496 0.5351 89616 43074752
DS62118 B8 WT 133150566 66290020 0.497857591 9.507675514 0.6293 84140 37828699
DS62078 E9 Rescue 113297102 54913020 0.484681594 6.284986694 0.5814 96058 41489731
DS62123 E9 Rescue 93097004 44993816 0.483300365 5.740877813 0.5462 83838 32038179
DS62128 C12 Rescue 147458648 69925906 0.474206884 6.555075597 0.5629 112363 46108620
DS62149 C12 Rescue 352400134 192455072 0.546126557 12.87358226 0.5018 133135 58010819
DS62133 F3 Rescue 138212502 81486932 0.589577143 7.610697627 0.5042 115540 45743470
DS62154 F3 Rescue 127891098 84769052 0.662822146 4.467566772 0.3775 113470 43047811

Table S1: Metadata and library statistics of DNaseI-Seq experiments. Related to STAR methods.

DS-ID clone_ID Clone Total Reads Ribosomal Fraction Nuclear Duplicate


Rate
DS62081 A3 WT 37741048 0.002260986 19.13560566
DS62136 A3 WT 21387064 0.003123009 19.07689757
DS62082 B8 WT 19496110 0.002126783 13.00062813
DS62137 B8 WT 23606732 0.001931907 15.14737593
DS62083 E9 Rescue 19386874 0.001954209 14.73615514
DS62138 E9 Rescue 42126714 0.00217838 17.044498
DS62157 C12 Rescue 30192364 0.002025082 14.87589216
DS62139 C12 Rescue 43113384 0.0023672 18.32239776
DS62158 F3 Rescue 9700602 0.00273715 11.39960602
DS62140 F3 Rescue 26039012 0.003401819 16.72635612

Table S2: Metadata and library statistics of RNA-Seq experiments. Related to STAR methods.
DS-ID Antibody clone_ID Clone Total Aligned Nuclear Mapping Nuclear Median
Reads Nuclear Rate Duplicate Insert
Reads Rate Size
DS65989 H3K27me3 A3 WT 14356010 12571940 0.875726612 1.805671997 175
DS65990 H3K27me3 B8 WT 16059896 13983776 0.870726436 1.705791054 168
DS65991 H3K27me3 C12 Rescue 17835566 15638850 0.876835083 1.873795068 166
DS65992 H3K27me3 E9 Rescue 15286732 13534414 0.885370006 1.938539785 165
DS65993 H3K27me3 F3 Rescue 15535854 13539026 0.871469698 1.924820884 171
DS65994 H3K4me1 A3 WT 13275202 11917610 0.897734739 1.869082811 172
DS65995 H3K4me1 B8 WT 13594488 12273226 0.902808991 1.909310559 168
DS65996 H3K4me1 C12 Rescue 12992438 11750824 0.904435642 1.87384306 167
DS65997 H3K4me1 E9 Rescue 10380306 9257664 0.891848853 1.749404601 163
DS65998 H3K4me1 F3 Rescue 14371786 13020186 0.905954625 1.938927754 164
DS65999 H3K4me2 A3 WT 17061756 15585638 0.913483817 2.581966808 164
DS66000 H3K4me2 B8 WT 12826120 11620962 0.906038771 2.14302396 158
DS66001 H3K4me2 C12 Rescue 20311292 18620400 0.916751135 2.760026637 157
DS66002 H3K4me2 E9 Rescue 18610562 17018368 0.914446753 3.227359991 152
DS66003 H3K4me2 F3 Rescue 20426588 18684266 0.914703229 2.832019197 160
DS66514 SMARCA4 A3 WT 21210270 15387936 0.725494584 43.00862702 111
DS66515 SMARCA4 B8 WT 19863482 13661520 0.687770654 54.78930602 101
DS66516 SMARCA4 C12 Rescue 24714828 20029624 0.810429431 22.73058146 88
DS66517 SMARCA4 E9 Rescue 24659906 20466306 0.82994258 15.15720521 89
DS66518 SMARCA4 F3 Rescue 25808664 21141410 0.819159411 20.08004197 87
DS66519 SMARCA2 A3 WT 22266478 18083030 0.812119007 20.18917184 92
DS66520 SMARCA2 B8 WT 24096138 20317310 0.843177027 23.36750288 101
DS66521 SMARCA2 C12 Rescue 25568084 18805740 0.735516201 33.93028937 71
DS66522 SMARCA2 E9 Rescue 24379774 19673216 0.806948251 23.72461117 90
DS66523 SMARCA2 F3 Rescue 31411738 24103992 0.767356203 8.826977706 112

Table S3: Metadata and library statistics of CUT&RUN experiments. Related to STAR methods
TF Peak-Number Accession-ID
SREBF1-human 3429 ENCFF624DDK
YY1-human 17078 ENCFF613DTQ
FOXA1-human 33874 ENCFF297HAX
POLR2AphosphoS2 6272 ENCFF156MIR
PHF8-human 17048 ENCFF907WHF
TCF12-human 31794 ENCFF228CDD
HDAC2-human 4167 ENCFF814DAF
EHMT2-human 2352 ENCFF199OOU
ETS1-human 9988 ENCFF896WFR
HES2-human 3242 ENCFF558XCJ
SREBF2-human 838 ENCFF483YCC
ELF1-human 11737 ENCFF935ZUW
CHD2-human 3440 ENCFF310IDS
MAZ-human 4323 ENCFF661NNJ
CBX8-human 2819 ENCFF330OCU
CEBPB-human 47003 ENCFF047UIF
CHD4-human 3542 ENCFF766YPH
RAD21-human 26062 ENCFF897QCA
CBX2-human 141 ENCFF208AXT
NR3C1-human 665 ENCFF963CGV
CTCF-human 43844 ENCFF535MZG
SIN3A-human 38272 ENCFF567BJI
FOSL2-human 33138 ENCFF808RWZ
TAF1-human 17246 ENCFF886KDK
JUND-human 21802 ENCFF587VEY
RFX5-human 6479 ENCFF179WDI
ZFP36-human 11426 ENCFF137JHO
ELK1-human 470 ENCFF605JXG
SP1-human 43742 ENCFF404OSB
RNF2-human 4801 ENCFF110EOX
KDM5A-human 5209 ENCFF149INM
MYC-human 9437 ENCFF542GMN
JUNB-human 11264 ENCFF565QYS
ZC3H11A-human 7858 ENCFF415SIS
SIX5-human 8553 ENCFF189NMX
ATF3-human 11014 ENCFF851UTY
REST-human 9886 ENCFF706DRE
USF2-human 11956 ENCFF593EOW
SMC3-human 23810 ENCFF256LDD
BCL3-human 12352 ENCFF093ZAB
GABPA-human 17425 ENCFF520GJC
POLR2A-human 31834 ENCFF664KTN
RCOR1-human 862 ENCFF993WZP
MAFK-human 70080 ENCFF813WJW
ESRRA-human 2631 ENCFF558UWY
KDM1A-human 7245 ENCFF316CBQ
EP300-human 4797 ENCFF727TYG
CREB1-human 3289 ENCFF576PUH
JUN-human 1767 ENCFF127HJG
ZBTB33-human 11349 ENCFF593ZJA
NFE2L2-human 7256 ENCFF418TUX

Table S4: ENCODE ChIP-Seq experiments used in the analysis of remodeled DHSs. Related to Figure 2.
OLS Regression Results
==============================================================================
Dep. Variable: RNA_LFR R-squared: 0.246
Model: OLS Adj. R-squared: 0.245
Method: Least Squares F-statistic: 414.9
Date: Tue, 23 Apr 2019 Prob (F-statistic): 0.00
Time: 20:57:35 Log-Likelihood: -13494.
No. Observations: 16544 AIC: 2.702e+04
Df Residuals: 16530 BIC: 2.712e+04
Df Model: 13
Covariance Type: nonrobust
=============================================================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------------------------------------
Intercept -0.0039 0.010 -0.383 0.702 -0.024 0.016
H3K4me3_Peak[T.True] 0.0530 0.018 3.019 0.003 0.019 0.087
H3K27me3_Peak[T.True] -0.0047 0.024 -0.196 0.845 -0.051 0.042
Bivalent[T.True] -0.0814 0.028 -2.925 0.003 -0.136 -0.027
Methylation_Fraction -0.0012 0.010 -0.119 0.905 -0.021 0.019
CpG_Island 0.0395 0.019 2.125 0.034 0.003 0.076
DHS_Score 0.5440 0.018 30.996 0.000 0.510 0.578
H3K4me3_Peak[T.True]:DHS_Score -0.0729 0.035 -2.096 0.036 -0.141 -0.005
H3K27me3_Peak[T.True]:DHS_Score 0.0538 0.040 1.360 0.174 -0.024 0.131
Bivalent[T.True]:DHS_Score 0.4889 0.047 10.387 0.000 0.397 0.581
Methylation_Fraction:DHS_Score -0.1010 0.020 -4.994 0.000 -0.141 -0.061
CpG_Island:DHS_Score -0.0478 0.057 -0.845 0.398 -0.159 0.063
Methylation_Fraction:CpG_Island 0.0463 0.019 2.416 0.016 0.009 0.084
Methylation_Fraction:CpG_Island:DHS_Score 0.0338 0.057 0.596 0.551 -0.077 0.145
==============================================================================
Omnibus: 2444.944 Durbin-Watson: 2.034
Prob(Omnibus): 0.000 Jarque-Bera (JB): 39894.936
Skew: -0.017 Prob(JB): 0.00
Kurtosis: 10.607 Cond. No. 32.9
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Table S5: Promoter state regression results. Related to Figure #3.

You might also like