Redundancy ensures integrity of gene expression independent of genome architecture

Part of finding out how something works is trying to break it. Two recent papers from Professor Jennifer Mitchell’s lab try to break the folded structure in the genome that links the essential Sox2 gene with its far away regulators. But no matter how many elements they removed or added around the Sox2 region, this structure was maintained, revealing the depths of the redundant systems that protect genome integrity.

Sox2 is crucial for the identity of embryonic stem cells. Dr Ian Tobias, a postdoctoral fellow in the Mitchell lab in Cell & Systems Biology, notes that “It makes a lot of sense, from an evolutionary and developmental biology standpoint, that such a key regulator of cell fate wouldn’t be so fragile as to have its regulatory control lost with the deletion of a small segment of DNA.”

Sox2 and its distant control region are bundled together

Inside our cells the genome consists of long ribbons of DNA that contain all the information to make every part of the body. In organs like lung or brain, the DNA is fenced off in different ways to turn on lung or brain-specific genes, while hiding genes that should not be active. The Sox2 gene region is open and active in embryonic stem cells but closed in most other cell types.

Active Sox2 in embryonic stem cells is isolated within a region that contains only the gene and its activators including the Sox2 Control Region (SCR). The SCR is over 100,000 base pairs away from Sox2, so this region is relatively long.  Individual organised regions of the genome have been called topologically associated domains (TAD), and the Mitchell lab studies how these domains are established and maintained.

Tiegh Taylor, a PhD student in the Mitchell lab used gene editing to tease out the mechanisms that keep the Sox2-SCR gene domain intact. Working with colleagues in the Sexton lab in Strasbourg, France, their studies of the region around Sox2 challenged the dogma of how the winding ribbons of DNA in a cell are organized.

In large scale studies of the genome, the CTCF protein is almost always found at TAD boundaries, leading to the idea that CTCF-bound DNA acts as a fence that isolates DNA outside of the domain from DNA inside the domain. Separate transcription factor proteins are responsible for individual gene activation, bundling DNA within the TAD together.

CTCF is not required for Sox2 TAD maintenance

What the Mitchell and Sexton labs showed is that transcription factor proteins can still keep DNA organized in embryonic stem cells even in the absence of the CTCF fences.

How did they show this? The SCR is comprised of two clusters of transcription factor binding sites flanking a CTCF site. At the Sox2-SCR loop, transcription factor proteins interact to enhance gene activation, giving the SCR the designation of an “enhancer”.

When the researchers removed the transcription factor-binding clusters, Sox2 enhancement was lost, but the TAD bundle remained in place. Removing all of the SCR including the CTCF site loosened the bundle to interact with an adjacent TAD, so presumably CTCF is holding the TAD together.

Surprisingly, Taylor found that although this CTCF site was sufficient to fence in the TAD, it wasn’t necessary. When removed from the SCR, the TAD bundle was maintained, presumably due to the transcription factors binding to the enhancer surrounding the missing CTCF binding site. The strength of these transcription factors is so strong in fact, that they are able to ‘jump over’ an artificial CTCF fence which was later placed between Sox2 and the SCR.

Taylor notes that “It wasn’t clear what the finding was at first, as the data came in it seemed like nothing we could do would break this interaction. It wasn’t until we thought about it for a while that we realized that maybe it’s not that these various proteins are so defined in their roles of interaction vs enhancer, it’s more about this complex cooperativity between all these factors that makes this locus so robust. It really highlights just how complicated our genome is.”

This work was published in the journal Genes & Development as “Transcriptional regulation and chromatin architecture maintenance are decoupled functions at the Sox2 locus

Transcription factors maintain Sox2 architecture, even as stem cells become neural progenitors in mice

Taylor’s colleague Dr Ian Tobias is studying what happens to enhancer activity when embryonic stem cells change into other cell types where Sox2 is active. “We’re interested in the neural lineage because it’s also highly dependent on the Sox2 protein; the model we use is neural progenitor cells (NPS). During the differentiation of embryonic stem cells to NPS, Sox2 interacts with a region that’s even further away than the SCR.”

Using techniques similar to Taylor’s approach, Tobias found that when the far away region is deleted, Sox2 doesn’t turn on in NPS, making this site a candidate Sox2 distal neural enhancer or DNE. At the same time, collaborators at the NIH in Maryland were looking at TADs in neurons and saw a clear domain linking the DNE to Sox2. When the NIH group blocked TAD formation in mice by inserting CTCF between the DNE and Sox2, interaction was reduced, but not eliminated, again showing how robust these interactions can be.

The result of this collaboration has been published in Nature Genetics as “Enhancer–promoter interactions can bypass CTCF-mediated boundaries and contribute to phenotypic robustness”.

New targets for understanding genome architecture and disease

The combined results in these two papers reveal that specific elements supplying enhancer function contribute to maintaining DNA structure. They consequently argue for looking beyond CTCF when looking to explain how DNA folds in the nucleus.

Professor Mitchell notes that “Now that we know these enhancer sequences are important for regulating Sox2 during development, our next steps are studying aberrant expression of Sox2 in disease.” Tobias and others in the Mitchell lab will be pursuing the role of Sox2 enhancers and their associated proteins in neurological disorders and other diseases.