Research
Exploring the Function and Regulation of Endogenous Retroviruses (ERVs)
It is well known that retroviruses pose a threat to human health by infection of somatic cells, but retroviruses have also been infecting our mammalian ancestors for millions of years and have been incorporated in the germ-line as endogenous retroviruses (ERVs) that account for nearly 10% of our genomic DNA. We are interested in studying ERVs from two perspectives: 1) as a parasite that must be kept in check by the host to prevent widespread viral activation and genomic instability and 2) as a symbiont, that can be co-opted and utilized by the host for developmental and evolutionary advantage. Specifically, our work aims to study how the host has adapted molecular recognition machinery to transcriptionally silence ERVs, how the ERVs have sometimes evaded these silencing mechanisms during development, and how these evasive activities have lead to host co-option of viral regulatory sequences that may have contributed to early cell fate choices in the pre-implantation embryos of placental mammals. These studies will lead to important insights into the molecular “arms race” between retroviruses and their hosts that continue play out today, and will also lead to a better understanding of how ERVs have played an integral role in driving our development and evolution as a species.
Exploring the function of KRAB-Zinc finger Proteins as repressors of ERVs
Kruppel associated box zinc finger proteins (KZFPs) make up the largest family of transcription factors in mammals (estimated to be several hundred in mice and humans) within even deeper roots in amniotes. Each species has its own unique repertoire of KZFPs, with a small number shared with closely related species and a larger fraction specific to each species. Despite their abundance, little is known about their physiological functions. KZFPs consist of an N-terminal KRAB domain that binds the co-repressor KAP1 and a variable number of C-terminal C2H2 zinc finger domains that mediate sequence-specific DNA binding. KAP1 directly interacts with the KRAB domain, which recruits the histone methyltransferase (HMT) SETDB1 and heterochromatin protein 1 (HP1) to initiate heterochromatic silencing. Several lines of evidence point to a role for the KZFP family in ERV silencing. First, the number of C2H2 zinc finger genes in mammals correlates with the number of ERVs. Second, the KZFP protein ZFP809 was isolated based on its ability to bind to the primer binding site for proline tRNA (PBSPro) of murine leukemia virus (MuLV). Third, deletion of the KZFP co-repressors Trim28 or Setdb1 leads to activation of many ERVs. Thus we have begun a systematic interrogation of KZFP function as a potential adaptive repression system against ERVs.
We initially focused on ZFP809 as a likely ERV-suppressing KZFP since it was originally identified as part of a repression complex that recognizes infectious MuLV via direct binding to the 18 nt Primer Binding Site for Proline (PBSpro) sequence. Using ChIP-seq of epitope tagged ZFP809 we determined that ZFP809 bound to several sub-classes of ERV elements via the PBSpro. We found that Zfp809 knockout tissues displayed high levels of VL30pro elements and that the targeted elements display an epigenetic shift from repressive epigenetic marks. ZFP809-mediated repression extended to a handful of genes that contained adjacent VL30pro integrations. These studies provided the first demonstration for the in vivo requirement of a KZFP in the recognition and silencing of ERVs.
As a follow-up to our studies on ZFP809, we began a systematic analysis of KZFPs using a medium throughput ChIP-seq screen and functional genomics of KZFP clusters and individual KZFP genes. Our ChIP-seq data demonstrates that the majority of recently evolved KZFP genes interact with and repress distinct and partially overlapping ERV targets. This hypotheses is strongly supported by the distinct ERV reactivation phenotypes we observed in mouse ESC lines lacking one of five of the largest KZFP gene clusters. Furthermore our preliminary evidence suggests that KZFP cluster KO mice are viable, but have elevated rates of somatic retrotransposition of specific retrotransposon families, providing the first direct genetic link between KZFP gene diversification and retrotranspsoson mobility.
Ancient KZFPs and phenotypic innovations in mammals
Although the majority of mouse KZFPs explored to date bind to ERVs and other retrotransposon families, we have begun to focus on more ancient KZFPs that are conserved across evolution and play important roles in mammalian development.
ZFP568
We determined that ZFP568 is a direct repressor of a placental specific isoform of the Igf2 gene called Igf2-P0. Insulin-like growth factor 2 (Igf2) is the major fetal growth hormone in mammals. We demonstrated that loss of Zfp568, which causes gastrulation failure, or mutation of the ZFP568 binding site at the Igf2-P0 promoter causes inappropriate Igf2-P0 activation. We also showed that this lethality could be rescued by deletion of Igf2. These data highlight the exquisite selectivity by which members of the KZFP family repress their targets and identifies an additional layer of transcriptional control of a key growth factor regulating fetal and placental development. In an exciting follow-up to these studies, we determined that ZFP568 is highly conserved and under purifying selection in eutheria with the exception of human. Human ZNF568 allele variants have lost the ability to bind and repress Igf2-P0, which may have been driven by the loss of the Igf2-p0 transcript in human placenta.
ZFP110 (called ZNF274 in humans)
We have found that Zfp110 is essential for development in mice, and like its human ortholog, binds to the 3' end of KZFP genes. We speculate that it is necessary to protect sensitive KZFP genetic loci that may be particularly unstable.
ZFP661 (called ZNF2 in humans)
We demonstrated that Zfp661 mutants have dendrite arborization defects and display autism-like behaviors. Mechanistically ZFP661 functions by balancing the expression of clustered protocadherin genes, increasing their diversity in neurons, by binding antagonistically near CTCF sites, which prevents CTCF from trapping cohesin. We hypothesize that the emergence of ZFP661 in mammals was a critical adaptation that co-evolved with larger brains in mammals.
ZFP777 (called ZNF777 in humans)
Zfp777 is a highly conserved KZFP with deep roots in amniotes. We found that Zfp777 is essential for survival in mice, with KOs dying perinatally. Zfp777 mutants display ventral septal heart defects, similar to a human patient harboring a ZNF777 mutation. Mechanistically, ZFP777 binds to the promoters of thousands of developmental genes, where it is necessary for developmental target gene repression.
PRDM9 and the regulation of Meiotic Recombination
PRDM9 is the ancestral member of the KZFP family. PRDM9 contains a DNA-binding zinc finger array and a KRAB domain, like other KZFPs, but it is unique in several respects. First, PRDM9 does not interact with KAP1. Second, PRDM9 also contains a histone methyltransferase domain that methylates histone H3 on both K4 and K36 to generate the K4me3 and K36me3 dual mark. Third, PRDM9 is exclusively expressed at a brief window of time during meiotic prophase, where its activity directs the programmed DNA double strand break machinery to initiate meiotic recombination. Fourth, the zinc finger array is rapidly evolving within and between species. We have begun exploring how this rapid evolution may be linked with infertility. PRDM9 mutations cause sterility in male and female mice, and mutations have also been linked with infertility in humans. Our ongoing work has demonstrated that rare PRDM9 alleles identified in men with azoospermia have gain-and loss-of-function phenotypes in DNA binding assays, providing a plausible link to infertility mechanisms.
We also began an in silico search for factors that may function downstream of PRDM9. We identified two factors, ZCWPW1 and ZCWPW2, that bind to the double histone methylation marked placed by PRDM9 (H3K4me3 and H3K36me3) in vitro and at hotpots in vivo. We show that loss of either Zcwpw1 or Zcwpw2 in mice leads to complete male sterility, meiotic arrest and failed synapsis. Strikingly we determine that the positioning of DSBs are not altered in Zcwpw1 KOs, demonstrating that Zcwpw1 is required not for the initiation but for the repair of PRDM9-induced DSBs. However in Zcwpw2 KOs, DSBs partially relocated towards promoters. This suggests that Zcwpw2 plays a role linking PRDM9-induced histone methylation marks to efficient production of DNA double strand breaks at hotspots, while Zcwpw1 plays a critical role in ensuring efficient homologous DNA repair. Our ongoing studies are focusing on how ZCWPW1/2 factors impact meiotic recombination.
Cataloguing new KZFP genes using long read sequencing
We have embarked on an effort to fully annotate the KZFP genes throughout murine species, as many KZFP genetic loci have large gaps in the reference assembly. We are utilizing a combination of long read sequencing methods to provide near TtoT genomes of mice along with comprehensive annotations of KZFP genes in both laboratory and wild strains. Our data shows that waves of TE/ERV integration are a driving force of KZFP gene innovation and adaptation within KZFP gene clusters.