Site-specific DNA replication origins exist in all prokaryotes, their plasmids and their phage, as well as in animal and plant viruses, and all single cell eukaryotes studied to date. We and others have shown that they consist of two essential components (1) : one or more binding sites for an origin recognition protein (e.g. SV40 T-ag) or protein complex (e.g. ORC), and an easily unwound sequence “DNA unwinding element” (DUE). DNA unwinding begins within the DUE, followed by DNA synthesis on each of the resulting single-stranded DNA templates. Most, if not all, eukaryotic origins initiate replication in both directions, resulting in a transition from discontinuous to continuous DNA synthesis on each template strand “origin of bi-directional replication” (OBR), Fig. 2B. The two leading strand initiation events that mark the OBR are separated by only one or two nucleotides (2) . Auxiliary origin components have been identified in both viral and cellular origins (e.g. ScARS1, Fig. 2C ). We and others have shown that they consist of binding sites for transcription factors that facilitate either binding of origin recognition proteins or DNA unwinding (3).
Viral origins contain a specific core DNA sequence that is required for initiation to proceed efficiently, whereas the requirement for specific DNA sequences is greatly diminished in single cell eukaryotes and virtually absent in multicellular eukaryotes. For example, SV40 T-ag is assembled into a hexameric helicase only after binding to the SV40 replication origin. In contrast, S. cerevisiae replication origins share only ~15% of their sequences in common a genetically required consensus sequence required for ORC binding (elements A & B1, Fig. 2C ). Origins in the fission yeast Schizosaccharomyces pombe lack a genetically required consensus sequence, although they do exhibit “autonomously replicating sequence” (ARS) activity. Thus, origins in multicellular eukaryotes are more similar to those in S. pombe than to those in S. cerevisiae in terms of their size, sequence complexity, and their lack of a required consensus sequence. Origins in multicellular organisms generally lack ARS activity, although they do exhibit “replicator” activity (the ability to impart origin activity when translocated to other chromosomal sites), and they can be inactivated by sequence alterations.
We and others have mapped replication origins to specific genomic loci in differentiated cells of flies, frogs and mammals (1,2) , but the nature of these loci, however, remains elusive. Studies employing 2D-gel fractionation of total genomic DNA to detect replication bubbles or to map the polarity of replication forks generally conclude that initiation events are distributed uniformly over intergenic regions as large as 55 kb (“initiation zones”), whereas methods that map either the relative distribution or the relative abundance of nascent DNA strands along the genome invariably conclude that initiation events originate within specific loci of ~1 kb or less, similar in size to fission yeast origins ( Fig. 2A ). The initiation zone may reflect a basal level of “random” initiation events bounded by two actively transcribed genes (4) . The specific, high intensity initiation sites within intergenic regions contain replicators that can be inactivated by internal deletions, but they lack an identifiable, genetically required consensus sequence, such as the “ARS consensus sequence” found in budding yeast replicators (5-8) . They also contain an OBR comparable to those reported in yeast ( Fig. 2E ). In addition, origin activity can be regulated from sequences such as “locus control regions” that are many kilobases distal to the OBR but affect the accessibility of initiation sites to replication proteins. Finally, metazoan ORCs do not require binding to specific DNA sequences in order to initiate assembly of a pre-RC. Early embryos undergoing rapid cell cleavage (e.g. frogs, flies, sea urchin, and fish) initiate DNA replication with no obvious requirement for any specific DNA sequences, and Drosophila ORC or human ORC can replace Xenopus ORC in frog egg extracts with the same result.
How might such disparate data be reconciled? The answer is that metazoan genomes contain many potential initiation sites for DNA replication, but during animal development some of these sites are selectively activated while others are suppressed (1,9) . This is evident from the simple fact that site-specific initiation is developmentally acquired (10) . Initiation sites are uniformly distributed throughout the genome in embryos undergoing rapid cell cleavages prior to the onset of zygotic gene expression, with no apparent preference for specific sequences. After this stage, initiation events become restricted to specific sites. Remarkably, these specific sites appear to be reestablished each time a cell divides (11,12) . Therefore, epigenetic as well as genetic parameters must determine where initiation will occur. Several epigenetic parameters have been implicated in eukaryotic origin specificity such as nucleotide pool levels (13) , transcription factor binding sites (M. Mechali, and M. Leffak, unpublished results), ratio of initiation proteins to DNA (12) , gene transcription (14) , chromosome structure (10,15) , nuclear organization (16) , DNA topology negative superhelical energy facilitates DNA unwinding (17) and DNA methylation (18,19) . Thus, DNA sequence requirements may reflect their ability to support such epigenetic parameters. In addition, ORCs from S. pombe (20) , Drosophila (17) , Xenopus eggs (21) , and human cells (22) all preferentially bind to asymmetric A:T-rich sequences that are characteristic of replication origins in fission yeast and frequently found at replication origins in flies and mammals. Such differences presumably reflect the affects of sequence context and chromatin structure on origin activity. This allows metazoans the flexibility to change their pattern of replication origins to accommodate changes in the length of S-phase and in the pattern of gene expression that can occur during animal development. Thus, evolution has retained the same basic mechanism for DNA replication throughout the eukaryotic kingdom without sacrificing the flexibility needed in gene expression and genomic changes needed to create complex, multicellular organisms.
Figure 2. Eukaryotic DNA Replication Origins. A. Initiation zones have been detected using 2D-gel electrophoresis methods to detect replication bubble and fork structures. For example, initiation occurs throughout the 55 kb intergenic region between the hamster DHFR and 2BE2121 genes, although most initiation events occur within a 12.5 kb region that contains two origins of bi-directional DNA replication (OBRs) ori - b and b' (Kobayashi et al., 1998). Another OBR (ori - g ) has also been mapped. As cells enter S-phase, only two MNase hypersensitive sites appear in this region, one at ori - b and one at ori - g (Pemov et al., 1998) , suggesting that pre-RCs are assembled predominantly at the OBRs.
B. An "origin of bi-directional DNA replication" (OBR) is the transition between continuous and discontinuous DNA synthesis on the two template strands that mark the place where bi-directional replication begins. This transition occurs because the two complementary DNA strands are anti-parallel (5' ® 3':3' ¬ 5'), and all DNA polymerases travel along their template in only one direction (5' ® 3'). Therefore, DNA synthesis can occur continuously on one template of a replication fork in the same direction as DNA unwinding (termed "leading strand synthesis"), but it must occur discontinuously on the complementary template, in the direction opposite to DNA unwinding, through the repeated initiation of short nascent DNA fragments called "Okazaki fragments" (termed "lagging strand synthesis").
C. Saccharomyces cerevisiae replication origins are 100 to 150 bp. The core component consists of an ScORC binding site (~30 bp) that includes two genetically identifiable elements, A and B1, as well as a DUE that usually contains the genetically identifiable B2 element. Element A contains an asymmetric A:T-rich “ARS consensus sequence” that is required for origin activity. B1 facilitates A in binding ScORC. B2 appears to be a weak ScORC binding site that facilitates pre-RC assembly. Some origins also contain an auxiliary component, element B3 (~22 bp), that binds transcription factor Abf-1. The OBR is marked by two start sites for leading strand DNA synthesis at specific nucleotides separated by 1 bp. Despite the fact that only ~15% of the sequences in S. cerevisiae origins are shared in common (parts of elements A and B1), they exhibit a flexible modular anatomy in which homologous elements from different origins are interchangeable.
D.Schizosaccharomyces pombe replication origins contain at least two SpORC binding sites, consisting of asymmetric A:T-rich sequences, that act synergistically to facilitate assembly of a single pre-RC at an adjacent site. The OBR is 4 bp in ARS1 (Gomez and Antequera, 1999) and 11 bp in ARS3001 (Kong and DePamphilis, 2002) . ARS3001, for example, consists of four genetically required sites contained within ~570 bp; D2 and D6 are weakly required while D3 and D9 are strongly required (Kim and Huberman, 1998; Kong and DePamphilis, 2002) . SpORC binds strongly to the D3 site, weakly to the D6 site, and not at all to the remaining sequences. Pre-RC assembly appears to occur primarily at the D3+ D2 region, and an OBR has been mapped to the D2 site. Thus, the D3+ D2 region (~100 bp) is equivalent to a simple S. cerevisiae origin. D6 appears similar to the B2 element in S. cerevisiae origins in that it is a weak SpORC binding site that facilitates origin activity. Remarkably, the D9 region, which is required for origin activity to the same extent as the D3 region, neither binds ORC nor functions as a centromere, although it does bind an as yet unidentified protein throughout the cell cycle. Therefore, D9 may be a novel origin component.
E. Mammalian replication origins are characterized by the one at the 3'-end of the human lamin B2 gene. This origin has been mapped to <0.5 kb, and the OBR has been mapped to 3 bp (Abdurashidova et al., 2000) . A 290 bp fragment exhibits ectopic origin activity, and internal deletions eliminate origin activity (Paixao et al., 2004) . Orc2, Orc1 and Cdc6 have been photo cross-linked to specific DNA sequences within a 78 bp footprint whose size is cell cycle dependent (Abdurashidova et al., 2003).
Current Research on DNA Replication Origins
Given the seemingly intractable nature of mammalian replication origins, we examined S. pombe as a possible paradigm for mammals. S. pombe replication origins (0.5 to 1 kb; Fig. 2B ) are five to ten times larger than those in S. cerevisiae (0.1 to 0.15 kb, Fig. 2C ). S. pombe replication origins, like those in S. cerevisiae, contain ARS elements that function in vivo as replication origins, but S. pombe ARSs are not interchangeable with those in S. cerevisiae , and they lack a genetically required consensus sequence. However, S. pombe replication origins do contain two or more regions that are required for full ARS activity, and these regions consist of asymmetric A:T-rich sequences with A residues clustered on one strand and T residues on the other (20,23) . Thus, despite the absence of a required consensus sequence, S. pombe origins still exhibit sequence-specificity in their design.
We found that, as with S. cerevisiae , all six S. pombe ORC subunits remain tightly bound to chromatin throughout the cell cycle (24) . However, in contrast to S. cerevisiae , site-specific DNA binding by SpORC requires only the SpOrc4 subunit; neither the presence nor absence of ATP and the other five subunits affects SpOrc4 binding to DNA in vitro (20,24) . The SpOrc4 subunit is unique among eukaryotes in that its N-terminal half contains nine AT-hook motifs that specifically bind the minor groove of AT-rich DNA. Although SpOrc4 has a general affinity for all AT-rich DNA, SpORC is bound in vivo to S. pombe origins and not to other AT-rich sequences in the regions between origins (25) . Thus, while each AT-hook motif binds tightly to [], site-specificity likely results from the arrangement of all nine motifs acting in concert. Although the same three ORC subunits bind ATP in SpORC as in other eukaryotic ORCs (21) , ATP is not required for SpORC DNA binding specificity. Therefore, it imay be required for pre-replication complex (pre-RC) assembly.
The well characterized S. ombe replication origin ARS3001 contains four genetically required elements ( D 2, D 3, D 6 and D 9, Fig. 2D ). We found that ARS3001 consists of a single strong SpORC binding site ( D 3) that results in assembly of a pre- replication complex and initiation of DNA synthesis at the D 2/ D 3 locus (20) . Weak ORC binding occurs at D 6, but this may simply facilitate recruitment of Cdc18 to the D 2/ D 3 locus. The D 9 element neither binds ORC nor functions as a centromere, but does bind an as yet unidentified protein throughout the cell cycle.
Although SpORC can not substitute for XlORC in initiating DNA replication in a frog egg extract , either SpORC or SpOrc1 alone can compete with XlORC for binding to DNA and thereby prevent XlORC from initiating DNA replication (21) . Thus, despite differences in their ATP requirements and DNA sequence specificities, eukaryotic ORCs are functionally conserved. The same subunits bind ATP and exhibit ATPase activity, and XlORC initiates DNA replication preferentially at the same or similar sites to those targeted in S. pombe . In fact, similar sequence preferences were subsequently reported for HsORC (22) and DmORC (17).
Taken together, these results reveal (a) that the SpORC is designed to bind to specific genomic sites that are not only AT-rich but that contain other, as yet unspecified characteristics, (b) that fission yeast replication origins are equivalent to two or more budding yeast origins in tandem, and (c) that affinity for A:T-asymmetric DNA is a common property of all eukaryotic ORCs. The fact AT-hook motifs are unique to the SpORC strongly suggests that the need to accommodate changes in the length of S-phase and the pattern of gene expression during animal development require greater flexibility in selection of initiation sites.
What constitutes a replication origin in mammalian cells and what advantages does it impart to the cell? We are currently attempting to identify specific ORC/chromatin binding sites in vivo using constitutively expressed, epitope tagged proteins, we will to purify protein-DNA complexes that exist in human cells during normal cell proliferation. We hope to correlate the binding of ORC and other pre-RC components with known origins of bidirectional replication, determine whether or not these sequences are methylated or this chromatin is hyperacetylated, and what other proteins may be bound at these sites. We will also determine how various epigenetic parameters may influence origin selection. This information will be used to identify novel origins and to create an artificial origin where we can more easily manipulate the sequences and the proteins that bind to these sequences. The ultimate goal of these studies is to assemble pre-replication complexes at specific genomic sites in vitro using the proteins and DNA sequences identified and purified in these studies.
References
- DePamphilis, M. L. (1999) Bioessays 21, 5-16.
- Bielinsky, A. K., and Gerbi, S. A. (2001) J Cell Sci 114, 643-651.
- DePamphili, M. L. (1993) Trends Cell Biol 3, 161-167.
- Saha, S., Shan, Y., Mesner, L. D., and Hamlin, J. L. (2004) Genes Dev 18, 397-410.
- Liu, G., Malott, M., and Leffak, M. (2003) Mol Cell Biol 23, 1832-1842.
- Wang, L., Lin, C. M., Brooks, S., Cimbora, D., Groudine, M., and Aladjem, M. I. (2004) Mol Cell Biol 24, 3373-3386.
- Altman, A. L., and Fanning, E. (2004) Mol Cell Biol 24, 4138-4150.
- Paixao, S., Colaluca, I. N., Cubells, M., Peverali, F. A., Destro, A., Giadrossi, S., Giacca, M., Falaschi, A., Riva, S., and Biamonti, G. (2004) Mol Cell Biol 24, 2958-2967.
- DePamphilis, M. L. (2003) Cell 114, 274-275.
- Mechali, M. (2001) Nat Rev Genet 2, 640-645.
- Li, F., Chen, J., Solessio, E., and Gilbert, D. M. (2003) J Cell Biol 161, 257-266.
- Li, C. J., Bogan, J. A., Natale, D. A., and DePamphilis, M. L. (2000) J Cell Sci 113 ( Pt 5), 887-898 .
- Anglana, M., Apiou, F., Bensimon, A., and Debatisse, M. (2003) Cell 114, 385-394.
- Lin, C. M., Fu, H., Martinovsky, M., Bouhassira, E., and Aladjem, M. I. (2003) Curr Biol 13, 1019-1028.
- Gerbi, S. A., and Bielinsky, A. K. (2002) Curr Opin Genet Dev 12, 243-248.
- Li, F., Chen, J., Izumi, M., Butler, M. C., Keezer, S. M., and Gilbert, D. M. (2001) J Cell Biol 154, 283-292.
- Remus, D., Beall, E. L., and Botchan, M. R. (2004) Embo J 23, 897-907.
- Harvey, K. J., and Newport, J. (2003) Mol Cell Biol 23, 6769-6779.
- Rein, T., Kobayashi, T., Malott, M., Leffak, M., and DePamphilis, M. L. (1999) J Biol Chem 274, 25792-25800.
- Kong, D., and DePamphilis, M. L. (2002) Embo J 21, 5567-5576.
- Kong, D., Coleman, T. R., and DePamphilis, M. L. (2003) Embo J 22, 3441-3450.
- Vashee, S., Cvetic, C., Lu, W., Simancek, P., Kelly, T. J., and Walter, J. C. (2003) Genes Dev 17, 1894-1908.
- Takahashi, T., Ohara, E., Nishitani, H., and Masukata, H. (2003) Embo J 22, 964-974.
- Kong, D., and DePamphilis, M. L. (2001) Mol Cell Biol 21, 8095-8103.
- Wuarin, J., Buck, V., Nurse, P., and Millar, J. B. (2002) Cell 111, 419-431.