Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Tandem Repeat-Containing MITEs in the Clam Donax trunculus

Tandem Repeat-Containing MITEs in the Clam Donax trunculus Two distinct classes of repetitive sequences, interspersed mobile elements and satellite DNAs, shape eukaryotic genomes and drive their evolution. Short arrays of tandem repeats can also be present within nonautonomous miniature inverted repeat transposable elements (MITEs). In the clam Donax trunculus, we characterized a composite, high copy number MITE, named DTC84. It is composed of a central region built of up to five core repeats linked to a microsatellite segment at one array end and flanked by sequences holding short inverted repeats. The modular composition and the conserved putative target site duplication sequence AA at the element termini are equivalent to the composition of several elements found in the cupped oyster Crassostrea virginica and in some insects. A unique feature of D. trunculus element is ordered array of core repeat variants, distinctive by diagnostic changes. Position of variants in the array is fixed, regardless of alterations in the core repeat copy number. Each repeat harbors a palindrome near the junction with the following unit, being a potential hotspot responsible for array length variations. As a consequence, variations in number of tandem repeats and variations in flanking sequences make every sequenced element unique. Core repeats may be thus considered as individual units within the MITE, with flanking sequences representing a “cassette” for internal repeats. Our results demonstrate that onset and spread of tandem repeats can be more intimately linked to processes of transposition than previously thought and suggest that genomes are shaped by interplays within a complex network of repetitive sequences. Key words: mobile element, MITE, satellite DNA, tandem repeats, sequence rearrangements, evolution. Introduction satDNAs are tandemly repeated noncoding sequences lo- Eukaryotic genomes host two ubiquitous classes of highly cated in heterochromatic chromosomal compartments (Plohl abundant repetitive sequences, satellite DNAs (satDNAs) and et al. 2008). Characteristic low sequence variability of satDNAs transposable elements (TEs) (Lo ´ pez-Flores and Garrido-Ramos is considered to be a consequence of a phenomenon called 2012). TEs are sequence segments able to move to new concerted evolution, in which mutations are homogenized genomic locations and form interspersed repeats if replicated among repeats of a family in a genome and fixed among in this process (Finnegan 1989; Kazazian 2004; Jurka et al. individuals in a population (Dover 1986). Many satDNAs dif- 2007). Large number of diverse TEs exists in genomes, fering in length, sequence, copy number, and origin can coex- grouped into two basic classes based on mechanisms of trans- ist in a genome, but processes and possible constraints limiting position. Class I elements transpose by RNA-mediated mech- their onset and persistence are understood only fragmentarily anisms, while DNA-mediated processes spread class II (Mes ˇtrovic ´ et al. 2006). elements. Each of them includes autonomous and nonauton- Despite differences in structure, organization, mechanisms omous copies, the former being able to code for all products of spread, and sequence dynamics, growing number of re- needed for their own transposition while the later depend on ports indicate traits that link TEs and satDNAs. Internal tandem enzymes produced by the first. Passive transposition of a repeats found in some TEs provoke a hypothesis that their whole palette of sequences is possible due to the ability of expansion may represent a source of some satDNAs (Noma mechanisms involved in transposition to recognize DNA and Ohtsubo 2000; Gaffney et al. 2003; Macas et al. 2009). In secondary structures, such as inverted repeats (Craig 1995; addition, satDNA repeats were found as single units or short Izsvak et al. 1999; Coates et al. 2011). arrays interspersed in euchromatic portions of the genome, The Author(s) 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2549 ˇ Satovic and Plohl GBE probably as parts of yet uncharacterized TEs (Cafasso et al. Materials and Methods 2003; Brajkovic et al. 2012). It was suggested that inverted Construction of Partial Genomic Libraries repeats formed by inversion of satDNA monomers can pro- Donax trunculus genomic DNA was obtained from commer- mote interspersed distribution of such units (Mravinac and cially supplied adult specimens using adjusted phenol/ Plohl 2010). It must be noted that a direction of transition chloroform extraction protocol (Plohl and Cornudella 1997). between organizational forms is difficult to assess because Following the strategy described by Biscotti et al. (2007),ge- shifts from mobile elements to satDNAs can also be antici- nomic DNA was partially digested (10 mg of DNA, 37 C/5 min) pated in the opposite direction (Heikkinen et al. 1995; with 5 U of AluI restriction endonuclease (Fermentas) in order Macas et al. 2009). to reveal mass of degraded fragments in a range between 300 Miniature inverted-repeat transposable elements (MITEs) and 3,000 bp. The fragments were ligated into the pUC19/ are one group of nonautonomous DNA transposons. They SmaI vector. Transformed Escherichia coli DH5a competent are small (usually up to 600 bp), lack coding potential and/or cells (Invitrogen) were grown on 90-mm ampicillin-selective RNA pol III promoter site, and are featured by terminal or plates. After colony transfer, positively charged membranes subterminal inverted repeats, ability to fold into secondary (Amersham) were probed with digoxigenin-labeled AluI- structures, and short target site duplication (TSD) sequences digested (complete digestion) D. trunculus genomic DNA. formed in the process of insertion (Feschotte et al. 2002). Labeling, hybridization, and signal detection were performed MITEs are usually present in a high copy number in genomes as described in the following section. Hybridization was con- and are widespread in plants, animals, and fungi (Bureau and ducted under 65 C in 20 mM sodium phosphate buffer (pH Wessler 1992; Wang et al. 2010; Fleetwood et al. 2011). They 7.2), 20% sodium dodecyl sulfate (SDS), allowing ~80% are considered to be derived from larger autonomous ele- sequence similarity. ments (Feschotte and Mouche `s2000) and probably propa- gate through a cut-and-paste mechanism of transposition Southern Hybridization and Dot Blot Quantification combined with a gap repair and/or aberrant DNA replication, triggered by secondary structures (Izsvak et al. 1999; Coates For Southern analysis, genomic DNA (2.5 mg/sample) was di- et al. 2011). gested with 20 U of restriction endonucleases overnight, frag- Some MITE sequences have tandem repeats in their central ments were separated by electrophoresis on 1% agarose gel, part. Arrays of variable number of tandem repeats (usually up and transferred onto a positively charged nylon membrane to 6) are a common trait of MITE-like elements DINE1 (Yang (Roche). Polymerase chain reaction (PCR)-amplified fragments of interest were labeled with digoxigenin by random priming and Barbash 2008), SGM (Miller et al. 2000), mini-me (Wilder using the DIG DNA Labeling and Detection Kit (Roche) and and Hollocher 2001), and PERI (Kuhn and Heslop-Harrison used as a hybridization probe. Membranes were hybridized 2011), described in Drosophila,and of MINE-2 in some in 20 mM sodium phosphate (pH 7.2), 20% SDS, at low, mod- Lepidoptera (Coates et al. 2011). Tandem repeats of these erate, and high stringency conditions (60 C, 65 C, and elements are followed by a short microsatellite array at one 68 C, respectively). Stringency washing was conducted in end. Both modules are embedded between flanking se- 20 mM sodium phosphate buffer, 1% SDS, at the tempera- quences featured by an inverted repeat and the TSD sequence ture three degrees lower than the hybridization temperature. AA. Described modular structure was found in elements of To detect the hybridization, signal membranes were incubated the pearl family detected in the cupped oyster Crassostrea with anti-digoxigenin alkaline phosphatase conjugate, and virginica (Gaffney et al. 2003). In addition, internal tandem chemiluminescent signals induced by CDP-Star (Roche) were repeats (core repeats) of pearl share sequence similarity and captured on X-ray films (Amersham). unit length with several satDNAs widespread in bivalve mol- The relative genomic contribution of the DTC84 core lusks (Plohl and Cornudella 1996; Lopez-Flores et al. 2004; repeat sequence was determined by dot blot analysis. Serial Biscotti et al. 2007; Plohl et al. 2010), thus linking TEs and dilutions of D. trunculus genomic DNA and core repeat se- satDNAs in these organisms. quences were spotted onto a nylon membrane. Hybridization Standard focus in studying TEs is mostly on characterization was performed under high (68 C) and low stringency condi- of sequence traits that might be responsible for their mobility tions (60 C). (Lopez-Flores and Garrido-Ramos 2012). The same also holds for repeat-incorporating MITEs while available information is PCR Amplification scarce if we consider sequence dynamics, range, and possible causes of variability of tandem repeats residing within them. To amplify core repeats in DTC84 elements, primer Here, we characterize a novel MITE element in the clam pair DTC84AluSatF: TTGCCTGTGACGTCTACTTGTGC and D. trunculus, DTC84, with focus on tandem repeats residing DTC84AluSatR: AGAGGTCACAGGCAACCATCCA was de- within it and suggest pathways and mechanisms involved in rived according to the DTC84 clone. Amplification was per- their evolution. formed with initial denaturation at 94 C for 5 min, 35 cycles 2550 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE of 94 C for 30 s, 57 C for 30 s, 72 C for 30 s, and final that flank repeats (thus called core repeats) and the microsat- extension at 72 Cfor 7 min. ellite. Schematic presentation of sequenced genomic frag- Primers constructed according to sequence segments that mentsisshown in supplementary figure S1a, Supplementary flank core repeats DTC84mobF: AACAAGAGCACCGCTGGG Material online. CG and DTC84mobR: CGCACGTTTGAAAAACGGGACGTA Observed sequence segments have modular composi- were used in order to amplify additional copies of DTC84 tion which includes flanking sequence L+ (core elements. Amplification was performed with initial denatur- repeats) + (ACGG/ACGA) microsatellite+ flanking se- 1–5 2–13 ation at 94 C for 5 min, 35 cycles of 94 C for 30 s, 55 Cfor quence R (fig. 1a). In order to make a broader view on se- 30 s, 72 C for 1 min, and final extension at 72 Cfor 7 min. quences between flanking segments, corresponding primers All PCR products were cloned into pGEM-T Easy Vector were used in PCR amplification of genomic DNA. In agree- System (Promega), and recombinant clones containing multi- ment with elements obtained from genomic clones, random mers were sequenced. All cloned fragments were sequenced selection of 19 amplified fragments revealed conserved at Macrogen Inc. (Korea) on ABI3730XL DNA Analyzer. sequence and organizational pattern of the new element, Sequences submitted to GenBank obtained the following named DTC84 after the sequence segment cloned first. All accession numbers: KC981676–KC981759. depicted DTC84 elements are shown in figure 1b. Based on the common microsatellite motif ACGG and low sequence similarity (57%) between core repeats of the two elements Sequence Analysis (supplementary fig. S1b, Supplementary Material online), Obtained multimeric DNA sequences were trimmed to ex- DTC84 is closest to the CvG MITE-like pearl element from clude primer binding sites from sequence analysis. Sequence C. virginica (Gaffney et al. 2003). editing and alignments were performed using the Geneious Possibility that mechanisms of transposition are involved 5.4.3 program (Biomatters Ltd.). in the spread of DTC84 is indicated by observation of AA Substructures, repeats, and motifs were searched with ap- dinucleotides as putative TSD that defines element ends in propriate applications within the online tool Oligonucleotids the cloned genomic fragments (boxed in fig. 2 consensus repeats finder, developed by Bazin, Kosarev, and Babenko sequence). Among other structural features that may be of (http://wwwmgs.bionet.nsc.ru/mgs/programs/oligorep/InpFo significance for transposition is 11-bp-long inversely oriented rm.htm, last accessed December 20, 2013). To construct phy- motif positioned at ends of otherwise unrelated sequence logenetic networks, program Network (Fluxus Technology Ltd modules L and R (figs. 1a and 2, green arrows). This motif is 1999–2012) was used. CENSOR software tool was used for located subterminally in the flanking segment L while it occu- screening query sequences against a reference collection of pies terminal location in R, in agreement with the position of the Repbase repetitive DNA collection (Jurka et al. 2005). The inverted repeats in pearl (Gaffney et al. 2003). As in pearl, distribution of nucleotide diversity along the core repeat se- complete pol III box A and box B priming sites were not quence using a set of complete repeats was calculated as foundinDTC84. average number of nucleotide differences per site, using a Additional complexity of DTC84 elements is given by dif- 10-bp window with a sliding step of 1 bp in DnaSP v4.5 ferences in junctions between the flanking module L and core (Rozas et al. 2003). repeats (figs. 1b and 2a). This flanking module can be linked directly to core repeats but it can also accommodate two types Results of spacer segments, of ~80 and from 70 to 160 bp, all sharing the first 16 nucleotides (fig. 2a). The common part may rep- DTC84 MITE Repetitive DNA Family resent a residual of the flanking sequence itself, missing when Recombinant clones enriched in repetitive fraction of D. trun- the junction with core repeats is direct. Spacer sequences are culus genome were initially selected based on a positive cor- not related to other components of DTC84 and do not show relation between hybridization signal intensity and genomic any apparent substructure. At the opposite core repeat array abundance of repetitive sequences. Sequenced inserts (in total end, the microsatellite segment is linked directly to the flank- 117 kb in 67 colonies) revealed, among others, 2,707-bp-long ing sequence R (figs. 1b and 2b). fragment named DTC84Alu. This fragment turned to be of interest because it includes a short array composed of five Structured Arrays of DTC84 ~160-bp-long tandem repeats, followed by a microsatellite Core Repeats segment at one array end. In the next screening of initial set of about 1,000 colonies, repeat-specific hybridization probe Additional set of core repeats was cloned and sequenced after revealed four additional genomic fragments as positives. PCR amplification of genomic DNA with core repeat-specific Besides single or few repeats organized in tandem, sequenc- primers. After excision of primer sites, a total of 41 core repeat ing and sequence alignment of these fragments with sequences were extracted from PCR-obtained multimers (up DTC84Alu disclosed two ~50-bp-long conserved segments to 5-mer) and compared with 65 core repeats obtained from Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2551 ˇ Satovic and Plohl GBE FIG.1.—Modular structure of DTC84 element. (a) Terminal and subterminal inverted repeats of DTC84 and equivalent elements are indicated by arrows within yellow and gray boxes that represent flanking modules. Solid black arrow in the middle of element represents core repeats, while the microsatellite segment is shown as a dark blue box. Conserved putative TSD sequence is given at element ends. (b) DTC84 in cloned genomic fragments (DTC84Alu, 84-16F, 84-17F, 84-35, and 84-37) and in genomic fragments retrieved by PCR amplification of genomic DNA with primers specific for left (yellow boxes) and right (gray boxes) flanking modules. Core repeats are shown as pink boxes, arrowheads indicating the orientation, and microsatellite regions are shown as dark blue boxes. Two types of DTC84 core repeats are indicated, and type 2 core repeat adjacent to the microsatellite is additionally marked by asterisk. Waved line on the left or right end of the core repeat box indicates truncation. Solid black lines represent genomic sequences other than in described modules. (c) Alignment of consensus sequences of different types of core repeats. Dots indicate identity, only differences to the consensus sequence of the first repeat type are shown. DTC84 elements described earlier (supplementary fig. S2, compared core repeats differ mostly in single nucleotide sub- Supplementary Material online). It must be noted that geno- stitutions (supplementary fig. S2, Supplementary Material on- mic environment of PCR-obtained core repeats is not known; line) and revealed a consensus length of 156 bp. In addition, they can be incorporated in DTC84 elements but also associ- sliding window analysis showed that nucleotide differences ated with some other undetermined genomic sequences. All among core repeat variants are unevenly distributed along 2552 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2553 FIG.2.—Sequence alignment of DTC84 element modules. Consensus sequence is derived according to the majority principle, and only differences to the consensus are shown in each segment. Dots indicate identity and dashes insertions or deletions. (a) Left flanking module (yellow bar) and two groups of accompanying spacer segments. (b) Core repeats, microsatellite region (dark blue bar), and right flanking module (gray bar). For simplicity, only beginning of the first core repeat and ending of the last core repeat in each array is shown, while interruption is indicated by slashes in the consensus sequence. Complete alignment of all core repeats can be seen in supplementary figure S2, Supplementary Material online. Black vertical arrows above the consensus sequence indicate starting positions of truncated repeats. Inverted repeats in flanking modules are indicated by green arrows. AA dinucleotides as putative TSDs are boxed in the consensus sequence. Dashed arrows above the consensus sequence show primer positions. ˇ Satovic and Plohl GBE FIG.3.—Grouping of DTC84 core repeats. Type 1 core repeats are indicated by green circles, type 2 are yellow, while type 2* are orange. the sequence and form regions of reduced variability, approx- 3 position, every DTC84 core repeat ends with the 12-nt-long imately in the first half and near the end of the core repeat palindrome TTGTCCGGACAA followed by the sequence AAA (supplementary fig. S3, Supplementary Material online). TT, after which starts the next repeat unit. This palindrome is The first core repeat in DTC84 is regularly 5 end truncated conserved in the majority of core repeats, and mutated for 4–67 nt. Only two core repeats out of 24 located at that variants are rare in this segment (16 out of 106 sequenced position are complete (84mob12 and 84mob16). Truncated core repeats; supplementary fig. S2, Supplementary Material core repeats start at 11 different nucleotide positions, indicat- online). Specifically, transition point of the last core repeat into ing that they were predominantly formed in independent the microsatellite sequence differs in a way that sequence events (fig. 2b). The distribution of truncation sites is however following the palindrome is AAAGGT. not quite random and seven core repeats start at the position Phylogenetic analysis (fig. 3) combined with visual 8, while the first nucleotide of four core repeats is at the po- inspection of aligned core repeats (supplementary fig. S2, sition 42. Curiously, starting nucleotide in these two sets is Supplementary Material online) revealed clustering and en- always A. Although this nucleotide may represent a preferred abled identification of variants according to cluster-specific site of truncation, identity in starting position can also be a diagnostic nucleotides. Two types of core repeats are distinc- consequence of amplification events that followed truncation. tive, based on six most exclusive variable positions. Type 1 Disregarding the truncated part, all sequenced arrays of variants are recognized by CC-T-G-A-T nucleotides while the DTC84 elements are composed of an integer number of same positions are occupied by TG-C-A-G-G nucleotides in core repeats, all being positioned in the same orientation variants of the type 2 (fig. 1c and supplementary fig. S2, and forming the same junction with the microsatellite. At its Supplementary Material online). Although the difference is 2554 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE subtle, type 2 variants can be further subdivided into two groups. As explained in the previous paragraph, those linked directly to the microsatellite (indicated as 2*) differ from others by two additional diagnostic nucleotides located just at the monomer end (fig. 1c and supplementary fig. S2, Supplementary Material online). In the phylogenetic network, these core repeats group mostly in the left node of the type 2 branch (fig. 3). Sequenced DTC84 elements are composed of core repeats arranged according to the formula: type 1 (up to 3 consecu- tive repeats)+ type 2(1–2repeats) (fig. 1b). It is important to note that no correlation could be derived among core repeat array length and/or composition, truncation point of the first repeat, spacer segments, and number of repeats in the micro- satellite array. For example, dimeric core repeats of the arrangement type 1+ type 2 build 11 sequenced DTC84 ele- ments (84mob23, 84mob3, 84-37, 84mob40, 84mob11, 84mob5, 84mob18, 84mob15, 84mob41, 84mob37, and 84mob16; fig. 1b). Despite identical core repeat composition, they differ in all other above-mentioned features (fig. 2). It can be concluded that every studied DTC84 element shows a FIG.4.—Southern hybridization of Donax trunculus genomic DNA. unique combination of sequences featured in an independent (a) Genomic DNA digested with MspI (methylation insensitive, line 1), manner. HpaII (isoshisomer of MspI, methylation sensitive, line 2), and MboI(meth- In core repeat arrays of some DTC84 elements, only type 2 ylation insensitive, line 3) and hybridized with the core repeat-specific is found (in 84-35, 84mob24, and in the partial segment probe. (b) Electrophoretic separation of genomic DNA amplified with 84-16; fig. 1b). Following the rule of the first core repeat in PCR primers located in conserved flanking sequences. For the primer array being truncated, in these cases this is the type 2. In other position, see figure 2. words, core repeats of any type can appear in the truncated form but strictly depending on their position in the array. According to the dot blot analysis (not shown) performed at high stringency conditions (68 C), DTC84 core repeats con- stituteupto1%ofthe bivalvegenomeor8.9  10 copies, Genomic Organization of DTC84 Core Repeats Revealed considering the genome size reported by Hinegardner (1974). by Southern Blot Hybridization Estimated contribution of these sequences almost double at Southern hybridization experiments were performed in efforts low stringency conditions (60 C), indicating that a large to provide a more general view on organizational patterns of number of related sequences should exist in the genome. core repeats in the D. trunculus genome (fig. 4a). Digestion of Genomic abundance estimated to be <1% for homogeneous genomic DNA with endonuclease MspI, cutting once within core repeats constituting DTC84 elements is roughly in agree- the core repeat monomer sequence, and with MboI, which ment with their occurrence in the cloned genomic DNA predominantly cuts core repeats of the type 2, revealed short fragments. ladders of multimers on Southern blots. Methylation-sensitive endonuclease HpaII (isoschizomer of MspI) revealed slightly Other Pearl-Related Repetitive Sequences in D. trunculus less degraded profile (fig. 4a), in agreement with the previous conclusion about methylation of D. trunculus genomic DNA Among sequences depicted in initial cloning of D. trunculus (Petrovic ´ et al. 2009). Blurred appearance and hybridization genomic DNA, local Blast revealed one fragment, smear of up to about 5 kb could be related to association of DTC37AluF, which in a stretch of 140 nucleotides shares core repeats or their segments with other genomic sequences, ~61% similarity with DTC84 and with the pearl element resulting in a number of hybridizing fragments of different CvG (Gaffney et al. 2003; supplementary fig. S4a and b, length. In agreement are also short ladders obtained after Supplementary Material online). This match stretches over a amplification with PCR primers located in conserved flanking part of the core repeat and adjacent microsatellite region. sequences (fig. 4b) and those obtained with primers specific The microsatellite array that follows the putative core for core repeats (not shown). Obtained results suggest pre- repeat in DTC37AluF is heterogeneous, built of ACGG dominant organization of core repeats in short arrays, as motif and its related variants ACTG and ACGA. In addition, observed in the cloned DTC84 representatives. DTC37AluF turned to be related to another genomic Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2555 ˇ Satovic and Plohl GBE fragment, DTC41AluF. They share 87% similarity in the elements differs among species according to the core repeat cloned core repeat segment, both being truncated at the sequence and length. The most variable parameter is copy same nucleotide in the cloning procedure. The microsatellite number, varying also among elements of the same type region differs between them only in length, and it is followed within a species (Yang and Barbash 2008). In C. virginica, by 94% similar, 50-bp-long R flanking sequence, unrelated to core repeats of three elements belonging to the pearl family equivalent modules in DTC84 and in CvG. Although not fur- (CvA, CvE, and CvG) share little or no sequence similarity ther explored in this work, observed similarity confirms that (Gaffney et al. 2003). However, related sequences can be divergent elements of the DTC84 type exist in the D. truncu- detected in other species. Sequence similarity between lus genome, flanked with their distinctive sequences. core repeats of CvE and those of the element detected in In the genomic clone DTC12Alu, we detected 190-bp-long M. galloprovincialis is relatively high, ~74–78% (Kourtidis sequence that shows 63% similarity to the CvE element et al. 2006). Sequences related to C. virginica core repeats can be also identified in D. trunculus, CvG being distantly (Gaffney et al. 2003). This fragment incorporates part of the putative core repeat, the microsatellite segment, and adjacent related to DTC84 core repeats and CvE to a fragment of the sequence (supplementary fig. S4c, Supplementary Material clone DTC12Alu. Particularly intriguing is sequential distribution of DTC84 online). The microsatellite in DTC12Alu is composed of core repeat variants. Based on observed variations in array ACCG and ACAG motifs, instead of the ACTG motif found length and arrangement of core repeats, it is difficult to an- in the original CvE (supplementary fig. S4d, Supplementary ticipate evolution of array diversity only as a consequence of Material online). accumulation of mutations and subsequent amplification of particular sequence types. Fixed position of the variant 2* and Discussion truncation of variants indicate that arrays may be modified Modular structure of repetitive elements described in this by consecutive deletions in the preexisting ancestral array. work in the clam D. trunculus corresponds to MITE-like ele- Excision of core repeats could lead to different truncation of ments of the pearl family, detected in C. virginica, the blood the newly formed first repeat, simply as a result of imprecise ark Anadara trapezia,the seaurchin Strongylocentrotus pur- mechanisms of sequence turnover. Proneness to alterations by puratus (Cohen et al. 1985; Gaffney et al. 2003), and in the insertions and deletions is additionally evident by finding two Mediterranean mussel Mytilus galloprovincialis (Kourtidis et al. types of spacer sequences that separate core repeats and the 2006). They also share equivalent structure with a group of flanking module L. elements described in Drosophila and some other insects The landmark of DTC84 core repeats is a palindrome near (Locke et al. 1999; Miller et al. 2000; Wilder and Hollocher the core repeat end, located in the segment with reduced 2001; Yang and Barbash 2008; Coates et al. 2011; Kuhn and variability compared with the rest of the sequence. Heslop-Harrison 2011). All these elements are characterized Equivalent segment of nonhomologous core repeats of the by TSDs and left and right flanking sequences with 10–20- pearl family CvA also exhibits reduced sequence variability, bp-long inverted repeats located at terminal and subterminal but, instead of hosting a palindrome, it is characterized by a positions. In addition, they all have central region composed of microsatellite-like motif (different from the microsatellite in a sequence repeated in tandem up to about five times, adja- continuation of core repeats, Gaffney et al. 2003). CvA- cent at one side to a short tetranucleotide-based microsatellite related sequences organized as satDNAs preserve reduced segment (fig. 1a). Exceptionally, the microsatellite segment is variability in this region and disclose another one, located missing in the M. galloprovincialis element (Kourtidis et al. roughly in the middle of the core repeat (Plohl et al. 2010). 2006). At the DNA sequence level, in all of them is apparently Interestingly, two regions of reduced sequence variability were conserved only the dinucleotide AA as the putative TSD, indi- also observed in alignment of core repeats of Drosophila PERI cating involvement of a family of related integrases in the element (Kuhn and Heslop-Harrison 2011). process of movement. Based on abundance, TIRs, dinucleo- Putative motifs hidden in conserved sequence segments tide TSD, and lack or incomplete RNA pol III promoter, these were observed in many satDNA monomers (e.g., Hall et al. elements are mostly referred as MITE-like TEs. It was recently 2003; Mes ˇtrovic ´ et al. 2006). Their occurrence is usually inter- suggested that they can be alternatively classified as members preted as a result of evolution under constraints, most likely of Helitron-like TEs, the class exploiting rolling-circle replication due to functional interactions with protein components in in its spread (Yang and Barbash 2008). chromatin. Experimental evidence is however still missing Rapid sequence divergence is expected along the whole and the only well-described interaction is between the 17- length of nonautonomous TEs because of the lack of any po- bp-long CENP-B motif in a subset of human alpha satDNA tential coding function, while secondary structure and archi- monomers and the CENP-B protein (Masumoto et al. 1989). tecture of termini are assumed to be major requirements for This sequence motif is found in apparently unrelated satDNAs transposition (Craig 1995; Coates et al. 2011). Comparisons of several mammalian species, and its variants were also of 12 Drosophila species showed that central region of DINE-1 detected in some invertebrates, including nematodes 2556 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE (Mes ˇtrovic ´ et al. 2013) and bivalve mollusks (Canapa et al. in Izsva ´ k et al. 1999). In a case when the duplicated segment is 2000). The CENP-B protein is likely to be involved in the not excised, the result will be formation of a tandem copy. human centromere assembly (Ohzeki et al. 2002), but its sim- Similarly, another model hypothesizes that internal MITE ilarity to pogo-like transposases suggests its origin from tandem repeats can be built during DNA replication based domesticated mobile elements and a possible role in satDNA on the ability of a sequence to form stem-loop structures sequence rearrangements (Kipling and Warburton 1997; (Hikosaka and Kawahara 2004). The later model predicts uni- Casola et al. 2008). To the pogo-like subfamily of Tc1/mariner directional expansion of tandem repeats into long arrays typ- transposons is related one large group of MITEs, abundant in ical for satDNAs. The arrangement of DTC84 core repeats is the human and other genomes (Feschotte et al. 2002). These consistent with consecutive introduction of mutations during data and the observed putative motifs suggest that transpo- stepwise repeat duplication, while observed diversity in com- sase-related mechanisms may participate in alterations of position of variants can be accomplished by subsequent dele- tandem repeats of DTC84 and equivalent elements, as well tions, as explained earlier. as in formation of satDNAs derived from these sequences. In conclusion, ordered distribution of mutations in the array In addition to sequence motifs, the microsatellite region of a variable number of core repeats in DTC84 is consistent canalsohavearoleinelement structuringand dynamics,or withdeletionevents inapreformedsegment,occurring inde- alternatively it can be trailed as a consequence of these pendently with respect to the flanking modules. In this way, processes (Wilder and Hollocher 2001; Coates et al. 2011). flanking sequences may be considered as a kind of a “cas- A possible significance is supported by existence of the micro- sette” for internal core repeats. One core repeat sequence end satellite repeat motif ACGG in CvG (Gaffney et al. 2003)and is marked by a palindrome, located within a segment of in DTC84, despite weak or no relevant sequence similarity reduced sequence variability. Such segments may have a between other modules. role in core repeat rearrangements, probably by mechanisms The idea about core repeats and satDNA monomers as related to transposition. Although limited, similarity of independent insertion/deletion units is further supported by some core repeats and satDNAs characterized previously in finding diverse satellite monomers (or their short arrays) on D. trunculus and other bivalve mollusks indicates a complex euchromatic genome locations, including in the vicinity of network which links tandem repeats residing inside MITEs and genes (Kuhn et al. 2012; Brajkovic ´ et al. 2012). Other obser- those expanded into arrays of satDNAs. vations also stress transposition as an intrinsic feature of at least some sequences arranged in tandem. For example, Supplementary Material it was proposed that mechanisms of transposition spread Supplementary figures S1–S4 areavailableat Genome Biology human alpha satDNA to new genomic locations (Alkan and Evolution online (http://www.gbe.oxfordjournals.org/). et al. 2007). In addition, analysis of variability of related satDNAs shared by groups of species led to the model in which bursts of spread are followed by long periods of Acknowledgments stasis, a feature pertinent to mobile elements (Mes ˇtrovic ´ The authors thank Brankica Mravinac, Nevenka Mes ˇtrovic ´ , et al. 2006). and Andrea Luchetti for critical reading and comments on It is striking that core repeats of DTC84, pearl-related se- the manuscript. This work was supported by Research Fund quences, and monomers of eight different satDNAs detected of Ministry of Science, Education and Sports of Republic of in D. trunculus are of similar length, about 160 bp (Plohl and Croatia, project no. 098-0982913-2756. ´ ´ Cornudella 1996, 1997; Petrovic and Plohl 2005; Petrovic et al. 2009). It was hypothesized that preferred length of Literature Cited satDNA repeats of 140–200 bp and 340 bp is favored by the Alkan C, et al. 2007. Organization and evolution of primate centromeric chromatin structure (Henikoff et al. 2001). Based on observed DNA from whole-genome shotgun sequence data. PLoS Comput Biol. similarities with core repeats, we can also speculate that the 3(9):1807–1818. preferred length mirrors specificities of mechanisms involved Biscotti MA, et al. 2007. Repetitive DNA, molecular cytogenetics and genome organization in the King scallop (Pecten maximus). Gene in initial processes of repeat formation. 406:91–98. Compared with copy number alterations in arrays of Brajkovic ´ J, Feliciello I, Bruvo-Madaric B, Ugarkovic ´ Ð. 2012. Satellite DNA- tandem repeats, much less can be said about mechanisms like elements associated with genes within euchromatin of the beetle responsible for initial tandem duplications of repetitive se- Tribolium castaneum. G3 2:931–941. quences. In this regard, duplication of entire MITE elements Bureau TE, Wessler SR. 1992. Tourist: a large family of small inverted repeat elements frequently associated with maize genes. Plant Cell is proposed to be consequence of aberrant DNA replication 4:1283–1294. (Izsva ´ k et al. 1999). Briefly, due to an inverted repeat and/or a Cafasso D, Cozzolino S, De Luca P, Chinali G. 2003. An unusual satellite palindrome, sequence can be duplicated when DNA polymer- DNA from Zamia paucijuga (Cycadales) characterised by two different ase passes through a MITE, followed by excision of the dupli- organisations of the repetitive unit in the plant genome. Gene 311: cated segment and its reintegration into a new location (fig. 4 71–79. Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2557 ˇ Satovic and Plohl GBE Canapa A, Barucca M, Cerioni PN, Olmo E. 2000. A satellite DNA con- inside HSP70 introns of the Mediterranean mussel (Mytilus gallopro- taining CENP-B box-like motifs is present in the antarctic scallop vincialis). Genome 49:1451–1458. Adamussium colbecki. Gene 247:175–180. Kuhn GCS, Heslop-Harrison JS. 2011. Characterization and Casola C, Hucks D, Feschotte C. 2008. Convergent domestication of genomic organization of PERI, a repetitive DNA in the Drosophila pogo-like transposases into centromere-binding proteins in fission buzzatii cluster related to DINE-1 transposable elements and highly yeast and mammals. Mol Biol Evol. 25:29–41. abundant in the sex chromosomes. Cytogenet Genome Res. 132: Coates BS, Kroemer JA, Sumerford DV, Hellmich RLL. 2011. A novel 79–88. class of miniature inverted repeat transposable elements (MITEs) Kuhn GCS, Ku ¨ ttler H, Moreira-Filho O, Heslop-Harrison JS. 2012. that contain hitchhiking (GTCY) microsatellites. Insect Mol Biol. 20: The 1.688 repetitive DNA of Drosophila: concerted evolution at 15–27. different genomic scales and association with genes. Mol Biol Evol. Cohen JB, Hoffman-Liebermann B, Kedes L. 1985. Structure and 29:7–11. Locke J, Howard LT, Aippersbach N, Podemski L, Hodgetts RB. 1999. The unusual characteristics of a new family of transposable elements in the sea urchin Strongylocentrotus purpuratus. Mol Cell Biol. 5: characterization of DINE-1, a short, interspersed repetitive element 2804–2813. present on chromosome and in the centric heterochromatin of Craig NL. 1995. Unity in transposition reactions. Science 270:253–254. Drosophila melanogaster. Chromosoma 108:356–366. Dover GA. 1986. Molecular drive in multigene families: How biologi- Lo ´ pez-Flores I, Garrido-Ramos MA. 2012. The repetitive DNA content of cal novelties arise, spread and are assimilated. Trends Genet. 2: eukaryotic genomes. In: Garrido-Ramos MA, editor. Repetitive DNA 159–165. VII. Basel (Switzerland): Karger Publishers. p. 1–28. Feschotte C, Mouche ` s C. 2000. Evidence that a family of miniature Lo ´ pez-Flores I, et al. 2004. The molecular phylogeny of oysters based on a inverted-repeat transposable elements (MITEs) from the Arabidopsis satellite DNA related to transposons. Gene 339:181–188. thaliana genome has arisen from a pogo-like DNA transposon. Mol Macas J, Koblı´zkova ´ A, Navra ´ tilova ´ A, Neumann P. 2009. Hypervariable 3’ UTR region of plant LTR-retrotransposons as a source of novel sat- Biol Evol. 17:730–737. Feschotte C, Zhang X, Wessler S. 2002. Miniature inverted-repeat trans- ellite repeats. Gene 448:198–206. Masumoto H, Masukata H, Muro Y, Nozaki N, Okazaki T. 1989. A human posable elements (MITEs) and their relationship with established DNA transposons. In: Craig N, editor. Mobile DNA II. Washington (DC): centromere antigen (CENP-B) interacts with a short specific sequence ASM Press. p. 1147–1158. in alphoid DNA, a human centromeric satellite. J Cell Biol. 109: Finnegan DJ. 1989. Eukaryotic transposable elements and genome evolu- 1963–1973. tion. Trends Genet. 5:103–107. Mes ˇtrovic ´ N, Castagnone-Sereno P, Plohl M. 2006. Interplay of selec- Fleetwood DJ, et al. 2011. Abundant degenerate miniature inverted- tive pressure and stochastic events directs evolution of the MEL172 repeat transposable elements in genomes of epichloid fungal endo- satellite DNA library in root-knot nematodes. Mol Biol Evol. 23: phytes of grasses. Genome Biol Evol. 3:1253–1264. 2316–2325. Gaffney PM, Pierce JC, Mackinley AG,Titchen DA,Glenn WK.2003. Pearl, Mes ˇtrovic ´ N, et al. 2013. Conserved DNA motifs, including the CENP-B a novel family of putative transposable elements in bivalve mollusks. box-like, are possible promoters of satellite DNA array rearrangements J Mol Evol. 56:308–316. in nematodes. PLoS One 8:e67328. Hall SE, Kettler G, Preuss D. 2003. Centromere satellites from Arabidopsis Miller WJ, Nagel A, Bachmann J, Bachmann L. 2000. Evolutionary dynam- populations: maintenance of conserved and variable domains. ics of the SGM transposon family in the Drosophila obscura species Genome Res. 13:195–205. group. Mol Biol Evol. 17:1597–1609. Heikkinen E, Launonen V, Muller E, Bachmann L. 1995. The pvB370 Mravinac B, Plohl M. 2010. Parallelism in evolution of highly repetitive BamHI satellite DNA family of the Drosophila virilis group and its evo- DNAs in sibling species. Mol Biol Evol. 27:1857–1867. lutionary relation to mobile dispersed genetic pDv elements. J Mol Noma K, Ohtsubo E. 2000. Tnat1 and Tnat2 from Arabidopsis thaliana: Evol. 41:604–614. novel transposable elements with tandem repeat sequences. DNA Res. Henikoff S, Ahmad K, Malik HS. 2001. The centromere para- 7:1–7. dox: stable inheritance with rapidly evolving DNA. Science 293: Ohzeki J, Nakano M, Okada T, Masumoto H. 2002. CENP-B box is required for de novo centromere chromatin assembly on human alphoid DNA. 1098–1102. J Cell Biol. 159:765–775. Hikosaka A, Kawahara A. 2004. Lineage-specific tandem repeats riding Petrovic ´ V, Plohl M. 2005. Sequence divergence and conservation in or- on a transposable element of MITE in Xenopus evolution: a new mechanism for creating simple sequence repeats. J Mol Evol. 59: ganizationally distinct subfamilies of Donax trunculus satellite DNA. 738–746. Gene 362:37–43. Hinegardner R. 1974. Cellular DNA content of the Mollusca. Comp Petrovic V, et al. 2009. A GC-rich satellite DNA and karyology of the Biochem Physiol A Comp Physiol. 47:447–460. bivalve mollusk Donax trunculus: a dominance of GC-rich heterochro- Izsva ´ k Z, et al. 1999. Short inverted-repeat transposable elements in teleost matin. Cytogenet Genome Res. 124:63–71. fish and implications for a mechanism of their amplification. J Mol Plohl M, Cornudella L. 1996. Characterization of a complex satellite DNA Evol. 48:13–21. in the mollusc Donax trunculus: analysis of sequence variations and Jurka J, Kapitonov VV, Kohany O, Jurka MV. 2007. Repetitive sequences in divergence. Gene 169:157–164. complex genomes: structure and evolution. Annu Rev Genomics Hum Plohl M, Cornudella L. 1997. Characterization of interrelated sequence motifs in four satellite DNAs and their distribution in the genome of Genet. 8:241–259. the mollusc Donax trunculus. J Mol Evol. 44:189–198. Jurka J, et al. 2005. Repbase Update, a database of eukaryotic repetitive Plohl M, Luchetti A, Mestrovic ´ N, Mantovani B. 2008. Satellite DNAs be- elements. Cytogenet Genome Res. 110:462–467. Kazazian HH. 2004. Mobile elements: drivers of genome evolution. tween selfishness and functionality: structure, genomics and evolution Science 303:1626–1632. of tandem repeats in centromeric (hetero)chromatin. Gene 409: Kipling D, Warburton PE. 1997. Centromeres, CENP-B and Tigger too. 72–82. Trends Genet. 13:141–145. Plohl M, et al. 2010. Long-term conservation vs high sequence divergence: Kourtidis A, Drosopoulou E, Pantzartzi CN, Chintiroglou CC, Scouras ZG. the case of an extraordinarily old satellite DNA in bivalve mollusks. 2006. Three new satellite sequences and a mobile element found Heredity 104:543–551. 2558 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R. 2003. DnaSP, DNA Wilder J, Hollocher H. 2001. Mobile elements and the genesis of micro- polymorphism analyses by the coalescent and other methods. satellites in dipterans. Mol Biol Evol. 18:384–392. Bioinformatics 19:2496–2497. Yang H-P, Barbash DA. 2008. Abundant and species-specific DINE-1 trans- Wang S, Zhang L, Meyer E, Matz MV. 2010. Characterization of a group posable elements in 12 Drosophila genomes. Genome Biol. 9:R39. of MITEs with unusual features from two coral genomes. PLoS One 5: e10700. Associate editor: Josefa Gonzalez Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2559 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Genome Biology and Evolution Oxford University Press

Tandem Repeat-Containing MITEs in the Clam Donax trunculus

Loading next page...
 
/lp/oxford-university-press/tandem-repeat-containing-mites-in-the-clam-donax-trunculus-Plm5Yxrrmw

References (54)

Publisher
Oxford University Press
Copyright
© The Author(s) 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
ISSN
1759-6653
eISSN
1759-6653
DOI
10.1093/gbe/evt202
pmid
24317975
Publisher site
See Article on Publisher Site

Abstract

Two distinct classes of repetitive sequences, interspersed mobile elements and satellite DNAs, shape eukaryotic genomes and drive their evolution. Short arrays of tandem repeats can also be present within nonautonomous miniature inverted repeat transposable elements (MITEs). In the clam Donax trunculus, we characterized a composite, high copy number MITE, named DTC84. It is composed of a central region built of up to five core repeats linked to a microsatellite segment at one array end and flanked by sequences holding short inverted repeats. The modular composition and the conserved putative target site duplication sequence AA at the element termini are equivalent to the composition of several elements found in the cupped oyster Crassostrea virginica and in some insects. A unique feature of D. trunculus element is ordered array of core repeat variants, distinctive by diagnostic changes. Position of variants in the array is fixed, regardless of alterations in the core repeat copy number. Each repeat harbors a palindrome near the junction with the following unit, being a potential hotspot responsible for array length variations. As a consequence, variations in number of tandem repeats and variations in flanking sequences make every sequenced element unique. Core repeats may be thus considered as individual units within the MITE, with flanking sequences representing a “cassette” for internal repeats. Our results demonstrate that onset and spread of tandem repeats can be more intimately linked to processes of transposition than previously thought and suggest that genomes are shaped by interplays within a complex network of repetitive sequences. Key words: mobile element, MITE, satellite DNA, tandem repeats, sequence rearrangements, evolution. Introduction satDNAs are tandemly repeated noncoding sequences lo- Eukaryotic genomes host two ubiquitous classes of highly cated in heterochromatic chromosomal compartments (Plohl abundant repetitive sequences, satellite DNAs (satDNAs) and et al. 2008). Characteristic low sequence variability of satDNAs transposable elements (TEs) (Lo ´ pez-Flores and Garrido-Ramos is considered to be a consequence of a phenomenon called 2012). TEs are sequence segments able to move to new concerted evolution, in which mutations are homogenized genomic locations and form interspersed repeats if replicated among repeats of a family in a genome and fixed among in this process (Finnegan 1989; Kazazian 2004; Jurka et al. individuals in a population (Dover 1986). Many satDNAs dif- 2007). Large number of diverse TEs exists in genomes, fering in length, sequence, copy number, and origin can coex- grouped into two basic classes based on mechanisms of trans- ist in a genome, but processes and possible constraints limiting position. Class I elements transpose by RNA-mediated mech- their onset and persistence are understood only fragmentarily anisms, while DNA-mediated processes spread class II (Mes ˇtrovic ´ et al. 2006). elements. Each of them includes autonomous and nonauton- Despite differences in structure, organization, mechanisms omous copies, the former being able to code for all products of spread, and sequence dynamics, growing number of re- needed for their own transposition while the later depend on ports indicate traits that link TEs and satDNAs. Internal tandem enzymes produced by the first. Passive transposition of a repeats found in some TEs provoke a hypothesis that their whole palette of sequences is possible due to the ability of expansion may represent a source of some satDNAs (Noma mechanisms involved in transposition to recognize DNA and Ohtsubo 2000; Gaffney et al. 2003; Macas et al. 2009). In secondary structures, such as inverted repeats (Craig 1995; addition, satDNA repeats were found as single units or short Izsvak et al. 1999; Coates et al. 2011). arrays interspersed in euchromatic portions of the genome, The Author(s) 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2549 ˇ Satovic and Plohl GBE probably as parts of yet uncharacterized TEs (Cafasso et al. Materials and Methods 2003; Brajkovic et al. 2012). It was suggested that inverted Construction of Partial Genomic Libraries repeats formed by inversion of satDNA monomers can pro- Donax trunculus genomic DNA was obtained from commer- mote interspersed distribution of such units (Mravinac and cially supplied adult specimens using adjusted phenol/ Plohl 2010). It must be noted that a direction of transition chloroform extraction protocol (Plohl and Cornudella 1997). between organizational forms is difficult to assess because Following the strategy described by Biscotti et al. (2007),ge- shifts from mobile elements to satDNAs can also be antici- nomic DNA was partially digested (10 mg of DNA, 37 C/5 min) pated in the opposite direction (Heikkinen et al. 1995; with 5 U of AluI restriction endonuclease (Fermentas) in order Macas et al. 2009). to reveal mass of degraded fragments in a range between 300 Miniature inverted-repeat transposable elements (MITEs) and 3,000 bp. The fragments were ligated into the pUC19/ are one group of nonautonomous DNA transposons. They SmaI vector. Transformed Escherichia coli DH5a competent are small (usually up to 600 bp), lack coding potential and/or cells (Invitrogen) were grown on 90-mm ampicillin-selective RNA pol III promoter site, and are featured by terminal or plates. After colony transfer, positively charged membranes subterminal inverted repeats, ability to fold into secondary (Amersham) were probed with digoxigenin-labeled AluI- structures, and short target site duplication (TSD) sequences digested (complete digestion) D. trunculus genomic DNA. formed in the process of insertion (Feschotte et al. 2002). Labeling, hybridization, and signal detection were performed MITEs are usually present in a high copy number in genomes as described in the following section. Hybridization was con- and are widespread in plants, animals, and fungi (Bureau and ducted under 65 C in 20 mM sodium phosphate buffer (pH Wessler 1992; Wang et al. 2010; Fleetwood et al. 2011). They 7.2), 20% sodium dodecyl sulfate (SDS), allowing ~80% are considered to be derived from larger autonomous ele- sequence similarity. ments (Feschotte and Mouche `s2000) and probably propa- gate through a cut-and-paste mechanism of transposition Southern Hybridization and Dot Blot Quantification combined with a gap repair and/or aberrant DNA replication, triggered by secondary structures (Izsvak et al. 1999; Coates For Southern analysis, genomic DNA (2.5 mg/sample) was di- et al. 2011). gested with 20 U of restriction endonucleases overnight, frag- Some MITE sequences have tandem repeats in their central ments were separated by electrophoresis on 1% agarose gel, part. Arrays of variable number of tandem repeats (usually up and transferred onto a positively charged nylon membrane to 6) are a common trait of MITE-like elements DINE1 (Yang (Roche). Polymerase chain reaction (PCR)-amplified fragments of interest were labeled with digoxigenin by random priming and Barbash 2008), SGM (Miller et al. 2000), mini-me (Wilder using the DIG DNA Labeling and Detection Kit (Roche) and and Hollocher 2001), and PERI (Kuhn and Heslop-Harrison used as a hybridization probe. Membranes were hybridized 2011), described in Drosophila,and of MINE-2 in some in 20 mM sodium phosphate (pH 7.2), 20% SDS, at low, mod- Lepidoptera (Coates et al. 2011). Tandem repeats of these erate, and high stringency conditions (60 C, 65 C, and elements are followed by a short microsatellite array at one 68 C, respectively). Stringency washing was conducted in end. Both modules are embedded between flanking se- 20 mM sodium phosphate buffer, 1% SDS, at the tempera- quences featured by an inverted repeat and the TSD sequence ture three degrees lower than the hybridization temperature. AA. Described modular structure was found in elements of To detect the hybridization, signal membranes were incubated the pearl family detected in the cupped oyster Crassostrea with anti-digoxigenin alkaline phosphatase conjugate, and virginica (Gaffney et al. 2003). In addition, internal tandem chemiluminescent signals induced by CDP-Star (Roche) were repeats (core repeats) of pearl share sequence similarity and captured on X-ray films (Amersham). unit length with several satDNAs widespread in bivalve mol- The relative genomic contribution of the DTC84 core lusks (Plohl and Cornudella 1996; Lopez-Flores et al. 2004; repeat sequence was determined by dot blot analysis. Serial Biscotti et al. 2007; Plohl et al. 2010), thus linking TEs and dilutions of D. trunculus genomic DNA and core repeat se- satDNAs in these organisms. quences were spotted onto a nylon membrane. Hybridization Standard focus in studying TEs is mostly on characterization was performed under high (68 C) and low stringency condi- of sequence traits that might be responsible for their mobility tions (60 C). (Lopez-Flores and Garrido-Ramos 2012). The same also holds for repeat-incorporating MITEs while available information is PCR Amplification scarce if we consider sequence dynamics, range, and possible causes of variability of tandem repeats residing within them. To amplify core repeats in DTC84 elements, primer Here, we characterize a novel MITE element in the clam pair DTC84AluSatF: TTGCCTGTGACGTCTACTTGTGC and D. trunculus, DTC84, with focus on tandem repeats residing DTC84AluSatR: AGAGGTCACAGGCAACCATCCA was de- within it and suggest pathways and mechanisms involved in rived according to the DTC84 clone. Amplification was per- their evolution. formed with initial denaturation at 94 C for 5 min, 35 cycles 2550 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE of 94 C for 30 s, 57 C for 30 s, 72 C for 30 s, and final that flank repeats (thus called core repeats) and the microsat- extension at 72 Cfor 7 min. ellite. Schematic presentation of sequenced genomic frag- Primers constructed according to sequence segments that mentsisshown in supplementary figure S1a, Supplementary flank core repeats DTC84mobF: AACAAGAGCACCGCTGGG Material online. CG and DTC84mobR: CGCACGTTTGAAAAACGGGACGTA Observed sequence segments have modular composi- were used in order to amplify additional copies of DTC84 tion which includes flanking sequence L+ (core elements. Amplification was performed with initial denatur- repeats) + (ACGG/ACGA) microsatellite+ flanking se- 1–5 2–13 ation at 94 C for 5 min, 35 cycles of 94 C for 30 s, 55 Cfor quence R (fig. 1a). In order to make a broader view on se- 30 s, 72 C for 1 min, and final extension at 72 Cfor 7 min. quences between flanking segments, corresponding primers All PCR products were cloned into pGEM-T Easy Vector were used in PCR amplification of genomic DNA. In agree- System (Promega), and recombinant clones containing multi- ment with elements obtained from genomic clones, random mers were sequenced. All cloned fragments were sequenced selection of 19 amplified fragments revealed conserved at Macrogen Inc. (Korea) on ABI3730XL DNA Analyzer. sequence and organizational pattern of the new element, Sequences submitted to GenBank obtained the following named DTC84 after the sequence segment cloned first. All accession numbers: KC981676–KC981759. depicted DTC84 elements are shown in figure 1b. Based on the common microsatellite motif ACGG and low sequence similarity (57%) between core repeats of the two elements Sequence Analysis (supplementary fig. S1b, Supplementary Material online), Obtained multimeric DNA sequences were trimmed to ex- DTC84 is closest to the CvG MITE-like pearl element from clude primer binding sites from sequence analysis. Sequence C. virginica (Gaffney et al. 2003). editing and alignments were performed using the Geneious Possibility that mechanisms of transposition are involved 5.4.3 program (Biomatters Ltd.). in the spread of DTC84 is indicated by observation of AA Substructures, repeats, and motifs were searched with ap- dinucleotides as putative TSD that defines element ends in propriate applications within the online tool Oligonucleotids the cloned genomic fragments (boxed in fig. 2 consensus repeats finder, developed by Bazin, Kosarev, and Babenko sequence). Among other structural features that may be of (http://wwwmgs.bionet.nsc.ru/mgs/programs/oligorep/InpFo significance for transposition is 11-bp-long inversely oriented rm.htm, last accessed December 20, 2013). To construct phy- motif positioned at ends of otherwise unrelated sequence logenetic networks, program Network (Fluxus Technology Ltd modules L and R (figs. 1a and 2, green arrows). This motif is 1999–2012) was used. CENSOR software tool was used for located subterminally in the flanking segment L while it occu- screening query sequences against a reference collection of pies terminal location in R, in agreement with the position of the Repbase repetitive DNA collection (Jurka et al. 2005). The inverted repeats in pearl (Gaffney et al. 2003). As in pearl, distribution of nucleotide diversity along the core repeat se- complete pol III box A and box B priming sites were not quence using a set of complete repeats was calculated as foundinDTC84. average number of nucleotide differences per site, using a Additional complexity of DTC84 elements is given by dif- 10-bp window with a sliding step of 1 bp in DnaSP v4.5 ferences in junctions between the flanking module L and core (Rozas et al. 2003). repeats (figs. 1b and 2a). This flanking module can be linked directly to core repeats but it can also accommodate two types Results of spacer segments, of ~80 and from 70 to 160 bp, all sharing the first 16 nucleotides (fig. 2a). The common part may rep- DTC84 MITE Repetitive DNA Family resent a residual of the flanking sequence itself, missing when Recombinant clones enriched in repetitive fraction of D. trun- the junction with core repeats is direct. Spacer sequences are culus genome were initially selected based on a positive cor- not related to other components of DTC84 and do not show relation between hybridization signal intensity and genomic any apparent substructure. At the opposite core repeat array abundance of repetitive sequences. Sequenced inserts (in total end, the microsatellite segment is linked directly to the flank- 117 kb in 67 colonies) revealed, among others, 2,707-bp-long ing sequence R (figs. 1b and 2b). fragment named DTC84Alu. This fragment turned to be of interest because it includes a short array composed of five Structured Arrays of DTC84 ~160-bp-long tandem repeats, followed by a microsatellite Core Repeats segment at one array end. In the next screening of initial set of about 1,000 colonies, repeat-specific hybridization probe Additional set of core repeats was cloned and sequenced after revealed four additional genomic fragments as positives. PCR amplification of genomic DNA with core repeat-specific Besides single or few repeats organized in tandem, sequenc- primers. After excision of primer sites, a total of 41 core repeat ing and sequence alignment of these fragments with sequences were extracted from PCR-obtained multimers (up DTC84Alu disclosed two ~50-bp-long conserved segments to 5-mer) and compared with 65 core repeats obtained from Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2551 ˇ Satovic and Plohl GBE FIG.1.—Modular structure of DTC84 element. (a) Terminal and subterminal inverted repeats of DTC84 and equivalent elements are indicated by arrows within yellow and gray boxes that represent flanking modules. Solid black arrow in the middle of element represents core repeats, while the microsatellite segment is shown as a dark blue box. Conserved putative TSD sequence is given at element ends. (b) DTC84 in cloned genomic fragments (DTC84Alu, 84-16F, 84-17F, 84-35, and 84-37) and in genomic fragments retrieved by PCR amplification of genomic DNA with primers specific for left (yellow boxes) and right (gray boxes) flanking modules. Core repeats are shown as pink boxes, arrowheads indicating the orientation, and microsatellite regions are shown as dark blue boxes. Two types of DTC84 core repeats are indicated, and type 2 core repeat adjacent to the microsatellite is additionally marked by asterisk. Waved line on the left or right end of the core repeat box indicates truncation. Solid black lines represent genomic sequences other than in described modules. (c) Alignment of consensus sequences of different types of core repeats. Dots indicate identity, only differences to the consensus sequence of the first repeat type are shown. DTC84 elements described earlier (supplementary fig. S2, compared core repeats differ mostly in single nucleotide sub- Supplementary Material online). It must be noted that geno- stitutions (supplementary fig. S2, Supplementary Material on- mic environment of PCR-obtained core repeats is not known; line) and revealed a consensus length of 156 bp. In addition, they can be incorporated in DTC84 elements but also associ- sliding window analysis showed that nucleotide differences ated with some other undetermined genomic sequences. All among core repeat variants are unevenly distributed along 2552 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2553 FIG.2.—Sequence alignment of DTC84 element modules. Consensus sequence is derived according to the majority principle, and only differences to the consensus are shown in each segment. Dots indicate identity and dashes insertions or deletions. (a) Left flanking module (yellow bar) and two groups of accompanying spacer segments. (b) Core repeats, microsatellite region (dark blue bar), and right flanking module (gray bar). For simplicity, only beginning of the first core repeat and ending of the last core repeat in each array is shown, while interruption is indicated by slashes in the consensus sequence. Complete alignment of all core repeats can be seen in supplementary figure S2, Supplementary Material online. Black vertical arrows above the consensus sequence indicate starting positions of truncated repeats. Inverted repeats in flanking modules are indicated by green arrows. AA dinucleotides as putative TSDs are boxed in the consensus sequence. Dashed arrows above the consensus sequence show primer positions. ˇ Satovic and Plohl GBE FIG.3.—Grouping of DTC84 core repeats. Type 1 core repeats are indicated by green circles, type 2 are yellow, while type 2* are orange. the sequence and form regions of reduced variability, approx- 3 position, every DTC84 core repeat ends with the 12-nt-long imately in the first half and near the end of the core repeat palindrome TTGTCCGGACAA followed by the sequence AAA (supplementary fig. S3, Supplementary Material online). TT, after which starts the next repeat unit. This palindrome is The first core repeat in DTC84 is regularly 5 end truncated conserved in the majority of core repeats, and mutated for 4–67 nt. Only two core repeats out of 24 located at that variants are rare in this segment (16 out of 106 sequenced position are complete (84mob12 and 84mob16). Truncated core repeats; supplementary fig. S2, Supplementary Material core repeats start at 11 different nucleotide positions, indicat- online). Specifically, transition point of the last core repeat into ing that they were predominantly formed in independent the microsatellite sequence differs in a way that sequence events (fig. 2b). The distribution of truncation sites is however following the palindrome is AAAGGT. not quite random and seven core repeats start at the position Phylogenetic analysis (fig. 3) combined with visual 8, while the first nucleotide of four core repeats is at the po- inspection of aligned core repeats (supplementary fig. S2, sition 42. Curiously, starting nucleotide in these two sets is Supplementary Material online) revealed clustering and en- always A. Although this nucleotide may represent a preferred abled identification of variants according to cluster-specific site of truncation, identity in starting position can also be a diagnostic nucleotides. Two types of core repeats are distinc- consequence of amplification events that followed truncation. tive, based on six most exclusive variable positions. Type 1 Disregarding the truncated part, all sequenced arrays of variants are recognized by CC-T-G-A-T nucleotides while the DTC84 elements are composed of an integer number of same positions are occupied by TG-C-A-G-G nucleotides in core repeats, all being positioned in the same orientation variants of the type 2 (fig. 1c and supplementary fig. S2, and forming the same junction with the microsatellite. At its Supplementary Material online). Although the difference is 2554 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE subtle, type 2 variants can be further subdivided into two groups. As explained in the previous paragraph, those linked directly to the microsatellite (indicated as 2*) differ from others by two additional diagnostic nucleotides located just at the monomer end (fig. 1c and supplementary fig. S2, Supplementary Material online). In the phylogenetic network, these core repeats group mostly in the left node of the type 2 branch (fig. 3). Sequenced DTC84 elements are composed of core repeats arranged according to the formula: type 1 (up to 3 consecu- tive repeats)+ type 2(1–2repeats) (fig. 1b). It is important to note that no correlation could be derived among core repeat array length and/or composition, truncation point of the first repeat, spacer segments, and number of repeats in the micro- satellite array. For example, dimeric core repeats of the arrangement type 1+ type 2 build 11 sequenced DTC84 ele- ments (84mob23, 84mob3, 84-37, 84mob40, 84mob11, 84mob5, 84mob18, 84mob15, 84mob41, 84mob37, and 84mob16; fig. 1b). Despite identical core repeat composition, they differ in all other above-mentioned features (fig. 2). It can be concluded that every studied DTC84 element shows a FIG.4.—Southern hybridization of Donax trunculus genomic DNA. unique combination of sequences featured in an independent (a) Genomic DNA digested with MspI (methylation insensitive, line 1), manner. HpaII (isoshisomer of MspI, methylation sensitive, line 2), and MboI(meth- In core repeat arrays of some DTC84 elements, only type 2 ylation insensitive, line 3) and hybridized with the core repeat-specific is found (in 84-35, 84mob24, and in the partial segment probe. (b) Electrophoretic separation of genomic DNA amplified with 84-16; fig. 1b). Following the rule of the first core repeat in PCR primers located in conserved flanking sequences. For the primer array being truncated, in these cases this is the type 2. In other position, see figure 2. words, core repeats of any type can appear in the truncated form but strictly depending on their position in the array. According to the dot blot analysis (not shown) performed at high stringency conditions (68 C), DTC84 core repeats con- stituteupto1%ofthe bivalvegenomeor8.9  10 copies, Genomic Organization of DTC84 Core Repeats Revealed considering the genome size reported by Hinegardner (1974). by Southern Blot Hybridization Estimated contribution of these sequences almost double at Southern hybridization experiments were performed in efforts low stringency conditions (60 C), indicating that a large to provide a more general view on organizational patterns of number of related sequences should exist in the genome. core repeats in the D. trunculus genome (fig. 4a). Digestion of Genomic abundance estimated to be <1% for homogeneous genomic DNA with endonuclease MspI, cutting once within core repeats constituting DTC84 elements is roughly in agree- the core repeat monomer sequence, and with MboI, which ment with their occurrence in the cloned genomic DNA predominantly cuts core repeats of the type 2, revealed short fragments. ladders of multimers on Southern blots. Methylation-sensitive endonuclease HpaII (isoschizomer of MspI) revealed slightly Other Pearl-Related Repetitive Sequences in D. trunculus less degraded profile (fig. 4a), in agreement with the previous conclusion about methylation of D. trunculus genomic DNA Among sequences depicted in initial cloning of D. trunculus (Petrovic ´ et al. 2009). Blurred appearance and hybridization genomic DNA, local Blast revealed one fragment, smear of up to about 5 kb could be related to association of DTC37AluF, which in a stretch of 140 nucleotides shares core repeats or their segments with other genomic sequences, ~61% similarity with DTC84 and with the pearl element resulting in a number of hybridizing fragments of different CvG (Gaffney et al. 2003; supplementary fig. S4a and b, length. In agreement are also short ladders obtained after Supplementary Material online). This match stretches over a amplification with PCR primers located in conserved flanking part of the core repeat and adjacent microsatellite region. sequences (fig. 4b) and those obtained with primers specific The microsatellite array that follows the putative core for core repeats (not shown). Obtained results suggest pre- repeat in DTC37AluF is heterogeneous, built of ACGG dominant organization of core repeats in short arrays, as motif and its related variants ACTG and ACGA. In addition, observed in the cloned DTC84 representatives. DTC37AluF turned to be related to another genomic Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2555 ˇ Satovic and Plohl GBE fragment, DTC41AluF. They share 87% similarity in the elements differs among species according to the core repeat cloned core repeat segment, both being truncated at the sequence and length. The most variable parameter is copy same nucleotide in the cloning procedure. The microsatellite number, varying also among elements of the same type region differs between them only in length, and it is followed within a species (Yang and Barbash 2008). In C. virginica, by 94% similar, 50-bp-long R flanking sequence, unrelated to core repeats of three elements belonging to the pearl family equivalent modules in DTC84 and in CvG. Although not fur- (CvA, CvE, and CvG) share little or no sequence similarity ther explored in this work, observed similarity confirms that (Gaffney et al. 2003). However, related sequences can be divergent elements of the DTC84 type exist in the D. truncu- detected in other species. Sequence similarity between lus genome, flanked with their distinctive sequences. core repeats of CvE and those of the element detected in In the genomic clone DTC12Alu, we detected 190-bp-long M. galloprovincialis is relatively high, ~74–78% (Kourtidis sequence that shows 63% similarity to the CvE element et al. 2006). Sequences related to C. virginica core repeats can be also identified in D. trunculus, CvG being distantly (Gaffney et al. 2003). This fragment incorporates part of the putative core repeat, the microsatellite segment, and adjacent related to DTC84 core repeats and CvE to a fragment of the sequence (supplementary fig. S4c, Supplementary Material clone DTC12Alu. Particularly intriguing is sequential distribution of DTC84 online). The microsatellite in DTC12Alu is composed of core repeat variants. Based on observed variations in array ACCG and ACAG motifs, instead of the ACTG motif found length and arrangement of core repeats, it is difficult to an- in the original CvE (supplementary fig. S4d, Supplementary ticipate evolution of array diversity only as a consequence of Material online). accumulation of mutations and subsequent amplification of particular sequence types. Fixed position of the variant 2* and Discussion truncation of variants indicate that arrays may be modified Modular structure of repetitive elements described in this by consecutive deletions in the preexisting ancestral array. work in the clam D. trunculus corresponds to MITE-like ele- Excision of core repeats could lead to different truncation of ments of the pearl family, detected in C. virginica, the blood the newly formed first repeat, simply as a result of imprecise ark Anadara trapezia,the seaurchin Strongylocentrotus pur- mechanisms of sequence turnover. Proneness to alterations by puratus (Cohen et al. 1985; Gaffney et al. 2003), and in the insertions and deletions is additionally evident by finding two Mediterranean mussel Mytilus galloprovincialis (Kourtidis et al. types of spacer sequences that separate core repeats and the 2006). They also share equivalent structure with a group of flanking module L. elements described in Drosophila and some other insects The landmark of DTC84 core repeats is a palindrome near (Locke et al. 1999; Miller et al. 2000; Wilder and Hollocher the core repeat end, located in the segment with reduced 2001; Yang and Barbash 2008; Coates et al. 2011; Kuhn and variability compared with the rest of the sequence. Heslop-Harrison 2011). All these elements are characterized Equivalent segment of nonhomologous core repeats of the by TSDs and left and right flanking sequences with 10–20- pearl family CvA also exhibits reduced sequence variability, bp-long inverted repeats located at terminal and subterminal but, instead of hosting a palindrome, it is characterized by a positions. In addition, they all have central region composed of microsatellite-like motif (different from the microsatellite in a sequence repeated in tandem up to about five times, adja- continuation of core repeats, Gaffney et al. 2003). CvA- cent at one side to a short tetranucleotide-based microsatellite related sequences organized as satDNAs preserve reduced segment (fig. 1a). Exceptionally, the microsatellite segment is variability in this region and disclose another one, located missing in the M. galloprovincialis element (Kourtidis et al. roughly in the middle of the core repeat (Plohl et al. 2010). 2006). At the DNA sequence level, in all of them is apparently Interestingly, two regions of reduced sequence variability were conserved only the dinucleotide AA as the putative TSD, indi- also observed in alignment of core repeats of Drosophila PERI cating involvement of a family of related integrases in the element (Kuhn and Heslop-Harrison 2011). process of movement. Based on abundance, TIRs, dinucleo- Putative motifs hidden in conserved sequence segments tide TSD, and lack or incomplete RNA pol III promoter, these were observed in many satDNA monomers (e.g., Hall et al. elements are mostly referred as MITE-like TEs. It was recently 2003; Mes ˇtrovic ´ et al. 2006). Their occurrence is usually inter- suggested that they can be alternatively classified as members preted as a result of evolution under constraints, most likely of Helitron-like TEs, the class exploiting rolling-circle replication due to functional interactions with protein components in in its spread (Yang and Barbash 2008). chromatin. Experimental evidence is however still missing Rapid sequence divergence is expected along the whole and the only well-described interaction is between the 17- length of nonautonomous TEs because of the lack of any po- bp-long CENP-B motif in a subset of human alpha satDNA tential coding function, while secondary structure and archi- monomers and the CENP-B protein (Masumoto et al. 1989). tecture of termini are assumed to be major requirements for This sequence motif is found in apparently unrelated satDNAs transposition (Craig 1995; Coates et al. 2011). Comparisons of several mammalian species, and its variants were also of 12 Drosophila species showed that central region of DINE-1 detected in some invertebrates, including nematodes 2556 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE (Mes ˇtrovic ´ et al. 2013) and bivalve mollusks (Canapa et al. in Izsva ´ k et al. 1999). In a case when the duplicated segment is 2000). The CENP-B protein is likely to be involved in the not excised, the result will be formation of a tandem copy. human centromere assembly (Ohzeki et al. 2002), but its sim- Similarly, another model hypothesizes that internal MITE ilarity to pogo-like transposases suggests its origin from tandem repeats can be built during DNA replication based domesticated mobile elements and a possible role in satDNA on the ability of a sequence to form stem-loop structures sequence rearrangements (Kipling and Warburton 1997; (Hikosaka and Kawahara 2004). The later model predicts uni- Casola et al. 2008). To the pogo-like subfamily of Tc1/mariner directional expansion of tandem repeats into long arrays typ- transposons is related one large group of MITEs, abundant in ical for satDNAs. The arrangement of DTC84 core repeats is the human and other genomes (Feschotte et al. 2002). These consistent with consecutive introduction of mutations during data and the observed putative motifs suggest that transpo- stepwise repeat duplication, while observed diversity in com- sase-related mechanisms may participate in alterations of position of variants can be accomplished by subsequent dele- tandem repeats of DTC84 and equivalent elements, as well tions, as explained earlier. as in formation of satDNAs derived from these sequences. In conclusion, ordered distribution of mutations in the array In addition to sequence motifs, the microsatellite region of a variable number of core repeats in DTC84 is consistent canalsohavearoleinelement structuringand dynamics,or withdeletionevents inapreformedsegment,occurring inde- alternatively it can be trailed as a consequence of these pendently with respect to the flanking modules. In this way, processes (Wilder and Hollocher 2001; Coates et al. 2011). flanking sequences may be considered as a kind of a “cas- A possible significance is supported by existence of the micro- sette” for internal core repeats. One core repeat sequence end satellite repeat motif ACGG in CvG (Gaffney et al. 2003)and is marked by a palindrome, located within a segment of in DTC84, despite weak or no relevant sequence similarity reduced sequence variability. Such segments may have a between other modules. role in core repeat rearrangements, probably by mechanisms The idea about core repeats and satDNA monomers as related to transposition. Although limited, similarity of independent insertion/deletion units is further supported by some core repeats and satDNAs characterized previously in finding diverse satellite monomers (or their short arrays) on D. trunculus and other bivalve mollusks indicates a complex euchromatic genome locations, including in the vicinity of network which links tandem repeats residing inside MITEs and genes (Kuhn et al. 2012; Brajkovic ´ et al. 2012). Other obser- those expanded into arrays of satDNAs. vations also stress transposition as an intrinsic feature of at least some sequences arranged in tandem. For example, Supplementary Material it was proposed that mechanisms of transposition spread Supplementary figures S1–S4 areavailableat Genome Biology human alpha satDNA to new genomic locations (Alkan and Evolution online (http://www.gbe.oxfordjournals.org/). et al. 2007). In addition, analysis of variability of related satDNAs shared by groups of species led to the model in which bursts of spread are followed by long periods of Acknowledgments stasis, a feature pertinent to mobile elements (Mes ˇtrovic ´ The authors thank Brankica Mravinac, Nevenka Mes ˇtrovic ´ , et al. 2006). and Andrea Luchetti for critical reading and comments on It is striking that core repeats of DTC84, pearl-related se- the manuscript. This work was supported by Research Fund quences, and monomers of eight different satDNAs detected of Ministry of Science, Education and Sports of Republic of in D. trunculus are of similar length, about 160 bp (Plohl and Croatia, project no. 098-0982913-2756. ´ ´ Cornudella 1996, 1997; Petrovic and Plohl 2005; Petrovic et al. 2009). It was hypothesized that preferred length of Literature Cited satDNA repeats of 140–200 bp and 340 bp is favored by the Alkan C, et al. 2007. Organization and evolution of primate centromeric chromatin structure (Henikoff et al. 2001). Based on observed DNA from whole-genome shotgun sequence data. PLoS Comput Biol. similarities with core repeats, we can also speculate that the 3(9):1807–1818. preferred length mirrors specificities of mechanisms involved Biscotti MA, et al. 2007. Repetitive DNA, molecular cytogenetics and genome organization in the King scallop (Pecten maximus). Gene in initial processes of repeat formation. 406:91–98. Compared with copy number alterations in arrays of Brajkovic ´ J, Feliciello I, Bruvo-Madaric B, Ugarkovic ´ Ð. 2012. Satellite DNA- tandem repeats, much less can be said about mechanisms like elements associated with genes within euchromatin of the beetle responsible for initial tandem duplications of repetitive se- Tribolium castaneum. G3 2:931–941. quences. In this regard, duplication of entire MITE elements Bureau TE, Wessler SR. 1992. Tourist: a large family of small inverted repeat elements frequently associated with maize genes. Plant Cell is proposed to be consequence of aberrant DNA replication 4:1283–1294. (Izsva ´ k et al. 1999). Briefly, due to an inverted repeat and/or a Cafasso D, Cozzolino S, De Luca P, Chinali G. 2003. An unusual satellite palindrome, sequence can be duplicated when DNA polymer- DNA from Zamia paucijuga (Cycadales) characterised by two different ase passes through a MITE, followed by excision of the dupli- organisations of the repetitive unit in the plant genome. Gene 311: cated segment and its reintegration into a new location (fig. 4 71–79. Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2557 ˇ Satovic and Plohl GBE Canapa A, Barucca M, Cerioni PN, Olmo E. 2000. A satellite DNA con- inside HSP70 introns of the Mediterranean mussel (Mytilus gallopro- taining CENP-B box-like motifs is present in the antarctic scallop vincialis). Genome 49:1451–1458. Adamussium colbecki. Gene 247:175–180. Kuhn GCS, Heslop-Harrison JS. 2011. Characterization and Casola C, Hucks D, Feschotte C. 2008. Convergent domestication of genomic organization of PERI, a repetitive DNA in the Drosophila pogo-like transposases into centromere-binding proteins in fission buzzatii cluster related to DINE-1 transposable elements and highly yeast and mammals. Mol Biol Evol. 25:29–41. abundant in the sex chromosomes. Cytogenet Genome Res. 132: Coates BS, Kroemer JA, Sumerford DV, Hellmich RLL. 2011. A novel 79–88. class of miniature inverted repeat transposable elements (MITEs) Kuhn GCS, Ku ¨ ttler H, Moreira-Filho O, Heslop-Harrison JS. 2012. that contain hitchhiking (GTCY) microsatellites. Insect Mol Biol. 20: The 1.688 repetitive DNA of Drosophila: concerted evolution at 15–27. different genomic scales and association with genes. Mol Biol Evol. Cohen JB, Hoffman-Liebermann B, Kedes L. 1985. Structure and 29:7–11. Locke J, Howard LT, Aippersbach N, Podemski L, Hodgetts RB. 1999. The unusual characteristics of a new family of transposable elements in the sea urchin Strongylocentrotus purpuratus. Mol Cell Biol. 5: characterization of DINE-1, a short, interspersed repetitive element 2804–2813. present on chromosome and in the centric heterochromatin of Craig NL. 1995. Unity in transposition reactions. Science 270:253–254. Drosophila melanogaster. Chromosoma 108:356–366. Dover GA. 1986. Molecular drive in multigene families: How biologi- Lo ´ pez-Flores I, Garrido-Ramos MA. 2012. The repetitive DNA content of cal novelties arise, spread and are assimilated. Trends Genet. 2: eukaryotic genomes. In: Garrido-Ramos MA, editor. Repetitive DNA 159–165. VII. Basel (Switzerland): Karger Publishers. p. 1–28. Feschotte C, Mouche ` s C. 2000. Evidence that a family of miniature Lo ´ pez-Flores I, et al. 2004. The molecular phylogeny of oysters based on a inverted-repeat transposable elements (MITEs) from the Arabidopsis satellite DNA related to transposons. Gene 339:181–188. thaliana genome has arisen from a pogo-like DNA transposon. Mol Macas J, Koblı´zkova ´ A, Navra ´ tilova ´ A, Neumann P. 2009. Hypervariable 3’ UTR region of plant LTR-retrotransposons as a source of novel sat- Biol Evol. 17:730–737. Feschotte C, Zhang X, Wessler S. 2002. Miniature inverted-repeat trans- ellite repeats. Gene 448:198–206. Masumoto H, Masukata H, Muro Y, Nozaki N, Okazaki T. 1989. A human posable elements (MITEs) and their relationship with established DNA transposons. In: Craig N, editor. Mobile DNA II. Washington (DC): centromere antigen (CENP-B) interacts with a short specific sequence ASM Press. p. 1147–1158. in alphoid DNA, a human centromeric satellite. J Cell Biol. 109: Finnegan DJ. 1989. Eukaryotic transposable elements and genome evolu- 1963–1973. tion. Trends Genet. 5:103–107. Mes ˇtrovic ´ N, Castagnone-Sereno P, Plohl M. 2006. Interplay of selec- Fleetwood DJ, et al. 2011. Abundant degenerate miniature inverted- tive pressure and stochastic events directs evolution of the MEL172 repeat transposable elements in genomes of epichloid fungal endo- satellite DNA library in root-knot nematodes. Mol Biol Evol. 23: phytes of grasses. Genome Biol Evol. 3:1253–1264. 2316–2325. Gaffney PM, Pierce JC, Mackinley AG,Titchen DA,Glenn WK.2003. Pearl, Mes ˇtrovic ´ N, et al. 2013. Conserved DNA motifs, including the CENP-B a novel family of putative transposable elements in bivalve mollusks. box-like, are possible promoters of satellite DNA array rearrangements J Mol Evol. 56:308–316. in nematodes. PLoS One 8:e67328. Hall SE, Kettler G, Preuss D. 2003. Centromere satellites from Arabidopsis Miller WJ, Nagel A, Bachmann J, Bachmann L. 2000. Evolutionary dynam- populations: maintenance of conserved and variable domains. ics of the SGM transposon family in the Drosophila obscura species Genome Res. 13:195–205. group. Mol Biol Evol. 17:1597–1609. Heikkinen E, Launonen V, Muller E, Bachmann L. 1995. The pvB370 Mravinac B, Plohl M. 2010. Parallelism in evolution of highly repetitive BamHI satellite DNA family of the Drosophila virilis group and its evo- DNAs in sibling species. Mol Biol Evol. 27:1857–1867. lutionary relation to mobile dispersed genetic pDv elements. J Mol Noma K, Ohtsubo E. 2000. Tnat1 and Tnat2 from Arabidopsis thaliana: Evol. 41:604–614. novel transposable elements with tandem repeat sequences. DNA Res. Henikoff S, Ahmad K, Malik HS. 2001. The centromere para- 7:1–7. dox: stable inheritance with rapidly evolving DNA. Science 293: Ohzeki J, Nakano M, Okada T, Masumoto H. 2002. CENP-B box is required for de novo centromere chromatin assembly on human alphoid DNA. 1098–1102. J Cell Biol. 159:765–775. Hikosaka A, Kawahara A. 2004. Lineage-specific tandem repeats riding Petrovic ´ V, Plohl M. 2005. Sequence divergence and conservation in or- on a transposable element of MITE in Xenopus evolution: a new mechanism for creating simple sequence repeats. J Mol Evol. 59: ganizationally distinct subfamilies of Donax trunculus satellite DNA. 738–746. Gene 362:37–43. Hinegardner R. 1974. Cellular DNA content of the Mollusca. Comp Petrovic V, et al. 2009. A GC-rich satellite DNA and karyology of the Biochem Physiol A Comp Physiol. 47:447–460. bivalve mollusk Donax trunculus: a dominance of GC-rich heterochro- Izsva ´ k Z, et al. 1999. Short inverted-repeat transposable elements in teleost matin. Cytogenet Genome Res. 124:63–71. fish and implications for a mechanism of their amplification. J Mol Plohl M, Cornudella L. 1996. Characterization of a complex satellite DNA Evol. 48:13–21. in the mollusc Donax trunculus: analysis of sequence variations and Jurka J, Kapitonov VV, Kohany O, Jurka MV. 2007. Repetitive sequences in divergence. Gene 169:157–164. complex genomes: structure and evolution. Annu Rev Genomics Hum Plohl M, Cornudella L. 1997. Characterization of interrelated sequence motifs in four satellite DNAs and their distribution in the genome of Genet. 8:241–259. the mollusc Donax trunculus. J Mol Evol. 44:189–198. Jurka J, et al. 2005. Repbase Update, a database of eukaryotic repetitive Plohl M, Luchetti A, Mestrovic ´ N, Mantovani B. 2008. Satellite DNAs be- elements. Cytogenet Genome Res. 110:462–467. Kazazian HH. 2004. Mobile elements: drivers of genome evolution. tween selfishness and functionality: structure, genomics and evolution Science 303:1626–1632. of tandem repeats in centromeric (hetero)chromatin. Gene 409: Kipling D, Warburton PE. 1997. Centromeres, CENP-B and Tigger too. 72–82. Trends Genet. 13:141–145. Plohl M, et al. 2010. Long-term conservation vs high sequence divergence: Kourtidis A, Drosopoulou E, Pantzartzi CN, Chintiroglou CC, Scouras ZG. the case of an extraordinarily old satellite DNA in bivalve mollusks. 2006. Three new satellite sequences and a mobile element found Heredity 104:543–551. 2558 Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 Tandem Repeat-Containing MITEs GBE Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R. 2003. DnaSP, DNA Wilder J, Hollocher H. 2001. Mobile elements and the genesis of micro- polymorphism analyses by the coalescent and other methods. satellites in dipterans. Mol Biol Evol. 18:384–392. Bioinformatics 19:2496–2497. Yang H-P, Barbash DA. 2008. Abundant and species-specific DINE-1 trans- Wang S, Zhang L, Meyer E, Matz MV. 2010. Characterization of a group posable elements in 12 Drosophila genomes. Genome Biol. 9:R39. of MITEs with unusual features from two coral genomes. PLoS One 5: e10700. Associate editor: Josefa Gonzalez Genome Biol. Evol. 5(12):2549–2559 doi:10.1093/gbe/evt202 Advance Access publication December 6, 2013 2559

Journal

Genome Biology and EvolutionOxford University Press

Published: Dec 1, 2013

There are no references for this article.