The distribution of a germline methylation marker suggests a regional mechanism of LINE-1 silencing by the piRNA-PIWI system
© Sigurdsson et al; licensee BioMed Central Ltd. 2012
Received: 28 November 2011
Accepted: 24 April 2012
Published: 24 April 2012
A defense system against transposon activity in the human germline based on PIWI proteins and piRNA has recently been discovered. It represses the activity of LINE-1 elements via DNA methylation by a largely unknown mechanism. Based on the dispersed distribution of clusters of piRNA genes in a strand-specific manner on all human chromosomes, we hypothesized that this system might work preferentially on local and proximal sequences. We tested this hypothesis with a methylation-associated SNP (mSNP) marker which is based on the density of C-T transitions in CpG dinucleotides as a surrogate marker for germline methylation.
We found significantly higher density of mSNPs flanking piRNA clusters in the human genome for flank sizes of 1-16 Mb. A dose-response relationship between number of piRNA genes and mSNP density was found for up to 16 Mb of flanking sequences. The chromosomal density of hypermethylated LINE-1 elements had a significant positive correlation with the chromosomal density of piRNA genes (r = 0.41, P = 0.05). Genome windows of 1-16 Mb containing piRNA clusters had significantly more hypermethylated LINE-1 elements than windows not containing piRNA clusters. Finally, the minimum distance to the next piRNA cluster was significantly shorter for hypermethylated LINE-1 compared to normally methylated elements (14.4 Mb vs 16.1 Mb).
Our observations support our hypothesis that the piRNA-PIWI system preferentially methylates sequences in close proximity to the piRNA clusters and perhaps physically adjacent sequences on other chromosomes. Furthermore they suggest that this proximity effect extends up to 16 Mb. This could be due to an unknown localization signal, transcription of piRNA genes near the nuclear membrane or the presence of an unknown RNA molecule that spreads across the chromosome and targets the methylation directed by the piRNA-PIWI complex. Our data suggest a region specific molecular mechanism which can be sought experimentally.
Approximately 70% of CpG dinucleotides are methylated in the human genome . Additionally, non-CpG methylation, primarily of the CWG trinucleotide, has recently been described in the human genome . It accounts for 25% of cytosine methylation in embryonic stem cells, compared to only 0.02% of all cytosine methylation in adult cell lines . DNA methylation has several important functions in mammalian genomes, highlighted by the lethal phenotype of homozygous mutants of DNA methyltransferases [3, 4]. These functions include regulation of tissue-specific gene expression [5, 6], parent-of-origin expression of imprinted genes  and the stable maintenance of the inactive X chromosome in females [8, 9].
Transposable elements (TE) comprise 45% of the human genome  and their activity threatens genomic stability . The two major active subfamilies of transposable elements are the LINE-1 subfamily of the LINE (long interspersed nucleotide elements) family and the Alu subfamily of the SINE (short interspersed nucleotide elements) family . Several defense systems against harmful effects of TE exist, and some of them involve DNA methylation . A hypothesis that DNA methylation is a global defense system against all TE in the human genome  has been challenged . Examples of alternative defense mechanisms are families of enzymes interfering with TE activity by altering the properties of the RNA transcripts from TE , such as the APOBEC3 family . While some of the APOBEC3 family members can inhibit retrotransposition by both LINE-1 and Alu , other members are selective for inhibition of only Alu .
DNA methylation appears to be involved in defense of LINE-1 and long terminal repeats (LTR) in mammals. The transcription of LINE-1 elements in vitro is controlled by DNA methylation, at least in vitro . Mice homozygous for mutations in the DNMT3L gene loose the methylation of LINE-1 and LTR elements resulting in their increased transcription .
A germline TE defense system based on the interaction of piwi-interacting RNA (piRNA) with PIWI (P-element induced wimpy testis) proteins was originally described in Drosophila . Similar protein-piRNA complexes have subsequently been discovered in the mouse and human germline , where they seem to mediate defense against mouse IAP (Intracisternal A-particle) and LINE-1 elements via DNA methylation . Male mice with mutated PIWI proteins (MILI, MIWI2) are sterile and have increased LINE-1 and IAP transcription [19, 21], although female null mutants are fertile . In the germline, PIWI proteins in the cytoplasm cleave transcribed RNAs into piRNA. A ping-pong amplification cycle occurs that increases the relative abundance of TE transcripts in the piRNA pool . The genes coding for the 24-30 nucleotide piRNAs are found in clusters on all chromosomes in both human and mice, with the majority of piRNA genes in each cluster generally arising from the same strand . Their locations in the human and mouse genomes is neither dependent on TE nor gene density . Following the formation of piRNA, a piRNA-PIWI complex moves into the nucleus, where the complex directs de novo methylation of LINE-1 and IAP elements [22, 23] via an unknown mechanism . piRNA-PIWI appears to have a role in the pathogenesis of human cancer and this is currently the focus of ongoing research . In particular, Hiwi, a member of the piwi gene family, appears to be over-expressed in germline tumors such as seminomas . It is also over-expressed in non-germline tumors such as sarcomas, where high Hiwi expression is associated with a worse prognosis . Increased Hiwi expression in tumors is associated with increased DNA methylation levels, and these tumors respond to inhibitors of DNA methyltransferases .
There are at least 208 piRNA clusters in the human genome, on average one cluster every 16 Mb . The distribution of piRNA gene clusters on each human chromosome in the genome suggests that this defense mechanism might be regional with a size range in the range of few tens of megabases. This hypothesis predicts that the piRNA-PIWI defense mechanism preferentially methylates sequences that are in proximity to the piRNA clusters. Here, we present data in support of this hypothesis using our previously published methylated-associatetd SNP (mSNP) surrogate marker of germline methylation . This marker was created by constructing a database of the density of C-T or A-G SNPs within the CpG dinucleotide (mSNPs) from the second generation HapMap dataset of SNPs, under the assumption that the majority of these SNPs are likely to stem from the hypermutability of methylated cytosine in the germline .
The mSNP density adjacent to piRNA clusters
Dose-response relationship between piRNA elements and mSNPs
Chromosomal distribution of normally methylated LINE-1 elements and hypermethylated LINE-1 elements in the human genome
LINE-1 elements preferentially reside within gene-poor and AT rich regions of the genome . For each LINE-1 element we measured the density of mSNPs in 10 kb flanks on each side of the element. We defined hypermethylated LINE-1 elements as elements with mSNP flank density in the top 10th percentile. The hypermethylated LINE-1 elements had a mean of 8.9 mSNPs in the 10 kb of sequence flanking the elements (median 8, range 7-41 mSNPs). In contrast, the normally methylated LINE-1 elements had a mean of 2.4 mSNPs in the 10 kb of sequence flanking the elements (median 2, range 0-6 mSNPs).
We hypothesized that we might observe a positive correlation between the number of hypermethylated LINE-1 elements and the number of piRNA genes on a chromosomal level. For each chromosome, we calculated the chromosomal density of hypermethylated LINE-1 elements by dividing the absolute number of hypermethylated LINE-1 elements with the chromosomal length. We found that hypermethylated LINE-1 elements were over-represented on chromosomes 6, 10, 13, 17, 18, 20, 21 and 22 and an under-represented on chromosomes 1, 2, 3, 4, 5, 7, 14 and 15 (data not shown). Similarly, we calculated the chromosomal density of piRNA genes by dividing the absolute number of piRNA genes per chromosome with the chromosomal length. There was an over-representation of piRNA genes for chromosomes 6, 10, 17, 19 and 22 and an under-representation of piRNA genes for chromosomes 1, 2, 3, 4, 5, 7, 8, 12, 13, 16, 20 and 21. There was a positive correlation between the chromosomal piRNA gene density and the hypermethylated LINE-1 element density (Spearman's ρ = 0.42, p = 0.05).
Intra-chromosomal distribution of hypermethylated LINE-1 elements in the human genome and relationship with piRNA clusters
Distance between LINE-1 elements and piRNA clusters in the human genome
The presence of several piRNA clusters on each mouse and human chromosome  might suggest that a great amount of piRNA is needed for the piRNA-PIWI defense system or that the system operates preferentially on sequences proximal to the piRNA clusters via an unknown mechanism. We tested the hypothesis that the piRNA-PIWI system operates preferentially on sequences adjacent to piRNA clusters. Based on the number of the piRNA clusters, the size range of the effect could be within tens of megabases. If this hypothesis is true we should observe more germline methylation of sequences proximal to piRNA clusters. In particular this methylation should be within LINE-1 sequences, the dominant target of the piRNA-PIWI defense system. This hypothesis is testable using the mSNP surrogate marker of germline methylation in the human genome that we have recently described . Our methods only allow us to test this on proximal sequences on the same chromosome, as the spatial organization of the chromosome segments in the human germline is unclear.
We found that the density of mSNPs flanking piRNA clusters was significantly higher for flank sizes of 1-16 Mb compared to the genome average mSNP density. We also found a dose-response relationship between numbers of piRNA genes within piRNA clusters and adjacent mSNPs extending up to 16 Mb. Combined, these results support a hypothesis that there might be an overall more germline methylation adjacent to piRNA clusters, and that the range of the effect is within 16 Mb. The sequences immediately flanking the piRNA clusters (< 1 Mb) might be spared methylation. This could be due to a hypomethylation of the piRNA clusters themselves to maintain their high expression in the germline. Alternatively, our approach might be underpowered to detect a difference in the smaller window sizes.
Second we defined hypermethylated LINE-1 elements based on our mSNP marker and analyzed their distribution in the human genome. We found that the chromosomal density of hypermethylated LINE-1 elements correlated positively with the density of piRNA genes. We also found significantly more hypermethylated LINE-1 elements in genome windows containing piRNA clusters for window sizes of 1-16 Mb. The mean minimum distance to the nearest LINE-1 element was 14-16 Mb, and was significantly shorter for hypermethylated LINE-1 elements. This supports that a higher number of hypermethylated LINE-1 elements is adjacent to piRNA clusters, consistent with a hypothesis that the piRNA-PIWI system might mediate methylation of sequences proximal to the piRNA clusters.
There are several limitations to our approach. The resolution of our mSNP data set limits the sequence length to be analyzed to a minimum of 10 kb. If possible analyzing shorter sequences might be more appropriate, given earlier observations of the methylation spread from retroviruses extending up to 1 kb . We might therefore an expect even a stronger effect if we could limit our selection of hypermethylated LINE-1 elements using methylation information from a shorter sequences.
Our mSNP marker correlates with several other features, including GC ratio, CpG density, repeats density and the absolute density of SNPs . However, many of the observations presented are independent of these effects, such as the dose-response relationship and the chromosomal distribution of piRNA elements and hypermethylated LINE-1 elements. Furthermore the piRNA clusters are neither dependent on gene or repeat density .
A potential piRNA-PIWI defense system working on proximal sequences is intriguing, especially given that the amplification cycle and the formation of the piRNA-PIWI complex occurs in the cytoplasm. Perhaps a unique localization signal is transcribed and involved in each piRNA complex? There could also be an RNA molecule that spreads across the chromosomes in both directions from the piRNA clusters and recruits the piRNA-PIWI complex to the region. This could operate in a similar manner to the XIST mechanism of silencing of the X chromosome . Alternatively, actively transcribed piRNA genes might be located near the nuclear membrane, the formation of the piRNA-PIWI complex would occur immediately outside the nucleus and the complex then return to sequences that are most adjacent to the cluster. Our data might also reflect some structural characteristics of chromosomes resulting in correlation between germline methylation, LINE-1 elements and piRNA clusters.
In summary, our results support an intriguing regional effect of piRNA-PIWI mediated methylation of LINE-1 sequences. These results call for experiments to determine causation and molecular mechanisms.
We have previously described the mSNP data set and demonstrated its usage as a surrogate marker for germline methylation . It involves screening large databases of SNPs for mutations likely to stem from mutations of methylated cytosine into thymine (C → T or G → A within a CpG dinucleotide). We created a genome-wide marker of germline methylation from the entire HapMap phase II data set  and a dataset of ancestral alleles for the HapMap SNPs . From these data sets, we defined mSNP as any C/T or G/A SNP with adjacent 3' guanine base (for C/T SNPs) or 5' cytosine base (for G/A SNPs). Additionally, we required that the ancestral allele was either cytosine or guanine, excluding T → C and A → G mutations.
The mSNP dataset was validated by demonstrating that it reflected the hypermutability of methylated cytosines and by showing a negative correlation of the marker with CpG islands, sequences generally hypomethylated in the human genome.
For this project, we downloaded the entire human genome sequence from the UCSC data browser http://genome.ucsc.edu . All data was in the NCBI35 (UCSC hg 17) coordinates. This was used to calculate GC ratio and count CpG's. Data on LINE-1 repeats was extracted from the Rmsk table  downloaded from the UCSC table browser . The location of piRNA clusters in the human genome was from Girard et al. . Data processing scripts were written in the JAVA programming language, and thoroughly tested prior to usage. All scripts are available at http://www.hi.is/~mis.
All statistics and figure preparation was done in R, version 2.11.0 (The R foundation, Austria). Non-parametric methods were used when the data did not fit normal distribution. Correlation was evaluated by Spearman's ranked correlation. Estimates of means and means differences were assigned 95% confidence intervals by calculating a bootstrap estimate of the parameters and their distribution with 1,000-10,000 random samplings of the data. This was done with the boot package in R. Statistical and figure processing scripts are available upon demand. The level of statistical significance was set at 0.05.
This work was funded by the University of Iceland Research Fund, the Icelandic Student Innovation Fund, The Landspítali University Hospital Science Fund and the Memorial Fund of Bergþóra Magnúsdóttir and Jakob Bjarnason.
- Ehrlich M, Gama-Sosa MA, Huang LH, Midgett RM, Kuo KC, McCune RA, Gehrke C: Amount and distribution of 5-methylcytosine in human DNA from different types of tissues of cells. Nucleic Acids Res. 1982, 10: 2709-2721. 10.1093/nar/10.8.2709.PubMed CentralView ArticlePubMedGoogle Scholar
- Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo Q, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009, 462: 315-322. 10.1038/nature08514.PubMed CentralView ArticlePubMedGoogle Scholar
- Kaneda M, Okano M, Hata K, Sado T, Tsujimoto N, Li E, Sasaki H: Essential role for de novo DNA methyltransferase Dnmt3a in paternal and maternal imprinting. Nature. 2004, 429: 900-903. 10.1038/nature02633.View ArticlePubMedGoogle Scholar
- Okano M, Xie S, Li E: Cloning and characterization of a family of novel mammalian DNA (cytosine-5) methyltransferases. Nat Genet. 1998, 19: 219-220. 10.1038/890.View ArticlePubMedGoogle Scholar
- Song F, Smith JF, Kimura MT, Morrow AD, Matsuyama T, Nagase H, Held WA: Association of tissue-specific differentially methylated regions (TDMs) with differential gene expression. Proc Natl Acad Sci USA. 2005, 102: 3336-3341. 10.1073/pnas.0408436102.PubMed CentralView ArticlePubMedGoogle Scholar
- Weber M, Hellmann I, Stadler MB, Ramos L, Pääbo S, Rebhan M, Schübeler D: Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat Genet. 2007, 39: 457-466. 10.1038/ng1990.View ArticlePubMedGoogle Scholar
- Reik W, Walter J: Genomic imprinting: parental influence on the genome. Nat Rev Genet. 2001, 2: 21-32.View ArticlePubMedGoogle Scholar
- Hellman A, Chess A: Gene body-specific methylation on the active X chromosome. Science (80-). 2007, 315: 1141-1143. 10.1126/science.1136352.View ArticleGoogle Scholar
- Venolia L, Gartler SM, Wassman ER, Yen P, Mohandas T, Shapiro LJ: Transformation with DNA from 5-azacytidine-reactivated X chromosomes. Proc Natl Acad Sci USA. 1982, 79: 2352-2354. 10.1073/pnas.79.7.2352.PubMed CentralView ArticlePubMedGoogle Scholar
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, et al: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.View ArticlePubMedGoogle Scholar
- Hedges DJ, Deininger PL: Inviting instability: transposable elements, double-strand breaks, and the maintenance of genome integrity. Mutat Res. 2007, 616: 46-59. 10.1016/j.mrfmmm.2006.11.021.PubMed CentralView ArticlePubMedGoogle Scholar
- Zamudio N, Bourc'his D: Transposable elements in the mammalian germline: a comfortable niche or a deadly trap?. Heredity. 2010, 105: 92-104. 10.1038/hdy.2010.53.View ArticlePubMedGoogle Scholar
- Yoder JA, Walsh CP, Bestor TH: Cytosine methylation and the ecology of intragenomic parasites. Trends Genet. 1997, 13: 335-340. 10.1016/S0168-9525(97)01181-5.View ArticlePubMedGoogle Scholar
- Bird A: Does DNA methylation control transposition of selfish elements in the germline?. Trends Genet. 1997, 13: 469-472. 10.1016/S0168-9525(97)01310-3.View ArticlePubMedGoogle Scholar
- Bogerd HP, Wiegand HL, Doehle BP, Lueders KK, Cullen BR: APOBEC3A and APOBEC3B are potent inhibitors of LTR-retrotransposon function in human cells. Nucleic Acids Res. 2006, 34: 89-95. 10.1093/nar/gkj416.PubMed CentralView ArticlePubMedGoogle Scholar
- Hulme AE, Bogerd HP, Cullen BR, Moran JV: Selective inhibition of Alu retrotransposition by APOBEC3G. Gene. 2007, 390: 199-205. 10.1016/j.gene.2006.08.032.PubMed CentralView ArticlePubMedGoogle Scholar
- Hata K, Sakaki Y: Identification of critical CpG sites for repression of L1 transcription by DNA methylation. Gene. 1997, 189: 227-234. 10.1016/S0378-1119(96)00856-6.View ArticlePubMedGoogle Scholar
- Bourc'his D, Bestor TH: Meiotic catastrophe and retrotransposon reactivation in male germ cells lacking Dnmt3L. Nature. 2004, 431: 96-99. 10.1038/nature02886.View ArticlePubMedGoogle Scholar
- Aravin AA, Naumova NM, Tulin AV, Vagin VV, Rozovsky YM, Gvozdev VA: Double-stranded RNA-mediated silencing of genomic tandem repeats and transposable elements in the D. melanogaster germline. Curr Biol. 2001, 11: 1017-1027. 10.1016/S0960-9822(01)00299-8.View ArticlePubMedGoogle Scholar
- Girard A, Sachidanandam R, Hannon GJ, Carmell MA: A germline-specific class of small RNAs binds mammalian Piwi proteins. Nature. 2006, 442: 199-202.PubMedGoogle Scholar
- Carmell MA, Girard A, van de Kant HJG, Bourc'his D, Bestor TH, de Rooij DG, Hannon GJ: MIWI2 is essential for spermatogenesis and repression of transposons in the mouse male germline. Dev Cell. 2007, 12: 503-514. 10.1016/j.devcel.2007.03.001.View ArticlePubMedGoogle Scholar
- Kuramochi-Miyagawa S, Watanabe T, Gotoh K, Totoki Y, Toyoda A, Ikawa M, Asada N, Kojima K, Yamaguchi Y, Ijiri TW, Hata K, Li E, Matsuda Y, Kimura T, Okabe M, Sakaki Y, Sasaki H, Nakano T: DNA methylation of retrotransposon genes is regulated by Piwi family members MILI and MIWI2 in murine fetal testes. Genes Dev. 2008, 22: 908-917. 10.1101/gad.1640708.PubMed CentralView ArticlePubMedGoogle Scholar
- Shoji M, Tanaka T, Hosokawa M, Reuter M, Stark A, Kato Y, Kondoh G, Okawa K, Chujo T, Suzuki T, Hata K, Martin SL, Noce T, Kuramochi-Miyagawa S, Nakano T, Sasaki H, Pillai RS, Nakatsuji N, Chuma S: The TDRD9-MIWI2 complex is essential for piRNA-mediated retrotransposon silencing in the mouse male germline. Dev Cell. 2009, 17: 775-787. 10.1016/j.devcel.2009.10.012.View ArticlePubMedGoogle Scholar
- Siddiqi S, Matushansky I: Piwis and piwi-interacting RNAs in the epigenetics of cancer. J Cell Biochem. 2012, 113: 373-380. 10.1002/jcb.23363.View ArticlePubMedGoogle Scholar
- Qiao D, Zeeman A, Deng W, Looijenga LHJ, Lin H: Molecular characterization of hiwi, a human member of the piwi gene family whose overexpression is correlated to seminomas. Oncogene. 2002, 21: 3988-3999. 10.1038/sj.onc.1205505.View ArticlePubMedGoogle Scholar
- Taubert H, Greither T, Kaushal D, Würl P, Bache M, Bartel F, Kehlen A, Lautenschläger C, Harris L, Kraemer K, Meye A, Kappler M, Schmidt H, Holzhausen H, Hauptmann S: Expression of the stem cell self-renewal gene Hiwi and risk of tumour-related death in patients with soft-tissue sarcoma. Oncogene. 2007, 26: 1098-1100. 10.1038/sj.onc.1209880.View ArticlePubMedGoogle Scholar
- Sigurdsson MI, Smith AV, Bjornsson HT, Jonsson JJ: HapMap methylation-associated SNPs, markers of germline DNA methylation, positively correlate with regional levels of human meiotic recombination. Genome Res. 2009, 19: 581-589. 10.1101/gr.086181.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Jähner D, Jaenisch R: Retrovirus-induced de novo methylation of flanking host sequences correlates with gene inactivity. Nature. 1985, 315: 594-597. 10.1038/315594a0.View ArticlePubMedGoogle Scholar
- Lee JT: The X as model for RNA's niche in epigenomic regulation. Cold Spring Harb Perspect Biol. 2010, 2: a003749-10.1101/cshperspect.a003749.PubMed CentralPubMedGoogle Scholar
- International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007, 449: 851-861. 10.1038/nature06258.View ArticleGoogle Scholar
- Thomas DJ, Trumbower H, Kern AD, Rhead BL, Kuhn RM, Haussler D, Kent WJ: Variation resources at UC Santa Cruz. Nucleic Acids Res. 2007, 35: D716-D720. 10.1093/nar/gkl953.PubMed CentralView ArticlePubMedGoogle Scholar
- Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res. 2002, 12: 996-1006.PubMed CentralView ArticlePubMedGoogle Scholar
- Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J: Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005, 110: 462-467. 10.1159/000084979.View ArticlePubMedGoogle Scholar
- Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ: The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004, 32: D493-D496. 10.1093/nar/gkh103.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.