Common variants at the 9q22.33, 14q13.3 and ATM loci, and risk of differentiated thyroid cancer in the Cuban population

Background The incidence of differentiated thyroid carcinoma (DTC) in Cuba is low and the contribution of host genetic factors to DTC in this population has not been investigated so far. Our goal was to assess the role of known risk polymorphisms in DTC cases living in Havana. We genotyped five polymorphisms located at the DTC susceptibility loci on chromosome 14q13.3 near NK2 homeobox 1 (NKX2-1), on chromosome 9q22.33 near Forkhead factor E1 (FOXE1) and within the DNA repair gene Ataxia-Telangiectasia Mutated (ATM) in 203 cases and 212 age- and sex- matched controls. Potential interactions between these polymorphisms and other DTC risk factors such as body surface area, body mass index, size, ethnicity, and, for women, the parity were also examined. Results Significant association with DTC risk was found for rs944289 near NKX2-1 (OR per A allele = 1.6, 95% CI: 1.2–2.1), and three polymorphisms near or within FOXE1, namely rs965513 (OR per A allele = 1.7, 95% CI: 1.2–2.3), rs1867277 in the promoter region of the gene (OR per A allele = 1.5, 95% CI: 1.1–1.9) and the poly-alanine tract expansion polymorphism rs71369530 (OR per Long Allele = 1.8, 95% CI: 1.3–2.5), only the 2 latter remaining significant when correcting for multiple tests. Overall, no association between DTC and the coding SNP D1853N (rs1801516) in ATM (OR per A Allele = 1.1, 95% CI: 0.7–1.7) was seen. Nevertheless women who had 2 or more pregnancies had a 3.5-fold increase in risk of DTC if they carried the A allele (OR 3.5, 95% CI: 3.2–9.8) as compared to 0.8 (OR 0.8, 95% CI: 0.4–1.6) in those who had fewer than 2. Conclusions We confirmed in the Cuban population the role of the loci previously associated with DTC susceptibility in European and Japanese populations through genome-wide association studies. Our results on ATM and the number of pregnancies raise interesting questions on the mechanisms by which oestrogens, or other hormones, alter the DNA damage response and DNA repair through the regulation of key effector proteins such as ATM. Due to the small size of our study and to multiple tests, all these results warrant further investigation.


Background
Cuba is the largest island in the Caribbean Sea, with an estimated population of over 11 million people according to the 2012 census [1]. The Cuban population has a mixed ethnic composition, with a large proportion of people of African or Spanish origin. Before the Spanish colonisation, Cuba was occupied by Native Americans who had migrated from the mainland of North, Central and South America several centuries before [2]. Today, Afro-Cubans, mostly from Congo, account for 35% of the population [1]. The registered incidence of DTC is low in Cuba, being 4.1 per 100,000 in females and 1.0 in males [1,3].
In a previous case-control study on thyroid cancer risk factors conducted in Cuba [4], we showed that DTC risk was lower in populations of African origin, was increased with parity and body surface area, and was higher in farmers than in people pursuing other types of activities. A history of ionizing radiation, agricultural occupation and an artesian well as the main source of drinking water were also associated with a significantly increased risk of developing DTC. In women, irregular cycles and menopause status were associated with a higher risk of DTC. On the other hand, DTC risk was lower in current or former smokers than in non-smokers [4].
We investigated the The contribution of host genetic factors to DTC susceptibility has never been assessed in the Cuban population. Since NKX2-1 (NK2 homeobox 1, also called TTF1 for Thyroid Transcription Factor 1), FOXE1 (Forkhead factor E1, also called TTF2 for Thyroid Transcription Factor 2) and ATM (Ataxia-Telangiectasia Mutated) have been associated with DTC in other populations and are compelling candidates due to their roles in thyroid development or response to DNA damage, we chose to assess the contribution of genetic variations in or near these three genes to the risk of DTC in the Cuban population.
NKX2-1 and FOXE1 encode thyroid-specific transcription factors that play an important role in thyroid development and whose expression is modified in thyroid tumours [5][6][7]. The first thyroid cancer genome-wide association study (GWAS) reported the contribution of two SNPs near these two genes, namely rs944289, located 337-kb upstream of NKX2-1 on chromosome 14q13.3, and rs965513, located 57-kb upstream of FOXE1 on chromosome 9q22.33, to the risk of developing DTC in the European population [8]. Subsequently, the relationship between these two loci and DTC susceptibility has been investigated in other populations, but these associations vary in the context of different ethnic backgrounds and FOXE1 polymorphisms were more strongly correlated with the pathogenesis of PTC than NKX2-1 polymorphisms [9][10][11][12]. In particular, two functional polymorphisms in FOXE1 appeared to be of specific interest: rs1867277, located within the 5′ untranslated region (UTR) and involved in the allelespecific transcriptional regulation of FOXE1 through recruitment of the USF1/USF2 transcription factors [13][14][15][16], and rs71369530, the poly-alanine expansion in the FOXE1 coding region [17,18].
ATM is a key initiator of the DNA damage response and some ATM SNPs have been reported to play a role in hormone dependent cancers and radiation sensitivity [19]. In particular, the common missense substitution D1853N (rs1801516) has been shown to play a role in DTC risk following irradiation [16,20] but the association with sporadic PTC was not replicated in a metaanalysis [11].

Results
For each genotyped polymorphism, allele and genotype frequencies were calculated and Hardy-Weinberg equilibrium (HWE) was tested in the studied sample set. The five polymorphisms were in HWE among the analysed control subjects (Table 1). We noted that in the control population the minor allele frequency (MAF) of rs944289 near NKTX2-1 was significantly lower in subjects of African origin than in the others (p = 0.03). No significant difference in MAF was observed for the other tested polymorphisms between the different ethnic groups (p < 0.3, whatever the polymorphism).
When stratifying by sex and age, and adjusting for body surface area (BSA), body mass index (BMI), size, ethnicity, tobacco consumption and, for women, number of pregnancies, all tested SNPs but rs1801516 in ATM (D1853N) were found to be associated with increased risk of DTC in the Cuban population (Table 2). ORs per minor allele, ranged from 1.5 (95% CI: 1.1-1.9) for rs1867277 located in the 5′UTR of FOXE1 to 1.8 (95% CI: 1.3-2.5) for the length polymorphism rs71369530 in the coding sequence of FOXE1 (Table 2).
A systematic investigation of potential interactions between the five polymorphisms and ethnicity, BMI, BSA, size, tobacco consumption and, for women, parity (Tables 3  and 4), revealed a suggestive interaction (p = 0.03) between the ATM coding SNP and the number of pregnancies. Women who had 2 or more pregnancies had a 3.5-fold (95% CI: 1.2-9.8) increase in risk of DTC if carrying the A allele whereas this OR was 0.8 (95% CI: 0.4-1.6) in women having had none or one pregnancy only (Table 3).

Discussion
In the present work, we assessed the relationship between five putative or recognized polymorphisms involved in DTC risk in the Cuban population, where the incidence of thyroid cancer is particularly low [4].
We replicated the association between polymorphisms at NKX2-1 and FOXE1 loci and DTC risk previously reported in a GWAS in European [8] and Japanese [14] populations. When taking into account multiple tests using Bonferoni correction [19], significant threshold for p-value should be 0.01, and only 2 of the 3 tested SNPs at FOXE1 loci remained significant.
The power of the present study was low for the ATM SNP rs1801516, because of the low MAF (11%) in controls. For this SNP, a power of 80% is reached only for an OR of 3.5 or higher. For the other tested SNPs with MAFs of about 20% in controls (range: 15% to 27%, Table 1), our study had a power of 80% for evidencing an association if OR is 1.7 or higher, when not taking into account multiple tests, and if OR is 2.0 or higher when taking in account multiple testing. Only important interaction could be evidenced given the size of our study,  a power of 80% being reached for gene-environment interactions of a factor 3, assuming an environmental factor present in 50% of controls, a main OR for environmental factor equal to 2, a SNP MAF equal to 20% and a main OR per minor allele equal to 1.5. All these numbers being given without correction for multiple tests. In addition to a low power, our study suffers from the traditional limitations of case-control studies. If size, pregnancies number and smoking habits are probably well reported by subjects, it is impossible for us to verify that cases correctly reported their weight before thyroid cancer, as specified in questionnaire, rather than their weight at time of interview.
Although we did not evidence an association between ATM D1853N (rs1801516) and DTC risk in the whole study set, a significant (p = 0.03) interaction was found in women between this polymorphism and the number of pregnancies, which is another known risk factor for DTC [4]. In the Cuban study the minor allele (A) was significantly associated with a 3-fold increased risk of DTC among women who had had two or more children.
Interestingly, as observed in the Cuban population [4], an increased risk of DTC with increasing number of pregnancies had been observed in natives of French Polynesia (OR = 3.1, 95% CI: 1.2-8.3) [21], where the average number of children is very high (about 4 children per Table 3 Results of interaction tests between genetic factors and other putative risk factors for DTC  woman in the controls). In the French Polynesian study the minor allele A was quite rare in the population (2% in controls), but it was associated with a significantly increased risk of DTC (Maillard et al. submitted). These observations raise interesting questions about the biological role of ATM, and possibly of other DNA repair genes, on the development of hormone-related cancers. The Ser/Thr protein kinase ATM is primarily known as a central element of the cellular response to double-strand break (DSB) lesions. DNA DSBs can be generated by DNA damaging agents, such as ionizing radiation, following the collapse of stalled replication forks or the response to uncapped telomeres. Unrepaired DSBs can severely disrupt DNA replication in proliferating cells, usually leading to cell death, or leave chromosomal aberrations leading to cancer formation. In previous studies on radio-induced PTC or sporadic PTC, the missense substitution D1853N in ATM had been associated with a decreased risk [16,22]. More recently, it has been reported that this conserved variant falling just upstream of the FAT kinase domain [23] may modify the genetic susceptibility to DTC and its clinical manifestation in carriers of a rare BRCA1 pathogenic variant. In particular, both ATM rs1801516 and BRCA1 rs16941 variants modify the impact of male gender on clinical variables [24]. An emerging hypothesis is that ATM is exploited in undamaged cells in other signalling pathways that DSBs repair in response to various stimuli or physiological situations such as hormonal exposure [25]. One could also hypothesize that oestrogen could contribute to DTC via the induction of DNA damage. For instance, in breast cancer it has been proposed that oestrogen receptor signalling converges to suppress effective DNA repair and apoptosis in favour of proliferation [26]. A variation in breast cancer risk associated with parity has been evidenced according to the type of mutation in the DNA repair gene BRCA1, acting in the same pathway as ATM [27]. Hence, further studies are warranted to better understand the role of ATM in hormone-related cancers such as DTC.

Conclusions
We confirmed in the Cuban population the role of the loci that have been previously associated with DTC susceptibility in European and Japanese populations through genomewide association studies. Moreover, our result on ATM and the number of pregnancies raises interesting questions on the mechanisms by which oestrogens, or other hormones, alter the DNA damage response and DNA repair through the regulation of key effector proteins such as ATM. Due to the small size of our study and to multiple testing, all these results warrant further investigation in a larger sample set.

Methods
This case-control study was carried out in Havana, Cuba, and was revised and approved by the Clinical Research Ethics Committee of the National Institute of Oncology (INOR), Havana, Cuba. Informed written consent was obtained from all study participants.

Subjects selection and interviews
The cases and controls selection process as well as the case-control study methodology have been described elsewhere [4]. In brief, cases lived in the Havana area, were between 18 and 50 years old at time of DTC diagnosis and had been treated between 2000 and 2011 at INOR or at the Institute of Endocrinology of Havana. Of the 240 eligible DTC cases, 37 (15%) individuals were not interviewed because they could not be located (n = 32) or refused to participate (n = 5). The final study population consisted of 203 cases. Controls were selected from the general population living in the same areas using consultation files from primary care units (family doctors). They were frequency-matched with cases by age at cancer diagnosis (±5 years) and gender. Of the 229 potential controls, 17 refused and 212 agreed to be interviewed.
All 415 participants were interviewed face-to-face by trained professionals (nurses and medical staff ) using a structured questionnaire between January 2009 and December 2011 in presence of a parent, a relative, or a general practitioner. Cases and controls characteristics are described in Table 5. All participants gave their consent for saliva sampling and genetic analyses.

DNA isolation
Saliva samples were collected using a DNA Genotek Oragene DNA collection kit (Ottawa, Canada). Genomic DNA (gDNA) was extracted using a standard inorganic method (Qiagen Autopure LS, Courtaboeuf, France). The gDNA was then quantified with the Life Technologies Picogreen kit (Saint-Aubin, France). For the genotyping, DNA from study participants was randomized on plates and all samples were analysed simultaneously. For quality control purposes, duplicates of 10% of the samples were interspersed throughout the plates.

Genotyping
Five polymorphisms that were observed in previous studies to be associated with DTC were selected for genotyping: the nonsynonymous SNP rs1801516 (D1853N) in ATM, the GWAS SNP rs944289 near PTCSC3 and NKX2-1 at 14q13.3, the GWAS SNP rs965513 near FOXE1 at 9q22.33, rs1867277 in the 5′UTR of FOXE1, and the poly-alanine stretch polymorphism rs71369530 in FOXE1 that is the result of a variable number of alanine repeats.
For SNPs rs944289, rs965513, rs1867277, and rs1801516, 25 ng of gDNA were analysed using High-Resolution Melting curve (HRM) with a specific probe. Some representative samples were re-sequenced by dye-terminator to confirm the genotype [28]. Fluorescence readings and data analyses were done with the Idaho Technology LightScanner Inc. Hi-Res Melting System (Idaho Technology, Salt Lake City, UT).
For rs71369530, 30 ng of gDNA was amplified by PCR with fluorescently end-labelled forward primers (5′-6-FAM or 5′-HEX) using KAPA 2G Fast HotStart ReadyMix (KAPA Biosystems, Woburn, MA, US) in a 10 μl final reaction volume (0.5 mM MgCl2, 5% DMSO, 0.25 mM primers). The fluorescently-labelled PCR product was loaded on an ABI 3730 capillary sequencer and analysed as a variable length fragment polymorphism using GenScan size standards (ROX-500) as internal size standards. Data were collected and visualized with Genotyper Software v3.7. To determine the number of repeats corresponding to each allele identified in the genotyping assay, the PCR products from 6 homozygous individuals were Sanger sequenced.
The sequences of all PCR primers, HRM probes, and all PCR conditions are available from the authors on request.

Statistical analyses
For the statistical analyses, the study participants were classified into three categories according to the ethnicity of their parents: European (both parents of European origin), African (both parents of African origin), and other (all other combinations of parental origin). Body mass index (BMI) was defined as weight (kg) divided by height (m) squared, and body surface area (BSA) was calculated using the Boyd formula: BSA(m 2 ) = 0.0003207 × (weight) 0.7285 -(0.0188 * log (weight)) × (height) 0.3 , where weight is expressed in g and height in cm [29]. Quantitative factors were categorised into tertiles based on their distribution among the controls. Anthropometric categorisation was defined separately for men and for women.
Allele and genotype frequencies were calculated and HWE was tested using a χ 2 test in the studied sample set for each polymorphism. The five SNPs were in HWE among the analysed control subjects (Table 1).
For the genotype analysis of the FOXE1 multi-allelic poly-alanine stretch length polymorphism (rs71369530), we considered a bi-allelic marker with the three possible genotypes according to the length of the alanine tract: Short/Short, Short/Long and Long/Long, with short alleles (S) including alleles coding for a stretch of 12-14 alanines and long alleles (L) comprising those alleles coding for a stretch of 16-19 alanines.
Nineteen strata were defined based on age and gender, seven for men and twelve for women. The association between the five analysed polymorphisms and risk of DTC was assessed using multiple logistic regressions and assuming co-dominant, dominant, and recessive genetic models of inheritance [30,31]. Crude analyses and analyses adjusted for environmental thyroid cancer risk factors were performed. Tests for interaction were performed to determine whether the putative associations of SNPs with the