To our knowledge, this study represents the first investigation of genotyping error rates in a wildlife DNA register, and the first application of mixed models to examine multiple effects of different factors influencing the genotyping quality, such as time, microsatellite marker and sample quality.
A major challenge with microsatellite data sets is sharing data between laboratories, and comparing data from different analytical platforms. Despite the importance of these challenges, systematic shifts of allelic scores have, with few exceptions, not been thoroughly examined (e.g., [6, 15, 16]). Furthermore, to our knowledge no studies have investigated how calibration between laboratories over time influences the ability to produce calibrated data. Although nine true values of 221 were recorded as 220 at GATA417 and Lab 1, we did not detect any systematic shift of allelic values for any of the markers implemented in the NMDR. This was despite the fact that the analyses were conducted in four separate laboratories in three countries, and over a period almost stretching a decade. We conclude that even though the genotyping for the NMDR has been conducted by several laboratories, and during a period in which genotyping platforms have displayed significant technological changes from gel to capillary based electrophoresis instruments, systematic genotyping errors due to allele size calibration were not present. This demonstrates the importance of calibrating genotyping scoring between laboratories, in addition to conducting blind proficiency tests prior to new laboratories overtaking an existing DNA register, as was performed for the NMDR.
Assuming genotyping errors disperse identically and independently (or almost independently) across the markers within an individual can be convenient when dealing with calculations regarding the genotyping error rate. Among studies utilizing simulations, it is therefore a common simplification [10, 24, 27, 28]. However, such an assumption is often not realistic [7, 27, 29]. This is well illustrated by a study of genotyping errors in 510 loci . In that study, ten errors were detected, and all occurred in the same individual. Within-individual dependencies like this can easily be modeled by increasing the standard deviation of the random effects MP:IND and IND in the mixed models (4).
Because all individual whales caught by Norway are required to be genotyped at all markers in the NMDR, samples of questionable quality cannot be disregarded or left as missing data, as is possible in many studies [8, 30]. Despite this fact, the overall error rate of 0.013 per locus in the NMDR is in concordance with the published literature on microsatellites from tissue samples [7–9, 31]. Still, the inclusion of bad samples in the analysis is an error source beyond the control of any laboratory, and contributes to the standard deviation of IND, σ
, representing the effect of variability in sample quality between individuals, being greater than zero in all models featuring IND (Table 2).
We initially investigated an interaction effect between individual and multiplex (MP:IND), but it turned out to be superfluous if a model also contained an individual random effect (IND). Further, the inclusion of MP:IND alone in a model not containing IND did not affect the model fit much (Table 2). This leads to the conclusion that sample quality was dominating over the mishandling of multiplex assays as a source of errors, but the same conclusion may not apply in other studies.
Among the fixed effects, YEAR was the most important. Because LAB and YEAR were confounded, we were unable to assess their individual impacts on the genotyping error rate. However, models including YEAR had the best AIC scores (Table 2). This can partly be explained by LAB being a less parsimonious representation of technological and procedural advances than YEAR, and partly by models including YEAR being closer to the data than models including LAB (Table 2). Examining the difference between technological and procedural advances, we conclude that the considerable impact of sample quality (IND) relative to that of the mishandling of multiplex assays (MP:IND) implies that the importance of YEAR is mostly due to progress regarding apparatus. Although the complete eradication of genotyping errors seems unlikely, we have documented a positive development taking place over the last decade in the NMDR.
In addition to the time aspect, a large variation between the genotyping error rate on different markers was detected (Table 6). Such variation was expected , and was the reason for the positive impact of LOCUS on the model fit. It is also known that larger alleles may be more prone to genotyping errors than shorter ones [32–34], which is consistent with the trend seen here (Table 6). The relationship between allelic size and error rate is not deterministic however, as illustrated by locus EV001 harboring zero errors (Table 6).
Initially the R function "lmer" was used to fit the mixed models. However, due to convergence problems on the bootstrap datasets, we switched to the R package "glmmADMB" which turned out to be more numerically robust, but had the limitation that only a single random effect can be included at a time. This is the reason why models including both the factors IND and MP:IND were run in SAS.
Since the beginning of the millennium, the number of peer reviewed articles mentioning genotyping errors has drastically increased . It has been discussed how to best obtain estimates for genotyping error rates [23, 31, 35], reduce the number of errors [7, 8] and statistically handle the uncertainty necessarily accompanying errors [10, 27]. We have focused on how to accurately model the genotyping error rate. This is important both in order to understand the underlying mechanisms concerning errors, and to be able to use data for purposes of statistical inference.
The presence of genotyping errors weakens the ability to accurately match individual samples to a DNA register. On average there is a 17% chance of a mismatch between a true multilocus genotype and the corresponding genotype in the NMDR for data accumulated over a decade (Table 5). Tissue samples from all individuals are stored, so it is possible to analyze them again, and thereby correct prospective errors. This is handy, e.g., in a juridical setting , where a high confidence in the validity of the genotypes is imperative in order to take legal action, and only few samples are involved. The use of genetic tagging to obtain abundance estimates [36–39] and to monitor populations [19, 40–42] is widespread. For such applications, reanalyzing all close mismatches (sample matching at all but a few markers) may not be feasible due to financial or other reasons. If a rule is applied to allow for close mismatches to count as a recapture, the within-individual dependence structure described by σ is of great importance. Assuming independence (σ = 0) when it is not the case, the probability of more errors occurring at individuals harboring at least one error will potentially be grossly underestimated (Table 5). Genotyping errors may also strongly influence the outcome of parentage analysis [10, 43]. As with individual identification, one can compensate by assigning parentage even if a candidate parent-offspring pair does not have at least one allele in common at all markers [44, 45]. A somewhat related matter is the degree of relatedness represented by LOD score [46, 47] used in, e.g., [48–50] to do inference about population structure and size from identified close kin. In both cases it matters whether few individuals contain many genotyping errors or the errors are more evenly spread.