Composite likelihood estimation of demographic parameters

Background Most existing likelihood-based methods for fitting historical demographic models to DNA sequence polymorphism data to do not scale feasibly up to the level of whole-genome data sets. Computational economies can be achieved by incorporating two forms of pseudo-likelihood: composite and approximate likelihood methods. Composite likelihood enables scaling up to large data sets because it takes the product of marginal likelihoods as an estimator of the likelihood of the complete data set. This approach is especially useful when a large number of genomic regions constitutes the data set. Additionally, approximate likelihood methods can reduce the dimensionality of the data by summarizing the information in the original data by either a sufficient statistic, or a set of statistics. Both composite and approximate likelihood methods hold promise for analyzing large data sets or for use in situations where the underlying demographic model is complex and has many parameters. This paper considers a simple demographic model of allopatric divergence between two populations, in which one of the population is hypothesized to have experienced a founder event, or population bottleneck. A large resequencing data set from human populations is summarized by the joint frequency spectrum, which is a matrix of the genomic frequency spectrum of derived base frequencies in two populations. A Bayesian Metropolis-coupled Markov chain Monte Carlo (MCMCMC) method for parameter estimation is developed that uses both composite and likelihood methods and is applied to the three different pairwise combinations of the human population resequence data. The accuracy of the method is also tested on data sets sampled from a simulated population model with known parameters. Results The Bayesian MCMCMC method also estimates the ratio of effective population size for the X chromosome versus that of the autosomes. The method is shown to estimate, with reasonable accuracy, demographic parameters from three simulated data sets that vary in the magnitude of a founder event and a skew in the effective population size of the X chromosome relative to the autosomes. The behavior of the Markov chain is also examined and shown to convergence to its stationary distribution, while also showing high levels of parameter mixing. The analysis of three pairwise comparisons of sub-Saharan African human populations with non-African human populations do not provide unequivocal support for a strong non-African founder event from these nuclear data. The estimates do however suggest a skew in the ratio of X chromosome to autosome effective population size that is greater than one. However in all three cases, the 95% highest posterior density interval for this ratio does include three-fourths, the value expected under an equal breeding sex ratio. Conclusion The implementation of composite and approximate likelihood methods in a framework that includes MCMCMC demographic parameter estimation shows great promise for being flexible and computationally efficient enough to scale up to the level of whole-genome polymorphism and divergence analysis. Further work must be done to characterize the effects of the assumption of linkage equilibrium among genomic regions that is crucial to the validity of applying the composite likelihood method.


Background
The availability of whole-genome polymorphism data offers both great opportunities and tremendous challenges to the study of population genetics. Complete genotype information from populations allows increased resolution of parameters in complex evolutionary or demographic models. The challenge is to develop computational methods that permit the efficient use of such large-scale datasets. Likelihood-based coalescent methods have proven very flexible for the analysis of DNA sequence polymorphism. However full likelihood methods, such as Markov chain Monte Carlo (MCMC) and Importance Sampling (IS), are not efficient enough to scale up to genome-wide datasets, necessitating the use of approximate methods for estimating likelihoods. One problem with existing MCMC and IS methods is that a proposal function must be employed to efficiently search among candidate coalescent histories. To circumvent this problem, approximate likelihood methods have proven useful. This class of methods reduces the dimensionality of a full DNA polymorphism dataset to a set of summary statistics, thereby also reducing the number of coalescent histories that need to be sampled to obtain an estimate of the likelihood.
One potential drawback of approximate likelihood methods is that a significant amount of information contained in the original data may be lost. A second problem with full MCMC and IS methods is that integrating over the entire space of possible histories for partially linked polymorphisms along a chromosome can quickly become computationally intractable. In this regard, composite likelihood has been shown to be a promising method for the analysis of partially linked polymorphisms [1][2][3]. Using this method, the likelihood function is computed marginally for each polymorphism (or contiguous sets of linked polymorphisms) and their product is taken to be an approximation of the full likelihood [4]. Because composite likelihood methods are found to yield consistent estimators of population parameters when the number of regions examined becomes very large [5,6], they may be particularly applicable to whole-genome datasets.
One of the commonly used class of models in population genetics aims to quantify divergence time by measuring the genetic distance between two populations or species. Yet many measures of genetic distance are susceptible to biases introduced by non-equilibrium conditions during the histories of the populations. Specifically, evolutionary forces that reduce within-population variation are known to inflate measures of genetic distance; such forces may include natural selection [7] or temporal fluctuations in the effective population size [8][9][10]. In contrast to the locus-specific effects of natural selection, fluctuations in effective size, such as population bottlenecks, are expected to influence the frequencies of alleles throughout the entire genome and therefore should be readily detectable using genome-wide polymorphism data. Thus, it is desirable to develop methods that can not only estimate divergence time from genome-wide polymorphism data, but can also simultaneously account for non-equilibrium demographic events, such as population bottlenecks.
One novel implementation of a coalescent-based method that simultaneously estimates divergence time between two populations and accounts for population bottlenecks is described by Li and Stephan [11]. This method achieves the necessary computational economies by summarizing two-population polymorphism data in the form of the joint frequency spectrum. The joint frequency spectrum is a two-dimensional matrix whose elements are the frequencies of the derived nucleotide allele in a joint sample from two populations or species. Using the joint frequency spectrum, mutations can be classified as either fixed, shared, or exclusive to one of the populations [12]. Li and Stephan [11] estimate divergence time and bottleneck parameters from a joint frequency spectrum constructed from 250 X-linked loci, representing samples of African and non-African populations of Drosophila melanogaster.
While the approach of Li and Stephan [11] does provide an economical method for fitting a parameter-rich population divergence model to a large polymorphism dataset, it can nonetheless be further economized and extended. Because the authors consider linkage disequilibrium among polymorphisms within loci, the mutation rate per locus must be included as a parameter. In contrast, by adopting a composite likelihood approach and assuming linkage equilibrium among polymorphism loci, it is possible to eliminate the mutation rate as a parameter, similar to a recent approach by Hernandez et al. [13]. Lastly, Li and Stephan [11] used a maximum likelihood method that evaluates a fixed set of proposed parameter values for their model. This approach does not capitalize on advances in Bayesian MCMC methods for model parameter estimation. The present study extends the approach of Li and Stephan [11] by both eliminating the mutation rate as a nuisance parameter and implementing a Bayesian MCMC approach that takes advantage of multiprocessor/ multicore computer architecture. The proposed method is tested for accuracy using simulated joint frequency spectra and is then applied to three large autosomal and X-linked resequence datasets from African, European, Asian, and Oceanian human populations [14]. Three pairwise analyses of the populations are performed to estimate the parameters of an "Out-of-Africa" bottleneck model (Figure 1), paying particular attention to the effective population size of the X chromosome versus the autosomes.

Behavior of the Markov Chain
To gain confidence regarding the convergence of Markov chains to their stationary distribution, it is important that the chains mix well and also that independent runs of the chains converge to the same posterior probability distribution. The mixing of independently seeded chains is assessed by measuring the autocorrelation of parameter values accepted from the prior probability distributions. Autocorrelations are measured at lag intervals from 1 to 50. Table 1 presents the autocorrelations, at lag 50 (ρ 50 ) for each parameter value, over all ten replicate runs. For each of the six different datasets shown in the table, the two time parameters show the weakest mixing behavior. Similarly, for each dataset, the two time parameters showed the highest levels of cross-correlation, ranging between -0.2 and -0.4 (data not shown). Interestingly, the parameter that shows the best mixing behavior is the ancestral population size scaling factor α 3 . The potential scale reduction factor (PRSF) and the upper 97.5% quantile of the PRSF distribution are all very close to unity (Table 1) for every dataset, except simulated data set G, which differs from the others in that much longer divergence times are involved. A PRSF value significantly greater than one implies that chains must be run longer to achieve convergence to the stationary distribution.

Simulated Datasets
The performance of the method under a true population divergence model is assessed using twelve simulated joint frequency spectra. Table 2 lists the parameter values for each of the twelve simulated data sets presented here. The simulated data sets are intended to represent both recent population bottlenecks (A-F), as well as older population bottlenecks (G-L). The duration of the reduction phase of the bottleneck is the same in all of the simulated data sets, however, populations either experience a ten-fold reduction or no reduction at all, in which case the model reduces to one of pure population growth (C, F, I, L). Similarly, during the recovery phase of the bottlenecks, populations can grow by either 100-fold or 1000-fold. Lastly, the ratio of the X chromosome effective population size to that of the autosomes varies between three-quarters (expected if there are an equal number of breeding males and females) and unity (a 7:1 ratio of reproducing females to males).
The posterior probability distributions shown in Figure 2 illustrate several consistencies, as well as several systematic biases in the MCMCMC estimation procedure. For both recent and ancient population bottlenecks, the time of recovery (t 1 ) is estimated accurately. However, the duration of the bottleneck (t 2 ) tends to be slightly, and consistently, overestimated when the bottleneck occurred 2N 1 generations ago (data sets G-L). When the bottleneck is recent, the MCMC method tends to systematically underestimate the current effective population size of population 2 (α 1 ), regardless of whether it is 100 or 1000 times that of the founding population size. However, underestimation does not appear to be a problem for the data sets obtained from simulations of an older bottleneck time. Also, the size of the founder population (α 2 ) tends to be consistently overestimated when the bottleneck is ancient. In all cases, the ancestral population size (α 3 ) is estimated accurately, which is compatible with the results of Becquet and Przeworski [15]. Lastly, the ratio of the effective population size of the X chromosome to that of the autosomes (h) is estimated accurately, however, in most cases, the 95% HPD interval includes the value of the parameter expected under an alternate case of interest.
For example, Figure 2 shows that the median values of tend to slightly overestimate the true value of h = 1, and that, in all but four cases (D, E, J, and K), the 95% HPD interval includes 0.75 when the true h value is unity and h Demographic model Figure 1 Demographic model. A schematic of the two population divergence model that is fit to the joint frequency spectrum. Looking forward in time, the ancestral population splits t 1 + t 2 generations before the present into two descendant populations. At this time, the effective size of population 1 is assumed to be N 1 and the founding size of population 2 is assumed to be α 2 N 1 . Then, after t 2 generations, population 2 grows to effective size α 1 N 1 . Lastly, the effective size of the common ancestral population is assumed to be α 3 N 1 . Thus, the divergence model is governed by five parameters that need to estimated: t 1 , t 2 , α 1 , α 2 and α 3 .
also when the true value of h is three-fourths the 95% HPD interval includes unity. This observation suggests that the MCMCMC method may not always have the adequate power to reject the null hypothesis that h = 3/4. The effect that analyzing a larger data set may have on this power remains to be investigated.

Estimates of Human Bottleneck Parameters
The quantiles of the marginal posterior probability distributions obtained by applying the method to the resequence data of Wall et al. [14] are shown in Figure 3, with each of the three comparisons between continental human populations shown side by side. Also, Table 3 provides the numerical values for the median estimated parameter values and the corresponding 95% HPD intervals. It should be first noted that none of these results are consistent with a population bottleneck model. In each of the three comparisons, the ancestral effective population size is estimated to be twice that of the current Mandenka effective population size and the median estimated values of neither α 1 or α 2 are greater than one. Figure 4 plots the joint posterior probability distributions of α 1 and α 2 for simulated bottleneck data set A and the three empirical resequencing data sets. These joint distributions confirm that the method accurately detects recent population growth from data simulated under a bottleneck model, but is also unable to support recent population growth for the data of Wall et al. [14]. The resequence data are consistent with a model of a reduced effective population size with no subsequent expansion for any of the four sampled human populations.
The divergence time of African and non-African populations (t d ) is consistent across comparisons. The estimated median Africa-Asia divergence time is 0.1010 × 2N 1 with The behavior of the Markov chains for all datasets, as determined by ten independent runs of the chain for each dataset. Three statistics are given: ρ 50 is the autocorrelation of each parameter at lag 50 steps, PSRF refers to the potential scale reduction factor for each parameter, and Upper PRSF is the 97.5% quantile of the PRSF. Multivariate PRSF values are not given. The resequencing data sets of Wall et al. [14] are abbreviated as Africa-Asia (AA), Africa-Europe (AE), and Africa-Oceania (AO). Additionally, autocorrelation values are also presented for three representative simulated data sets (A, D, and G).  Simulated data estimates Figure 2 Simulated data estimates. Accuracy of parameter estimation for the twelve parameters in the divergence model. For each parameter, the ratio of the estimated median posterior probability to the "true" value of the parameter in the simulation. The horizontal gray lines delineate a ratio of unity. The heavy lines in the box plots are the median, the hinge of the boxes are the 25% and 75% quantiles and the outer whiskers represent the 2.5% and 97.5% quantiles. Results are presented for each of the twelve simulated datasets, the parameters of which are listed in Table 1. Posterior probability distributions are taken over all ten replicate runs of the Markov chain. There is no support for the hypothesis that these estimated times significantly differ from one another and therefore, a single Africa/non-Africa divergence event cannot be rejected.
There is some suggestion that the rate of coalescence, after divergence from the African population, may be higher in the Asian population than in the other two non-African populations. This can be seen by examining the intensity of the effective population size reduction phase (F = t 2 / α 2 ). For the Africa-Asia comparison the estimated median F is 0.1027 with a 95% HPD interval of 0.0010-0.2297, while the estimated median F for the Africa-Europe comparison is 0.0603 (95% HPD interval of 0.0019-0.2035) and 0.05122 for the Africa-Oceania comparison (95% HPD interval of 0.0012-0.1936). While the estimated median F from the Africa-Asia comparison does not lie outside the 95% HPD intervals of the other two comparisons, the difference is much more pronounced than the difference between divergence time estimates, yet cannot Human data estimates Figure 3 Human data estimates. Representations of the posterior probability distributions for the six divergence model parameters from the data of Wall et al. [14]. Three pairwise population comparisons are plotted: Africa-Asia (AA), Africa-Europe (AE), and Africa-Oceania (AO). The heavy lines in the box plots are the median, the hinge of the boxes are the 25% and 75% quantiles and the outer whiskers represent the 2.5% and 97.5% quantiles. Numerical values for the median and 95% highest posterior density intervals can be found in Table 3. In the plot of the ratio of the X chromosome to autosomal effective population size (h), the horizontal gray line delineates a ratio of 3/4. As in Figure 2, the posterior probability distributions shown here are taken over all ten replicate runs of the Markov chain. be considered conclusive evidence. Lastly, the estimated median ratios of effective population sizes for the X chromosome to that of the autosomes is greater than 3/4 for all comparisons. The model assumes a single value of h for all populations in the model. The highest estimated median h is found in the Africa-Oceania comparison, while the smallest is found in the Africa-Asia comparison ( Table 3).

Conclusion
Efficient computational methods for fitting complex demographic or evolutionary models to large genomic datasets present a great challenge to population geneticists. The method presented here uses two approximations to achieve the necessary computational efficiencies. The first is an approximate likelihood method, in which large genomic polymorphism datasets are summarized in terms of the joint frequency spectrum. This approach reduces the number of coalescent genealogies that must be sampled to obtain an estimation of the likelihood, compared with most full likelihood-based approaches. Secondly, this methodology is rendered feasible by a composite likelihood approach, which assumes that all polymorphic sites are in linkage equilibrium and have independent genealogical histories. The method is implemented using a model of allopatric population divergence, with a founder event occurring in the history of one of the two diverging populations.
Simulated datasets are used to investigate the accuracy of the MCMC parameter estimation. The method is found to perform well, although it experiences some difficulty Evidence for population growth Figure 4 Evidence for population growth. Joint posterior density plots for the α 1 and α 2 parameters for four different data sets: A) simulated data set A, B) Africa-Asia, C) Africa-Europe, and D) Africa-Oceania. The dashed line plots the case of α 1 = α 2 , which is indicative of no recent population growth. In panel A, the posterior density for the simulated bottleneck data lies below the dashed line, supporting recent population growth. However, in the other three panels representing the empirical resequence data of [14], the joint posterior density lies within the one-to-one region, suggesting a lack of evidence for recent population growth. delineating the time of the founding event versus the time of population growth parameters, even though the intensity of the bottleneck is estimated accurately. The problem of bottleneck models being partially identifiable, with respect to the timing and magnitude of the reduction phase, was also observed by several other investigators [16][17][18]. This suggests that current approaches to estimating bottleneck parameters may be limited to estimating the total amount of drift occurring during the reduction phase of the bottleneck (e.g., the product of the bottleneck duration and the magnitude). Again, it remains to be determined what effect the size of a data set may have on this problem of identifiability.
The composite likelihood method is applied to three joint frequency spectra datasets constructed from samples of four continental human populations [14]. The results indicate that there is evidence for a reduction in both the African and non-African effective population sizes, but no evidence that this reduction was followed by a recovery in size that is characteristic of population bottlenecks. As noted by Fay and Wu [19], there is expected to be a period of time, following a population bottleneck, during which the X and the autosomes lag in their signal of population growth compared to the Y chromosome and mitochondrial DNA (i.e., a slower accumulation of rare mutations than the two haploid compartments of the genome).
The conclusion gleaned from the present analysis stands in contrast to those of Voight et al. [17], which is the only other resequencing study to test a population bottleneck model explicitly. Voight  Thus, it appears far from conclusive that there is convincing evidence for recent population growth from either autosomal or X-linked non-coding resequence datasets.
The differential recovery time for the X chromosome versus the autosomes has also prompted Pool and Nielsen [20] to suggest that the X chromosome may recover rare variants more quickly than the autosomes following a population bottleneck and that this could elevate the ratio of X-linked to autosomal nucleotide diversity. Indeed, the human data analyzed here show equivalent levels of Xlinked and autosomal diversity [21]. The composite likelihood analysis suggests that, for this dataset, the effective population size of the X chromosome is equal to, or greater than, that of the autosomes, even after taking into consideration that effects of a bottleneck. The effect described by Pool and Nielsen [20] would elevate the X to autosomal diversity ratio upwards of 1,000 generations following the recovery period of the bottleneck.
The conclusion presented here, that a bottleneck alone is insufficient to produce the observed elevation in X-linked diversity, indirectly supports the conclusions of Hammer et al. [21], that a systematic skew in the breeding sex-ratio is responsible for the X to autosome diversity ratio. However, if the expected ratio of X chromosome effective population size to that of the autosomes, under a purely neutral demographic model, is h = 9/(8(2 -ϕ)), where ϕ is the proportion of the population that is female, then lim ϕ→1 h = 9/8. This expected maximum value for the ratio of X to autosomal effective population size is lower than all three of our median estimates of h. Although the lower bound of the 95% HPD interval for h varies between 0.46 and 0.60 for the three comparisons, the median estimates suggest that there may be additional forces, such as sexbiased migration, that act at a genome-wide scale. Interestingly, the conclusions reached here contrast with those of Keinan et al. [22], who use single nucleotide polymorphism (SNP) genotype data to infer a lower effective population size for the X chromosome. These contrasting conclusions may reflect the different types of data used in the analyses (resequence versus SNP-typing) or the influence of natural selection near regions of the genome with a high density of coding sequence. Undoubtedly, this dichotomy will be resolved by the forthcoming data from the 1,000 Genomes Project.
The combination of approximate and composite likelihood methods is a promising approach for scaling up population genetics analyses to the level of wholegenome polymorphism data, yet much remains to be done to characterize the validity and accuracy of these methods. Projects are underway to examine further the properties of this method and to apply it to a full-genome polymorphism dataset.

Coalescent Model and Likelihood Estimation
The proposed composite/approximate likelihood method is applied to a model of allopatric population divergence, in which one of the populations experiences a transient founder event. This scenario is modeled by the coalescent process. Looking backwards in time, two populations with effective sizes N 1 and N 2 = α 1 N 1 . The second popula-tion changes its effective population size at time t 1 to N B = α 2 N 1 . Then at time t 1 + t 2 , the two populations are descended from an ancestral population of size N A = α 3 N 1 ( Figure 1). Therefore, the divergence time of the two populations is t d = t 1 + t 2 and time is measured in units of 2N 1 generations before the present. Thus, the model consists of four different rates of coalescence and its behavior is governed by a total of six parameters, which are collectively referred to as the vector λ = {t 1 , t 2 , α 1 , α 2 , α 3 }.
The coalescent process underlying this model traces the ancestry of n 1 and n 2 chromosomes sampled from each of the two populations, respectively. The total number of chromosomes in the joint sample is n = n 1 + n 2 . The number of ancestral lineages remaining within each of the two populations decays independently as time is traced backwards, such that there will be 1 ≤ k 1 ≤ n 1 sampled lineages remaining in population 1 and 1 ≤ k 2 ≤ n 2 lineages remaining in population 2. At time t d , the remaining sampled lineages merge so that k = k 1 + k 2 , at which point they are exchangeable and continue to coalesce until the most recent common ancestor of the joint sample. For t <t d , the rate of coalescence for the joint sample is the sum of two independent exponential distributed rates. The total coalescence rate (u) is given by Given that a coalescent event occurs and t <t d , the probability that two randomly chosen lineages in population 1 coalesce is The probability that the coalescent event occurs in population 2 is simply Pr (c 2 ) = 1 -Pr(c 1 ). When a separate joint frequency spectrum representing X-linked data is also being considered, then the parameter h scales both t 1 and t 2 , such that t 1 (X) = t 1 (A)/h and t 2 (X) = t 2 (A)/h, where t * (X) is the time of the event for X-linked loci and t * (A) is the time of the event for autosomal loci. Under the assumption of neutral evolution, when the male and females population sizes are equal, h is expected to be 3/4.
A bifurcating coalescent genealogy consists of 2n -2 branches. Each branch in the genealogy can be labeled b z for 1 ≤ z ≤ 2n -2. By drawing from an exponential distribution given by equation (1), each branch can be assigned a length T z . Let the total length of the entire genealogy be if . (2) is added to the zero probability entry. Although this pseudocount is somewhat arbitrary, it is equal to a single random branch length drawn from the entire Monte Carlo sample of size r. Finally, the likelihood of λ, given the entire observed joint frequency spectrum (S) from a total of polymorphic nucleotide sites, can be approximated by where S ij is the number of sites with configuration (i, j) and provided that i + j is not equal to zero or n 1 + n 2 , since neither of these two conditions would result in a polymorphism at that site. A total of r = 10 5 coalescent genealogies are sampled to estimate Pr(i, j|λ).
By taking the product of the marginal likelihoods across all polymorphic nucleotide sites in equation (7), it is assumed that all sites are in linkage equilibrium and, therefore, have independent coalescent genealogies. This means that equation (7) belongs to a class of approximate methods known as composite likelihood [4]. There is some suggestion that composite likelihood estimators of population parameters are consistent, particularly when the number of regions examined becomes very large. While the data of Wall et al. [14] consist of only 40 independent regions of the genome, the method may be promising for future analyses of whole-genome polymorphism data. One potential consequence of linkage disequilibrium among sites may be that the resulting credible intervals for the MCMC parameter estimation may be too narrow.

Metropolis-coupled Markov chain Monte Carlo (MCMCMC) Parameter Inference
A Bayesian approach is used to estimate the model parameters in λ from the observed joint frequency spectrum data (S). The parameters in λ constitute the state of a Markov chain that relies upon equation (7) to sample from its stationary distribution, where f(λ) is the prior density of parameter values and f(S) = ∫L(S|λ) f(λ)dλ is a normalizing constant. The prior densities employed in this study were chosen from repeated exploratory runs of the Markov chain. For the analysis of the human resequence data, the prior distributions for the two time parameters are uniform over the interval (0, 1), while the prior for the ratio of X chromosome to autosomal effective population size is uniform over the interval (0, 5). The prior densities for the relative effective population size parameters are exponential with mean 3. Markov chain transition probabilities are governed by the Metropolis-Hastings criterion [24,25].
Multiple Markov chains are run in parallel and then Metropolis-coupled, a method in which chains attempt to swap their current states [26]. Metropolis-coupled chains are known to improve the mixing of parameter values [27] and also convergence behavior [28]. Metropolis-coupled chain x can be assigned a heating term (β x ) to modify the Metropolis-Hastings transition probability from the current state λ to a proposed state λ'. This modified transition probability (U) is given by the equation where q(λ → λ') is the probability of proposing a move from state λ to λ'. The probability that an attempted swap of parameter values between two randomly selected chains x and y is successful can be written as [27]. For the purposes of the present study, an incremental heating scheme was used for eight Metropolis-coupled chains; in this scheme, the heating term for chain x is given by β x = 1/[1 + ΔK(x -1)]. A temperature increment parameter of ΔK = 1.1 is used in this study, which yields an average swapping rate of 30-40% between chains. Only the state of the non-heated chain is recorded at each step.
The general Metropolis-coupled Markov chain Monte Carlo (MCMCMC) algorithm proceeds as follows: 1) Randomly assign initial parameters value in λ, sampled from f (λ).
3) Randomly select a parameter in λ and propose a new value from f (λ) to obtain λ'.
G In step 6, the algorithm returns to step 2, rather than step 3. This means that at each step the likelihood of the current state is recalculated, rather than retained from the previous step. Retaining the previous likelihood may result in the chain becoming stuck in a state that, by chance from the Monte Carlo sample, yields an unusually high likelihood; however it is also guaranteed to produce samples from the true posterior distribution, regardless of the Monte Carlo sample size [29]. While the true variance of the target distribution can be obtained with this "sticky" method, it may also impede the convergence behavior of the chain and, hence, require more steps in the chain. In contrast, the practice of recalculating the likelihood ("smooth" MCMC) are expected to result in higher acceptance rates, while the resulting posterior distribution may also have increased variance over that of the true target distribution, but it may also improve convergence behavior [30]. Initial runs using the "sticky" MCMC approach yielded lower overall rates of parameter mixing than did the "smooth" MCMC method. Therefore, only results using the "smooth" MCMC algorithm are reported here.
For each dataset, ten independently seeded replicates of the Markov chain are run for 10 5 steps, not including an initial "burn-in" period of 10 3 steps. A C++ program, written to perform the MCMCMC method called mc3 is freely available over the internet at http://www.rochester.edu/ College/BIO/labs/Garrigan/software.htm. The OpenMP application program interface http://www.openmp.org/ is used to distribute parallelized Markov chains across the shared memory of eight cores in dual Intel Quad-core Xeon processors running at 2.66 GHz. The duration of the Metropolis-coupled runs are typically 40-80 hours, depending upon the dataset or the MCMC algorithm. The mc3 program allows users to input the desired number of Metropolis-coupled chains to run on their particular computer.
The potential scale reduction factor (PSRF) is used to quantify the convergence of the ten replicate runs to the stationary distribution [31], as implemented in the CODA package for the free statistical programming environment Rhttp://www.r-project.org. The CODA package is also used to calculate posterior probability densities, parameter autocorrelations and cross-correlations, as well as the 95% highest posterior density (HPD) intervals.

Assessing the Accuracy of the Method
Twelve simulated joint frequency spectra are generated under a two-population divergence model with known parameters, given in Table 1. Marginal posterior probability distributions for the model parameters are then estimated, using the method outlined above. For each simulated data set, a joint frequency spectrum of 1000 unlinked single nucleotide polymorphism loci is generated for both the X chromosome and the autosomes, resulting in a total of 2000 unlinked polymorphic sites. The number of sampled chromosomes in each simulation is n 1 = 20 and n 2 = 20, for both the X-linked and autosomal joint frequency spectra. The prior distributions used for the analysis of simulated data sets A-F are the same as those given above for the analysis of the human resequence data, except the prior for α 1 was exponentially distributed with mean 50. For simulated data sets G-L, only the prior for t 1 was altered to a uniform distribution over the interval (0, 5).
somes; however, Pool and Nielsen [20] show that the X versus autosomal effective size may be equal during the growth phase following a population bottleneck. The intention here is to use the method to ascertain whether rapid population growth following a bottleneck, or a decrease in male effective population size, may result in increased levels of X chromosome genetic diversity.

Authors' contributions
D.G. designed the study, wrote the MC3 program, performed the analyses and wrote the paper.