Skip to main content
Fig. 1 | BMC Genomic Data

Fig. 1

From: Dating ancient splits in phylogenetic trees, with application to the human-Neanderthal split

Fig. 1

We use a large, comprehensive phylogeny, such as Phylotree, and assume its tree topology and branch lengths are known. We also assume that the phylogeny is detailed enough so that it describes all the substitutions that occurred between its sequences. From this detailed phylogeny, we extract a list of the number of transitions and transversions that occurred in each site along the phylogeny to get \(\vec {X}_{mtDNA}\) (composed of the number of transitions that occurred at each site) and a list of phylogeny transversion sites - sites in which at least one transversion occurred along the phylogeny. We aim to estimate the distance between two sequences that are not necessarily part of the tree and may be much more distant than branch tree lengths. To do so, we extract a binary vector \(\vec {Z}\) that states for each site whether the sequences are identical (\(z_i=0\), marked black) or different (\(z_i=1\), marked orange). We check in which sites a transversion must have occurred between the two sequences (marked blue) and remove these sites from \(\vec {X}_{mtDNA}\) and \(\vec {Z}\), thus shortening these vectors. We do the same for the phylogeny transversion sites

Back to article page