Skip to main content
Figure 2 | BMC Genetics

Figure 2

From: Clustering by genetic ancestry using genome-wide SNP data

Figure 2

Accuracy, Stability, Between Cluster Distance and Scoring Index (SI) for HGDP Africans. Accuracy, stability and between cluster distance on the k-means cluster assignments are displayed in red and the measures on the permuted cluster assignments are displayed in blue. The SI averages the relative gain in the accuracy, stability and between cluster distances and maximizes at 8 clusters (0.931, CI: 0.923, 0.938)). Note that the graphical display of accuracy, stability and distance shows that none of the score components would be sufficient to identify the correct number of clusters. For example, high accuracy is not sufficient to conclude that the clustering is optimal. In this example, the accuracy is nearly perfect when the number of clusters is less than 7 suggesting that any number of clusters less than 7 is equally optimal, however these numbers are not all equally optimal as demonstrated by the measures of stability and between cluster distances that continually increase.

Back to article page