| Total | Unique | Unique N > 2 | % Red. |
---|
All Sequences | 24152 | 2530 | 1007 | 95.83% |
Pre-2000 | 2396 | 475 | 184 | 92.32% |
2000-2010 | 13753 | 1473 | 560 | 95.93% |
Post-2010 | 8003 | 909 | 373 | 95.34% |
Human | 13315 | 783 | 310 | 97.67% |
Swine | 3288 | 742 | 277 | 91.58% |
Other | 7549 | 1103 | 453 | 94.00% |
- Six subsets of the total sequence set were constructed: three divided by date of isolation (Pre-2000, 2000-2010, and Post-2010), and three divided by host organism (human, swine, other). Two filters were applied to the unfiltered sequence subsets ('Total'): the first filter including only unique strains ('Unique' column), and a second filter including only those unique strains that occur at least twice in the unfiltered sequence set ('Unique n > 2'). The '% Red.' column refers to the reduction in set size that takes place between the 'Total' column and the 'Unique n > 2' column. A higher percent reduction is an indication of high sequence repetition in the initial set.