Phylogeny: tracing COVID around the world.

COVID-18 Spatial clustering,. nextstrain.org

The Phylogeny of  SARS-2.

Over the last fifteen years a new tool is emerged to assist in the tracking of outbreaks - genetic sequencing. Machines that can rapidly sequence the genome of people and animals can also be turned to pathogens, sequencing their various proteins to establish their pedigree, timing  and transfer.

What we are looking for is small, stable, single-base mutations that are actually errors in copying, random mistakes that accumulate in the genome. The mutations are passed down to all descendants of a particular organism, establishing a hierarchy of different strains descending from the original organism.The branching diagram or phylogenetic tree that results from testing different strains and establishing their evolutionary relationships from these distinctive markers is known as a haplotree.

SARS-2 is a positive-sense single-strand RNA virus - others include the common cold, polio, yellow fever and most kinds of gastroenteritis.These are all small (3000 base). and are capable of recombination to form new variants. 

The copying of RNA does not have the checking mechanisms of DNA and therefore RNA viruses mutate far more rapidly than DNA. A typical RNA haplotree covers only a few months, whereas human YDNA or mtDNA haplotrees cover millennia.

COVID-2 has several other proteins that also may be used for phylogeny - for example, the one covering the external spikes.

The haplotree of COVID -19

nextstrain.com - more detail on original

The phylogeny of the SARS-CoV-2 RNA strand as known at April 6 2020 is shown in the Figure, and shows the sequencing of 3486 CV samples from different parts of the world. There are three basal branches, each with many subbranches as the virus evolved. The colours represent specific localities as per the map at the beginning of the post.

The diagram shows the great complexity of moves and infections that have been occurring globally,with most major branches present in all locations. However there is a  major European branch containing about half of all sequenced samples, and a major American branch with 495 descendants. Wuhan was closed shortly afterwards.

The topmost branch (clade A2a) has been sequenced the most,  with 1718 descendants. It has three mutations from the original ancestor and dates to 13 January. It consists almost entirely of European samples and their descendants. There is an early Shanghai branch, probably from the source, which went extinct after 6 February. It expands extremely rapidly from 28 January, possibly from an English source, forming new branches in France, Spain, Portugal, Netherlands, Belgium, Iceland, Canada and the USA, with at least half a dozen different British branches. [Note - there are no samples from Italy or Germany, and from the known progress of the disease, some of the action probably originated there].

The second branch (clade B1) has 1210 sequenced descendants. It branches again into a number of branches by 25 December. One of these contain Patient Zero, and most of the sequenced American samples belong to this line. Other branches continue in China till late February.

The third branch (clade A1a) has 543 descendants, including many of the Chinese samples, and quite a few in Scandinavia, the Low Countries and Britain.

Australia provided a number of early samples, and a few of these are direct from China. Most of the cases however are in the European strain A2a (some via the USA) or in the US strain B1, indicating they arrived from countries that were not being screened at the time.

Patient Zero seems to be responsible for about half the US cases. This is fairly usual - the number of descendants of any group of ancestors soon randomly forms a "geometric distribution" in which a few originals have many descendants and many individuals have few descendants. (see Flood 2016).

Summary

 The phylogeny shows how rapidly COVID-2 spread around the world. Several clades can be identified strongly with particular regions. However, in a globalised world, any recent example of the clade might have passed through one or more other regions before arriving at the tested location.

A single case might appear to have had no further spread, but then COVID went 'underground' through a chain of mild cases, finally reappearing a month later and creating the ultimate pandemic. For a disease as infectious and dangerous as COVID-2, it is essential to close borders both at source and at destination, and if a case does slip through, to be prepared to test all contacts.

The experience of Australia shows that it is wise to presume nothing when a global outbreak is going on, and to exclude travel not just from the known area of infection, but also from countries in which only single, apparently isolated cases have been found.


<Return to Index>
.   

Comments