SARS-CoV-2 Variant Evolution

(beta version)

This website for "Evolutionary trajectory and origin of SARS-CoV-2 variant"
by Anyou Wang

Quality attributes extracted from traditional alignment-based methods fail to provide dynamic evolutionary trajectory and origin of SARS-CoV-2(severe acute respiratory syndrome coronavirus 2), which causes COVID-19 global pandemic. This study develops an alignment-free approach combining Fréchet distance (Fr) and artificial recurrent neural network to reveal evolutionary trajectory and origin of SARS-Co-2 variant. Fr measures the dissimilarity between reference and variant genome in 84 genome features (4 single nucleotides, 16 dinucleotides and 64 codons). Recurrent neural networks use these 84 feature Frs to quantitatively identify variants and reveal the evolutionary trajectory and origin of SARS-CoV-2 from more than one million of genome sequences. Total 34 SARS-Co-2 variants have been identified. All these variants dynamically delete their genome during evolution, but their trajectory and deletion degree varies with individual variants, which can be classified into 3 groups, slight mutation group (13 members), middle level deletion (17 members), and high deletion (4 members). The slight deletion group works like wild type and its trajectory waves only slightly and temporarily, which has very low infection capacity. The high deletion group fluctuates a rough trajectory with a large loss and also infect humans lightly. The middle deletion group gradually deletes their genome with a certain rhythm trajectory, corresponding to the pandemic peaks. This group causes most of the global COVID-19 cases. At least 3 mink coronavirus variants pose 56 genome features similar to SARS-Co-2 and they are predicted to be able to infect human, and thus mink is the most likely origin of SARS-Co-2, and the origin path follows this order: mink, cat, tiger, mouse, hamster, dog, lion, gorilla, leopard, bat, and pangolin. These findings provide a valuable guideline to combat COVID-19.

To search Fr database, please enter a virus GISAID access ID

For example, EPI_ISL_601443