Ching-Yung Lin: Global Coronavirus can be divided into eight categories. The more it spreads over time, the more the virus mutates

Posted by Global Bio and Investment on Apr 30, 2020

Alt Text (Photo: Global Bio and Investment)

Today (April 30, 2020), "TAIWAN is Helping: All-round AI x Epidemic Prevention Online Forum" sponsored by Daquan Health Care Sub-center of the Ministry of Science and Technology Subsidy Station in Taiwan. Graphen founder and CEO, Ching-Yung Lin shared how to use artificial intelligence tools to monitor the evolution of the COVID-19 virus.

In mid-March, Ching-Yung Lin used AI ​​data analysis tools to draw the "COVID-19 Gene Evolution Pathway Analysis". Currently, he has completed the genetic sequence alignment of over 12,294 virus strains worldwide.Ching-Yung Lin said, COVID-19 virus genome can produce 26 proteins, each protein is like a part with its own function. When gene mutates, it affects protein translation. Parts change will also cause the virus to produce different functions. Thus gene mutation is very important to address.

Alt Text According to over 12,000 Coronavirus strains, they can be roughly divided into eight categories (Image source: Graphen)

The virus evolution tree has been expanding since December last year. According to the current sequence comparison of 12,294 virus strains, viruses can be roughly divided into eight categories (A1, A2, B, C, D, E, F, G, H).

As for where the origin of the Coronavirus was, according to further interviews in this journal, Ching-Yung Lin stated that A1 and A2 are the L and S types mentioned in an earlier Chinese paper. A1 is more concentrated in Wuhan, A2 is distributed throughout China, including Wuhan, Jiangxi, Shandong, Zhejiang, etc. The two virus strains are separated very early, while the other viruses are their respective offspring, so these two should be the earliest of this virus mutations.

Because there are too few viral data from Wuhan, whether A1 or A2 is the earliest mutation will need more data to determine. He also said that in any case, it should be obvious that the virus started in China, unless there are other early data published later.

Alt Text Eight virus families throughout time and variability characteristics in different countries. (Source: Graphen)

Ching-Yung Lin also used data analysis tools to see how the eight major types of viruses spread around the world (above). He said family B viruses were the fastest spread and the most infected at the beginning. F, E, D, C, G, H variant viruses appeared one after another. But as the pandemic continued to break out in various places, the B virus that had the fastest progress and most infection was also caught up by the H virus which showed up most recently.

H virus is currently the main infectious virus strain on the east coast of the United States. It can be seen from the evolutionary tree that this mutant strain is from France. It is characterized by S protein and ORF3 protein mutations. The function of ORF3 is to pierce the host cell membrane and allow the replicated virus to spread. Ching-Yung Lin was also surprised by this result, because it has not yet been seen elsewhere. The phenomenon of "high purity" virus like New York, the H virus alone accounts for 85%, and the mutation may have made the H virus spread faster more aggressively than other types.

Alt Text Virus mutation variability and mortality rate by country. (Image source: Graphen)

He also said that a few days ago there were reports that the more virus mutations are, the higher the fatality rate. However, from the data above, there's no absolute correlations between the average virus variation (Avg Variants) and the national mortality rate (Mortality Rate) . The average number of virus variants in a country reflects the time of the country 's outbreak by Coronavirus, and the mortality rate is affected by various of factors, including the number of diagnoses and the actual number of tests in each country, high-risk community, etc. .

After the virus mutates, from the perspective of natural selection, those with a higher transmission rate will be passed down. If the lethal rate is too high, it will affect the unfavorable growth of virus reproduction, such as MERS / SARS. Therefore, under the pressure of natural selection, the virus will always mutate to find the "best balance" between infectivity and lethality. Therefore, the current results on actual figures do not support the assumptions about the variability and lethality of each country.

However, Ching-Yung Lin also believes from the analysis results that the COVID-19 virus will continue to increase as the number of infected persons increases, and the mutation will continue to increase.

Graphen has now made visualization of the progress of various viruses in different regions on its official website ( By clicking on the state, country, region or city, you can observe the different regions and its Coronavirus status. Ching-Yung Lin said that AI can help analyze literature and data. The goal of Graphen is to solve problems the humans are facing through AI technology.

Source: Original story in Chinese