This article is from the WeChat public account: The Academy of Science (ID: kexuedayuan) , oF: learning CADD egg

I thought this would still be an ordinary winter vacation.

Until the situation became more serious.

Investigation tasks for various proteins of the virus are assigned to each laboratory member far away. There are no bottles, no reagents, no expensive instruments. A computer that can remotely connect to the server is all I have.

Figure 1. Drug development sometimes only requires a notebook that can connect to the server (Image source: veer gallery)

Many people think that drug development can only be done in a lab coat. However, with the development and maturity of various theoretical methods, computer hardware, and industrial and academic professional software for more than ten years, Computer Aided Drug Design (Computer-Aided Drug Design, CADD) method has become increasingly mature, and its application has greatly accelerated the speed and efficiency of new drug development, and has become one of the conventional methods of modern drug development. [1] .

Relying on such technology, even if I stay at home and work remotely at home, I can contribute to finding potential medicines for this viral infection.

Finding key proteins

Protein is the material basis of life and an important component of all cells and tissues that make up bacteria, viruses, animals and plants. Various functional proteins perform their respective functions in the body, maintaining the normal operation of the entire body.

Take a virus as an example. A virus is a non-cellular form composed of nucleic acid (DNA or RNA) and protein. Between living organisms, they cannot replicate and reproduce themselves. They need to be parasitic in living host cells. Depending on the raw materials, energy supply and place of the host cells, they can replicate and release themselves. [2] . This process is like the criminal (virus) of a country A who sells counterfeit banknotes, which does not have the ability to print the currency of country A; but when he After hacking into the banknote printing factory in Country B, (Host Cell) , he can call all the resources of the factory in Country B to print the currency of Country A (the nucleic acid and protein of the virus) .

The life cycle of a virus needs to undergo six major steps: adsorption, invasion, husking, biosynthesis, assembly, and release. [2] , Various functional proteins of the virus have a clear division of labor and a high degree of cooperation in these steps, so as to complete the entire cycle from infecting host cells to replicating the virus. At present, researchers have isolated orf1ab, S, E, M, N, etc. from a new coronavirus (2019-nCov) Ten genomic sequences, each encoding the corresponding viral protein.

By analogy with SARS viruses, which are also coronaviruses, we can reasonably infer the functions of various proteins encoded by the new coronavirus genes. For example, for example, the orf1ab gene encodes the orf1ab polyprotein, participates in the transcription and replication of viral RNA, and has multiple functions such as proteases and methyltransferases; the S gene encodes the surface glycoprotein of coronavirus, also known as spinous processes The protein (Spike Protein) , through binding with the ACE2 protein in the human body, directly mediates the infection and fusion of the host cell by the virus, such proteins are like The corona virus is distributed on the envelope of the virus, hence the name “coronavirus”. These functional proteins play an important role in virus infection and replication. Interfering with the function of one or more of these proteins individually or simultaneously, and inhibiting their activity, they can block the virus from infecting or in host cells. The process of self-replication, which has a therapeutic effect.

So once the crystal structure of the viral protein is obtained, based on our previous understanding of its function, we can use computer-assisted drug design technology to find potential effective drugs in a targeted manner. So how does this process work?


Fish and knife, lock and key

If it is “taking things by appearance”, today, with “smooth elasticity” as the beauty, the surface of the protein is potholes, and “looking like” is not really beautiful. But it is these hollow cavities that really hide the mystery.

Take a segment of the orf1ab polyprotein of SARS virus, 3C-like protease, also called Mpro protein as an example. This protein is mainly responsible for hydrolyzing polyproteins into functional peptides in order to perform their respective functions. It is like when a fish is used, a whole knife is used to descale and dirty the whole fish, and it is divided into fish head The fish body and fish tail are prepared separately as stew, braised and steamed. The active site of this protease is like the blade of this kitchen knife, which is hidden in the cavity of the protein surface (pocket) (Figure 2 ).


Figure 2. Potty SARS Mpro protein surface (white) and its protease active site pocket (red circle), PDB number 2GX4 [3] (picture Source: PyMol mapping)

Small molecules with biological activity obtained by screening or synthesis can bind well in the pocket of this active site. (Figure 3) , Thereby inhibiting the activity of the protease, preventing it from cleaving polyproteins into functional polypeptides, thereby preventing these polypeptides from functioning in subsequent virus replication and infection. It’s like confiscating the fish-cutting knife, or putting a protective cover on the blade to passivate it, making it impossible to process the fish and subsequent cooking.


Figure 3. SARS Mpro protein surface (white) and literature reported inhibitor (green stick), PDB No. 2GX4 [3] (Image source: PyMol)

For each protein, the process of finding a molecule with satisfactory biological activity is like facing an exquisite lock, and you need to find an equally exquisite key to fit it closely. What computer-aided drug design needs to do is to find such a reasonable key in an efficient and low-cost way; Earlier understanding and systematic analysis of protein structure, through calculation, evaluate the binding strength and mode of action of various molecules in protein pockets, so as to screen or design the most likely to be protein function inhibitors (or agonists) molecules.

Computer Assisted Drug Discovery

Unfortunately, despite the rapid development of related technologies in recent years, it is not easy to find such a beautiful key. It often takes decades and billions of dollars for a new drug to reach the market from research and development to market. But the good news is that There are always a few locks that look alike, and their keys can be used universally , just like inhibitors of HIV proteases may act on proteases of coronavirus (such as nelfinavir and lopinavir) , RNA polymerases responsible for RNA replication of Ebola or influenza virus (RNA-dependent RNA polymerase, RdRp) inhibitors may act on the Coronavirus’s RdRp as well as (such as Fapilavir and Rui Desiwe) [5] .

In the face of an acute outbreak, finding suitable molecules from “old drugs” that are already on the market or clinically is obviously a strategy that has a time advantage over developing new molecules from scratch. Using computer-assisted drug design, virtual screening based on a technology called molecular docking (molecular docking) (virtual screening) , we can simulate the binding conformation of each “old medicine” molecule in the viral protein pocket, and evaluate its theoretical binding strength by means of scoring functions and free energy calculations. To analyze the possibility of the molecule becoming a potential inhibitor. It’s like when we know the structure of the “lock”, using this technology, we don’t have to insert each key in the keyhole to find it ourselves.The one that can be unlocked; A computer simulation analysis method is used to screen out the few keys that are most likely to open the lock, and then only test these keys .

Even, by using the technology of homology modeling (homology modeling) , we can not even know the “lock” of the current viral protein In the case of only the protein amino acid sequence, a possible model of the viral protein structure is constructed, and a virtual screening job is submitted. Although according to early news reports, the similarity between the 2019-nCov virus and SARS virus at the genome level is only 70%, but in fact, through sequence comparison of viral proteins, it can be found that some of the key proteins The amino acid homology performance of SRAS virus is above 95%. (Example, Figure 4) . Therefore, relying on the protein crystal structure database (Protein Data Bank) , the crystal structure of the SARS virus protein obtained during the early research of SRAS virus, we will A reasonable 2019-nCov corresponding protein structure can be constructed.


Figure 4. MRS protein amino acid sequence alignment between SARS and 2019-nCov (Image source: BoxShade)

Therefore, when basic outbreaks such as structural biology are too late to keep up with the acute outbreak, when the virus is too toxic, laboratory conditions are limited, and there are fewer laboratories meeting biosafety requirements; When there are many people, insufficient staff, and high costs, computer supportThe help method is an efficient strategy, which provides valuable suggestions for the discovery of active molecules and the exploration of the mechanism, thus earning valuable time for the development of special-effect drugs.


Write last words

I have been involved in this new coronavirus-related work at home since the third day of the winter vacation. It has been about two weeks since then. Undoubtedly, this is the most fulfilling and intensive vacation I’ve had at the end of my PhD career.

But as mentioned earlier, drug discovery is no easy task. As some scholars said earlier, “No matter how serious the disease is, no matter how urgent our desire for new drugs and new vaccines is, the law of new drug and new vaccine development cannot be surpassed.” [ 6 ] Drug development has its inherent laws, and no human will can be transferred. Although as mentioned earlier, many drugs are currently undergoing clinical or preclinical studies [5] , but still protect yourself from infection It is a priority for everyone.

Fortunately, compared to 17 years ago, we now have Tianhe and Shenwei. The computing power used for protein system simulation is not the same as that year. One group after another and the research staff did not hesitate Sacrifice his rest time to participate in this research.

Maybe as it is said in the high school politics textbook: “The road is tortuous and the future is bright!”


(Written in the gap between server calculation results in February 2020)

Reference:

1. Jorgensen, W. L., The many roles of computation in drug discovery. Science 2004, 303, 1813.

2. Institute of Microbiology, Chinese Academy of Sciences, Academy of Sciences, Invasion & Counterattack! Offense and defense of pathogens and human body. https://mp.weixin.qq.com/s/Zh9-8C-hilrlKk8jrVo60g < / span>

3.Yang, S .; Chen, SJ; Hsu, MF; et, al., Synthesis, crystal structure, structure-activity relationships, and antiviral activity of a potent SARS coronavirus 3CL protease inhibitor. J. Med. Chem. 2006, 49, 4971.

4. Guan Li, academy of sciences, how do new drugs move from the laboratory to the market? https://mp.weixin.qq.com/s/huwBvSKldeBp0TJBnKY_aQ << br >

5. Medicine Rubik’s Cube, Medicine Rubik’s Cube Info,