Molecular Modeling and Docking Study on Glycoproteins of Tomato Spotted Wilt Virus and its Inhibition by Antiviral Agents

Research Article

Austin J Proteomics Bioinform & Genomics. 2014;1(1): 6.

Molecular Modeling and Docking Study on Glycoproteins of Tomato Spotted Wilt Virus and its Inhibition by Antiviral Agents

Soundararajan P1, Manivannan A1, Muneer S2, Park YG1, Ko CH1 and Jeong BR1,2,3*

1Department of Horticulture, Division of Applied Life Science (BK21Plus Program), Graduate School, Gyeongsang National University, Jinju 660-701, Korea

2Institute of Agricultural & Life Science, Gyeongsang National University, Jinju 660-701, Korea

3Research Institute of Life Science, Gyeongsang National University, Jinju 660-701, Korea

*Corresponding author: Jeong BR, Department of Horticulture, Division of Applied Life Science (BK21Plus Program), Graduate School, Gyeongsang National University, Jinju 660-701, Korea

Received: July 04, 2014; Accepted: September 02, 2014; Published: September 04, 2014

Abstract

Tomato Spotted Wilt Virus (TSWV) belongs to the genus Tospovirus and has a tripartite, single-stranded negative-sense RNA genome. It is the only genus infects plants in the Bunyaviridae family. During viral infection, a complementary strand code for the envelope glycoprotein (GP) precursor is post-translationally cleaved into two spikes or glycoproteins such as GN (amino-terminal) and GC (carboxyl- terminal). Importantly, GN and GC plays key role in particle assembly, maturation, and release in infected cells. Among the two proteins, GN is most important because during maturation process GC retained in the endoplasmic reticulum (ER) unless co-expressed with GN. It enables GN-GC co-migration from ER to the golgi complex. In order to elevate the broad spectrum of viral inhibitors, it is necessary to understand the viral proteins structural details and its interaction with antiviral drugs. The compounds used for protein-ligand docking are tunicamycin, distamycin-A, tiazofurin, actigard, admire, and ribavirin. In order to elucidate their structural interaction, the three-dimensional structure of both GN and GC was predicted. Protein-protein (GC-GN) docking indicates that the C-terminal of GN is necessary for heterodimerization with GC and localization from ER to golgi complex. Consequently, protein-ligand (GN-antiviral compounds) showed that the compounds such as tunicamycin and distamycin-A were considered as the most efficient drugs. On conclusion, in this study the structural details and the docking interaction between GC-GN and GN-antiviral compounds have been explored in TSWV.

Keywords: Glycoprotein; Modeling; Docking; Antiviral compounds; Inhibition

Abbrevations

GP: Glycoprotein; GN: Amino-terminal of Glycoprotein; GC: Carboxy-terminal of Glycoprotein; ER: Endoplasmic Reticulum; TSWV: Tomato Spotted Wilt Virus

Introduction

Up to our knowledge this is the first modeling and docking report on glycoprotein (GP) of tomato spotted wilt virus (TSWV). TSWV belongs to the genus Tospovirus which has broad host range and worldwide distribution [1]. Unlike other genus in the Bunyaviridae family, Tospovirus is unique in its ability to contain the plant infecting counter parts [2]. This enveloped plant virus cause great economic losses in most of the (sub) tropical and temperate crops. Approximately 1080 plant species are affected by TSWV and most of the Tospovirus is transmitted by an insect belongs to the order Thysanoptera vector [3-4]. The genome of Tospovirus consists of tripartite RNA, which is composed of a large (L) (<9 kb), medium (M) (~4.8 kb) and small (S) segment (~3 kb). The L segment encodes an RNA-dependent RNA polymerase (RdRp), the M segment codes for GP and the non-structural protein (NSm) followed by the S segment that encodes another non-structural protein (NSs) and the nucleoprotein (N) [5]. Among the above mentioned proteins, the envelope GP plays crucial role in particle assembly, maturation, and release in the host organism [6]. During viral infection, complementary strand codes for the GP precursor is post-translationally cleaved into two spikes or glycoproteins such as GN and GC (Where N- and C- refers to the amino- and carboxyl- terminal) [7]. Initially, envelope protein such as GN and GC are separately accumulated in endoplasmic reticulum (ER) [8]. Viruses of this family obtain their lipid membrane and maturation at the golgi complex of the host cell. Similarly, animal infecting bunyavirus also have the same mechanism during infection [9]. Firstly, GC protein is arrested in the ER, and able to migrate to the golgi complex only by forming dimer with the GN protein. Transient expression studies of TSWV in plant cells have shown that GN is able to migrate itself and also co-translate with GC from the endoplasmic reticulum to the golgi complex. Unlike GN, GC is unable to leave the ER on its own. Therefore, GN-GC interaction is rather unique feature among plant viruses and required for the replication of the virus in the vector and also for virus particle assembly, development, maturation, and organization in the host [6,10,11].

In order to elevate the broad spectrum of viral inhibitors, it is necessary to understand the structural details of viral proteins and its interaction with antiviral drugs. So, the practically used antiviral agents against TSWV such as tunicamycin, distamycin-A, tiazofurin, actigard, admire, and ribavirin have been virtually screened in the current study to identify the elite compounds to inhibit TSWV. Tunicamycin, an N-glycosylation inhibitor was reported to affect the GC/GN exit from ER [12]. Distamycin-A was reported to delay the spreading of TSWV in tobacco leaves [13]. Carner et al. [14] mentioned that tiazofurin showed efficient activity against TSWV in tomato. Ribavirin significantly reduced the overall growth rate of the virus by increasing the duration of lag phase in the TSWV affected cells in both tomato and tobacco [15]. Actigard and admire application in early infection showed some effective reduction of TSWV in flue-cure tobacco [16].

However, till date none of the reports have documented the three-dimensional (3D) model, and the protein-protein interaction and protein-antiviral compound(s) interaction of TSWV GN and GC. In the biological database amino acid sequence of many important proteins are available but their 3D structure(s) are limited. Structural details of protein are necessary to study the exact function and its nature. Application of computational technology on the prediction of protein structure and their interaction/recognition process has been successfully applied in biological research. This technique could provide the proper guide to carryout vast array of experiments in less time and reduced cost [17-19]. Hence, to understand the GN-GC interaction and inhibitory mechanism of antiviral compounds, molecular modeling and docking strategy have been carried out. Further, the current work will provide theoretical guidance on the research and development of antiviral agents against TSWV.

Materials and Methods

Molecular modeling

The sequence of GN and GC was retrieved separately from the whole length of precursor envelope glycoprotein (ID: O55647) from UniProt database (https://www.uniprot.org/). I-TASSER, ranked as No. 1 server for protein structure prediction, was used to predict the 3D model [20]. For ab-initio modeling, I-TASSER generates the full length model of proteins by excising continuous fragments from threading alignments and then reassembles those using replica-exchanged Monte Carlo simulations. Low temperature replicas (decoys) generated during the simulation are clustered by SPICKER and top five cluster centroids are selected for generating full atomic models. Cluster density is defined as the number of structure decoys at a unit of space in the SPICKER cluster. A higher cluster density means the structure occurs more often in the simulation trajectory and therefore indicates a better quality model. The confidence (C)-score is used to estimate the quality of the models based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. To find the accuracy of the predicted model, transmembrane (TM) score and root mean square deviation (RMSD) are also measured based on the C-score to know the structural similarity between two structures.

Validation and transmembrane and active site prediction

The best 3D model selected was validated by Structural Analysis and Verification Server (SAVeS) (https://services.mbi.ucla.edu/ SAVES/). In SAVes, the stereochemical quality of a protein structure was checked residue-by-residue geometry and overall structure geometry by PROCHECK [21] and WHAT_CHECK. Non-bonded interactions between different atom types were evaluated by ERRAT. VERIFY_3D and PROVE was used to determine the compatibility of the atomic model (3D) with its own amino acid sequence (1D) by assigning a structural class based on its location and environment (alpha, beta, loop, polar etc) and comparing the results to good structures. Volumes of atoms in macromolecules using an algorithm treats the atoms like hard spheres and calculates a statistical Z-score deviation for the model from highly resolved (2.0 Å or better) and refined (R-factor of 0.2 or better) PDB-deposited structures was calculated by PROVE. Transmembrane region of both GN and GC have been predicted by TMHMM server (v. 2.0) [22]. Active site was predicted by the Q-SiteFinder, an online web server [23].

Ligand preparation and validation

Chemical formula of all the anti-viral compounds such as distamycin-A (3003), tiazofurin (403014), actigard (77928), admire (77934), and ribavirin (34439) were retrieved from ChemSpider database (https://www.chemspider.com) except tunicamycin. The chemical formula of tunicamycin (6433557) has been retrieved from PubChem (https://www.ncbi.nlm.nih.gov/pccompound). Compound identification number from their respective databases has been mentioned in the brackets. All the structures were sketched and optimized using universal force field till the optimized geometry is achieved in ArgusLab (v. 4.0.1) [24]. After optimization, all compounds were converted to PDB format for further docking studies.

Protein-protein docking simulation

To find the essential region of GN for the dimerization between GN and GC during rescue of GC from ER, protein-protein docking stimulation was carried out by using GRAMM-X (v.1.2.0), a protein-protein docking web server from Vakser Lab [25]. Whole protein structure was given for both GN and GC was chose for protein-protein interaction. Maximum number of 100 output models was given using default parameters. Generally, in GRAMMX intermolecular energy function has been used to smoothen by changing the range of the atom-atom potentials. This technique locates the area of the global minimum of intermolecular energy for structures of different accuracy. GRAMM-X was able to detect the near native matches in complexes with large conformational changes.

Protein-ligand docking simulation

Autodock (v. 4.2) was used to analyze the interaction between GN and anti-viral compounds [26]. Firstly, Kollman united atom charges, solvation parameters, and polar hydrogens were added to the protein. Protonated species were merged with the non-polar hydrogens after Gasteiger charge was assigned. The rigid roots of selected flexible molecules were defined and ligands were allowed to rotate freely by applying Auto-Tors. To dock the compounds against macromolecules grid maps need to be set on the receptor to define the region for interaction. The grid box x, y, and z coordinates were varied according to the size of the ligand. Lamarckian genetic algorithm (LGA) is the hybrid of the Genetic Algorithm (GA) method with the adaptive Local Search (LS) method to enhance the relative performance of GA. The spacing between grid points was set to 0.375 Å. Maximum of 10 conformers was used for each compound. The population size was set to 100 and individuals were initialized with the following parameters: maximum number of energy evaluation 55 x 105, maximum number of generations 1000, maximum number of top individual that automatically survived 1, mutation rate 0.02, crossover rate 0.8, step sizes were 0.2 Å for translations, 5.0° for quaternions and 5.0° for torsions, cluster tolerance 0.5 Å, external grid energy 1000, maximum initial energy 0.0 and maximum number of retries 10000. Totally, 50 LGA runs were performed to each anti-viral compound and the best configuration was selected based on the least global energy values, H-bonds and No. of electro statistically interacted amino acids.

Visualization

The Discovery Studio Visualizer (v. 4.0) has been used to visualize the modeled proteins and docked molecules. The active sites of GN predicted from the Q-site finder was viewed in PyMol (v. 1.7) software.

Result and Discussion

Molecular modeling of glycoprotein

To date, neither crystallographic/NMR structure nor theoretically predicted 3D model of envelope TSWV glycoprotein is available. However, several experimental studies on TSWV have suggested that the envelope GP is the major contributor for the pathogenicity of TSWV and it is associated with the attachment/receptor binding on the host cell surface. Therefore, the present report has attempted to utilize the existing experimental results with the bioinformatics approaches to understand the interaction between both GN-GC and GN-antiviral agents, respectively. In addition, the structural information of the TSWV proteins would present a significant step in forward to identifying new drug and their targets. Due to the less homology of TSWV proteins even within the genus, the theoretical structural prediction of GP protein is very difficult. In computational biology, prediction of 3D structure of protein from the amino acid sequence is relatively easy when a candidate protein exhibits significant sequence similarity (30%) to the already known protein structure [17,18]. The total length of the glycoprotein is 1135 amino acids which is divided into three parts i.e., 1-35 amino acids act as a signal peptide, 36-484 amino acids codes the GN and 485-1135 amino acids codes the GC. The modeling was done only on the region encodes GN and GC proteins.

In the current work, five models were predicted by the I-TASSER server and based on the C-score, TM-score, and RMSD, one best model for GN and GC was selected (Figure 1A and B) for further analysis. In a benchmark set of 500 non-homologous proteins C-score of TSWV GN protein is highly correlated with TM-score and RMSD. Selected GN and GC protein's I-TASSER parameters value such as C-Score, No. of decoys, cluster density, TM-score and RMSD are shown in table 1. The CHARM force field has been applied to minimize the energy of GN and GC i.e., -1096.92 and -5506.352 KJ•mol-1, respectively. Modeled protein φ-Ψ torsion angles were evaluated by PROCHECK. Ramachandran plot of the energy minimized 3D model of GN protein displayed 93.0% in favorable region (75.2% in core and 17.8% in additionally allowed region), 4.7% in generously allowed and only 2.2% in disallowed region. Percentage of residues in favoured regions (74.7% in core and 16.9% in additional allowed region) was 91.6%, a generously allowed region was 5.9% and disallowed region was 2.6%. Theoretically predicted models almost satisfied all the parameters of WHAT_CHECK, ERRAT, VERIFY_3D, AND PROVE (Data not shown).