Network Analysis of Differential Expression of ANLN and Its Interacting Proteins in Esophageal Squamous Cell Carcinoma

Research Article

Austin J Biosens & Bioelectron. 2023; 8(1): 1046.

Network Analysis of Differential Expression of ANLN and Its Interacting Proteins in Esophageal Squamous Cell Carcinoma

Hong Sun1,2; Yufei Cao1,2; Bingli Wu1,2; Beibei Tong1,2; Huayan Zou1,2; Liyan Xu2; Enmin Li1-3*

¹Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou 515041, China

²Key Laboratory of Molecular Biology in High Cancer Incidence Coastal Chao Shan Area of Guangdong Higher Education Institutes, Shantou University Medical College, Shantou 515041, China

³Shantou Academy Medical Sciences, Shantou 515041, China

*Corresponding author: Enmin LiDepartment of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou 515041, China. Email: nmli@stu.edu.cn

Received: August 31, 2023 Accepted: September 25, 2023 Published: October 02, 2023

Abstract

Anillin (ANLN) is an actin binding protein, which was originally extracted from Drosophila melanogaster embryos. As a key regulatory factor in cytokinesis, ANLN is highly expressed in various tumors, leading to abnormal cell division and promoting the proliferation, migration, and invasion of cancer cells. At present, the role and regulatory mechanism of ANLN in human Esophageal Squamous Cell Carcinoma (ESCC) are not fully understood. The purpose of this study is to construct a Protein-Protein Interaction Network (PPIN) to reveal the characteristics of ANLN and its interacting proteins, by the integration of their expression in ESCC. The differentially expressed ANLN and its interacting proteins in ESCC were identified from our previous RNA-seq data. By constructing a specific PPI network, it was found that many differentially expressed genes/proteins may interact with ANLN. Multiple enrichment pathways of ANLN and its differentially expressed genes were explained by functional enrichment analysis, Gene Ontology (GO) analysis and KEGG pathway analysis. In addition, it is revealed that ANLN, ECT2, ACADM and PPP1R9A play an important role in the occurrence and development of ESCC, and they are proposed as new prognostic factors for ESCC. These bioinformatics analyses provide a comprehensive perspective for the role of ANLN in ESCC.

Keywords: ANLN and its interacting proteins; Differential expression; Protein-protein interaction network; Esophageal squamous cell carcinoma

Introduction

In the human genome, ANLN is located on the human chromosome 7p14.2 and can encode 1124 amino acids with a theoretical molecular weight of 124 kDa [1]. ANLN is a scaffold protein with several unique domains. Its N-terminal contains several domains that bind to the contractile ring proteins, and its C-terminal contains RBD, C2 and PH domains that bind to plasma membrane [2]. ANLN is an actin binding protein, which was originally extracted from the embryos of Drosophila melanogaster [3]. ANLN plays a role in cell growth, migration and cytokinesis [4]. As a highly conserved protein, ANLN can regulate the process of cell division by interacting with different proteins such as F-actin, myosin II and septins [5]. During cytokinesis, ANLN is concentrated in the contractile ring to provide contractile force and mediate the separation of daughter cells [6]. ANLN improves contraction efficiency by directly binding to phosphorylated myosin, or by coordinating the effective interaction between F-actin and Non-muscle Myosin II (NM II) [7,8]. ANLN combines with septins, an important regulator of cell division and mechanical transduction, to regulate cell proliferation, differentiation and migration [9,10]. Several studies have found that ANLN plays a crucial role in promoting tumor cell proliferation, and its absence can inhibit tumor cell division [11,12]. In addition, the promoting effect of ANLN on the growth of gastric cancer cells in vivo has been confirmed by mouse models [13]. The high expression level of ANLN in breast cancer cells is related to the poor prognosis of patients with breast cancer [14]. ANLN is related to T lymphocyte infiltration in pancreatic cancer and has certain potential value in immunotherapy [15]. ANLN enhances the metastasis of lung adenocarcinoma by promoting epithelial-mesenchymal transformation [16]. Targeting USP10-ANLN axis can effectively inhibit the progression of cell cycle in ESCC [17]. Overall, the high expression of ANLN leads to abnormal cell division, thereby promoting the proliferation, migration, and invasion of cancer cells [18-23].

ESCC is a common malignant tumor of digestive system with a high morbidity and mortality. According to the global cancer statistics in 2020, the incidence of ESCC ranks seventh and its

mortality ranks sixth among all malignant tumors [24]. ESCC is the most common subtype of esophageal cancer worldwide, with a high prevalence in East, East and South Africa and southern Europe [25], and its five-year survival rate is 15% - 25% [24], which is a serious threat to human health [26]. China is a country with a high incidence of ESCC, accounting for more than half of the world's morbidity and deaths [27]. ESCC is the fourth leading cause of death of malignant tumors in China, leading 375,000 patients die of ESCC [28]. With the continuous accumulation of biological knowledge and various omics data, protein-protein interaction data are becoming more and more abundant, which makes it possible for them to be used to construct gene regulatory networks for tumors and other diseases. PPI network can provide a powerful working platform for revealing the pathogenesis of complex diseases and drug development [29-32]. In this study, we collected the interaction proteins of human ANLN, constructed a PPI network containing differentially expressed ANLN interacting proteins in ESCC, and analyzed its clinical significance. These data contribute to reveal the role of ANLN in ESCC cells and provide an important molecular basis for studying the regulatory mechanism of ANLN in ESCC.

Materials and Methods

Expression of ANLN and its Interacting Proteins in ESCC

The gene set of ANLN and its interacting proteins was collected from NCBI (https://www.ncbi.nlm.nih.gov/), HPRD (http://www.hprd.org/) (Release 9) and BioGRID (http://thebiogrid.org/) (Release 4.4.209). In our previous high-throughput RNA-seq data, we have screened the expression trend of ANLN and its interacting proteins in ESCC from 15 pairs of ESCC clinical samples using the same stringent criteria. (Foldchange >2 or <0.5, FDR value < 0.05) [33]. The differentially expressed genes/proteins which interacts with ANLN are used to the construction of PPI networks.

Construction of the ANLN PPI Network

The up to date human protein-protein interaction data sets were downloaded and collated from HPRD (http://www.hprd.org/) (Release 9) and BioGRID (http://thebiogrid.org/) (Release 4.4.209) respectively. These two data sets contain low-throughput and high-throughput experimental results collected from public references [34,35]. Based on the above data, a complete human protein-protein interaction network was constructed by Cytoscape software, and the duplicated edges and self-loops in the network were deleted through the "Network Modification" menu to avoid chaos in the calculation of topology parameters of the PPI subnetwork [36,37]. This newly generated network contains 21,678 unique proteins and 790,776 pairs of interactions, which was considered as parent PPI networks. It is well known that up-regulated and down-regulated genes or proteins play an important role in tumorigenesis [38]. In order to highlight their importance, according to the steps we described earlier, ANLN and its interacting proteins with differential expression in ESCC are generated by Cytoscape software to generate specific PPIN [39]. Firstly, the ANLN related dysregulated GENE IDs (official gene symbol) were listed in a text file and mapped to the parental PPI network imported into Cytoscape by the menu of "Select→Nodes→From ID List File". Secondly, the first level interactions between ANLN and their neighbor were detected by menus of "Select→Nodes→First Neighbors of Selected Nodes" and "New→Network→From Selected Nodes, All Edges" to obtain the ANLN PPI network.

Topology Analysis of ANLN PPINs

The Network Analyzer plugin in Cytoscape software was used to analyze the topology parameters of ANLN PPI network, and several topology metrics were calculated, such as clustering coefficient, degree distributions, neighborhood connectivity, topological coefficient and so on, to deeply reveal the organization and structure of complex networks [40]. The node degree to the number of edges associated with the node, or the number of connections, and the node here refers to the protein. Obviously, the more protein nodes in the network, the greater the degree of the node, and the power distribution of the node is the most significance. The network topology feature analysis method and its important parameters were described in our previous work [39]. The degree distribution of the network satisfies the following relation, P(K)=n(K)/N, that is, the ratio of all nodes with k to the total number of nodes in the network. If there are n nodes in a network and the value of n (K) nodes is K, then P(K)=n(K)/N. The degree distribution is the whole distribution of P(K). Network Analyzer can also draw the power distribution curve y=߷xa and calculate the R² value of the fitting degree of the reaction power distribution curve. The closer the R² value is to 1, the higher the curve fitting degree is. In addition, Network Analyzer can also calculate several other important topology parameters at the same time, including the shortest path length, compactness center and so on.

Subcellular Layer of ANLN PPIN

The subcellular localization information of each protein in the ANLN PPI network were downloaded from the HPRD database, and then imported them into the network as the attributes of the nodes. With the cerebral plugin in Cytoscape, nodes were rematched to different subcellular locations, which did not change the interaction between nodes [41].

Functional Enrichment Analysis of Differentially Expressed Genes Related to ANLN

In order to further understand the functional relationship between ANLN and its interacting proteins, the functional annotation map module in Bioinformatics Database (DAVID) (https://david.ncifcrf.gov/) was used to analyze and annotate the genes related to the up-regulated or down-regulated expression of ANLN in ESCC. The results with statistical significance (P<0.001, FDR<0.05) were screened and visualized by Enrich map plugin in Cytoscape. Functional classification includes Gene Ontology (GO), INTERPRO, KEGG, SMART and other annotation categories [42].

GO Enrichment Analysis

GO Tree is an important part of WebGestalt (web-based gene set analysis toolkit) [43,44]. It uses GO Directed Acyclic Graph (DAG) to construct and visualize gene sets, including extensible trees, bar graphs of selected annotation layers and enriched DAGs. Among them, DAG is the most intuitive form of GO enrichment analysis, which mainly reflects the hierarchical relationship between the superior and subordinate of GO terms and the degree of enrichment, and it is used to visualize GO categories with a rich number of genes identified by statistical modules. The darker the color in the graph, the more significant the richness, while colorless means that the richness is not significant. In addition, the histogram can also be used to count the enrichment results.

KEGG Pathway Enrichment Analysis

One of the most crucial tasks in biological research is to determine the molecular pathway in which proteins are involved. WebGestalt can display the name of each KEGG pathway associated with the gene set studied, all genes involved in the pathway, and the corresponding Entrez ID [43,45]. In addition, the KEGG table not only provides a P value that reflects the importance of each pathway, but also links to the KEGG map, where the genes in the gene set are marked in red.

Survival Analysis of ANLN-Related Differentially Expressed Genes in ESCC

The expression data (GSE53625) was downloaded from the GEO database, which contains 179 pair’s samples of ESCC. Those cases with survival of <3 months were excluded, leaving only 175 cases of ESCC. According to the differential gene expression level, the best segmentation point was found by X-Tile (verison 3.6.1) software, and the ESCC patients were divided into high expression group and low expression group. Kaplan-Meier and Log-rank tests were used to analyze the survival of the patients, and draw the survival curve by GraphPad Prism 5 [46,47].

Results

The Differential Expression of ANLN and its Interacting Proteins in ESCC

In our previous study, we performed high-throughput RNA sequencing (RNA-seq) in 15 pairs of ESCC clinical samples to target the key function lncRNAs that regulates expression of downstream protein-coding genes in ESCC [33]. In order to detect the expression pattern of ANLN and its interacting proteins in ESCC, we analyzed their expression level in RNA-seq data with the fold change >2 or <0.5 (FDR<0.05). In this data, there are 192 differentially expressed genes, of which 155 genes were up-regulated, fold change ranging from 2.01 to 12.1, and 37 genes were down-regulated, fold change ranging from 0.043 to 0.493 (Figure 1).