Computational Fine-Tuning of Functional Single Nucleotide Polymorphisms Associated with ACP5 Gene to Characterize Missense Mutations Running Title: SNP Analysis of ACP5 Gene

Acid phosphatase 5 (ACP5) gene plays a vibrant role in the synthesis of a tartrate-resistant acid phosphatase (TRAP5) enzyme. TRAP5 is ~35 KD glycosylated di-iron metalloenzyme responsible for the regulation of osteopontin (protein) activity. There are two isoforms of TRAP5, TRAP5a, and TRAP5b. TRAP5a functions with low enzymatic activity due to a loop interacting with the active site and the more active TRAP5b is generated upon proteolytic cleavage of this loop. TRAP5a works as a marker for systematic macrophage function and chronic inflammation activity, while TRAP5b for osteoclast activity. ACP5 is evolutionarily conserved in nature and acts as a multifunctional protein that involves generations of reactive oxygen species, normal bone development, macrophage function, and osteoblast regulation, affecting a series of pathways, as well as reflecting bone resorption and osteoclast activity. To understand its fundamental role, a functional investigation of missense mutations of the ACP5 gene was carried out through an in-silico approach. Two nsSNPs G109R and L201P were predicted to be deleterious using multiple computational tools like SIFT, Polyphen-1, PolyPhen-2, MAPP, SNAP, Predict SNP and PhD-SNP. Additionally, the structural analysis was performed. The result is that there was no similarity between the native and mutant structures. Therefore, these reported mutations in ACP5 modify the expression, function, and structure of a TRAP5 protein. These findings suggest that TRAP5 can be a therapeutic target in immunological disorders, cancer, a n d metabolic bone diseases. These deleterious mutations can be lethal to its function and may hamper its therapeutic strategy leading to various diseases such as autoimmune cytopenia, systemic lupus erythematosus (SLE) immune-osseous dysplasia, spasticity with leukodystrophy, moyamoya syndrome, and sjogren’s syndrome.


Introduction
Acid phosphatase 5 (ACP5) is an iron-containing enzyme of the class of purple acid phosphatases, common in two isoforms which are responsible for catalyzing the hydrolysis of various phosphatases in acidic conditions [1]. Acid phosphatase 5 (ACP5) is mainly transcribed through a single gene that contains 5 exons in which the first three Associated with ACP5 Gene to Characterize Missense Mutations exons (E51, E52, E53) have three alternate promoters [2], while the TRAP5 protein is translated from exon 2 to exon 5 [3]. Acid phosphatase 5 (ACP5) is mainly found in macrophages that contains 975bp which is encoded by a protein of 325 amino acids including a single peptide of 19 residues and 2 potential sites for N-glycosylation. It plays a vibrant role in the synthesis of tartrate-resistant acid phosphatase 5 (TRAP5), which has a molecular weight of 35 -37kd [3]. It is found that 969bp of TRAP5 corresponds to 323 amino acids, a putative signal sequence of 19 amino acids, and 2 potential glycosylation sites. ACP5 is the most basic in nature and it is insensitive to inhibition by L (+) tartrate, found in the spleen and macrophages. Its enzymatic activity is responsible for osteopontin (protein) activity. For catalytic activity, it has a mixed valency of the di-iron center. Mostly, TRAP5 is primarily expressed by differentiated cells of the mononuclear phagocyte system which include osteoclasts, macrophages, and dendritic cells. TRAP5 is found into two isoforms which are TRAP5a and TRAP5b. TRAP5a is mainly synthesized as a monomeric proenzyme and has a molecular weight of 35kd [4][5][6]. TRAP5a has low enzyme activity due to the presence of a repression loop interacting with the active site [7], which is primarily found in immune cells. The increased level of TRAP5a is seen in obesity, end-stage renal disease, and rheumatoid arthritis [8][9][10][11]. However, TRAP5b is synthesized in the form of a disulfide-linked heterodimer containing an N-terminal fragment of 20-23kd joined, to a C-terminal which is of 16-17kd molecular mass. It originates from post-translational cleavage of the monomeric form that exhibits significantly increased phosphatase activity [12][13][14], which is primarily present in osteoclasts. TRAP5b is responsible for the development of bone diseases like osteoporosis, paget's diseases of bone, and multiple myeloma [15][16][17]. TRAP5b can be used as a serum marker for the diagnosis of bone metastatic disease in patients who are suffering from myeloma and certain types of cancers [18][19][20][21]. Osteoclasts cells are involved in bone remodeling. During remodeling, the protein osteopontin activates and allows the osteoclast to attach to the bone. After the completion of bone, TRAP5b inactivates osteopontin, which causes osteoclast to be released from bone [21]. Osteopontin cells also play an important role in the immune system. They are found in macrophages and dendritic cells. Throughout infection, this protein fights and helps by promoting inflammation, regulating other immune cell activity which is necessary to fight against foreign invaders like bacteria and viruses. Like TRAP5b that inactivates osteopontin in bone cells, TRAP5a inactivates osteopontin in dendritic cells and macrophages when it is no longer needed. TRAP5 facilitates various functions that include (a) encouraging tumor metastasis, (b) indicating poor prognosis for hepatocellular carcinoma [22], (c) enabling spontaneous metastasis intrusion, transformation, and development of cancer [23], and (d) engaging in epithelial cell migration and regulation [24]. TRAP5 has gained research interest as a possible drug target because of its proposed role in the pathogenesis of osteoporosis and cancer. TRAP5 activity can indirectly be decreased by neutralizing the protein using antibodies [25] or by the application of RNA interferences [26]. However, the use of small chemical inhibitors to selectively inhibit TRAP5 would not only offer a method to research. TRAP5-mediated effects in pathologies such as cancer but would also open up possibilities for future therapeutic strategies involving TRAP5 activity modulation. Spondyloenchondrodysplasia (SPENCD) is a complex genetic disorder that has recently been identified containing craniofacial, skeletal, neurological, and autoimmune manifestations of the described disorder [27][28][29]. Skeletal dysplasia and radiolucent metaphysical lesions arising from biallelic mutations of the ACP5 gene encoding the TRAP5 enzymes are more precisely involved [30,31]. Ramesh J, et al., and others showed earlier [32][33][34] that autoimmune cytopenia, immune-osseous dysplasia, leukodystrophy spasticity, systemic lupus erythematosus (SLE), moyamoya syndrome, and sjogren's syndrome could be caused by ACP5 mutations [35][36][37]. Other salient features of SPENCD include growth retardation with developmental defects, clumsy gestures, and particular neurological symptoms such as intracranial/cerebral calcifications [31].
To examine the deleterious effect of nsSNPs on the structure and function of TRAP5 protein, several diverse approaches have been employed to analyze deleterious nsSNPs through an empirical-based and support-vector-based methods.
Thus, the goal of the study is to identify nsSNPs for the ACP5 gene which are likely to alter the structural and functional aspects of the TRAP5 protein.
Since there is a massive amount of SNP data available for the ACP5 gene, it might not be feasible for a scientist or researcher to carry out studies in the laboratory on every single SNP to analyze their biological impact on the structural and functional stability of an encoded protein.
The methodology involved in the process will be time-consuming, difficult, and expensive to design experiments for depicting the impact of each nsSNPs on the function of protein. Therefore, this study paves an inexpensive alternative approach and less time-consuming to screen amino acid substitutions and select candidate nsSNPs of the ACP5 gene for further wet-laboratory analysis.

Data Retrieval
The variant genomic region of the gene ACP5 is taken from the UNIPROT database (P13686). UNIPROT database is a collective protein resource where all data about the protein sequence is available [38]. The ACP5 gene comprises 325 amino acid sequences, which are further downloaded for homo-sapiens in FASTA format. The total number of 2030 non-reductants of ACP5 SNP's were curated from NCBI. Among these identified variants, 233 were missense SNPs, 106 synonymous, and 1707 intronic SNP's ( Figure 1). In this study, two diverse approaches, support vector-based and empirical methods, were utilized to screen non-synonymous SNPs. The annotation and sequence of TRAP5 protein were shown in Figures 2 and 3 respectively.

Tools for identifying deleterious SNPs
All the nsSNPs of the ACP5 gene were uncovered from various computational tools like SIFT Blink, Polyphen-1, PolyPhen-2, MAPP, SNAP, Predict SNP, and PhD-SNP to find its deleterious effect [39,40]. The selected tools follow different algorithms like sequence-homology and evolution-based methods with diverse criteria. The effect of each amino acid change concerning the native residues was prioritized, based on a scoring system by the above-mentioned tools.

SIFT blink
Sorting intolerant from tolerant (SIFT) blink predicts whether the replacement of amino acids is disrupted by the work of proteins based on the physical characteristics and sequence of amino acids homologies. SIFT applies to laboratory-induced missense mutations and non-synonymous polymorphisms which occur naturally. SIFT has evolved as one of the standard tools for depicting missense variations [41][42]. SIFT uses NCBI pre-computed BLAST for single protein analysis. The mutation can be referenced depending on the mutation. When the probability score is ≥0 05 or <0 05 it is designated as benign or deleterious. An assessment of the degree of survival of the affected residue was the most consistent approach by determining the pathogenicity of the missense variant [43,44].

PolyPhen
PolyPhen is a programme that predicts potential impacts (polymorphism phenotyping v 1&2) on the function and structure of a protein using straightforward amino acid substitution physical contemplations and comparative ones. This score indicates the likelihood that a substitute is negative. As an input, The FASTA format of a gene protein sequence and the substitution of the native and the mutant amino acids are given. The results of the performance are score, sensitivity, specificity, and PSIC (position-specific independent count) and the results are elucidated as benign, likely, or potentially harmful [45]. The score for PolyPhen-2 is between 0.0 (tolerated) and 1.0 (deleterious). Variants with scores (0.0 to 0.15) are likely to be negative, and variants with scores (0.85 to 1.0) are more likely to be detrimental.

MAPP
MAPP is known as multivariate protein polymorphism analysis. To determine missense variants that compromise charge, polarity, hydropathy, energy in alpha-helical and beta-sheet conformation, and side-chain length, six physicochemical properties are used. The high predictive score is calculated by the physicochemical properties being quantified. There is a good association between the missense variant and the prolonged human disease physicochemical characteristics. It also gives the degree of trust of the predicted mutations [46].

SNAP
To determine non-acceptable polymorphisms, SNAP screening is used. Potentially, this tool will classify the non-synonymous SNPs in neutral and non-neutral (deleterious) in all the proteins. It uses an algorithm based on sequences. SNAP includes a reliability index for each event, which shows the confidence level of each prediction. SNAP predicts nsSNPs practical effects by integration of predicted protein structure features (including the secondary structure and solvent structure accessibility), evolutionary knowledge, conservation of residues, and other important data. A protein sequence is needed as an input. The deleterious low-confidence mutations are identified by SNAP [47].

PhD-SNP
PhD-SNP recognizes human deleterious single nucleotide polymorphism as an indicator. To identify non-synonymous SNP's into human genetic disease-causing(deleterious) or benign mutations, the support vector machine (SVM) based algorithm is used [48]. SNP analysis was executed using different bioinformatic methods including SIFT-Blink, PolyPhen-1, PolyPhen-2, MAPP, SNAP, Predict-SNP, and PhD-SNP. Each method operates on a particular algorithm, so the prediction of deleterious nsSNPs can be more reliably achieved by combining different computational tools [30][31][32][33][34]. Hence, a combinatorial method was determined to ease the false-positive predictions. Besides, only those nsSNPs that were commonly predicted by all the above, mentioned methods to be deleterious were noted.

Results and Discussions
A new diagnostic approach has been established in the investigation of non-synonymous SNPs in recent years. In this analysis, nsSNPs were chosen for ACP5 using different parameters, coding synonymous region (SNPs), coding non-synonymous region (nsSNPs), intronic regions, and untranslated regions (3' UTR and 5' UTR). Findings in this study showed that out of 233 variants, only 43 SNPs were predicted to be deleterious (Table 1) by the prediction tools. However, two of these deleterious SNPs (rs387906672 and rs781050795) were found to be damaging and harmful by all seven computational tools mentioned above.

Prediction of deleterious nsSNPs
The deleterious SNPs were listed in Table 1 by comparing the results of SIFT Blink, PolyPhen-1, PolyPhen-2, MAPP, SNAP, Predict-SNP, and PhD-SNP. These mutations occurred at positions G109R and L201P due to a shift in a single nucleotide (Table 1). Prior studies have indicated that using different algorithmic methods to classify detrimental ns-SNPs increases prediction accuracy. The native structure of the protein was downloaded from the PDB database (PDB ID: 2BQ8). The structure was mutated using a Swiss PDB viewer. Figure 4 shows the native and mutant structure of TRAP5 protein (G109R and L201P). Thus, these deleterious mutations (G109R and L201P) mentioned here may hinder the function and expression of ACP5. These mutations are responsible for spondyloendrodysplasia (SPENCD) disease.  SPENCD is an inherited disease that mainly affects bone growth and the immune system. The signs and symptoms of SPENCD can become evident at any time from infancy to puberty. This disease was initially reported in the year of 1976 as a skeletal dysplasia with vertebral and metaphyseal changes implying cartilage persistence within the bone. It was subsequently found that individuals with SPENCD have neurological symptoms of brain calcification and spasticity and suffered from a high incidence of autoimmune disease [49]. On the contrary, Lausch et al. discovered that the disease was first identified in 1958 in a 10-year-old boy with juvenile systemic lupus erythematosus (SLE) with unique bone lesions [49].
In summary, this study proposed two highly deleterious nsSNPs of ACP5 gene that can be undertaken for epidemiological studies to assess their phenotypic association and correlation with different diseases.

Conclusions
By using all multiple computational platforms, two nsSNPs G109R and L201P are predicted to be more deleterious causing alteration in structure and function of TRAP5 protein.
TRAP5 can be a therapeutic target in immunological disorders, cancer, and metabolic bone diseases. As a result, these deleterious mutations can be lethal to its function and may hamper therapeutic strategy of protein. To investigate more precisely the deleterious effect of these nsSNPs (G109R and L201P) on structure and function of TRAP5 protein, further molecular dynamics simulation will be performed.