3D-QSAR MIFs Studies on 3,5-substituted-1,4,2-dioxazoles Derivatives Using Open3DQSAR Tools

The MIFs had been applied to a data set of thirty three (33) dioxazole derivatives to generate the 3D-QSAR model at various 3D grid spacing. An excellent cross-validated correlation coefficient q (0.903) and conventional correlation coefficient r (0.985) were obtained at a 2.0 Å 3D grid spacing, indicating the statistical significance of this class of compounds. The calculated biological activities showed a high degree of agreement with experimental values.


Introduction
Amoebiasis [1] is an intestinal infection occurred by the protozoan parasite Entemoeba histolytica (E. histolytica) [2,3]. E. histolytica has the capacity to destroy almost all tissues of the human body, the intestinal mucosa, liver and to lesser extent the brain, skin, cartilage and even bone [4]. Amoebiasis occurs worldwide [5], but is mostly seen in tropical and developing countries, which have bad sanitary and hygienic practices. Acute amoebiasis can present as diarrhoea or dysentery with frequent, small and often bloody stools. Chronic amoebiasis can present with gastrointestinal symptoms plus fatigue, weight loss and occasional fever. Extraintestinal amoebiasis can occur if the parasite spreads to other organs, most commonly the liver where it causes amoebic liver abscess. Amoebic liver abscess presents with fever and right upper quadrant abdominal pain. Amoebic colitis results from ulcerating mucosal lesions caused by the release of parasite-derived hyaluronidases and proteases. Hepatic infection occurs as a consequence of entry of the parasite into the afferent bloodstream. Paromomycin or iodoquinol are the drugs of choice for treatment of asymptomatic infections proven to be caused by E. histolytica [6,7]. For symptomatic intestinal infection, or extraintestinal, infections (e.g., hepatic abscess), the drugs of choice are metronidazole (Flagyl) or tinidazole (Tindamax), immediately followed by treatment with iodoquinol or paromomycin [6,7]. Though medicinal chemistry has vastly developed but there are many difficulties remaining in the determination of E. histolytica for the treatment of amoebiasis. Recent studies tried to improve the treatment of this infection by developing antiamoebic therapy [8], a set of dioxazoles derivatives showed better activity than the reference drug metronidazole; furthermore, they are non toxic to the human kidney epithelial cells. In the other word, QSAR studies were reported to identify important structural features responsible for the antiamoebic activity [9,10]. The most important step is to find the possible compounds with desired biological activity. The quantitative structure-activity relationships (QSAR) are now modern media for drug design. Over the years of development, many methods, algorithms and techniques have been discovered and applied in QSAR studies [11,12]. Today, QSARs are being applied in many disciplines with much emphasis in drug design. Three-dimensional quantitative structure activity relationships (3D-QSAR) analysis [13] is now the most widely used and improved technique.

Experimental Data
3,5-substituted-1,4,2-dioxazoles derivatives and their antiamoebic activities were taken from the literature reported by Bhat et al. [8]. A comparative study of in vitro antiamoebic activity of all synthesized dioxazole derivatives (33 compounds) were performed against HM1:IMSS strain of Entamoeba histolytica. The experimental IC 50 values of all compounds in μM (micromole) showed very good results compared with IC 50 values of metronidazole. Most of them are non toxic to the human kidney epithelial cells. The toxicity studies against human kidney epithelial cell line showed that all the compounds were non-toxic with IC 50 values in the range of 0.41-1.80 µM [8].

Data Set
3D-QSAR modeling was carried out through MIFs analysis by dividing the dataset of 33 molecules into training set [14] of 27 molecules and test set of 6 molecules in a random manner (the testing set is marked by *). According to research methodology, all experimental IC 50 values (μM) were converted to negative logarithm of IC 50 , i.e., pIC 50 and used as dependent variable in 3D-QSAR study. The structures of these compounds are given in Table 1 and their biological activities are incorporated in Table 4.

Structure Generation
At first all the 3D structures were generated by Gauss View 03 and minimization was performed with the MOPAC2012 [15] software using semi-empirical method (AM1). All geometric variables were finally optimized for each compound using Gaussian03W [16] version 6.0 program at the level of B3LYP/6-31G(d,p) theory and the low energy conformers were ensured with all real frequency by the frequency calculations and this lowest energy conformations were used in the open3DQSAR tools [17] for MIFs analysis.

Alignment
The molecules were superimposed using the atom-based alignment by the open3DALIGN tools [18,19] given in Figure 2. From the data set, the compound, 10 shown in Figure 1 was selected as the template to construct other compounds because of its high biological activity and representative chemical structure, and the alignment was completed by open3DALIGN workstation [18,19]. Except for some special notes, default values were chosen. Then their geometries were optimized by the RMS gradient [20] criterion method on MMFF94s force-field implemented in TINKER by using command option of open3DALIGN. The energy convergence criterion in alignment is 0.01 kcal/mol.

Generating MIFs
The Molecular Interaction Fields (MIFs) are the interaction energies between a probe atom (or a molecule) and a set of aligned molecules, which are used to establish the three-dimensional quantitative structure activity relationship (3D-QSAR) equations. To generate the MIFs, a probe atom is systematically moved from one point to another for each aligned molecule within a defined 3D grid [21]. At each grid point, the interaction energy is calculated between the probe and the target molecule. In this study, the 33 aligned molecules were placed in various 3D cubic lattice spacing. The steric (van der Waals) and electrostatic (Coulombic) interaction energies were calculated for each molecule at each grid point using an alkyl carbon probe (default) with automatically assigned charges using OpenBabel utilities. Energies lower than -40.0 kcal/mol and greater than 40.0 kcal/mol were cutoff because a few high values in the dataset may severely bias the model.

MIFs Study
The MIFs analysis described here were performed on an open3DQSAR tools [17] from Sourceforge.net using Partial Least Square (PLS) [22] technique through the NIPALS algorithm methodology [23]. Steric and electrostatic fields are included in the analysis and various grid spacing was applied. All of the 33 compounds for the data set were superimposed (see Figure 2) onto a template using an atom by atom least-square fit, and one of the most active compounds (10) was used as the reference molecule. After alignment the molecules were put to the various 3D grid [21] spacing. The steric and electrostatic fields were then calculated using a 'CR' Alkyl Carbon atom with default charge and the cutoff energy was set between -40 to 40 kcal/mol. Regression analysis of the resulting field matrix was performed by Partial Least Squares (PLS) [22] technique. To obtain the 3D-QSAR models, PLS analysis was performed using steric and electrostatic field alone and also combination. Cross-validation [24] in PLS was carried out using the leave-one-out method (LOO) [25,26] to check the predictive ability of the models and to determine the optimal number of components to be used in the final 3D-QSAR models. The optimum number of PLS components for the final, non-validated analysis was chosen based on the smallest standard error of estimation (SEE) values from the cross-validated analysis.

Result and Discussion
To develop an effective 3D-QSAR model some parameters such as the cross-validated correlation coefficient (q 2 ), non-cross-validated correlation coefficient (r 2 ), standard error estimate (SEE) and F-statistic values have been taken under consideration. The LOO cross-validation was carried out first for 3D-QSAR model. Then the number of components identified in the LOO cross-validation process was used in the final non-cross-validated PLS run. The optimal number of components was determined by selecting highest q 2 value.
The statistical results of MIFs studies are summarized in Table 2. The predicted antiamoebic activities for the dioxazole derivatives versus their experimental activities are listed in Table 4 and the correlation between the predicted activities and the experimental activities is depicted in Figure  3.
The MIFs based PLS calculation resulted in several models and among them, the final model selection is an important issue. To obtain the 3D-QSAR models, PLS analysis was performed using each of the steric and electrostatic MIFs alone and also in combination varying on 3D grid spacing. Seven types of model were produced on varying the grid spacing (see Table 2). All the models comparatively showed good statistical results except the model 5 (Table 2). Model-3, which uses both steric and electrostatic fields on 2.0Å grid spacing, was thus chosen as the working MIFs model, whose validity and predictability were assessed by the r 2 value of 0.985 and q 2 value of 0.903 with 5 components, F is 293.46, and a standard error of estimation (SEE) of 0.035. The steric and electrostatic contributions were 79.89% and 20.11% respectively. A graphical inspection of the experimental with calculated (pIC 50 ) values immediately indicated that the overall fit of the molecule was satisfactory for all the models proposed in Table 2. The results of the analyses are shown in Table 2, from which it is obvious that the best model is 3; that is to say, the combined fields (steric and electrostatic) at a 2.0Å gave the best statistical results.
The external predictive ability of the MIFs model is extremely important in terms of the applicability of the MIFs model. Therefore, it was decided to use the r 2 pred as a criterion for final selection of the one best model. As reflected by the Table 3, all the models showed comparatively good r 2 pred values except model 5 and model 3 has the highest r 2 pred value of 0.974 and hence the lowest standard error of prediction (SEP) value of 0.048 for the test set (six compounds, indicated by *, Table 4). Therefore, model 3 was selected as the best MIFs model

Conclusions
We have carried out MIFs studies for thirty three (33) dioxazole derivatives against experimental biological activities. The atom based alignment with varying the 3D grid spacing method was used to provide the model for MIFs analysis. Our present studies have established that the model derived through MIFs studies is quite reliable and significant. We have investigated that the PLS analysis at 2.0Å 3D grid spacing by Open3DQSAR tools has presented an excellent statistical results in terms of q 2 and r 2 values for dioxazole analogues and showed a high degree of agreement with the experimental antiamoebic activities. So it is to say that the dioxazole analogues have a good antiamoebic activity. In addition dioxazole may also find roles in the treatment of gastrointestinal infections caused by the protozoan parasite E. histolytica.