In Silico Design: Those Accentuate Assembly of HIV-1 Capsid

Considering the significance of RNA genome in the HIV infection, its release can be forbidden by inhibiting encapsulated Capsid (CA) ‘‘core’’. In this context, design of new molecules that interact with HIV-1 CA assembly was made by conducting 3D-QSAR and Pharmacophore studies on a stockpile of fifty eight molecules. The supramolecular interactions of these inhibitors with the receptors active site amino acids Glu 35, Gly 60, His62, Gln 63, Ala 65, and Val 136 has been characterized using docking studies. The combined five point Pharmacophore hypothesis AHHRR, and 3D-QSAR CoMFA (Comparative molecular field analysis) and CoMSIA (Comparative molecular similarity indices analysis) has shown good statistical partial least square factors. The association of generated 3D-QSAR and PHASE pharmacophore hypothesis has provided structural insights to explore new molecules with enhanced CA assembly activity.


Introduction
Capsid protein (CA) of Human immunodeficiency Virus type-1 (HIV-1) is a fullerene cone accomplished with approximately 250 hexamers and 12 pentamers that plays a prominent role in the initial stages of post host cell entry and in virus assembly [1][2][3][4]. Proteolytic cleavage of Gag (Matrix, Capsid and Nucleo capsid) polyprotein releases CA which reassembles to form a conical "core" that encloses the viral RNA genome [5][6][7]. The assembly process of CA protein is associated with the N-terminal domain (CANTD residues 1-145) and C-terminal domain (CACTD residues 151-231). The CANTD aid in virion maturation by self associating into hexameric rings and incorporation of the cellular protein cyclophilin A (CypA). While CACTD contributes to Gag-Gag interactions [8][9][10]. Once the virus enters the cell, the core disassembles and releases ribonucleoprotien complex, resulting in the formation of reverse transcription complexes [11][12][13][14]. However, virions with deformed core structured are defective in initiating reverse transcription and exhibit reduced infectivity [15]. It is believed that the N-terminus of CA can refold to form a new γ-hairpin helix by cleavage of CA from matrix protein (MA), which is stabilized by a buried salt bridge between Proline-1 of CA and the carboxyl side chain of Asp51. The renovation from tubular to spherical forms can be induced by changing the pH from 7.0 to 6.8 [16,17]. Several small molecule inhibitors like CAP-1, NYAD-1 a peptide inhibitor, bevirimat, a triterpene derivative have been reported to interfere with CA function on binding to the N-terminal domain of CA in both its mature and immature forms [18][19][20][21][22][23][24][25]. Pharmacophore mapping was performed by using Pharmacophore Alignment and Scoring Engine (PHASE) to evaluate features necessary for the ligand to interact with a receptor [26]. The pharmacophore may be used as a query in searching 3D databases containing "drug like" small organic molecules and also aid in exploring and engineering of new scaffold with high potency [27]. To investigate the mode of recognition and interaction mechanism of HIV-1 CA inhibitors with CA protein, molecular docking studies are performed. Further 3D-QSAR studies were performed by using CoMFA and CoMSIA analysis.

Ligand construction and preparation
A stockpile of fifty eight 1,5-dihydrobenzo[b] [1,4]diazepine-2,4-dione derivatives [28][29][30] (Figure 1) with their experimental information were collected from the literature. Outlining the three dimensional structures of these ligands using Maestro build panel in Schrödinger Suite, all possible low energy states are generated at physiological pH range of 7 +/-2 in the Ligprep module of Schrödinger.

Generation of the common pharmacophore hypothesis (CPH)
The pharmacophore hypothesis and alignment were carried out by PHASE [31] (version 2.5, 2011; Schrodinger, LLC, New York, NY). Considering the significance of EC 50 in biological activity [32,33] and we have converted E C 50 into the corresponding pEC 50 (-logE C 50 ), and randomly selected thirty five molecules as training set to generate Pharmacophore models. PHASE defines chemical features of ligand that facilitate non covalent bonding between the ligand and target receptor. PHASE models are created by applying Partial Least Squares (PLS) regression. The accuracy of the models improves with an increase in the number of PLS factors [34]. The PLS regression analysis was carried out using three PLS factors and possible statistical parameters are evaluated. The test set predictions were described as Q 2 root mean squared error and Pearson correlation coefficient (R) value. The regression value is calculated using the formula;

Docking studies
The X-ray structure of CANTD in complex with the benzodiazepine inhibitor (PDB: 4E91) [35] was retrieved from Protein Data Bank (http://www.rcsb.org/). Protein was prepared by using protein preparation wizard in Schrödinger.
[36] GLIDE 5.6 (Grid-based ligand docking with energies) docking was performed by generating a grid (10 Å x10 Å x10 Å) around the active site of capsid assembly inhibitor. Rigid receptor docking protocols of Glide, namely Standard Precision (SP) and Extra Precision (XP) were employed to gain insight into the binding modes of all inhibitors. Further, to account for the receptor flexibility, a computationally intensive Induced Fit Docking (IFD) was performed. To validate the docking protocol, co-crystallized ligand benzodiazepinedione was re-docked and its atomic root mean square deviation (RMSD) was calculated [37][38][39].

Prime/MM-GBSA calculations
The relative binding free energies for best ranking molecules in XP mode are calculated by Molecular Mechanics Generalized Born Surface Area (MM/GBSA) using Prime. The dock pose features that are locally optimized features from glide were minimized from prime and complex energies were calculated using OPLS_2005 force field. The relative binding free energy ΔGbind was estimated according to equation.

ΔGbind=Ecomplex (minimized)-[Eligand (unbound, minimized) + Ereceptor (unbound, minimized)]
ΔGbind is the calculated relative free energy which includes both ligand and receptor strain energy. Ecomplex (minimized) is MM-GBSA energy of the minimized complex, Eligand (unbound, minimized) is the MM-GBSA energy of the ligand after removing it from the complex and allowing it to relax. Ereceptor (unbound, minimized) is the MM-GBSA energy of protein after separating it from the ligand.

CoMFA and CoMSIA Models
CoMFA and CoMSIA studies are performed by aligning data set molecules and are evaluated for thier steric, electrostatic and hydrogen bond effect on bioactivities using SYBYLX-2.1 [40,41]. Gasteiger Huckle charges were assigned for all the inhibitors [42]. For CoMFA, the overlapped molecules were placed in a rectangular grid points separated by 2 Å. The van der Waals potential and columbic terms, representing the steric and electrostatic fields, were calculated using standard Tripos force field [43]. The regression analysis was carried out using set of variables, and cross validated the PLS method (leave-one-out). The final model (noncross-validated conventional analysis) was developed with the optimal number of components which poses the highest q 2 value.
For CoMSIA, five physicochemical properties namely steric, electrostatic, hydrophobic, hydrogen bond donor and acceptor were calculated. The CoMFA or CoMSIA descriptors were used as independent variables and pEC50 values as dependent variables in partial least square analysis. [34] The predictive correlation coefficient (r2pred ) based on the test molecules, is computed Novel Approaches in Drug Designing & Development with the formula r 2 pred = (SD-PRESS)/SD, where SD is the sum of the squared deviations between the biological activities of test set and mean activities of the training set molecules. PRESS is the sum of squared deviation between predicted and observed activity for each molecule in the test set.

ADME prediction
The ADME (absorption, distribution, metabolism and excretion) properties of designed molecules were evaluated computationally using Qik Prop module of Schrodinger [44]. The physically significant descriptors and pharmaceutically relevant properties of organic molecules with compliance to Lipinski's rule of five were calculated by Qik prop.

Results and Discussion
The hexameric architecture of capsid relay on three basic type of interactions. (i) CANTD-CANTD six fold symmetric interfaces that create the hexameric rings [45]. (ii) CANTD-CACTD intermolecular interface that reinforces the hexamer [8][9][10] and (iii) CACTD-CACTD dimeric interface that links the hexameric rings to form the lattice [46][47][48][49][50]. The current study is mainly focused on hexamer formation of CA by disrupting the intermolecular interaction of CANTD and hence in silico methods employed are restricted to N-terminal domain only. The X-ray crystal structure (4E91) of HIV-1 CANTD is a single chain containing the N-terminal domain. Sundquist reported intermolecular N-terminal domain interactions between Phe-32, Glu35, Gly60, His62, Gln63, and Ala65 residues of helices 1 and 2 [51]. The perturbation of intermolecular N-terminal domain interaction involves in disruption of NTD-NTD interaction and hence inhibits CA assembly [52]. PHASE analysis for a set of 58 molecules derived five point CPHs belonging to AHHRR, AHRRR. Thirty five molecules in the training set were aligned on these CPHs and evaluated for PLS analysis using three PLS factors. The predictivity of each hypothesis was cross validated by the test set of twenty three molecules. The variants named AHHRR (model A 1 ) and AHRRR (model B1) in Figure 2 showed better statistical significance than AHHRR (A 2 ), AHHRR (A 3 ) and AHHRR (A 4 ). Since all the five pharmacophore hypothesis capitulate a statistically significant data as shown in Table 1, the hypothesis AHHRR (A 1 ) with good survival score 3.549, with significant R2 of 0.926 and Q 2 of 0.824 was considered in this work. From the above results it is confirmed that the obtained PHASE model (r 2 > 0.5, q 2 > 0.6, [(r 2 -r0 2 )/r 2 < 0.1, 0.85 ≤ k ≤ 1.15 and rm2 > 0.5) was in the acceptable range.

Pharmacophore generation
The hydrogen bond acceptor A 2 , and the phenyl ring R 12 was found to be significant for the activity. The carbonyl group at second position can form hydrogen bond interaction with His 62 residue in the active site, while the aromatic ring at third position contributes to hydrophobicity. The hypothesis AHHRR (A 1 ) was fit into the molecular skeleton of highest active molecule of capsid assembly inhibitor 50g. The distance and angles between the five features of the best two models AHHRR (A 1 ) and AHRRR (B 1 ) are shown in Table 1 & 2 (provided in supplementary data). The pharmacophore model implies that molecules which would fit the AHHRR (A 1 ) sites may consequence in stronger interactions with the active site residues. The distance and the angle maps of the best two models are represented in Figure S 1 given in supplementary data. The field contribution (blue and red cube contours) of the best two models are represented in Figure 3. The blue cubes represent that substitution of hydrophobic favored groups while red cubes represent disfavored region. Scatter plot of experimental and phase predicted pEC50 is shown in (Figure 2-4) ( Table 1).    To evaluate the validity of docking protocol, root mean square deviation (RMSD) calculations were performed. The crystal ligand was haul out from the complex (PDB id: 4E91) and subsequently re-docked into the receptor. Best dock pose from docking studies differ from the original conformation by 0.216 Å, substantiating the robustness of docking protocol to enumerate the experimental binding mode. Figure 5 shows the overlay of the X-ray crystal structures (colored in cyan) of capsid assembly inhibitor and the re-docked pose (colored in plum) ( Figure 5).

Molecular docking studies and Prime MM/GBSA calculations
Fifty eight molecules with benzodiazepine scaffold which were reported to exhibit inhibitory effect on assembly process of HIV-1 capsid (Figure 1) are chosen for docking studies. The binding orientation of these has shown hydrogen bond interactions with Glu35, Gly60, His 62, Gln 63, Ala 65, Val 135 and Tyr 145 amino acid residues and are said to exhibit HIV-1 replication with a novel mechanism of action [52]. This compounds bind to the N-terminal domain (NTD) of the viral capsid protein (CA) and is believed to prevent its ensemble into conical core that is trivial for virion maturation and viral infectivity [53].
The hydrogen bond interaction with His 62 and Phe 32 residues of helices 1 and 2 are plays a crucial role since they hinder the hexameric lattice organization by interrupting intermolecular N-terminal interaction. SP dock pose of highest active molecule 50g display polar H-bond interaction with His 62 (Figure 6a), while XP dock pose exhibit additional π-π stacking interaction with Phe 32. (Figure 6b) The IFD dock pose for the same 50g molecule reveal polar hydrogen bonding interaction with Phe 32 and His 62 residues (Figure 6c). Application of different docking protocols i.e., SP, XP and IFD resulted in variable dock scores, of which IFD has shown more refined scores. Increase in dock scores in IFD protocol is possibly due to additional flexibility attributed to the protein, which allow more interactions with active site residues. However, some deviations have been observed in IFD scores of 45f and 53h which were lower than XP dock scores which might be due to steric clashes.  The relative binding affinities of protein ligand complexes were assessed by Molecular mechanics with generalized born surface area (MM/GBSA). The ∆Gbind value of highest active molecule 50g was found to be -134.50 that is slightly higher than the lead molecule 54 which was screened through CA assembly assay. It is imperative to evaluate statistical correlation between the different computational parameters and experimental pEC50 values to validate the applicability of in silico studies in predicting the HIV-1 CA inhibitors. These inhibitors showed a Novel Approaches in Drug Designing & Development statistically significant correlation of -0.579 for XP docking, -0.614 for prime MMGBSA (Figure 7). The list of inhibitors along with their pEC50, SP, XP and IFD dock scores, ∆Gbind values from PRIME along with predicted CoMFA and CoMSIA values have been tabulated in Table 2.  The 3D-QSAR CoMFA and CoMSIA analysis was performed by aligning the dock poses of all fifty eight molecules (Figure 8). A training set of thirty five molecules (which has been selected for generating PHASE models) were also chosen for constructing CoMFA and CoMSIA models. The best predictions were obtained with CoMFA standard model q2 value of 0.607, r2 value of 0.968 and CoMSIA with q2 value of 0.616 and r2 value of 0.954 respectively. The predictivity of the model was cross validated by the test set of twenty three molecules. The predictive correlation coefficient r2 pred of 0.576 for CoMFA and 0.528 for CoMSIA explains good predictive ability of the models. The scatter plot of actual and predicted pEC50 values for training and test set of CoMFA and CoMSIA studies are shown in Figure 9a and Figure  9b. The results of CoMFA and CoMSIA are listed in Table 3.

CoMFA and CoMSIA contour maps
To visualize the information derived from the 3D-QSAR models, contour maps were generated. The contour plots are representation of the lattice points in the grid and the difference in the lattice points is strongly connected with the difference in the receptor binding affinity. Whereas molecular fields defines the favorable and unfavorable interaction energies of aligned molecules with a probe atom traversing across the lattice points, suggesting the modification required to design new potent molecules.
The CoMFA contours indicate the region in space where the molecule would favorably or unfavorably interact with the receptor, while CoMSIA contours indicate the areas within the specified region where the presence of groups with particular physicochemical property binds to the receptor. Therefore the most potent inhibitor among the series 50g was displayed on the maps for visualization.
The steric contours of CoMFA and CoMSIA Figure 10(a) and Figure 10(b) indicate the regions where sterically bulky substituent might have favorable (green) and unfavorable (yellow) effects on the activity of the inhibitor. The methoxy and hydroxyl substitutions are necessary to have good interactions with the receptor, and pyrazole ring is less sterically crowded, substitutions at this position with a bulky group facilitate increase in the activity. The electrostatic contours of CoMFA and CoMSIA Figure 10(c) and Figure 10(d) suggest that increasing the negative charge in red region will have a propensity for increase in binding affinity towards the receptor. The hydrophobic contour of CoMSIA Figure 11  As the first hit molecule screened through Capsid assembly assay was compound 54 with an enamine side chain at 3 rd position was reported to be chemically instable by Lee Fader [28], and the SAR studies were performed by replacing the enamine side chain with more polar groups like 2-methoxyethyl, 2-hydroxyethyl chains which resulted in no loss of potency thus confirmed that small substitutions at this position can be tolerated [29]. In this article designing of new molecules was attempted by substitution on phenyl ring at 5 th position but results were not industrious in terms of compliance to PHASE. These observations lead us to a conclusion that the substitution at R 1 , R 2 and R 3 of phenyl moiety at third position are well tolerated and therefore were targeted for isosteric substitution ( Figure 11). The sterically favored pyrazole at R 3 , that showed single interaction with His 62 has been substituted with 1,2,4 triazole and electronegative favored methoxy (-OMe) group in 50g has been replaced with methyl sulfonate (-SO2Me) of N10, showed equivalent dock scores and facilitate newer interactions with Val 27 and Val 59. This implies that improved electronegativity on the substituents showed higher affinity. Increase in aliphatic carbon chain on diazapine nitrogen with propyl (N 1 -N 6 ) and isobutyl (N 7 -N 10 ) accounts for increased hydrophobicity. Considering the structural necessities depicted in Figure 12, we have designed ten new molecules which adhere to PHASE as well as CoMFA and CoMSIA. Docking studies revealed that the designed molecules Figure 13 show good interaction with the active site amino acids. The structures of newly designed molecules and their possible interactions are tinted in Figure 14. The pharmacokinetic data of newly designed molecules is illustrated in Table 3 (provided in supplementary data).

Novel Approaches in Drug Designing & Development
Design of new molecules by retaining quinoline scaffold. The quinolines not only exhibited antibacterial, antimalarial but also shown to inhibit HIV-1 activity by involving in assembly process. Hence new molecules are designed with a view to interact with N-terminal domain of HIV-1 Capsid, they too have shown the similar interactions as that of reported molecules. The docks pose C 1 with its ligand interactions are depicted in the Figure 14.
The oxygen of carbonyl carbon forms hydrogen bond interaction with amine of Phe 32, where as the amine of quinoline forms hydrogen bond interaction with the His 62 amino acids within the active site.

Prediction of ADME properties
The newly designed molecules were analyzed for their druglikeness by assessing their physiochemical properties (Table S3 provided in supplementary data) and by applying Lipinski's rule of five. This rule states that the molecule should have molecular weight < 650 Daltons, H-bond donors <5, H-bond acceptors <10, and a log P of <5. For the selected 10 molecules the partition coefficient (QPlogPo/w), water solubility (QPlogS) and MDCK cell permeability (QPPMDCK) properties have been estimated. All these pharmacokinetic parameters were found to be within the acceptable range.

Novel Approaches in Drug Designing & Development Conclusion
In summary, Docking studies guided by the receptor, predicted the binding conformations, and binding free energies for a series of fifty eight benzodiazepenedione inhibitors against HIV-1 capsid which are well correlated with their reported inhibitory activities, The docking results for the newly designed molecules provide additional hydrogen bond interactions with residues Val 27, Val 59 and Gly 60 along with His 62, Phe 32. The higher binding affinities of designed molecules N1, N2, N9 and N10 may be attributed to the additional hydrophobic interaction. Pharmacophore generation and 3D QSAR CoMFA and CoMSIA field distribution are in good conformity with the structural requirements of the active site of capsid assembly inhibitors that allows conception of a plausible template for designing novel potent inhibitors. The predicted activities of newly designed molecules for both CoMFA and CoMSIA are found to be equivalent to that of the highest active molecule 50g and expected to be active against HIV-1 Capsid assembly ( Figure  14).